Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/mishalaskin/rad
RAD: Reinforcement Learning with Augmented Data
https://github.com/mishalaskin/rad
codebase data- data-augmentations deep-learning deep-learning-algorithms deep-neural-networks deep-q-learning deep-q-network deep-reinforcement-learning deeplearning-ai dm-control model-free mujoc off-policy ppo rad reinforcement-learning rl sac soft-actor-critic
Last synced: 1 day ago
JSON representation
RAD: Reinforcement Learning with Augmented Data
- Host: GitHub
- URL: https://github.com/mishalaskin/rad
- Owner: MishaLaskin
- Created: 2020-04-09T13:43:18.000Z (almost 5 years ago)
- Default Branch: master
- Last Pushed: 2021-03-29T01:32:39.000Z (almost 4 years ago)
- Last Synced: 2025-01-14T02:06:50.610Z (9 days ago)
- Topics: codebase, data-, data-augmentations, deep-learning, deep-learning-algorithms, deep-neural-networks, deep-q-learning, deep-q-network, deep-reinforcement-learning, deeplearning-ai, dm-control, model-free, mujoc, off-policy, ppo, rad, reinforcement-learning, rl, sac, soft-actor-critic
- Language: Jupyter Notebook
- Homepage:
- Size: 2.63 MB
- Stars: 404
- Watchers: 16
- Forks: 71
- Open Issues: 4
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Reinforcement Learning with Augmented Data (RAD)
Official codebase for [Reinforcement Learning with Augmented Data](https://mishalaskin.github.io/rad). This codebase was originally forked from [CURL](https://mishalaskin.github.io/curl).
Additionally, here is the [codebase link for ProcGen experiments](https://github.com/pokaxpoka/rad_procgen) and [codebase link for OpenAI Gym experiments](https://github.com/pokaxpoka/rad_openaigym).
## BibTex
```
@article{laskin2020reinforcement,
title={Reinforcement learning with augmented data},
author={Laskin, Michael and Lee, Kimin and Stooke, Adam and Pinto, Lerrel and Abbeel, Pieter and Srinivas, Aravind},
journal={arXiv preprint arXiv:2004.14990},
year={2020}
}
```## Installation
All of the dependencies are in the `conda_env.yml` file. They can be installed manually or with the following command:
```
conda env create -f conda_env.yml
```## Instructions
To train a RAD agent on the `cartpole swingup` task from image-based observations run `bash script/run.sh` from the root of this directory. The `run.sh` file contains the following command, which you can modify to try different environments / augmentations / hyperparamters.```
CUDA_VISIBLE_DEVICES=0 python train.py \
--domain_name cartpole \
--task_name swingup \
--encoder_type pixel --work_dir ./tmp/cartpole \
--action_repeat 8 --num_eval_episodes 10 \
--pre_transform_image_size 100 --image_size 84 \
--agent rad_sac --frame_stack 3 --data_augs flip \
--seed 23 --critic_lr 1e-3 --actor_lr 1e-3 --eval_freq 10000 --batch_size 128 --num_train_steps 200000 &
```## Data Augmentations
Augmentations can be specified through the `--data_augs` flag. This codebase supports the augmentations specified in `data_augs.py`. To chain multiple data augmentation simply separate the augmentation strings with a `-` string. For example to apply `crop -> rotate -> flip` you can do the following `--data_augs crop-rotate-flip`.
All data augmentations can be visualized in `All_Data_Augs.ipynb`. You can also test the efficiency of our modules by running `python data_aug.py`.
## Logging
In your console, you should see printouts that look like this:
```
| train | E: 13 | S: 2000 | D: 9.1 s | R: 48.3056 | BR: 0.8279 | A_LOSS: -3.6559 | CR_LOSS: 2.7563
| train | E: 17 | S: 2500 | D: 9.1 s | R: 146.5945 | BR: 0.9066 | A_LOSS: -5.8576 | CR_LOSS: 6.0176
| train | E: 21 | S: 3000 | D: 7.7 s | R: 138.7537 | BR: 1.0354 | A_LOSS: -7.8795 | CR_LOSS: 7.3928
| train | E: 25 | S: 3500 | D: 9.0 s | R: 181.5103 | BR: 1.0764 | A_LOSS: -10.9712 | CR_LOSS: 8.8753
| train | E: 29 | S: 4000 | D: 8.9 s | R: 240.6485 | BR: 1.2042 | A_LOSS: -13.8537 | CR_LOSS: 9.4001
```
The above output decodes as:```
train - training episode
E - total number of episodes
S - total number of environment steps
D - duration in seconds to train 1 episode
R - episode reward
BR - average reward of sampled batch
A_LOSS - average loss of actor
CR_LOSS - average loss of critic
```All data related to the run is stored in the specified `working_dir`. To enable model or video saving, use the `--save_model` or `--save_video` flags. For all available flags, inspect `train.py`. To visualize progress with tensorboard run:
```
tensorboard --logdir log --port 6006
```and go to `localhost:6006` in your browser. If you're running headlessly, try port forwarding with ssh.