https://github.com/mishalaskin/rad

RAD: Reinforcement Learning with Augmented Data
https://github.com/mishalaskin/rad

codebase data- data-augmentations deep-learning deep-learning-algorithms deep-neural-networks deep-q-learning deep-q-network deep-reinforcement-learning deeplearning-ai dm-control model-free mujoc off-policy ppo rad reinforcement-learning rl sac soft-actor-critic

Last synced: 3 months ago
JSON representation

RAD: Reinforcement Learning with Augmented Data

Host: GitHub
URL: https://github.com/mishalaskin/rad
Owner: MishaLaskin
Created: 2020-04-09T13:43:18.000Z (about 5 years ago)
Default Branch: master
Last Pushed: 2021-03-29T01:32:39.000Z (about 4 years ago)
Last Synced: 2025-03-30T02:08:29.432Z (3 months ago)
Topics: codebase, data-, data-augmentations, deep-learning, deep-learning-algorithms, deep-neural-networks, deep-q-learning, deep-q-network, deep-reinforcement-learning, deeplearning-ai, dm-control, model-free, mujoc, off-policy, ppo, rad, reinforcement-learning, rl, sac, soft-actor-critic
Language: Jupyter Notebook
Homepage:
Size: 2.63 MB
Stars: 409
Watchers: 15
Forks: 71
Open Issues: 4
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Reinforcement Learning with Augmented Data (RAD)

Official codebase for [Reinforcement Learning with Augmented Data](https://mishalaskin.github.io/rad). This codebase was originally forked from [CURL](https://mishalaskin.github.io/curl).

Additionally, here is the [codebase link for ProcGen experiments](https://github.com/pokaxpoka/rad_procgen) and [codebase link for OpenAI Gym experiments](https://github.com/pokaxpoka/rad_openaigym).

## BibTex

```
@article{laskin2020reinforcement,
title={Reinforcement learning with augmented data},
author={Laskin, Michael and Lee, Kimin and Stooke, Adam and Pinto, Lerrel and Abbeel, Pieter and Srinivas, Aravind},
journal={arXiv preprint arXiv:2004.14990},
year={2020}
}
```

## Installation

All of the dependencies are in the `conda_env.yml` file. They can be installed manually or with the following command:

```
conda env create -f conda_env.yml
```

## Instructions
To train a RAD agent on the `cartpole swingup` task from image-based observations run `bash script/run.sh` from the root of this directory. The `run.sh` file contains the following command, which you can modify to try different environments / augmentations / hyperparamters.

```
CUDA_VISIBLE_DEVICES=0 python train.py \
--domain_name cartpole \
--task_name swingup \
--encoder_type pixel --work_dir ./tmp/cartpole \
--action_repeat 8 --num_eval_episodes 10 \
--pre_transform_image_size 100 --image_size 84 \
--agent rad_sac --frame_stack 3 --data_augs flip \
--seed 23 --critic_lr 1e-3 --actor_lr 1e-3 --eval_freq 10000 --batch_size 128 --num_train_steps 200000 &
```

## Data Augmentations

Augmentations can be specified through the `--data_augs` flag. This codebase supports the augmentations specified in `data_augs.py`. To chain multiple data augmentation simply separate the augmentation strings with a `-` string. For example to apply `crop -> rotate -> flip` you can do the following `--data_augs crop-rotate-flip`.

All data augmentations can be visualized in `All_Data_Augs.ipynb`. You can also test the efficiency of our modules by running `python data_aug.py`.

## Logging

In your console, you should see printouts that look like this:

```
| train | E: 13 | S: 2000 | D: 9.1 s | R: 48.3056 | BR: 0.8279 | A_LOSS: -3.6559 | CR_LOSS: 2.7563
| train | E: 17 | S: 2500 | D: 9.1 s | R: 146.5945 | BR: 0.9066 | A_LOSS: -5.8576 | CR_LOSS: 6.0176
| train | E: 21 | S: 3000 | D: 7.7 s | R: 138.7537 | BR: 1.0354 | A_LOSS: -7.8795 | CR_LOSS: 7.3928
| train | E: 25 | S: 3500 | D: 9.0 s | R: 181.5103 | BR: 1.0764 | A_LOSS: -10.9712 | CR_LOSS: 8.8753
| train | E: 29 | S: 4000 | D: 8.9 s | R: 240.6485 | BR: 1.2042 | A_LOSS: -13.8537 | CR_LOSS: 9.4001
```
The above output decodes as:

```
train - training episode
E - total number of episodes
S - total number of environment steps
D - duration in seconds to train 1 episode
R - episode reward
BR - average reward of sampled batch
A_LOSS - average loss of actor
CR_LOSS - average loss of critic
```

All data related to the run is stored in the specified `working_dir`. To enable model or video saving, use the `--save_model` or `--save_video` flags. For all available flags, inspect `train.py`. To visualize progress with tensorboard run:

```
tensorboard --logdir log --port 6006
```

and go to `localhost:6006` in your browser. If you're running headlessly, try port forwarding with ssh.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/mishalaskin/rad

Awesome Lists containing this project

README