{"id":21421855,"url":"https://github.com/mishalaskin/rad","last_synced_at":"2025-04-06T03:10:28.991Z","repository":{"id":38790059,"uuid":"254382571","full_name":"MishaLaskin/rad","owner":"MishaLaskin","description":"RAD: Reinforcement Learning with Augmented Data ","archived":false,"fork":false,"pushed_at":"2021-03-29T01:32:39.000Z","size":2760,"stargazers_count":409,"open_issues_count":4,"forks_count":71,"subscribers_count":15,"default_branch":"master","last_synced_at":"2025-03-30T02:08:29.432Z","etag":null,"topics":["codebase","data-","data-augmentations","deep-learning","deep-learning-algorithms","deep-neural-networks","deep-q-learning","deep-q-network","deep-reinforcement-learning","deeplearning-ai","dm-control","model-free","mujoc","off-policy","ppo","rad","reinforcement-learning","rl","sac","soft-actor-critic"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/MishaLaskin.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-04-09T13:43:18.000Z","updated_at":"2025-03-21T16:09:19.000Z","dependencies_parsed_at":"2022-09-23T15:50:31.850Z","dependency_job_id":null,"html_url":"https://github.com/MishaLaskin/rad","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MishaLaskin%2Frad","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MishaLaskin%2Frad/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MishaLaskin%2Frad/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MishaLaskin%2Frad/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/MishaLaskin","download_url":"https://codeload.github.com/MishaLaskin/rad/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247427006,"owners_count":20937201,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["codebase","data-","data-augmentations","deep-learning","deep-learning-algorithms","deep-neural-networks","deep-q-learning","deep-q-network","deep-reinforcement-learning","deeplearning-ai","dm-control","model-free","mujoc","off-policy","ppo","rad","reinforcement-learning","rl","sac","soft-actor-critic"],"created_at":"2024-11-22T20:39:54.901Z","updated_at":"2025-04-06T03:10:28.967Z","avatar_url":"https://github.com/MishaLaskin.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Reinforcement Learning with Augmented Data (RAD)\n\nOfficial codebase for [Reinforcement Learning with Augmented Data](https://mishalaskin.github.io/rad). This codebase was originally forked from [CURL](https://mishalaskin.github.io/curl). \n\nAdditionally, here is the [codebase link for ProcGen experiments](https://github.com/pokaxpoka/rad_procgen) and [codebase link for OpenAI Gym experiments](https://github.com/pokaxpoka/rad_openaigym).\n\n\n## BibTex\n\n```\n@article{laskin2020reinforcement,\n  title={Reinforcement learning with augmented data},\n  author={Laskin, Michael and Lee, Kimin and Stooke, Adam and Pinto, Lerrel and Abbeel, Pieter and Srinivas, Aravind},\n  journal={arXiv preprint arXiv:2004.14990},\n  year={2020}\n}\n```\n\n## Installation \n\nAll of the dependencies are in the `conda_env.yml` file. They can be installed manually or with the following command:\n\n```\nconda env create -f conda_env.yml\n```\n\n## Instructions\nTo train a RAD agent on the `cartpole swingup` task from image-based observations run `bash script/run.sh` from the root of this directory. The `run.sh` file contains the following command, which you can modify to try different environments / augmentations / hyperparamters.\n\n```\nCUDA_VISIBLE_DEVICES=0 python train.py \\\n    --domain_name cartpole \\\n    --task_name swingup \\\n    --encoder_type pixel --work_dir ./tmp/cartpole \\\n    --action_repeat 8 --num_eval_episodes 10 \\\n    --pre_transform_image_size 100 --image_size 84 \\\n    --agent rad_sac --frame_stack 3 --data_augs flip  \\\n    --seed 23 --critic_lr 1e-3 --actor_lr 1e-3 --eval_freq 10000 --batch_size 128 --num_train_steps 200000 \u0026\n```\n\n## Data Augmentations \n\nAugmentations can be specified through the `--data_augs` flag. This codebase supports the augmentations specified in `data_augs.py`. To chain multiple data augmentation simply separate the augmentation strings with a `-` string. For example to apply `crop -\u003e rotate -\u003e flip` you can do the following `--data_augs crop-rotate-flip`. \n\nAll data augmentations can be visualized in `All_Data_Augs.ipynb`. You can also test the efficiency of our modules by running `python data_aug.py`.\n\n\n## Logging \n\nIn your console, you should see printouts that look like this:\n\n```\n| train | E: 13 | S: 2000 | D: 9.1 s | R: 48.3056 | BR: 0.8279 | A_LOSS: -3.6559 | CR_LOSS: 2.7563\n| train | E: 17 | S: 2500 | D: 9.1 s | R: 146.5945 | BR: 0.9066 | A_LOSS: -5.8576 | CR_LOSS: 6.0176\n| train | E: 21 | S: 3000 | D: 7.7 s | R: 138.7537 | BR: 1.0354 | A_LOSS: -7.8795 | CR_LOSS: 7.3928\n| train | E: 25 | S: 3500 | D: 9.0 s | R: 181.5103 | BR: 1.0764 | A_LOSS: -10.9712 | CR_LOSS: 8.8753\n| train | E: 29 | S: 4000 | D: 8.9 s | R: 240.6485 | BR: 1.2042 | A_LOSS: -13.8537 | CR_LOSS: 9.4001\n```\nThe above output decodes as:\n\n```\ntrain - training episode\nE - total number of episodes \nS - total number of environment steps\nD - duration in seconds to train 1 episode\nR - episode reward\nBR - average reward of sampled batch\nA_LOSS - average loss of actor\nCR_LOSS - average loss of critic\n```\n\nAll data related to the run is stored in the specified `working_dir`. To enable model or video saving, use the `--save_model` or `--save_video` flags. For all available flags, inspect `train.py`. To visualize progress with tensorboard run:\n\n```\ntensorboard --logdir log --port 6006\n```\n\nand go to `localhost:6006` in your browser. If you're running headlessly, try port forwarding with ssh.\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmishalaskin%2Frad","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmishalaskin%2Frad","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmishalaskin%2Frad/lists"}