{"id":19401090,"url":"https://github.com/google-research/reincarnating_rl","last_synced_at":"2025-04-24T07:30:32.418Z","repository":{"id":39082330,"uuid":"506782707","full_name":"google-research/reincarnating_rl","owner":"google-research","description":"[NeurIPS 2022] Open source code for reusing prior computational work in RL.","archived":true,"fork":false,"pushed_at":"2023-07-05T23:15:37.000Z","size":7644,"stargazers_count":95,"open_issues_count":5,"forks_count":13,"subscribers_count":5,"default_branch":"master","last_synced_at":"2025-03-13T13:44:09.123Z","etag":null,"topics":["dopamine","reinforcement-learning"],"latest_commit_sha":null,"homepage":"https://agarwl.github.io/reincarnating_rl","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/google-research.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2022-06-23T20:30:13.000Z","updated_at":"2025-03-08T02:19:16.000Z","dependencies_parsed_at":"2022-09-19T23:00:16.812Z","dependency_job_id":null,"html_url":"https://github.com/google-research/reincarnating_rl","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google-research%2Freincarnating_rl","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google-research%2Freincarnating_rl/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google-research%2Freincarnating_rl/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google-research%2Freincarnating_rl/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/google-research","download_url":"https://codeload.github.com/google-research/reincarnating_rl/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":250582776,"owners_count":21453911,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["dopamine","reinforcement-learning"],"created_at":"2024-11-10T11:17:08.732Z","updated_at":"2025-04-24T07:30:31.726Z","avatar_url":"https://github.com/google-research.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n# [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1ktlNni_vwFpFtCgUez-RHW0OdGc2U_Wv?usp=sharing) [![Website](https://img.shields.io/badge/www-Website-green)](https://agarwl.github.io/reincarnating_rl) [![Blog](https://img.shields.io/badge/b-Blog-blue)](https://ai.googleblog.com/2022/11/beyond-tabula-rasa-reincarnating.html)\n\n\n\n# Reincarnating Reinforcement Learning: Reusing Prior Computation to Accelerate Progress\n\n## [(External Replication) Working implementation in CleanRL](https://github.com/vwxyzjn/cleanrl/pull/344)\n\n\n\u003cp class=\"cover\" align=\"center\"\u003e \u003cimg src=\"RRL.gif\" width=\"80%\" /\u003e \u003c/p\u003e\n\nThis codebase provides the open source implementation using the\n[Dopamine][dopamine] framework for running Atari experiments in\n[Reincarnating RL][paper]. In this work, we leverage the\npolicy from an existing agent (e.g., DQN trained for 400M environment frames) to\nreincarnate another deep Q-learning agent. Refer to\n[agarwl.github.io/reincarnating_rl][project_page] for the project page.\n\n*This release is a work-in-progress. More instructions to be added soon.*\n\n## Downloading Teacher Checkpoints\n\nThe teacher checkpoints for pre-trained deep RL agents are in the public GCP\nbucket `gs://rl_checkpoints` ([browser link][gcp_bucket]) which can be\ndownloaded using [`gsutil`][gsutil]. To install gsutil, follow the instructions\n[here][gsutil_install].\n\nAfter installing gsutil, run the command to download the final checkpoint and\nDopamine replay buffer for a DQN (Adam) agent trained for 400 million\nenvironment frames on Atari 2600 games:\n\n```\ngsutil -m cp -R gs://rl_checkpoints/DQN_400 ./\n```\n\nTo run the dataset only for a specific Atari game (*e.g.*, replace `GAME_NAME`\nby `Breakout` to download the checkpoint for the game of Breakout), run the\ncommand:\n\n```\ngsutil -m cp -R gs://rl_checkpoints/DQN_400/[GAME_NAME] ./\n```\n\nNote that the agents were trained using recommended training protocol on Atari\nwith sticky actions, *i.e.*, there is 25% chance at every time step that the\nenvironment will execute the agent's previous action again, instead of the\nagent's new action.\n\n## Installation\n\nInstall `Dopamine` as a library following the\n[instructions here](https://github.com/google/dopamine#installing-from-source).\nAlternative, use the following command:\n\n```\npip install git+https://github.com/google/dopamine.git\n```\n\nFor using Atari environments, follow the instructions provided in\n[Dopamine prerequisites](https://github.com/google/dopamine#prerequisites).\n\n1.  Install the atari roms following the instructions from\n    [atari-py](https://github.com/openai/atari-py#roms).\n2.  `pip install ale-py` (we recommend using a\n    [virtual environment](virtualenv)):\n3.  `unzip $ROM_DIR/ROMS.zip -d $ROM_DIR \u0026\u0026 ale-import-roms $ROM_DIR/ROMS`\n    (replace $ROM_DIR with the directory you extracted the ROMs to).\n\nOnce you have setup `Dopamine`, clone this repository:\n\n```\ngit clone https://github.com/google-research/reincarnating_rl.git\n```\n\n### Running the code\n\nThe entry point for training policy to value reincarnating RL (PVRL) agents on\nAtari 2600 games is\n[reincarnating_rl/train.py](https://github.com/google-research/reincarnating_rl/reincarnating_rl/train.py).\n\nTo run any PVRL agent given a teacher agent, we need to first download the\nteacher checkpoints to `$TEACHER_CKPT_DIR`. To do so, we download the\ncheckpoints of a DQN (Adam) trained for 400M frames on `Breakout`.\n\n```\nexport TEACHER_CKPT_DIR=\"\u003cInsert directory name here\u003e\"\nmkdir -p $TEACHER_CKPT_DIR/Breakout\ngsutil -m cp -R gs://rl_checkpoints/DQN_400/Breakout $TEACHER_CKPT_DIR\n```\n\nAssuming that you have cloned the [reincarnating_rl][repo] repository, run the\n`QDaggerRainbow` agent using the following command:\n\n```\ncd reincarnating_rl\npython -um reincarnating_rl.train \\\n  --agent qdagger_rainbow \\\n  --gin_files reincarnating_rl/configs/qdagger_rainbow.gin\n  --base_dir /tmp/qdagger_rainbow \\\n  --teacher_checkpoint_dir $TEACHER_CKPT_DIR/Breakout/1 \\\n  --teacher_checkpoint_number 399\n  --run_number=1 \\\n  --atari_roms_path=/tmp/atari_roms \\\n  --alsologtostderr\n```\n\nTo use a `Impala CNN` architecture for the rainbow agent, pass the flag\n`--gin_bindings @reincarnation_networks.ImpalaRainbowNetwork` to the above\ncommand. More generally, since this code is based on Dopamine, it can be easily\nconfigured using the [gin configuration](https://github.com/google/gin-config)\nframework.\n\nTo run a quick experiment run for testing / debugging, you can use the following\ncommand:\n\n```\npython -um reincarnating_rl.train \\\n  --agent qdagger_rainbow \\\n  --gin_files reincarnating_rl/configs/qdagger_rainbow.gin \\\n  --base_dir /tmp/qdagger_rainbow \\\n  --teacher_checkpoint_dir $TEACHER_CKPT_DIR/Breakout/1 \\\n  --teacher_checkpoint_number 399 \\\n  --atari_roms_path=/tmp/atari_roms \\\n  --run_number=1 \\\n  --gin_bindings=\"Runner.evaluation_steps=10\" \\\n  --gin_bindings=\"RunnerWithTeacher.num_pretraining_iterations=2\" \\\n  --gin_bindings=\"RunnerWithTeacher.num_pretraining_steps=10\" \\\n  --gin_bindings=\"JaxDQNAgent.min_replay_history = 64\" \\\n  --alsologtostderr\n```\n\n[gsutil_install]: https://cloud.google.com/storage/docs/gsutil_install#install\n[gsutil]: https://cloud.google.com/storage/docs/gsutil\n[ale]: https://github.com/mgbellemare/Arcade-Learning-Environment\n[gcp_bucket]: https://console.cloud.google.com/storage/browser/rl_checkpoints\n[project_page]: https://agarwl.github.io/reincarnating_rl\n[paper]: https://arxiv.org/pdf/2206.01626.pdf\n[dopamine]: https://github.com/google/dopamine\n[repo]: https://github.com/google-research/reincarnating_rl\n\n### Citing\n\nIf you find this open source release useful, please reference in your paper:\n\n\u003e Agarwal, R., Schwarzer, M., Castro, P. S., Courville, A., \u0026 Bellemare, M. G.\n\u003e (2022). Reincarnating Reinforcement Learning: Reusing Prior Computation \n\u003e to Accelerate Progress *arXiv preprint arXiv:2206.01626*.\n\n```\n@inproceedings{agarwal2022beyond,\n  title={Reincarnating Reinforcement Learning: Reusing Prior Computation to Accelerate Progress},\n  author={Agarwal, Rishabh and Schwarzer, Max and Castro, Pablo Samuel and Courville, Aaron and Bellemare, Marc G},\n  booktitle={Thirty-Sixth Conference on Neural Information Processing Systems},\n  year={2022}\n}\n```\n\n*Disclaimer: This is not an official Google product.*\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgoogle-research%2Freincarnating_rl","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgoogle-research%2Freincarnating_rl","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgoogle-research%2Freincarnating_rl/lists"}