{"id":13658921,"url":"https://github.com/Kaixhin/Rainbow","last_synced_at":"2025-04-24T11:33:03.849Z","repository":{"id":40414534,"uuid":"106249690","full_name":"Kaixhin/Rainbow","owner":"Kaixhin","description":"Rainbow: Combining Improvements in Deep Reinforcement Learning","archived":false,"fork":false,"pushed_at":"2022-01-13T01:24:38.000Z","size":176,"stargazers_count":1584,"open_issues_count":9,"forks_count":283,"subscribers_count":41,"default_branch":"master","last_synced_at":"2024-11-10T12:43:01.697Z","etag":null,"topics":["deep-learning","deep-reinforcement-learning"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Kaixhin.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2017-10-09T07:21:24.000Z","updated_at":"2024-11-08T22:10:46.000Z","dependencies_parsed_at":"2022-08-09T19:40:57.247Z","dependency_job_id":null,"html_url":"https://github.com/Kaixhin/Rainbow","commit_stats":null,"previous_names":[],"tags_count":6,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Kaixhin%2FRainbow","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Kaixhin%2FRainbow/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Kaixhin%2FRainbow/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Kaixhin%2FRainbow/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Kaixhin","download_url":"https://codeload.github.com/Kaixhin/Rainbow/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":250618728,"owners_count":21460143,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deep-learning","deep-reinforcement-learning"],"created_at":"2024-08-02T05:01:03.766Z","updated_at":"2025-04-24T11:33:03.552Z","avatar_url":"https://github.com/Kaixhin.png","language":"Python","funding_links":[],"categories":["Python (144)","Paper implementations｜论文实现","Paper implementations"],"sub_categories":["Other libraries｜其他库:","Other libraries:"],"readme":"Rainbow\n=======\n[![MIT License](https://img.shields.io/badge/license-MIT-blue.svg)](LICENSE.md)\n\nRainbow: Combining Improvements in Deep Reinforcement Learning [[1]](#references).\n\nResults and pretrained models can be found in the [releases](https://github.com/Kaixhin/Rainbow/releases).\n\n- [x] DQN [[2]](#references)\n- [x] Double DQN [[3]](#references)\n- [x] Prioritised Experience Replay [[4]](#references)\n- [x] Dueling Network Architecture [[5]](#references)\n- [x] Multi-step Returns [[6]](#references)\n- [x] Distributional RL [[7]](#references)\n- [x] Noisy Nets [[8]](#references)\n\nRun the original Rainbow with the default arguments:\n\n```\npython main.py\n```\n\nData-efficient Rainbow [[9]](#references) can be run using the following options (note that the \"unbounded\" memory is implemented here in practice by manually setting the memory capacity to be the same as the maximum number of timesteps):\n\n```\npython main.py --target-update 2000 \\\n               --T-max 100000 \\\n               --learn-start 1600 \\\n               --memory-capacity 100000 \\\n               --replay-frequency 1 \\\n               --multi-step 20 \\\n               --architecture data-efficient \\\n               --hidden-size 256 \\\n               --learning-rate 0.0001 \\\n               --evaluation-interval 10000\n```\n\nNote that pretrained models from the [`1.3`](https://github.com/Kaixhin/Rainbow/releases/tag/1.3) release used a (slightly) incorrect network architecture. To use these, change the padding in the first convolutional layer from 0 to 1 (DeepMind uses \"valid\" (no) padding).\n\nRequirements\n------------\n\n- [atari-py](https://github.com/openai/atari-py)\n- [OpenCV Python](https://pypi.python.org/pypi/opencv-python)\n- [Plotly](https://plot.ly/)\n- [PyTorch](http://pytorch.org/)\n\nTo install all dependencies with Anaconda run `conda env create -f environment.yml` and use `source activate rainbow` to activate the environment.\n\nAvailable Atari games can be found in the [`atari-py` ROMs folder](https://github.com/openai/atari-py/tree/master/atari_py/atari_roms).\n\nAcknowledgements\n----------------\n\n- [@floringogianu](https://github.com/floringogianu) for [categorical-dqn](https://github.com/floringogianu/categorical-dqn)\n- [@jvmancuso](https://github.com/jvmancuso) for [Noisy layer](https://github.com/pytorch/pytorch/pull/2103)\n- [@jaara](https://github.com/jaara) for [AI-blog](https://github.com/jaara/AI-blog)\n- [@openai](https://github.com/openai) for [Baselines](https://github.com/openai/baselines)\n- [@mtthss](https://github.com/mtthss) for [implementation details](https://github.com/Kaixhin/Rainbow/wiki/Matteo's-Notes)\n\nReferences\n----------\n\n[1] [Rainbow: Combining Improvements in Deep Reinforcement Learning](https://arxiv.org/abs/1710.02298)  \n[2] [Playing Atari with Deep Reinforcement Learning](http://arxiv.org/abs/1312.5602)  \n[3] [Deep Reinforcement Learning with Double Q-learning](http://arxiv.org/abs/1509.06461)  \n[4] [Prioritized Experience Replay](http://arxiv.org/abs/1511.05952)  \n[5] [Dueling Network Architectures for Deep Reinforcement Learning](http://arxiv.org/abs/1511.06581)  \n[6] [Reinforcement Learning: An Introduction](http://www.incompleteideas.net/sutton/book/ebook/the-book.html)  \n[7] [A Distributional Perspective on Reinforcement Learning](https://arxiv.org/abs/1707.06887)  \n[8] [Noisy Networks for Exploration](https://arxiv.org/abs/1706.10295)  \n[9] [When to Use Parametric Models in Reinforcement Learning?](https://arxiv.org/abs/1906.05243)  \n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FKaixhin%2FRainbow","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FKaixhin%2FRainbow","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FKaixhin%2FRainbow/lists"}