{"id":13545320,"url":"https://github.com/datawhalechina/joyrl","last_synced_at":"2025-05-16T00:08:52.390Z","repository":{"id":63476729,"uuid":"550323552","full_name":"datawhalechina/joyrl","owner":"datawhalechina","description":"An easier PyTorch deep reinforcement learning library.","archived":false,"fork":false,"pushed_at":"2024-12-19T05:53:15.000Z","size":19789,"stargazers_count":207,"open_issues_count":4,"forks_count":18,"subscribers_count":6,"default_branch":"main","last_synced_at":"2025-05-09T18:56:03.761Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://datawhalechina.github.io/joyrl/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/datawhalechina.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-10-12T15:08:36.000Z","updated_at":"2025-05-07T06:14:52.000Z","dependencies_parsed_at":"2023-12-19T07:33:24.295Z","dependency_job_id":"2a622d04-2990-46c3-840e-2d909599bd51","html_url":"https://github.com/datawhalechina/joyrl","commit_stats":{"total_commits":20,"total_committers":2,"mean_commits":10.0,"dds":"0.30000000000000004","last_synced_commit":"5055cb9753212fb248928ba59e2ae7e6ab0ce316"},"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/datawhalechina%2Fjoyrl","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/datawhalechina%2Fjoyrl/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/datawhalechina%2Fjoyrl/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/datawhalechina%2Fjoyrl/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/datawhalechina","download_url":"https://codeload.github.com/datawhalechina/joyrl/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254442854,"owners_count":22071878,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-01T11:01:00.846Z","updated_at":"2025-05-16T00:08:47.357Z","avatar_url":"https://github.com/datawhalechina.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"# JoyRL\n\n[![PyPI](https://img.shields.io/pypi/v/joyrl)](https://pypi.org/project/joyrl/)  [![GitHub issues](https://img.shields.io/github/issues/datawhalechina/joyrl)](https://github.com/datawhalechina/joyrl/issues) [![GitHub stars](https://img.shields.io/github/stars/datawhalechina/joyrl)](https://github.com/datawhalechina/joyrl/stargazers) [![GitHub forks](https://img.shields.io/github/forks/datawhalechina/joyrl)](https://github.com/datawhalechina/joyrl/network) [![GitHub license](https://img.shields.io/github/license/datawhalechina/joyrl)](https://github.com/datawhalechina/joyrl/blob/master/LICENSE)\n\n`JoyRL` is a parallel reinforcement learning library based on PyTorch and Ray. Unlike existing RL libraries, `JoyRL` is helping users to release the burden of implementing algorithms with tough details, unfriendly APIs, and etc. JoyRL is designed for users to train and test RL algorithms with **only hyperparameters configuration**, which is mush easier for beginners to learn and use. Also, JoyRL supports plenties of state-of-art RL algorithms including **RLHF(core of ChatGPT)**(See algorithms below). JoyRL provides a **modularized framework** for users as well to customize their own algorithms and environments. \n\n## Install\n\n⚠️ Note that donot install JoyRL through any mirror image!!!\n\n```bash\n# you need to install Anaconda first\nconda create -n joyrl python=3.10\nconda activate joyrl\npip install -U joyrl\n```\n\nTorch install:\n\n```bash\n# CPU\npip install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1\n# CUDA 11.8\npip install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 --index-url https://download.pytorch.org/whl/cu118\n# CUDA 12.1\npip install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 --index-url https://download.pytorch.org/whl/cu121\n```\n\n## Usage\n\n### Quick Start\n\nthe following presents a demo to use joyrl. As you can see, first create a yaml file to **config hyperparameters**, then run the command as below in your terminal. That's all you need to do to train a DQN agent on CartPole-v1 environment.\n\n```bash\njoyrl --yaml ./presets/ClassControl/CartPole-v1/CartPole-v1_DQN.yaml\n```\nor you can run the following code in your python file. \n\n```python\nimport joyrl\nif __name__ == \"__main__\":\n    print(joyrl.__version__)\n    yaml_path = \"./presets/ClassControl/CartPole-v1/CartPole-v1_DQN.yaml\"\n    joyrl.run(yaml_path = yaml_path)\n```\n\n\n\n## Documentation\n\nMore tutorials and API documentation are hosted on [JoyRL docs](https://datawhalechina.github.io/joyrl/) or [JoyRL 中文文档](https://datawhalechina.github.io/joyrl-book/#/joyrl_docs/main).\n\n## Algorithms\n\n|       Name       |                          Reference                           |                    Author                     | Notes |\n| :--------------: | :----------------------------------------------------------: | :-------------------------------------------: | :---: |\n| Q-learning | [RL introduction](https://web.stanford.edu/class/psych209/Readings/SuttonBartoIPRLBook2ndEd.pdf) | [johnjim0816](https://github.com/johnjim0816) |       |\n| Sarsa | [RL introduction](https://web.stanford.edu/class/psych209/Readings/SuttonBartoIPRLBook2ndEd.pdf) | [johnjim0816](https://github.com/johnjim0816) | |\n| DQN | [DQN Paper](https://www.cs.toronto.edu/~vmnih/docs/dqn.pdf) | [johnjim0816](https://github.com/johnjim0816) | |\n| Double DQN  |     [DoubleDQN Paper](https://arxiv.org/abs/1509.06461)      | [johnjim0816](https://github.com/johnjim0816) | |\n| Dueling DQN | [DuelingDQN Paper](https://arxiv.org/abs/1511.06581) | [johnjim0816](https://github.com/johnjim0816) | |\n| NoisyDQN | [NoisyDQN Paper](https://arxiv.org/pdf/1706.10295.pdf) | [johnjim0816](https://github.com/johnjim0816) | |\n| CategoricalDQN | [CategoricalDQN Paper](https://arxiv.org/abs/1707.06887) | [johnjim0816](https://github.com/johnjim0816) | |\n| DDPG | [DDPG Paper](https://arxiv.org/abs/1509.02971) | [johnjim0816](https://github.com/johnjim0816) | |\n| TD3 | [TD3 Paper](https://arxiv.org/pdf/1802.09477) | [johnjim0816](https://github.com/johnjim0816) | |\n| A2C/A3C | [A3C Paper](https://arxiv.org/abs/1602.01783) | [johnjim0816](https://github.com/johnjim0816) | |\n| PPO | [PPO Paper](https://arxiv.org/abs/1707.06347) | [johnjim0816](https://github.com/johnjim0816) | |\n| SoftQ | [SoftQ Paper](https://arxiv.org/abs/1702.08165) | [johnjim0816](https://github.com/johnjim0816) | |\n\n## Why JoyRL?\n\n| RL Platform                                                  | GitHub Stars                                                 | # of Alg. \u003csup\u003e(1)\u003c/sup\u003e | Custom Env                     | Async Training      | RNN Support        | Multi-Head Observation | Backend                                           |\n| ------------------------------------------------------------ | ------------------------------------------------------------ | ------------------------ | ------------------------------ | ------------------ | ------------------ | ---------------------- | ------------------------------------------------- |\n| [Baselines](https://github.com/openai/baselines)             | [![GitHub stars](https://img.shields.io/github/stars/openai/baselines)](https://github.com/openai/baselines/stargazers) | 9                        | :heavy_check_mark: (gym)       | :x:                | :heavy_check_mark: | :x:                    | TF1                                               |\n| [Stable-Baselines](https://github.com/hill-a/stable-baselines) | [![GitHub stars](https://img.shields.io/github/stars/hill-a/stable-baselines)](https://github.com/hill-a/stable-baselines/stargazers) | 11                       | :heavy_check_mark: (gym)       | :x:                | :heavy_check_mark: | :x:                    | TF1                                               |\n| [Stable-Baselines3](https://github.com/DLR-RM/stable-baselines3) | [![GitHub stars](https://img.shields.io/github/stars/DLR-RM/stable-baselines3)](https://github.com/DLR-RM/stable-baselines3/stargazers) | 7        | :heavy_check_mark: (gym)       | :x:                | :x:                | :heavy_check_mark:     | PyTorch                                           |\n| [Ray/RLlib](https://github.com/ray-project/ray/tree/master/rllib/) | [![GitHub stars](https://img.shields.io/github/stars/ray-project/ray)](https://github.com/ray-project/ray/stargazers) | 16                       | :heavy_check_mark:             | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark:     | TF/PyTorch                                        |\n| [SpinningUp](https://github.com/openai/spinningup)           | [![GitHub stars](https://img.shields.io/github/stars/openai/spinningup)](https://github.com/openai/spinningupstargazers) | 6                        | :heavy_check_mark: (gym)       | :x:                | :x:                | :x:                    | PyTorch                                           |\n| [Dopamine](https://github.com/google/dopamine)               | [![GitHub stars](https://img.shields.io/github/stars/google/dopamine)](https://github.com/google/dopamine/stargazers) | 7                        | :x:                            | :x:                | :x:                | :x:                    | TF/JAX                                            |\n| [ACME](https://github.com/deepmind/acme)                     | [![GitHub stars](https://img.shields.io/github/stars/deepmind/acme)](https://github.com/deepmind/acme/stargazers) | 14                       | :heavy_check_mark: (dm_env)    | :x:                | :heavy_check_mark: | :heavy_check_mark:     | TF/JAX                                            |\n| [keras-rl](https://github.com/keras-rl/keras-rl)             | [![GitHub stars](https://img.shields.io/github/stars/keras-rl/keras-rl)](https://github.com/keras-rl/keras-rlstargazers) | 7                        | :heavy_check_mark: (gym)       | :x:                | :x:                | :x:                    | Keras                                             |\n| [cleanrl](https://github.com/vwxyzjn/cleanrl)                | ![GitHub stars](https://img.shields.io/github/stars/vwxyzjn/cleanrl) | 9                        | :heavy_check_mark: (gym)       | :x:                | :x:                | :x:                    | [poetry](https://github.com/python-poetry/poetry) |\n| [rlpyt](https://github.com/astooke/rlpyt)                    | [![GitHub stars](https://img.shields.io/github/stars/astooke/rlpyt)](https://github.com/astooke/rlpyt/stargazers) | 11                       | :x:                            | :x:                | :heavy_check_mark: | :heavy_check_mark:     | PyTorch                                           |\n| [ChainerRL](https://github.com/chainer/chainerrl)            | [![GitHub stars](https://img.shields.io/github/stars/chainer/chainerrl)](https://github.com/chainer/chainerrl/stargazers) | 18                       | :heavy_check_mark: (gym)       | :x:                | :heavy_check_mark: | :x:                    | Chainer                                           |\n| [Tianshou](https://github.com/thu-ml/tianshou)               | [![GitHub stars](https://img.shields.io/github/stars/thu-ml/tianshou)](https://github.com/thu-ml/tianshou/stargazers) | 20                       | :heavy_check_mark: (Gymnasium) | :x:                | :heavy_check_mark: | :heavy_check_mark:     | PyTorch                                           |\n| [JoyRL](https://github.com/datawhalechina/joyrl)             | ![GitHub stars](https://img.shields.io/github/stars/datawhalechina/joyrl) | 12                    | :heavy_check_mark: (Gymnasium) | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark:     | PyTorch                                           |\n\nHere are some other highlghts of JoyRL:\n\n* Provide a series of Chinese courses [JoyRL Book](https://github.com/datawhalechina/joyrl-book) (with the English version in progress), suitable for beginners to start with a combination of theory\n\n## Contributors\n\n\u003ctable border=\"0\"\u003e\n  \u003ctbody\u003e\n    \u003ctr align=\"center\" \u003e\n        \u003ctd\u003e\n         \u003ca href=\"https://github.com/JohnJim0816\"\u003e\u003cimg width=\"70\" height=\"70\" src=\"https://github.com/JohnJim0816.png?s=40\" alt=\"pic\"\u003e\u003c/a\u003e\u003cbr\u003e\n         \u003ca href=\"https://github.com/JohnJim0816\"\u003eJohn Jim\u003c/a\u003e\n         \u003cp\u003ePeking University\u003c/p\u003e\n        \u003c/td\u003e\n        \u003ctd\u003e\n            \u003ca href=\"https://github.com/qiwang067\"\u003e\u003cimg width=\"70\" height=\"70\" src=\"https://github.com/qiwang067.png?s=40\" alt=\"pic\"\u003e\u003c/a\u003e\u003cbr\u003e\n            \u003ca href=\"https://github.com/qiwang067\"\u003eQi Wang\u003c/a\u003e \n            \u003cp\u003eShanghai Jiao Tong University\u003c/p\u003e\n        \u003c/td\u003e\n        \u003ctd\u003e\n            \u003ca href=\"https://github.com/yyysjz1997\"\u003e\u003cimg width=\"70\" height=\"70\" src=\"https://github.com/yyysjz1997.png?s=40\" alt=\"pic\"\u003e\u003c/a\u003e\u003cbr\u003e\n            \u003ca href=\"https://github.com/yyysjz1997\"\u003eYiyuan Yang\u003c/a\u003e \n            \u003cp\u003eUniversity of Oxford\u003c/p\u003e\n        \u003c/td\u003e\n    \u003c/tr\u003e\n  \u003c/tbody\u003e\n\u003c/table\u003e","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdatawhalechina%2Fjoyrl","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdatawhalechina%2Fjoyrl","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdatawhalechina%2Fjoyrl/lists"}