Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/datawhalechina/joyrl
An easier PyTorch deep reinforcement learning library.
https://github.com/datawhalechina/joyrl
Last synced: 4 days ago
JSON representation
An easier PyTorch deep reinforcement learning library.
- Host: GitHub
- URL: https://github.com/datawhalechina/joyrl
- Owner: datawhalechina
- License: mit
- Created: 2022-10-12T15:08:36.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2024-08-20T05:18:55.000Z (3 months ago)
- Last Synced: 2024-10-30T10:44:55.961Z (15 days ago)
- Language: Python
- Homepage: https://datawhalechina.github.io/joyrl/
- Size: 18.8 MB
- Stars: 162
- Watchers: 6
- Forks: 18
- Open Issues: 5
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# JoyRL
[![PyPI](https://img.shields.io/pypi/v/joyrl)](https://pypi.org/project/joyrl/) [![GitHub issues](https://img.shields.io/github/issues/datawhalechina/joyrl)](https://github.com/datawhalechina/joyrl/issues) [![GitHub stars](https://img.shields.io/github/stars/datawhalechina/joyrl)](https://github.com/datawhalechina/joyrl/stargazers) [![GitHub forks](https://img.shields.io/github/forks/datawhalechina/joyrl)](https://github.com/datawhalechina/joyrl/network) [![GitHub license](https://img.shields.io/github/license/datawhalechina/joyrl)](https://github.com/datawhalechina/joyrl/blob/master/LICENSE)
`JoyRL` is a parallel reinforcement learning library based on PyTorch and Ray. Unlike existing RL libraries, `JoyRL` is helping users to release the burden of implementing algorithms with tough details, unfriendly APIs, and etc. JoyRL is designed for users to train and test RL algorithms with **only hyperparameters configuration**, which is mush easier for beginners to learn and use. Also, JoyRL supports plenties of state-of-art RL algorithms including **RLHF(core of ChatGPT)**(See algorithms below). JoyRL provides a **modularized framework** for users as well to customize their own algorithms and environments.
## Install
⚠️ Note that donot install JoyRL through any mirror image!!!
```bash
# you need to install Anaconda first
conda create -n joyrl python=3.10
conda activate joyrl
pip install -U joyrl
```Torch install:
```bash
# CPU
pip install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1
# CUDA 11.8
pip install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 --index-url https://download.pytorch.org/whl/cu118
# CUDA 12.1
pip install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 --index-url https://download.pytorch.org/whl/cu121
```## Usage
### Quick Start
the following presents a demo to use joyrl. As you can see, first create a yaml file to **config hyperparameters**, then run the command as below in your terminal. That's all you need to do to train a DQN agent on CartPole-v1 environment.
```bash
joyrl --yaml ./presets/ClassControl/CartPole-v1/CartPole-v1_DQN.yaml
```
or you can run the following code in your python file.```python
import joyrl
if __name__ == "__main__":
print(joyrl.__version__)
yaml_path = "./presets/ClassControl/CartPole-v1/CartPole-v1_DQN.yaml"
joyrl.run(yaml_path = yaml_path)
```## Documentation
More tutorials and API documentation are hosted on [JoyRL docs](https://datawhalechina.github.io/joyrl/) or [JoyRL 中文文档](https://datawhalechina.github.io/joyrl-book/#/joyrl_docs/main).
## Algorithms
| Name | Reference | Author | Notes |
| :--------------: | :----------------------------------------------------------: | :-------------------------------------------: | :---: |
| Q-learning | [RL introduction](https://web.stanford.edu/class/psych209/Readings/SuttonBartoIPRLBook2ndEd.pdf) | [johnjim0816](https://github.com/johnjim0816) | |
| Sarsa | [RL introduction](https://web.stanford.edu/class/psych209/Readings/SuttonBartoIPRLBook2ndEd.pdf) | [johnjim0816](https://github.com/johnjim0816) | |
| DQN | [DQN Paper](https://www.cs.toronto.edu/~vmnih/docs/dqn.pdf) | [johnjim0816](https://github.com/johnjim0816) | |
| Double DQN | [DoubleDQN Paper](https://arxiv.org/abs/1509.06461) | [johnjim0816](https://github.com/johnjim0816) | |
| Dueling DQN | [DuelingDQN Paper](https://arxiv.org/abs/1511.06581) | [johnjim0816](https://github.com/johnjim0816) | |
| NoisyDQN | [NoisyDQN Paper](https://arxiv.org/pdf/1706.10295.pdf) | [johnjim0816](https://github.com/johnjim0816) | |
| DDPG | [DDPG Paper](https://arxiv.org/abs/1509.02971) | [johnjim0816](https://github.com/johnjim0816) | |
| TD3 | [TD3 Paper](https://arxiv.org/pdf/1802.09477) | [johnjim0816](https://github.com/johnjim0816) | |
| A2C/A3C | [A3C Paper](https://arxiv.org/abs/1602.01783) | [johnjim0816](https://github.com/johnjim0816) | |
| PPO | [PPO Paper](https://arxiv.org/abs/1707.06347) | [johnjim0816](https://github.com/johnjim0816) | |
| SoftQ | [SoftQ Paper](https://arxiv.org/abs/1702.08165) | [johnjim0816](https://github.com/johnjim0816) | |## Why JoyRL?
| RL Platform | GitHub Stars | # of Alg. (1) | Custom Env | Async Training | RNN Support | Multi-Head Observation | Backend |
| ------------------------------------------------------------ | ------------------------------------------------------------ | ------------------------ | ------------------------------ | ------------------ | ------------------ | ---------------------- | ------------------------------------------------- |
| [Baselines](https://github.com/openai/baselines) | [![GitHub stars](https://img.shields.io/github/stars/openai/baselines)](https://github.com/openai/baselines/stargazers) | 9 | :heavy_check_mark: (gym) | :x: | :heavy_check_mark: | :x: | TF1 |
| [Stable-Baselines](https://github.com/hill-a/stable-baselines) | [![GitHub stars](https://img.shields.io/github/stars/hill-a/stable-baselines)](https://github.com/hill-a/stable-baselines/stargazers) | 11 | :heavy_check_mark: (gym) | :x: | :heavy_check_mark: | :x: | TF1 |
| [Stable-Baselines3](https://github.com/DLR-RM/stable-baselines3) | [![GitHub stars](https://img.shields.io/github/stars/DLR-RM/stable-baselines3)](https://github.com/DLR-RM/stable-baselines3/stargazers) | 7 | :heavy_check_mark: (gym) | :x: | :x: | :heavy_check_mark: | PyTorch |
| [Ray/RLlib](https://github.com/ray-project/ray/tree/master/rllib/) | [![GitHub stars](https://img.shields.io/github/stars/ray-project/ray)](https://github.com/ray-project/ray/stargazers) | 16 | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | TF/PyTorch |
| [SpinningUp](https://github.com/openai/spinningup) | [![GitHub stars](https://img.shields.io/github/stars/openai/spinningup)](https://github.com/openai/spinningupstargazers) | 6 | :heavy_check_mark: (gym) | :x: | :x: | :x: | PyTorch |
| [Dopamine](https://github.com/google/dopamine) | [![GitHub stars](https://img.shields.io/github/stars/google/dopamine)](https://github.com/google/dopamine/stargazers) | 7 | :x: | :x: | :x: | :x: | TF/JAX |
| [ACME](https://github.com/deepmind/acme) | [![GitHub stars](https://img.shields.io/github/stars/deepmind/acme)](https://github.com/deepmind/acme/stargazers) | 14 | :heavy_check_mark: (dm_env) | :x: | :heavy_check_mark: | :heavy_check_mark: | TF/JAX |
| [keras-rl](https://github.com/keras-rl/keras-rl) | [![GitHub stars](https://img.shields.io/github/stars/keras-rl/keras-rl)](https://github.com/keras-rl/keras-rlstargazers) | 7 | :heavy_check_mark: (gym) | :x: | :x: | :x: | Keras |
| [cleanrl](https://github.com/vwxyzjn/cleanrl) | ![GitHub stars](https://img.shields.io/github/stars/vwxyzjn/cleanrl) | 9 | :heavy_check_mark: (gym) | :x: | :x: | :x: | [poetry](https://github.com/python-poetry/poetry) |
| [rlpyt](https://github.com/astooke/rlpyt) | [![GitHub stars](https://img.shields.io/github/stars/astooke/rlpyt)](https://github.com/astooke/rlpyt/stargazers) | 11 | :x: | :x: | :heavy_check_mark: | :heavy_check_mark: | PyTorch |
| [ChainerRL](https://github.com/chainer/chainerrl) | [![GitHub stars](https://img.shields.io/github/stars/chainer/chainerrl)](https://github.com/chainer/chainerrl/stargazers) | 18 | :heavy_check_mark: (gym) | :x: | :heavy_check_mark: | :x: | Chainer |
| [Tianshou](https://github.com/thu-ml/tianshou) | [![GitHub stars](https://img.shields.io/github/stars/thu-ml/tianshou)](https://github.com/thu-ml/tianshou/stargazers) | 20 | :heavy_check_mark: (Gymnasium) | :x: | :heavy_check_mark: | :heavy_check_mark: | PyTorch |
| [JoyRL](https://github.com/datawhalechina/joyrl) | ![GitHub stars](https://img.shields.io/github/stars/datawhalechina/joyrl) | 11 | :heavy_check_mark: (Gymnasium) | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | PyTorch |Here are some other highlghts of JoyRL:
* Provide a series of Chinese courses [JoyRL Book](https://github.com/datawhalechina/joyrl-book) (with the English version in progress), suitable for beginners to start with a combination of theory
## Contributors
Peking University
Qi Wang
Shanghai Jiao Tong University
Yiyuan Yang
University of Oxford