https://github.com/datawhalechina/joyrl

An easier PyTorch deep reinforcement learning library.
https://github.com/datawhalechina/joyrl
Last synced: 6 months ago
JSON representation
An easier PyTorch deep reinforcement learning library.
Host: GitHub
URL: https://github.com/datawhalechina/joyrl
Owner: datawhalechina
License: mit
Created: 2022-10-12T15:08:36.000Z (about 3 years ago)
Default Branch: main
Last Pushed: 2024-12-19T05:53:15.000Z (11 months ago)
Last Synced: 2025-05-09T18:56:03.761Z (7 months ago)
Language: Python
Homepage: https://datawhalechina.github.io/joyrl/
Size: 18.9 MB
Stars: 207
Watchers: 6
Forks: 18
Open Issues: 4
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project

README

          # JoyRL

[![PyPI](https://img.shields.io/pypi/v/joyrl)](https://pypi.org/project/joyrl/)  [![GitHub issues](https://img.shields.io/github/issues/datawhalechina/joyrl)](https://github.com/datawhalechina/joyrl/issues) [![GitHub stars](https://img.shields.io/github/stars/datawhalechina/joyrl)](https://github.com/datawhalechina/joyrl/stargazers) [![GitHub forks](https://img.shields.io/github/forks/datawhalechina/joyrl)](https://github.com/datawhalechina/joyrl/network) [![GitHub license](https://img.shields.io/github/license/datawhalechina/joyrl)](https://github.com/datawhalechina/joyrl/blob/master/LICENSE)

`JoyRL` is a parallel reinforcement learning library based on PyTorch and Ray. Unlike existing RL libraries, `JoyRL` is helping users to release the burden of implementing algorithms with tough details, unfriendly APIs, and etc. JoyRL is designed for users to train and test RL algorithms with **only hyperparameters configuration**, which is mush easier for beginners to learn and use. Also, JoyRL supports plenties of state-of-art RL algorithms including **RLHF(core of ChatGPT)**(See algorithms below). JoyRL provides a **modularized framework** for users as well to customize their own algorithms and environments. 

## Install

⚠️ Note that donot install JoyRL through any mirror image!!!

```bash

# you need to install Anaconda first

conda create -n joyrl python=3.10

conda activate joyrl

pip install -U joyrl

```

Torch install:

```bash

# CPU

pip install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1

# CUDA 11.8

pip install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 --index-url https://download.pytorch.org/whl/cu118

# CUDA 12.1

pip install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 --index-url https://download.pytorch.org/whl/cu121

```

## Usage

### Quick Start

the following presents a demo to use joyrl. As you can see, first create a yaml file to **config hyperparameters**, then run the command as below in your terminal. That's all you need to do to train a DQN agent on CartPole-v1 environment.

```bash

joyrl --yaml ./presets/ClassControl/CartPole-v1/CartPole-v1_DQN.yaml

```

or you can run the following code in your python file. 

```python

import joyrl

if __name__ == "__main__":

    print(joyrl.__version__)

    yaml_path = "./presets/ClassControl/CartPole-v1/CartPole-v1_DQN.yaml"

    joyrl.run(yaml_path = yaml_path)

```

## Documentation

More tutorials and API documentation are hosted on [JoyRL docs](https://datawhalechina.github.io/joyrl/) or [JoyRL 中文文档](https://datawhalechina.github.io/joyrl-book/#/joyrl_docs/main).

## Algorithms

|       Name       |                          Reference                           |                    Author                     | Notes |

| :--------------: | :----------------------------------------------------------: | :-------------------------------------------: | :---: |

| Q-learning | [RL introduction](https://web.stanford.edu/class/psych209/Readings/SuttonBartoIPRLBook2ndEd.pdf) | [johnjim0816](https://github.com/johnjim0816) |       |

| Sarsa | [RL introduction](https://web.stanford.edu/class/psych209/Readings/SuttonBartoIPRLBook2ndEd.pdf) | [johnjim0816](https://github.com/johnjim0816) | |

| DQN | [DQN Paper](https://www.cs.toronto.edu/~vmnih/docs/dqn.pdf) | [johnjim0816](https://github.com/johnjim0816) | |

| Double DQN  |     [DoubleDQN Paper](https://arxiv.org/abs/1509.06461)      | [johnjim0816](https://github.com/johnjim0816) | |

| Dueling DQN | [DuelingDQN Paper](https://arxiv.org/abs/1511.06581) | [johnjim0816](https://github.com/johnjim0816) | |

| NoisyDQN | [NoisyDQN Paper](https://arxiv.org/pdf/1706.10295.pdf) | [johnjim0816](https://github.com/johnjim0816) | |

| CategoricalDQN | [CategoricalDQN Paper](https://arxiv.org/abs/1707.06887) | [johnjim0816](https://github.com/johnjim0816) | |

| DDPG | [DDPG Paper](https://arxiv.org/abs/1509.02971) | [johnjim0816](https://github.com/johnjim0816) | |

| TD3 | [TD3 Paper](https://arxiv.org/pdf/1802.09477) | [johnjim0816](https://github.com/johnjim0816) | |

| A2C/A3C | [A3C Paper](https://arxiv.org/abs/1602.01783) | [johnjim0816](https://github.com/johnjim0816) | |

| PPO | [PPO Paper](https://arxiv.org/abs/1707.06347) | [johnjim0816](https://github.com/johnjim0816) | |

| SoftQ | [SoftQ Paper](https://arxiv.org/abs/1702.08165) | [johnjim0816](https://github.com/johnjim0816) | |

## Why JoyRL?

| RL Platform                                                  | GitHub Stars                                                 | # of Alg. ⁽¹⁾ | Custom Env                     | Async Training      | RNN Support        | Multi-Head Observation | Backend                                           |

| ------------------------------------------------------------ | ------------------------------------------------------------ | ------------------------ | ------------------------------ | ------------------ | ------------------ | ---------------------- | ------------------------------------------------- |

| [Baselines](https://github.com/openai/baselines)             | [![GitHub stars](https://img.shields.io/github/stars/openai/baselines)](https://github.com/openai/baselines/stargazers) | 9                        | :heavy_check_mark: (gym)       | :x:                | :heavy_check_mark: | :x:                    | TF1                                               |

| [Stable-Baselines](https://github.com/hill-a/stable-baselines) | [![GitHub stars](https://img.shields.io/github/stars/hill-a/stable-baselines)](https://github.com/hill-a/stable-baselines/stargazers) | 11                       | :heavy_check_mark: (gym)       | :x:                | :heavy_check_mark: | :x:                    | TF1                                               |

| [Stable-Baselines3](https://github.com/DLR-RM/stable-baselines3) | [![GitHub stars](https://img.shields.io/github/stars/DLR-RM/stable-baselines3)](https://github.com/DLR-RM/stable-baselines3/stargazers) | 7        | :heavy_check_mark: (gym)       | :x:                | :x:                | :heavy_check_mark:     | PyTorch                                           |

| [Ray/RLlib](https://github.com/ray-project/ray/tree/master/rllib/) | [![GitHub stars](https://img.shields.io/github/stars/ray-project/ray)](https://github.com/ray-project/ray/stargazers) | 16                       | :heavy_check_mark:             | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark:     | TF/PyTorch                                        |

| [SpinningUp](https://github.com/openai/spinningup)           | [![GitHub stars](https://img.shields.io/github/stars/openai/spinningup)](https://github.com/openai/spinningupstargazers) | 6                        | :heavy_check_mark: (gym)       | :x:                | :x:                | :x:                    | PyTorch                                           |

| [Dopamine](https://github.com/google/dopamine)               | [![GitHub stars](https://img.shields.io/github/stars/google/dopamine)](https://github.com/google/dopamine/stargazers) | 7                        | :x:                            | :x:                | :x:                | :x:                    | TF/JAX                                            |

| [ACME](https://github.com/deepmind/acme)                     | [![GitHub stars](https://img.shields.io/github/stars/deepmind/acme)](https://github.com/deepmind/acme/stargazers) | 14                       | :heavy_check_mark: (dm_env)    | :x:                | :heavy_check_mark: | :heavy_check_mark:     | TF/JAX                                            |

| [keras-rl](https://github.com/keras-rl/keras-rl)             | [![GitHub stars](https://img.shields.io/github/stars/keras-rl/keras-rl)](https://github.com/keras-rl/keras-rlstargazers) | 7                        | :heavy_check_mark: (gym)       | :x:                | :x:                | :x:                    | Keras                                             |

| [cleanrl](https://github.com/vwxyzjn/cleanrl)                | ![GitHub stars](https://img.shields.io/github/stars/vwxyzjn/cleanrl) | 9                        | :heavy_check_mark: (gym)       | :x:                | :x:                | :x:                    | [poetry](https://github.com/python-poetry/poetry) |

| [rlpyt](https://github.com/astooke/rlpyt)                    | [![GitHub stars](https://img.shields.io/github/stars/astooke/rlpyt)](https://github.com/astooke/rlpyt/stargazers) | 11                       | :x:                            | :x:                | :heavy_check_mark: | :heavy_check_mark:     | PyTorch                                           |

| [ChainerRL](https://github.com/chainer/chainerrl)            | [![GitHub stars](https://img.shields.io/github/stars/chainer/chainerrl)](https://github.com/chainer/chainerrl/stargazers) | 18                       | :heavy_check_mark: (gym)       | :x:                | :heavy_check_mark: | :x:                    | Chainer                                           |

| [Tianshou](https://github.com/thu-ml/tianshou)               | [![GitHub stars](https://img.shields.io/github/stars/thu-ml/tianshou)](https://github.com/thu-ml/tianshou/stargazers) | 20                       | :heavy_check_mark: (Gymnasium) | :x:                | :heavy_check_mark: | :heavy_check_mark:     | PyTorch                                           |

| [JoyRL](https://github.com/datawhalechina/joyrl)             | ![GitHub stars](https://img.shields.io/github/stars/datawhalechina/joyrl) | 12                    | :heavy_check_mark: (Gymnasium) | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark:     | PyTorch                                           |

Here are some other highlghts of JoyRL:

* Provide a series of Chinese courses [JoyRL Book](https://github.com/datawhalechina/joyrl-book) (with the English version in progress), suitable for beginners to start with a combination of theory

## Contributors

  

    

        

         


         John Jim

         
Peking University

        

        

            


            Qi Wang 

            Shanghai Jiao Tong University

        

        

            


            Yiyuan Yang 

            University of Oxford
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/datawhalechina/joyrl

Awesome Lists containing this project

README