Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/imoneoi/onerl
One RL Platform is all you need -- Event-driven fully distributed reinforcement learning framework
https://github.com/imoneoi/onerl
deep-reinforcement-learning distributed-computing distributed-systems pytorch reinforcement-learning torch
Last synced: about 2 months ago
JSON representation
One RL Platform is all you need -- Event-driven fully distributed reinforcement learning framework
- Host: GitHub
- URL: https://github.com/imoneoi/onerl
- Owner: imoneoi
- Created: 2021-07-31T14:32:25.000Z (over 3 years ago)
- Default Branch: master
- Last Pushed: 2023-10-25T16:30:31.000Z (about 1 year ago)
- Last Synced: 2024-05-14T00:15:20.657Z (7 months ago)
- Topics: deep-reinforcement-learning, distributed-computing, distributed-systems, pytorch, reinforcement-learning, torch
- Language: Python
- Homepage:
- Size: 586 KB
- Stars: 16
- Watchers: 5
- Forks: 4
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# OneRL
Event-driven fully distributed reinforcement learning framework proposed in "A Versatile and Efficient Reinforcement Learning Approach for Autonomous Driving" (https://arxiv.org/abs/2110.11573) that can facilitate highly efficient policy learning in a wide range of real-world RL-based applications.- Super fast RL training! (15~30min for MuJoCo & Atari on single machine)
- State-of-the-art performance
- Scheduled and pipelined sample collection
- Completely lock-free execution
- Fully distributed architecture
- Full profiling & overhead identification tools
- Online visualization & rendering
- Support multi-GPU parallel training
- Support exporting trained policy to ONNX for faster inference & deployment## Installation
1. Clone this repo
```
git clone https://github.com/imoneoi/onerl.git
```
2. Install PyTorch and related dependencies (see requirements.txt)
```
pip install -r requirements.txt
```## Quick Start
**MuJoCo benchmark**
(Any machine with a single GPU)
```shell
python -m onerl.nodes.launcher examples/config/1_gpu/mujoco_sac_.yaml
```**Atari games**
(For 2 GPUs)
```shell
python -m onerl.nodes.launcher examples/config/2_gpu/atari_ddqn_.yaml
```## Performance
## Configuration and Namespaces
Isolate nodes, $global
## Algorithm Settings
YAML format
## Custom Environments
OpenAI Gym interface
implement `reset`, `step`
## Custom Algorithms
```python
class RandomAlgorithm(Algorithm):
def __init__(self,
network: dict,
env_params: dict,
**kwargs):
super().__init__(network, env_params)
# Initialize algorithm heredef forward(self, obs: torch.Tensor, ticks: int) -> torch.Tensor:
# Return selected action by observation (obs) at time (tick) as tensorif "act_n" in self.env_params:
# discrete action space
return torch.randint(0, self.env_params["act_n"], (obs.shape[0], ))
else:
# uniform -act_max ... act_max
return self.env_params["act_max"] * (torch.rand(obs.shape[0], *self.env_params["act_shape"]) * 2 - 1)def learn(self, batch: BatchCuda, ticks: int) -> dict:
# Update the policy using batch of transitions (s, a, r)_treturn {}
def policy_state_dict(self) -> OrderedDict:
# Return the state dict (a dict of torch.parameters) of the actor
# Which will be updated periodically to PolicyNode to interact with environmentreturn OrderedDict()
```## Export trained policy
## Profiling & Visualization
1. Enable profile recording
Set profiling=True and profile_log_path in global namespace
```yaml
$global:
# Profiling
profile: True
profile_log_path: profile_log
```2. Launch experiment, and profile will be recorded in meantime
```shell
python -m onerl.nodes.launcher
```3. Convert to JSON format
```
python -m onerl.scripts.convert_profile profile_log/
```4. Open JSON profile by Perfetto UI
Open https://ui.perfetto.dev in browser and drag & drop the converted JSON profile `profile.json`
![](./docs/assets/perf.png)
## Distributed principles
### Pipeline execution with event-driven scheduling
![](./docs/assets/pipeline.png)
In action:
![](./docs/assets/pipeline_perf.png)
### Lock-free replay sampling
![](./docs/assets/lockfree-buffer.drawio.png)
## Citation
If you are using OneRL training framework for your project development, please cite the following paper:
```
@inproceedings{
wang2022a,
title={A Versatile and Efficient Reinforcement Learning Approach for Autonomous Driving},
author={Guan Wang and Haoyi Niu and Desheng Zhu and Jianming Hu and Xianyuan Zhan and Guyue Zhou},
booktitle={NeurIPS 2022 Reinforcement Learning for Real Life Workshop},
year={2022}
}
```