https://github.com/sdpkjc/abcdrl
Modular Single-file Reinfocement Learning Algorithms Library
https://github.com/sdpkjc/abcdrl
deep-learning deep-reinforcement-learning machine-learning python pytorch reinfocement-learning
Last synced: about 1 month ago
JSON representation
Modular Single-file Reinfocement Learning Algorithms Library
- Host: GitHub
- URL: https://github.com/sdpkjc/abcdrl
- Owner: sdpkjc
- License: other
- Created: 2022-11-12T07:18:55.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2023-05-16T09:16:32.000Z (almost 2 years ago)
- Last Synced: 2024-10-11T05:50:59.922Z (7 months ago)
- Topics: deep-learning, deep-reinforcement-learning, machine-learning, python, pytorch, reinfocement-learning
- Language: Python
- Homepage: http://docs.abcdrl.xyz
- Size: 7.92 MB
- Stars: 37
- Watchers: 1
- Forks: 1
- Open Issues: 1
-
Metadata Files:
- Readme: README.cn.md
- License: LICENSE
- Citation: CITATION.bib
Awesome Lists containing this project
README
# **abcdRL** (简单四步实现一个强化学习算法)
[English](./README.md) | 简体中文
[](https://github.com/sdpkjc/abcdrl)
[](https://github.com/sdpkjc/abcdrl/actions/workflows/test.yml)
[](https://github.com/sdpkjc/abcdrl/actions/workflows/pre-commit.yml)
[](https://pypi.org/project/abcdrl)
[](https://hub.docker.com/r/sdpkjc/abcdrl/)
[](https://docs.abcdrl.xyz/)
[](https://gitpod.io/#https://github.com/sdpkjc/abcdrl)
[](https://report.abcdrl.xyz/)
[](https://gitee.com/sdpkjc/abcdrl/)
[](http://mypy-lang.org/)
[](https://github.com/psf/black)
[](https://pycqa.github.io/isort/)
[](https://pypi.org/project/abcdrl)abcdRL 是一个**模块化单文件强化学习代码库**,提供“有但不严格”的模块化设计,和清晰的单文件算法实现。
*阅读代码时,在单文件代码中,快速了解算法的完整实现细节;改进算法时,得益于轻量的模块化设计,只需专注于少量的模块。*
> abcdRL 主要参考了 [vwxyzjn/cleanrl](https://github.com/vwxyzjn/cleanrl/) 的单文件设计哲学和 [PaddlePaddle/PARL](https://github.com/PaddlePaddle/PARL/) 的模块设计。
***使用文档 ➡️ [docs.abcdrl.xyz](https://docs.abcdrl.xyz/zh/)***
***路线图🗺️ [#57](https://github.com/sdpkjc/abcdrl/issues/57)***
## 🚀 快速开始
在 Gitpod🌐 中打开项目,并立即开始编码。
[](https://gitpod.io/#https://github.com/sdpkjc/abcdrl)
使用 Docker📦:
```shell
# 0. 安装 Docker & Nvidia Drive & NVIDIA Container Toolkit
# 1. 运行 DQN 算法
docker run --rm --gpus all sdpkjc/abcdrl python abcdrl/dqn_torch.py
```***[详细安装说明 👀](https://docs.abcdrl.xyz/zh/install/)***
## 🐼 特点
- 👨👩👧👦 统一的代码结构
- 📄 单文件实现
- 🐷 低代码复用
- 📐 最小化代码差异
- 📈 集成 Tensorboard & Wandb
- 🛤 符合 PEP8 & PEP526 规范## 🗽 设计哲学
- 要“拷贝📋”,~~不要“继承🧬”~~
- 要“单文件📜”,~~不要“多文件📚”~~
- 要“功能复用🛠”,~~不要“算法复用🖨”~~
- 要“一致的逻辑🤖”,~~不要“一致的接口🔌”~~## ✅ 已实现算法
***Weights & Biases 性能报告 ➡️ [report.abcdrl.xyz](https://report.abcdrl.xyz)***
- [Deep Q Network (DQN)](https://doi.org/10.1038/nature14236) `dqn_torch.py`, `dqn_tf.py`, `dqn_atari_torch.py`, `dqn_atari_tf.py`
- [Deep Deterministic Policy Gradient (DDPG)](http://arxiv.org/abs/1509.02971) `ddpg_torch.py`
- [Twin Delayed Deep Deterministic Policy Gradient (TD3)](http://arxiv.org/abs/1802.09477) `td3_torch.py`
- [Soft Actor-Critic (SAC)](http://arxiv.org/abs/1801.01290) `sac_torch.py`
- [Proximal Policy Optimization (PPO)](http://arxiv.org/abs/1802.09477) `ppo_torch.py`---
- [Double Deep Q Network (DDQN)](http://arxiv.org/abs/1509.06461) `ddqn_torch.py`, `ddqn_tf.py`
- [Prioritized Deep Q Network (PDQN)](http://arxiv.org/abs/1511.05952) `pdqn_torch.py`, `pdqn_tf.py`## 引用 abcdRL
```bibtex
@misc{zhao_abcdrl_2022,
author = {Yanxiao, Zhao},
month = {12},
title = {{abcdRL: Modular Single-file Reinforcement Learning Algorithms Library}},
url = {https://github.com/sdpkjc/abcdrl},
year = {2022}
}
```