Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/takuseno/nnabla-drl-collections
Deep reinforcement learning implementations written in NNabla
https://github.com/takuseno/nnabla-drl-collections
deep-reinforcement-learning nnabla
Last synced: about 2 months ago
JSON representation
Deep reinforcement learning implementations written in NNabla
- Host: GitHub
- URL: https://github.com/takuseno/nnabla-drl-collections
- Owner: takuseno
- License: mit
- Created: 2019-06-08T06:10:45.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2021-10-12T22:59:13.000Z (over 3 years ago)
- Last Synced: 2024-10-23T04:06:36.122Z (3 months ago)
- Topics: deep-reinforcement-learning, nnabla
- Language: Python
- Homepage:
- Size: 3.41 MB
- Stars: 5
- Watchers: 3
- Forks: 1
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# NNabla DRL Collections
Deep reinforcement learning implementations with NNabla.## install
```
$ pip install -r requirements.txt
```
If you use GPU, see [here](https://nnabla.readthedocs.io/en/latest/python/pip_installation_cuda.html).## algorithms (discrete action-space)
- [x] [Deep Q-Network (DQN)](https://www.nature.com/articles/nature14236)
- [x] [Double DQN](https://arxiv.org/abs/1509.06461)
- [x] [Dueling DQN](https://arxiv.org/abs/1511.06581)
- [x] [NoisyNet-DQN](https://arxiv.org/abs/1706.10295)
- [x] [Prioritized Experience Replay](https://arxiv.org/abs/1511.05952)
- [x] [Categorical DQN](https://arxiv.org/abs/1707.06887)
- [x] [Bootstrapped DQN](https://arxiv.org/abs/1602.04621)
- [ ] [Rainbow](https://arxiv.org/abs/1710.02298)
- [ ] [Quantile Regression DQN (QR DQN)](https://arxiv.org/abs/1710.10044)
- [ ] [Implicit Quantile Networks (IQN)](https://arxiv.org/abs/1806.06923)
- [x] [Advantage Actor-Critic (A2C)](https://arxiv.org/abs/1602.01783)
- [ ] [Proximal Policy Optimization (PPO)](https://arxiv.org/abs/1707.06347)## algorithms (continuous action-space)
- [x] [Deep Deterministic Policy Gradients (DDPG)](https://arxiv.org/abs/1509.02971)
- [x] [Twin Delayed Deep Deterministic Policy Gradients (TD3)](https://arxiv.org/abs/1802.09477)
- [x] [Soft Actor-Critic (SAC)](https://arxiv.org/abs/1801.01290)
- [x] [SAC (learned temperature)](https://arxiv.org/abs/1812.05905)
- [ ] [Proximal Policy Optimization (PPO)](https://arxiv.org/abs/1707.06347)
- [ ] [Trust Region Policy Optimization (TRPO)](https://arxiv.org/abs/1502.05477)
- [ ] [Actor-Critic with Experience Replay (ACER)](https://arxiv.org/abs/1611.01224)## blog posts
- [Deep Q-Network Implementation with SONY’s NNabla](https://towardsdatascience.com/deep-q-network-implementation-with-sonys-nnabla-490d945deb8e)