Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/takuseno/nnabla-drl-collections

Deep reinforcement learning implementations written in NNabla
https://github.com/takuseno/nnabla-drl-collections

deep-reinforcement-learning nnabla

Last synced: about 2 months ago
JSON representation

Deep reinforcement learning implementations written in NNabla

Host: GitHub
URL: https://github.com/takuseno/nnabla-drl-collections
Owner: takuseno
License: mit
Created: 2019-06-08T06:10:45.000Z (over 5 years ago)
Default Branch: master
Last Pushed: 2021-10-12T22:59:13.000Z (over 3 years ago)
Last Synced: 2024-10-23T04:06:36.122Z (3 months ago)
Topics: deep-reinforcement-learning, nnabla
Language: Python
Homepage:
Size: 3.41 MB
Stars: 5
Watchers: 3
Forks: 1
Open Issues: 1
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

        # NNabla DRL Collections

Deep reinforcement learning implementations with NNabla.

## install

```

$ pip install -r requirements.txt

```

If you use GPU, see [here](https://nnabla.readthedocs.io/en/latest/python/pip_installation_cuda.html).

## algorithms (discrete action-space)

- [x] [Deep Q-Network (DQN)](https://www.nature.com/articles/nature14236)

- [x] [Double DQN](https://arxiv.org/abs/1509.06461)

- [x] [Dueling DQN](https://arxiv.org/abs/1511.06581)

- [x] [NoisyNet-DQN](https://arxiv.org/abs/1706.10295)

- [x] [Prioritized Experience Replay](https://arxiv.org/abs/1511.05952)

- [x] [Categorical DQN](https://arxiv.org/abs/1707.06887)

- [x] [Bootstrapped DQN](https://arxiv.org/abs/1602.04621)

- [ ] [Rainbow](https://arxiv.org/abs/1710.02298)

- [ ] [Quantile Regression DQN (QR DQN)](https://arxiv.org/abs/1710.10044)

- [ ] [Implicit Quantile Networks (IQN)](https://arxiv.org/abs/1806.06923)

- [x] [Advantage Actor-Critic (A2C)](https://arxiv.org/abs/1602.01783)

- [ ] [Proximal Policy Optimization (PPO)](https://arxiv.org/abs/1707.06347)

## algorithms (continuous action-space)

- [x] [Deep Deterministic Policy Gradients (DDPG)](https://arxiv.org/abs/1509.02971)

- [x] [Twin Delayed Deep Deterministic Policy Gradients (TD3)](https://arxiv.org/abs/1802.09477)

- [x] [Soft Actor-Critic (SAC)](https://arxiv.org/abs/1801.01290)

- [x] [SAC (learned temperature)](https://arxiv.org/abs/1812.05905)

- [ ] [Proximal Policy Optimization (PPO)](https://arxiv.org/abs/1707.06347)

- [ ] [Trust Region Policy Optimization (TRPO)](https://arxiv.org/abs/1502.05477)

- [ ] [Actor-Critic with Experience Replay (ACER)](https://arxiv.org/abs/1611.01224)

## blog posts

- [Deep Q-Network Implementation with SONY’s NNabla](https://towardsdatascience.com/deep-q-network-implementation-with-sonys-nnabla-490d945deb8e)