Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/soumik12345/ddpg
Pytorch implementation of the Deep Deterministic Policy Gradients for Continuous Control
https://github.com/soumik12345/ddpg
bipedalwalker deep-deterministic-policy-gradient lunarlander openai-gym pytorch reinforcement-learning
Last synced: about 2 months ago
JSON representation
Pytorch implementation of the Deep Deterministic Policy Gradients for Continuous Control
- Host: GitHub
- URL: https://github.com/soumik12345/ddpg
- Owner: soumik12345
- Created: 2020-03-15T10:42:43.000Z (almost 5 years ago)
- Default Branch: master
- Last Pushed: 2022-12-08T03:48:32.000Z (about 2 years ago)
- Last Synced: 2024-10-24T14:28:03.688Z (about 2 months ago)
- Topics: bipedalwalker, deep-deterministic-policy-gradient, lunarlander, openai-gym, pytorch, reinforcement-learning
- Language: Jupyter Notebook
- Homepage: https://arxiv.org/abs/1509.02971
- Size: 9.4 MB
- Stars: 26
- Watchers: 2
- Forks: 6
- Open Issues: 6
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Deep Deterministic Policy Gradients
[![HitCount](http://hits.dwyl.com/soumik12345/DDPG.svg)](http://hits.dwyl.com/soumik12345/DDPG)
Pytorch implementation of the Deep Deterministic Policy Gradients Algorithm for Continuous Control as described by the paper [Continuous control with deep reinforcement learning](https://arxiv.org/abs/1509.02971) by Timothy P. Lillicrap, Jonathan J. Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, Daan Wierstra.
## Results
### BipedalWalker-V3
**Environment Link:** [https://gym.openai.com/envs/BipedalWalker-v2/](https://gym.openai.com/envs/BipedalWalker-v2/)
**Mean Reward:** `169.5047038212551` sampled over `20` evaluation episodes.
Experiment Conducted on **Free-P5000** instance provided by [Paperspace Gradient](gradient.paperspace.com).
![](./Results/BipedalWalker-V3.gif)
### LunarLanderContinuous-V2
**Mean Environment Link:** [https://gym.openai.com/envs/LunarLanderContinuous-v2/](https://gym.openai.com/envs/LunarLanderContinuous-v2/)
**Reward:** `277.938417002226` sampled over `20` evaluation episodes.
Experiment Conducted on **Free-P5000** instance provided by [Paperspace Gradient](gradient.paperspace.com).
![](./Results/LunarLanderContinuous-V2.gif)
## Reference
```
@misc{1509.02971,
Author = {Timothy P. Lillicrap and Jonathan J. Hunt and Alexander Pritzel and Nicolas Heess and Tom Erez and Yuval Tassa and David Silver and Daan Wierstra},
Title = {Continuous control with deep reinforcement learning},
Year = {2015},
Eprint = {arXiv:1509.02971},
}
```