https://github.com/soumik12345/twin-delayed-ddpg
Pytorch Implementation of Twin Delayed Deep Deterministic Policy Gradients for Continuous Control
https://github.com/soumik12345/twin-delayed-ddpg
box2d deeplearning openai-gym pytorch reinforcement-learning
Last synced: about 1 month ago
JSON representation
Pytorch Implementation of Twin Delayed Deep Deterministic Policy Gradients for Continuous Control
- Host: GitHub
- URL: https://github.com/soumik12345/twin-delayed-ddpg
- Owner: soumik12345
- Created: 2020-03-17T13:09:32.000Z (about 5 years ago)
- Default Branch: master
- Last Pushed: 2020-08-16T16:01:26.000Z (almost 5 years ago)
- Last Synced: 2025-05-12T13:12:36.672Z (about 1 month ago)
- Topics: box2d, deeplearning, openai-gym, pytorch, reinforcement-learning
- Language: Jupyter Notebook
- Homepage: https://arxiv.org/abs/1802.09477
- Size: 8.54 MB
- Stars: 12
- Watchers: 3
- Forks: 5
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Twin Delayed DDGP
Pytorch Implementation of Twin Delayed Deep Deterministic Policy Gradients Algorithm for Continuous Control as described by the paper [Addressing Function Approximation Error in Actor-Critic Methods](https://arxiv.org/abs/1802.09477) by Scott Fujimoto, Herke van Hoof, David Meger.
## Results
### BipedalWalker-V3
**Environment Link:** [https://gym.openai.com/envs/BipedalWalker-v2/](https://gym.openai.com/envs/BipedalWalker-v2/)
**Mean Reward:** `295.263390447903` sampled over `20` evaluation episodes.
Experiment Conducted on **Free-P5000** instance provided by [Paperspace Gradient](gradient.paperspace.com).

### LunarLanderContinuous-V2
**Environment Link:** [https://gym.openai.com/envs/LunarLanderContinuous-v2/](https://gym.openai.com/envs/LunarLanderContinuous-v2/)
**Mean Reward:** `272.55341062406666` sampled over `20` evaluation episodes.
Experiment Conducted on **Free-P5000** instance provided by [Paperspace Gradient](gradient.paperspace.com).

## Reference
```
@misc{1802.09477,
Author = {Scott Fujimoto and Herke van Hoof and David Meger},
Title = {Addressing Function Approximation Error in Actor-Critic Methods},
Year = {2018},
Eprint = {arXiv:1802.09477},
}
```