Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/soumik12345/ddpg

Pytorch implementation of the Deep Deterministic Policy Gradients for Continuous Control
https://github.com/soumik12345/ddpg

bipedalwalker deep-deterministic-policy-gradient lunarlander openai-gym pytorch reinforcement-learning

Last synced: about 2 months ago
JSON representation

Pytorch implementation of the Deep Deterministic Policy Gradients for Continuous Control

Host: GitHub
URL: https://github.com/soumik12345/ddpg
Owner: soumik12345
Created: 2020-03-15T10:42:43.000Z (almost 5 years ago)
Default Branch: master
Last Pushed: 2022-12-08T03:48:32.000Z (about 2 years ago)
Last Synced: 2024-10-24T14:28:03.688Z (about 2 months ago)
Topics: bipedalwalker, deep-deterministic-policy-gradient, lunarlander, openai-gym, pytorch, reinforcement-learning
Language: Jupyter Notebook
Homepage: https://arxiv.org/abs/1509.02971
Size: 9.4 MB
Stars: 26
Watchers: 2
Forks: 6
Open Issues: 6
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

        # Deep Deterministic Policy Gradients

[![HitCount](http://hits.dwyl.com/soumik12345/DDPG.svg)](http://hits.dwyl.com/soumik12345/DDPG)

Pytorch implementation of the Deep Deterministic Policy Gradients Algorithm for Continuous Control as described by the paper [Continuous control with deep reinforcement learning](https://arxiv.org/abs/1509.02971) by Timothy P. Lillicrap, Jonathan J. Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, Daan Wierstra.

## Results

### BipedalWalker-V3

**Environment Link:** [https://gym.openai.com/envs/BipedalWalker-v2/](https://gym.openai.com/envs/BipedalWalker-v2/)

**Mean Reward:** `169.5047038212551` sampled over `20` evaluation episodes.

Experiment Conducted on **Free-P5000** instance provided by [Paperspace Gradient](gradient.paperspace.com).

![](./Results/BipedalWalker-V3.gif)

### LunarLanderContinuous-V2

**Mean Environment Link:** [https://gym.openai.com/envs/LunarLanderContinuous-v2/](https://gym.openai.com/envs/LunarLanderContinuous-v2/)

**Reward:** `277.938417002226` sampled over `20` evaluation episodes.

Experiment Conducted on **Free-P5000** instance provided by [Paperspace Gradient](gradient.paperspace.com).

![](./Results/LunarLanderContinuous-V2.gif)

## Reference

```

@misc{1509.02971,

    Author = {Timothy P. Lillicrap and Jonathan J. Hunt and Alexander Pritzel and Nicolas Heess and Tom Erez and Yuval Tassa and David Silver and Daan Wierstra},

    Title = {Continuous control with deep reinforcement learning},

    Year = {2015},

    Eprint = {arXiv:1509.02971},

}

```