https://github.com/soumik12345/twin-delayed-ddpg

Pytorch Implementation of Twin Delayed Deep Deterministic Policy Gradients for Continuous Control
https://github.com/soumik12345/twin-delayed-ddpg

box2d deeplearning openai-gym pytorch reinforcement-learning

Last synced: about 1 month ago
JSON representation

Pytorch Implementation of Twin Delayed Deep Deterministic Policy Gradients for Continuous Control

Host: GitHub
URL: https://github.com/soumik12345/twin-delayed-ddpg
Owner: soumik12345
Created: 2020-03-17T13:09:32.000Z (about 5 years ago)
Default Branch: master
Last Pushed: 2020-08-16T16:01:26.000Z (almost 5 years ago)
Last Synced: 2025-05-12T13:12:36.672Z (about 1 month ago)
Topics: box2d, deeplearning, openai-gym, pytorch, reinforcement-learning
Language: Jupyter Notebook
Homepage: https://arxiv.org/abs/1802.09477
Size: 8.54 MB
Stars: 12
Watchers: 3
Forks: 5
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

        # Twin Delayed DDGP

Pytorch Implementation of Twin Delayed Deep Deterministic Policy Gradients Algorithm for Continuous Control as described by the paper [Addressing Function Approximation Error in Actor-Critic Methods](https://arxiv.org/abs/1802.09477) by Scott Fujimoto, Herke van Hoof, David Meger.

## Results

### BipedalWalker-V3

**Environment Link:** [https://gym.openai.com/envs/BipedalWalker-v2/](https://gym.openai.com/envs/BipedalWalker-v2/)

**Mean Reward:** `295.263390447903` sampled over `20` evaluation episodes.

Experiment Conducted on **Free-P5000** instance provided by [Paperspace Gradient](gradient.paperspace.com).

![](./Results/BipedalWalker-v3.gif)

### LunarLanderContinuous-V2

**Environment Link:** [https://gym.openai.com/envs/LunarLanderContinuous-v2/](https://gym.openai.com/envs/LunarLanderContinuous-v2/)

**Mean Reward:** `272.55341062406666` sampled over `20` evaluation episodes.

Experiment Conducted on **Free-P5000** instance provided by [Paperspace Gradient](gradient.paperspace.com).

![](./Results/LunarLander-v2.gif)

## Reference

```

@misc{1802.09477,

    Author = {Scott Fujimoto and Herke van Hoof and David Meger},

    Title = {Addressing Function Approximation Error in Actor-Critic Methods},

    Year = {2018},

    Eprint = {arXiv:1802.09477},

}

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/soumik12345/twin-delayed-ddpg

Awesome Lists containing this project

README