Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/djbyrne/td3
Implementation of the TD3 algorithm written in Pytorch
https://github.com/djbyrne/td3
Last synced: 15 days ago
JSON representation
Implementation of the TD3 algorithm written in Pytorch
- Host: GitHub
- URL: https://github.com/djbyrne/td3
- Owner: djbyrne
- Created: 2019-06-05T05:34:14.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2022-12-08T05:13:08.000Z (about 2 years ago)
- Last Synced: 2023-08-01T12:17:04.871Z (over 1 year ago)
- Language: Jupyter Notebook
- Size: 12.4 MB
- Stars: 11
- Watchers: 2
- Forks: 4
- Open Issues: 11
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# TD3
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1foJRBKv0ymV7I5cpmQ6bI4v6CyJ-qATs)
An implementation of the TD3 algorithm trained on the Roboschool HalfCheetah environment using pytorch. The code here is based on the work of the original authors of the TD3 algorithm found [here](https://github.com/sfujim/TD3).
## Getting Started
These instructions will demonstrate how to setup a conda environment with all requirements for the project setup.
### Installing
```
conda env create -n rl_dev python=3.6conda activate rl_dev
git clone https://github.com/djbyrne/TD3.git
cd TD3
python setup.py install
jupyter notebook
```### Results
The notebook uses the same hyperparameters and architecture described in the paper. The agent is trained for 5 million timesteps. The agent converged on a successfull policy after 500k timesteps. The results below show the agents avg score over the previous 100 episodes.
As you can see, the agent learned rapidly and then briefly fell into a local optima. However, the agent was able to quickly recover itself. I believe with hyperparameter tuning and a proper sample of trained agents, the results could still improve.
## Acknowledgments
* Scott Fujimoto [TD3](https://github.com/sfujim/TD3)
* OpenAI [Spinning Up](https://github.com/openai/spinningup)
* OpenAI [Baselines](https://github.com/openai/baselines)