Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/ttitcombe/dqn
PyTorch implementation of Deep Q Learning
https://github.com/ttitcombe/dqn
cartpole deep-learning deep-q-learning deep-q-network deep-reinforcement-learning dqn dqn-pytorch openai-gym pytorch pytorch-implementation reinforcement-learning
Last synced: 9 days ago
JSON representation
PyTorch implementation of Deep Q Learning
- Host: GitHub
- URL: https://github.com/ttitcombe/dqn
- Owner: TTitcombe
- License: mit
- Created: 2019-07-17T07:25:23.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2020-01-06T22:06:34.000Z (almost 5 years ago)
- Last Synced: 2024-11-07T06:32:53.337Z (about 2 months ago)
- Topics: cartpole, deep-learning, deep-q-learning, deep-q-network, deep-reinforcement-learning, dqn, dqn-pytorch, openai-gym, pytorch, pytorch-implementation, reinforcement-learning
- Language: Python
- Homepage:
- Size: 24.2 MB
- Stars: 2
- Watchers: 1
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# DQN
![DDQN CarRacing](results/videos/carracing_good.gif)
This is a basic implementation of Deep Q Learning. We have implemented linear and convolutional DQN and DDQN models, with
DQN and double DQN algorithmsNext steps:
* Model trained on Atari games
* Prioritized Experience Replay## To run
To begin, setup [OpenAI gym](https://gym.openai.com/) and install the packages in `requirements.txt`.We have an example script which trains a model on the CartPoleSwingUp environment (this requires gym <= 0.9.4).
Run `python -m examples.cartpoleswingup_linear` in the top-level directory.(To run Box2D environments, I used [this Docker container](https://github.com/TTitcombe/docker_openai_gym) -
check it out if you are also having problems installing Gym)## Results
The best models trained on each env are present in `results/models/`. There you will find the saved pytorch model as a `.pth` file and
a graph comparing the reward per episode against random play.**Linear models**
| Model | Env | Score |
|-------|-----------------|:---------------:|
| DQN | CartPole-v1 | 500 |
| DQN | CartPoleSwingUp\* | 872 +/- 3 |You can see how dependant the linear model is on various hyperparameters in the following graph
![DQN linear investigation](results/CartpoleSwingUp_investigation.png)
\**Note: While we can train high-performing models on CartPoleSwingUp, these are very unstable, even when training for millions of frames and with a large (100000) capacity memory.
It is not clear why this is the case.***Conv model**
| Model | Env | Score |
|-------|-----|:-----:|![DQN_CartPoleSwingUp_Example](results/videos/double_dqn_cartpoleswingup.gif)