Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/spragunr/deep_q_rl
Theano-based implementation of Deep Q-learning
https://github.com/spragunr/deep_q_rl
Last synced: 13 days ago
JSON representation
Theano-based implementation of Deep Q-learning
- Host: GitHub
- URL: https://github.com/spragunr/deep_q_rl
- Owner: spragunr
- License: bsd-3-clause
- Created: 2014-09-29T16:09:34.000Z (about 10 years ago)
- Default Branch: master
- Last Pushed: 2017-04-14T01:57:26.000Z (over 7 years ago)
- Last Synced: 2024-10-15T10:22:27.883Z (28 days ago)
- Language: Python
- Size: 1.3 MB
- Stars: 1,078
- Watchers: 98
- Forks: 348
- Open Issues: 17
-
Metadata Files:
- Readme: README.md
- License: LICENSE.txt
Awesome Lists containing this project
README
# Introduction
This package provides a Lasagne/Theano-based implementation of the deep
Q-learning algorithm described in:[Playing Atari with Deep Reinforcement Learning](http://arxiv.org/abs/1312.5602)
Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis
Antonoglou, Daan Wierstra, Martin Riedmillerand
Mnih, Volodymyr, et al. "Human-level control through deep reinforcement learning." Nature 518.7540 (2015): 529-533.
Here is a video showing a trained network playing breakout (using an earlier version of the code):
http://youtu.be/SZ88F82KLX4
# Dependencies
* A reasonably modern NVIDIA GPU
* OpenCV
* [Theano](http://deeplearning.net/software/theano/) ([https://github.com/Theano/Theano](https://github.com/Theano/Theano))
* [Lasagne](http://lasagne.readthedocs.org/en/latest/) ([https://github.com/Lasagne/Lasagne](https://github.com/Lasagne/Lasagne)
* [Pylearn2](http://deeplearning.net/software/pylearn2/) ([https://github.com/lisa-lab/pylearn2](https://github.com/lisa-lab/pylearn2))
* [Arcade Learning Environment](http://www.arcadelearningenvironment.org/) ([https://github.com/mgbellemare/Arcade-Learning-Environment](https://github.com/mgbellemare/Arcade-Learning-Environment))The script `dep_script.sh` can be used to install all dependencies under Ubuntu.
# Running
Use the scripts `run_nips.py` or `run_nature.py` to start all the necessary processes:
`$ ./run_nips.py --rom breakout`
`$ ./run_nature.py --rom breakout`
The `run_nips.py` script uses parameters consistent with the original
NIPS workshop paper. This code should take 2-4 days to complete. The
`run_nature.py` script uses parameters consistent with the Nature
paper. The final policies should be better, but it will take 6-10
days to finish training.Either script will store output files in a folder prefixed with the
name of the ROM. Pickled version of the network objects are stored
after every epoch. The file `results.csv` will contain the testing
output. You can plot the progress by executing `plot_results.py`:`$ python plot_results.py breakout_05-28-17-09_0p00025_0p99/results.csv`
After training completes, you can watch the network play using the
`ale_run_watch.py` script:`$ python ale_run_watch.py breakout_05-28-17-09_0p00025_0p99/network_file_99.pkl`
# Performance Tuning
## Theano Configuration
Setting `allow_gc=False` in `THEANO_FLAGS` or in the `.theanorc` file
significantly improves performance at the expense of a slight increase
in memory usage on the GPU.# Getting Help
The [deep Q-learning web-forum](https://groups.google.com/forum/#!forum/deep-q-learning)
can be used for discussion and advice related to deep Q-learning in
general and this package in particular.# See Also
* https://sites.google.com/a/deepmind.com/dqn
This is the code DeepMind used for the Nature paper. The license
only permits the code to be used for "evaluating and reviewing" the
claims made in the paper.* https://github.com/muupan/dqn-in-the-caffe
Working Caffe-based implementation. (I haven't tried it, but there
is a video of the agent playing Pong successfully.)* https://github.com/kristjankorjus/Replicating-DeepMind
Defunct? As far as I know, this package was never fully functional. The project is described here:
http://robohub.org/artificial-general-intelligence-that-plays-atari-video-games-how-did-deepmind-do-it/* https://github.com/brian473/neural_rl
This is an almost-working implementation developed during Spring
2014 by my student Brian Brown. I haven't reused his code, but
Brian and I worked together to puzzle through some of the blank
areas of the original paper.