https://github.com/miyosuda/async_deep_reinforce
  
  
    Asynchronous Methods for Deep Reinforcement Learning 
    https://github.com/miyosuda/async_deep_reinforce
  
a3c deep-learning reinforcement-learning tensorflow
        Last synced: 5 months ago 
        JSON representation
    
Asynchronous Methods for Deep Reinforcement Learning
- Host: GitHub
- URL: https://github.com/miyosuda/async_deep_reinforce
- Owner: miyosuda
- License: apache-2.0
- Created: 2016-04-10T16:10:40.000Z (over 9 years ago)
- Default Branch: master
- Last Pushed: 2018-08-09T09:29:30.000Z (about 7 years ago)
- Last Synced: 2024-08-03T23:03:32.554Z (about 1 year ago)
- Topics: a3c, deep-learning, reinforcement-learning, tensorflow
- Language: Python
- Homepage:
- Size: 299 KB
- Stars: 590
- Watchers: 49
- Forks: 194
- Open Issues: 36
- 
            Metadata Files:
            - Readme: README.md
- License: LICENSE.txt
 
Awesome Lists containing this project
- Github-Repositories - Asynchronous Methods for Deep Reinforcement Learning
README
          # async_deep_reinforce
Asynchronous deep reinforcement learning
## About
An attempt to repdroduce Google Deep Mind's paper "Asynchronous Methods for Deep Reinforcement Learning."
http://arxiv.org/abs/1602.01783
Asynchronous Advantage Actor-Critic (A3C) method for playing "Atari Pong" is implemented with TensorFlow.
Both A3C-FF and A3C-LSTM are implemented.
Learning result movment after 26 hours (A3C-FF) is like this.
[](https://youtu.be/ZU71YdAedZs)
Any advice or suggestion is strongly welcomed in issues thread.
https://github.com/miyosuda/async_deep_reinforce/issues/1
## How to build
First we need to build multi thread ready version of Arcade Learning Enviroment.
I made some modification to it to run it on multi thread enviroment.
    $ git clone https://github.com/miyosuda/Arcade-Learning-Environment.git
    $ cd Arcade-Learning-Environment
    $ cmake -DUSE_SDL=ON -DUSE_RLGLUE=OFF -DBUILD_EXAMPLES=OFF .
    $ make -j 4
	
    $ pip install .
I recommend to install it on VirtualEnv environment.
## How to run
To train,
    $python a3c.py
To display the result with game play,
    $python a3c_disp.py
## Using GPU
To enable gpu, change "USE_GPU" flag in "constants.py".
When running with 8 parallel game environemts, speeds of GPU (GTX980Ti) and CPU(Core i7 6700) were like this. (Recorded with LOCAL_T_MAX=20 setting.)
|type | A3C-FF             |A3C-LSTM          |
|-----|--------------------|------------------|
| GPU | 1722 steps per sec |864 steps per sec |
| CPU | 1077 steps per sec |540 steps per sec |
## Result
Score plots of local threads of pong were like these. (with GTX980Ti)
### A3C-LSTM LOCAL_T_MAX = 5

### A3C-LSTM LOCAL_T_MAX = 20

Scores are not averaged using global network unlike the original paper.
## Requirements
- TensorFlow r1.0
- numpy
- cv2
- matplotlib
## References
This project uses setting written in muupan's wiki [muuupan/async-rl] (https://github.com/muupan/async-rl/wiki)
## Acknowledgements
- [@aravindsrinivas](https://github.com/aravindsrinivas) for providing information for some of the hyper parameters.