https://github.com/snowkylin/async_rl

Tensorflow implementation of asyncronous 1-step Q learning in "Asynchronous Methods for Deep Reinforcement Learning" with improvement on weight update process (use minibatch) to speed up training.
https://github.com/snowkylin/async_rl

asynchronous-methods deep-reinforcement-learning mini-batch tensorflow

Last synced: 11 months ago
JSON representation

Tensorflow implementation of asyncronous 1-step Q learning in "Asynchronous Methods for Deep Reinforcement Learning" with improvement on weight update process (use minibatch) to speed up training.

Host: GitHub
URL: https://github.com/snowkylin/async_rl
Owner: snowkylin
Created: 2017-01-23T12:36:26.000Z (over 9 years ago)
Default Branch: master
Last Pushed: 2017-01-31T12:32:35.000Z (over 9 years ago)
Last Synced: 2025-06-04T00:02:54.041Z (about 1 year ago)
Topics: asynchronous-methods, deep-reinforcement-learning, mini-batch, tensorflow
Language: Python
Homepage:
Size: 124 MB
Stars: 5
Watchers: 3
Forks: 3
Open Issues: 0
Metadata Files:
- Readme: readme.md

Awesome Lists containing this project

README

          # Play Atari Games with TensorFlow and Asynchronous RL

This is a Tensorflow implementation of asyncronous 1-step Q learning with improvement on weight update process (use minibatch) to speed up training. Algorithm can be fount at [Asynchronous Methods for Deep Reinforcement Learning](https://arxiv.org/abs/1602.01783)

## Demo

[![Play Flappy Bird with TensorFlow ](https://img.youtube.com/vi/ZxHAf5BM0QM/0.jpg)](https://www.youtube.com/watch?v=ZxHAf5BM0QM)

## Dependencies

* Python

* TensorFlow

* gym (with atari environment)

* OpenCV-Python

## Usage

Run `play.py` to play atari game (default is Breakout-v0) by trained network.

Run `train.py` to train the network on your computer.

You will get a comparatively good result (40+ score) when t is larger than 2000000. On my computer (i5-4590/16GB/GTX 1060 6GB), the training process need at least 2-3 hours.

## Evaluation

You can find the eval at https://gym.openai.com/evaluations/eval_03aUUz45Sc6TBg0vifljwA , which takes 40 hours to train the network.

## Credit

* [coreylynch/async-rl](https://github.com/coreylynch/async-rl)

* [yenchenlin/DeepLearningFlappyBird](https://github.com/yenchenlin/DeepLearningFlappyBird)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/snowkylin/async_rl

Awesome Lists containing this project

README