Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/carpedm20/naf-tensorflow

"Continuous Deep Q-Learning with Model-based Acceleration" in TensorFlow
https://github.com/carpedm20/naf-tensorflow

continuous-rl deep-learning deep-reinforcement-learning gym reinforcement-learning tensorflow

Last synced: 6 days ago
JSON representation

"Continuous Deep Q-Learning with Model-based Acceleration" in TensorFlow

Awesome Lists containing this project

README

        

# Normalized Advantage Functions (NAF) in TensorFlow

TensorFlow implementation of [Continuous Deep q-Learning with Model-based Acceleration](http://arxiv.org/abs/1603.00748).

![algorithm](https://github.com/carpedm20/naf-tensorflow/blob/master/assets/algorithm.png)

## Requirements

- Python 2.7
- [gym](https://github.com/openai/gym)
- [TensorFlow](https://www.tensorflow.org/) 0.9+

## Usage

First, install prerequisites with:

$ pip install tqdm gym[all]

To train a model for an environment with a continuous action space:

$ python main.py --env_name=Pendulum-v0 --is_train=True
$ python main.py --env_name=Pendulum-v0 --is_train=True --display=True

To test and record the screens with gym:

$ python main.py --env_name=Pendulum-v0 --is_train=False
$ python main.py --env_name=Pendulum-v0 --is_train=False --display=True

## Results

Training details of `Pendulum-v0` with different hyperparameters.

$ python main.py --env_name=Pendulum-v0 # dark green
$ python main.py --env_name=Pendulum-v0 --action_fn=tanh # light green
$ python main.py --env_name=Pendulum-v0 --use_batch_norm=True # yellow
$ python main.py --env_name=Pendulum-v0 --use_seperate_networks=True # green

![Pendulum-v0_2016-07-15](https://github.com/carpedm20/naf-tensorflow/blob/master/assets/Pendulum-v0_2016-07-15.png)

## References

- [rllab](https://github.com/rllab/rllab.git)
- [keras implementation](https://gym.openai.com/evaluations/eval_CzoNQdPSAm0J3ikTBSTCg)

## Author

Taehoon Kim / [@carpedm20](http://carpedm20.github.io/)