https://github.com/sthalles/asynchronous-advantage-actor-critic
https://github.com/sthalles/asynchronous-advantage-actor-critic
Last synced: 4 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/sthalles/asynchronous-advantage-actor-critic
- Owner: sthalles
- Created: 2018-06-25T22:52:01.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2018-06-26T00:50:09.000Z (over 7 years ago)
- Last Synced: 2025-04-13T22:40:53.836Z (6 months ago)
- Language: Python
- Size: 1.15 MB
- Stars: 1
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: Readme.md
Awesome Lists containing this project
README
# Asynchronous Actor Critic (A3C) Tensorflow implementation
## Dependencies
- Python 3.x
- TensorFlow 1.8
- Numpy
- Openai gym## How to run the training algorithm
To run the algorithm using the default parameters```
$ python train_agent.py
```- The final model metadata will be saved at:
*./model/\/*- Tensorboard information regarding **Average score per episode** behaviour during training as well as **losses** and **learning rate dacay** curves will be saved at: *./summary/\/train/*
## Training
The following graph shows the training performance (average score per episode) of our agent running the A3C algorithm on the DemonAttack game using the openai gyn environment.

Next we see the learning rate decay, policy loss, value loss, and loss (policy + value losses) behavior during training. The formulas for the gradients computation can be found at:
[Asynchronous Methods for Deep Reinforcement Learning](https://arxiv.org/abs/1602.01783)

## Evaluation
To run the algorithm using the default parameters
```
$ python test_agent.py
```The following shows the average score per game during testing. The y-axis represents the total score the agent accomplished while the x-axis the number of games.
