https://github.com/sthalles/asynchronous-advantage-actor-critic

Last synced: 4 months ago
JSON representation

Host: GitHub
URL: https://github.com/sthalles/asynchronous-advantage-actor-critic
Owner: sthalles
Created: 2018-06-25T22:52:01.000Z (over 7 years ago)
Default Branch: master
Last Pushed: 2018-06-26T00:50:09.000Z (over 7 years ago)
Last Synced: 2025-04-13T22:40:53.836Z (6 months ago)
Language: Python
Size: 1.15 MB
Stars: 1
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: Readme.md

Awesome Lists containing this project

README

# Asynchronous Actor Critic (A3C) Tensorflow implementation

## Dependencies
- Python 3.x
- TensorFlow 1.8
- Numpy
- Openai gym

## How to run the training algorithm
To run the algorithm using the default parameters

```
$ python train_agent.py
```

- The final model metadata will be saved at:
*./model/\/*

- Tensorboard information regarding **Average score per episode** behaviour during training as well as **losses** and **learning rate dacay** curves will be saved at: *./summary/\/train/*

## Training

The following graph shows the training performance (average score per episode) of our agent running the A3C algorithm on the DemonAttack game using the openai gyn environment.

![Average score per episode](./results/average_score_per_episode.png)

Next we see the learning rate decay, policy loss, value loss, and loss (policy + value losses) behavior during training. The formulas for the gradients computation can be found at:

[Asynchronous Methods for Deep Reinforcement Learning](https://arxiv.org/abs/1602.01783)

![Training variables](./results/tensorboard.png)

## Evaluation

To run the algorithm using the default parameters

```
$ python test_agent.py
```

The following shows the average score per game during testing. The y-axis represents the total score the agent accomplished while the x-axis the number of games.

![Average score per episode](./results/average_score_per_game_testing.png)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/sthalles/asynchronous-advantage-actor-critic

Awesome Lists containing this project

README