Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

https://github.com/onlytailei/A3C-PyTorch

PyTorch implementation of Advantage async actor-critic Algorithms (A3C) in PyTorch
https://github.com/onlytailei/A3C-PyTorch

a3c deep-reinforcement-learning

Last synced: 2 months ago
JSON representation

PyTorch implementation of Advantage async actor-critic Algorithms (A3C) in PyTorch

Host: GitHub
URL: https://github.com/onlytailei/A3C-PyTorch
Owner: onlytailei
Created: 2017-03-15T16:46:23.000Z (over 7 years ago)
Default Branch: master
Last Pushed: 2017-04-03T09:13:24.000Z (about 7 years ago)
Last Synced: 2024-01-25T20:08:56.008Z (5 months ago)
Topics: a3c, deep-reinforcement-learning
Language: Python
Homepage:
Size: 2.21 MB
Stars: 113
Watchers: 8
Forks: 22
Open Issues: 1
Metadata Files:
- Readme: README.md

Lists

Awesome-pytorch-list - A3C-PyTorch - critic Algorithms (A3C) in PyTorch (Paper implementations / Other libraries:)
Awesome-pytorch-list-CNVersion - A3C-PyTorch - critic)算法。 (Paper implementations｜论文实现 / Other libraries｜其他库:)

README

        # [Advantage async actor-critic Algorithms (A3C)](https://arxiv.org/abs/1602.01783) in PyTorch

```

@inproceedings{mnih2016asynchronous,

  title={Asynchronous methods for deep reinforcement learning},

  author={Mnih, Volodymyr and Badia, Adria Puigdomenech and Mirza, Mehdi and Graves, Alex and Lillicrap, Timothy P and Harley, Tim and Silver, David and Kavukcuoglu, Koray},

  booktitle={International Conference on Machine Learning},

  year={2016}}

```

This repository contains an implementation of Adavantage async Actor-Critic (A3C) in PyTorch based on the original paper by the authors and the [PyTorch implementation](https://github.com/ikostrikov/pytorch-a3c) by [Ilya Kostrikov](https://github.com/ikostrikov).

A3C is the state-of-art Deep Reinforcement Learning method.

## Dependencies

* Python 2.7

* PyTorch

* gym (OpenAI)

* universe (OpenAI)

* opencv (for env state processing)

* visdom (for visualization)

## Training

```

./train_lstm.sh

```

### Test wigh trained weight after 169000 updates for _PongDeterminisitc-v3_.

```

./test_lstm.sh 169000

```

A test result [video](https://youtu.be/Ohpo6BcMgZw) is available.

### Check the loss curves of all threads in http://localhost:8097

![loss_png](./assets/loss.png)

## References

* [Asynchronous methods for deep reinforcement learning on arXiv](https://arxiv.org/abs/1602.01783)

* [Ilya Kostrikov's implementation](https://github.com/ikostrikov/pytorch-a3c).