https://github.com/ikostrikov/pytorch-rl

pytorch reinforcement-learning reinforcement-learning-algorithms

Last synced: 5 months ago
JSON representation

Host: GitHub
URL: https://github.com/ikostrikov/pytorch-rl
Owner: ikostrikov
Created: 2017-09-08T00:00:28.000Z (about 8 years ago)
Default Branch: master
Last Pushed: 2018-08-28T02:06:01.000Z (about 7 years ago)
Last Synced: 2025-04-05T22:31:44.633Z (6 months ago)
Topics: pytorch, reinforcement-learning, reinforcement-learning-algorithms
Size: 1.95 KB
Stars: 56
Watchers: 3
Forks: 10
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

          # pytorch-rl

A list of references to my reimplementations of RL algorithms:

* Asynchronous Methods for Deep Reinforcement Learning (A3C) ([arxiv](https://arxiv.org/abs/1602.01783), [my code](https://github.com/ikostrikov/pytorch-a3c))

* Advantage Actor Critic (A2C) ([my code](https://github.com/ikostrikov/pytorch-a2c-ppo-acktr))

* Proximal Policy Optimization Algorithms (PPO) ([arxiv](https://arxiv.org/abs/1707.06347), [my code](https://github.com/ikostrikov/pytorch-a2c-ppo-acktr))

* Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR)([arxiv](https://arxiv.org/abs/1707.06347), [my code](https://github.com/ikostrikov/pytorch-a2c-ppo-acktr))

* Trust Region Policy Optimization (TRPO) ([arxiv](https://arxiv.org/pdf/1502.05477.pdf), [my code](https://github.com/ikostrikov/pytorch-trpo))

* Continuous Deep Q-Learning with Model-based Acceleration (NAF) ([arxiv](https://arxiv.org/abs/1603.00748), [my code](https://github.com/ikostrikov/pytorch-naf))

# TODO (volunteers are welcome)

* Move TRPO to a2c-ppo-acktr code, implement it as a hessian free optimizer (as ACKTR is implemented as KFAC)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/ikostrikov/pytorch-rl

Awesome Lists containing this project

README