Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/awjuliani/deeprl-agents
A set of Deep Reinforcement Learning Agents implemented in Tensorflow.
https://github.com/awjuliani/deeprl-agents
reinforcement-learning tensorflow
Last synced: about 16 hours ago
JSON representation
A set of Deep Reinforcement Learning Agents implemented in Tensorflow.
- Host: GitHub
- URL: https://github.com/awjuliani/deeprl-agents
- Owner: awjuliani
- License: mit
- Created: 2016-06-14T22:25:31.000Z (over 8 years ago)
- Default Branch: master
- Last Pushed: 2019-02-12T17:26:26.000Z (almost 6 years ago)
- Last Synced: 2024-10-29T17:48:53.509Z (2 months ago)
- Topics: reinforcement-learning, tensorflow
- Language: Jupyter Notebook
- Homepage:
- Size: 352 KB
- Stars: 2,239
- Watchers: 119
- Forks: 826
- Open Issues: 45
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Deep Reinforcement Learning Agents
This repository contains a collection of reinforcement learning algorithms written in Tensorflow. The ipython notebook here were written to go
along with a still-underway tutorial series I have been publishing on [Medium](https://medium.com/@awjuliani/simple-reinforcement-learning-with-tensorflow-part-0-q-learning-with-tables-and-neural-networks-d195264329d0#.4gyadb8a4).
If you are new to reinforcement learning, I recommend reading the accompanying post for each algorithm.The repository currently contains the following algorithms:
* **Q-Table** - An implementation of Q-learning using tables to solve a stochastic environment problem.
* **Q-Network** - A neural network implementation of Q-Learning to solve the same environment as in Q-Table.
* **Simple-Policy** - An implementation of policy gradient method for stateless environments such as n-armed bandit problems.
* **Contextual-Policy** - An implementation of policy gradient method for stateful environments such as contextual bandit problems.
* **Policy-Network** - An implementation of a neural network policy-gradient agent that solves full RL problems with states and delayed rewards, and two opposite actions (ie. CartPole or Pong).
* **Vanilla-Policy** - An implementation of a neural network vanilla-policy-gradient agent that solves full RL problems with states, delayed rewards, and an arbitrary number of actions.
* **Model-Network** - An addition to the Policy-Network algorithm which includes a separate network which models the environment dynamics.
* **Double-Dueling-DQN** - An implementation of a Deep-Q Network with the Double DQN and Dueling DQN additions to improve stability and performance.
* **Deep-Recurrent-Q-Network** - An implementation of a Deep Recurrent Q-Network which can solve reinforcement learning problems involving partial observability.
* **Q-Exploration** - An implementation of DQN containing multiple action-selection strategies for exploration. Strategies include: greedy, random, e-greedy, Boltzmann, and Bayesian Dropout.
* **A3C-Doom** - An implementation of Asynchronous Advantage Actor-Critic (A3C) algorithm. It utilizes multiple agents to collectively improve a policy. This implementation can solve RL problems in 3D environments such as VizDoom challenges.