Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
awesome-rl
Awesome RL: Papers, Books, Codes, Benchmarks
https://github.com/dbobrenko/awesome-rl
Last synced: about 2 hours ago
JSON representation
-
RL Frameworks & Implementations <a name="frameworks"></a>
-
RL Benchmarks <a name="benchmarks"></a>
-
Policy-Based Generic Agents <a name="policy-agents"></a>
- [Soft Actor Critic - actor-critic-deep-reinforcement.html)] [[code](https://github.com/rail-berkeley/softlearning/)] 2018 @ Google Brain, UC Berkeley
- [IMPALA
- [Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR, A2C)
- [Proximal Policy Optimization Algorithms (PPO) - baselines-ppo/)] 2017 @ OpenAI
- [High-dimensional continuous control using generalized advantage estimation (GAE)
- [Trust Region Policy Optimization (TRPO)
- [Actor-Critic Algorithms, pdf
- [Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning (REINFORCE), pdf
-
Value-Based Generic Agents <a name="value-agents"></a>
- [Implicit Quantile Networks for Distributional Reinforcement Learning (IQN)
- [A Distributional Perspective on Reinforcement Learning (c51)
- [Rainbow: Combining Improvements in Deep Reinforcement Learning
- [Dueling Network Architectures for Deep Reinforcement Learning (Dueling DQN)
- [Deep Reinforcement Learning with Double Q-learning (Double DQN)
- [Playing Atari with Deep Reinforcement Learning** (DQN)
- [Temporal Difference Learning and TD-Gammon, pdf
- [Temporal Difference Learning and TD-Gammon, pdf
-
![model-based] Model-Based Generic Agents <a name="model-based"></a>
- [Model-Based Reinforcement Learning for Atari
- navigation
- locomotion - based-rl/)] [[code](https://github.com/nagaban2/nn_dynamics)] 2017 @ Berkeley
- locomotion - imagine-and-plan/)] 2017 @ Google DeepMind
- navigation
-
![evolution] Evolutionary Algorithms <a name="evolution"></a>
-
![exploration] Exploration <a name="exploration"></a>
- [Go-Explore
- [Exploration by Random Network Distillation (RND) - learning-with-prediction-based-rewards/)] [[code](https://github.com/openai/random-network-distillation)] 2018 @ OpenAI
- navigation - scale-curiosity/)] 2018 @ OpenAI, Berkeley, Univ. of Edinburgh
- [RUDDER: Return Decomposition for Delayed Rewards - jku/baselines-rudder)] 2018 @ Johannes Kepler Univ. Linz
- [Deep Curiosity Search
- locomotion
- transfer - imagine-and-plan/)] 2017 @ DeepMind
-
![self-play] Self-Play <a name="self-play"></a>
-
![meta-learning] Meta-Learning
- locomotion - a-hierarchy/)] Frans et al., 2017 @ OpenAI, Berkeley.
- [Hybrid Reward Architecture for Reinforcement Learning (HRA)
-
![multi-agent-rl] Multi-Agent RL
- [Learning with Opponent-Learning Awareness (LOLA) - to-model-other-minds/)] Foerster et al., 2017 @ OpenAI, Oxford, Berkeley, CMU
-
![inverse-rl] Inverse RL <a name="inverse-rl"></a>
-
![navigation] Navigation <a name="navigation"></a>
- [Learning to Navigate in Cities Without a Map
- [Human-level performance in first-person multiplayer games with population-based deep reinforcement learning - the-flag/)] Jaderberg et al, 2018 @ DeepMind
- generalization
- [Learning to Navigate in Complex Environments
- transfer
- meta-learning
- [Learning to act by predicting the future (VizDoom 2016 Full DM Winner)
- [Playing FPS Games with Deep Reinforcement Learning (VizDoom 2016 Limited DM 2nd place)
-
![manipulation] Manipulation <a name="manipulation"></a>
- generalization - dexterity/)] Andrychowicz et al., 2018 @ OpenAI
- generalization - from-simulation/)] Pinto et al., 2017 @ OpenAI, CMU
- generalization - from-simulation/)] Peng et al., 2017 @ OpenAI, Berkeley
-
![locomotion] Locomotion <a name="locomotion"></a>
- [Emergence of Locomotion Behaviours in Rich Environments - flexible-behaviours-simulated-environments/)] Heess et al., 2017 @ DeepMind
- [Programmable Agents
-
![auto-ml] Auto ML <a name="auto-ml"></a>
-
Other Domains <a name="other-domains"></a>
-
Books
-
Search for new Papers
-
Misc
Programming Languages
Categories
RL Frameworks & Implementations <a name="frameworks"></a>
9
Policy-Based Generic Agents <a name="policy-agents"></a>
8
Value-Based Generic Agents <a name="value-agents"></a>
8
![navigation] Navigation <a name="navigation"></a>
8
![exploration] Exploration <a name="exploration"></a>
7
![auto-ml] Auto ML <a name="auto-ml"></a>
5
![model-based] Model-Based Generic Agents <a name="model-based"></a>
5
![evolution] Evolutionary Algorithms <a name="evolution"></a>
4
![manipulation] Manipulation <a name="manipulation"></a>
3
![inverse-rl] Inverse RL <a name="inverse-rl"></a>
3
RL Benchmarks <a name="benchmarks"></a>
3
![self-play] Self-Play <a name="self-play"></a>
3
![locomotion] Locomotion <a name="locomotion"></a>
2
Misc
2
![meta-learning] Meta-Learning
2
Books
1
Search for new Papers
1
![multi-agent-rl] Multi-Agent RL
1
Other Domains <a name="other-domains"></a>
1
Sub Categories
Keywords
pytorch
2
reinforcement-learning
2
tensorflow
1
rl
1
ml
1
google
1
ai
1
toolbox
1
stable-baselines
1
sde
1
sb3
1
robotics
1
reinforcement-learning-algorithms
1
python
1
openai
1
machine-learning
1
gym
1
gsde
1
second-order
1
roboschool
1
proximal-policy-optimization
1
ppo
1
natural-gradients
1
mujoco
1
kronecker-factored-approximation
1
kfac
1
hessian
1
deep-reinforcement-learning
1
deep-learning
1
continuous-control
1
atari
1
ale
1
advantage-actor-critic
1
actor-critic
1
acktr
1
a2c
1
baselines
1