awesome-deep-reinforcement-learning
Curated list for Deep Reinforcement Learning (DRL): software frameworks, models, datasets, gyms, baselines...
https://github.com/jgvictores/awesome-deep-reinforcement-learning
Last synced: 9 days ago
JSON representation
-
Similar pages
-
Neural Networks (NN) and Deep Neural Networks (DNN)
-
NN/DNN Datasets
- awesomedata/awesome-public-datasets
- MNIST - MNIST](https://github.com/rois-codh/kmnist).
- CIFAR-100
- Visual Genome
- MIT Places
- SVHN
- PASCAL VOC
- MIT MM Stimuli
- HowTo100M
- text8
- UMICH SI650
- wikipedia
- DOI: 10.1145/3447526.3472059
- Quick Draw (Google)
- MNIST - MNIST](https://github.com/rois-codh/kmnist).
- SVHN
- MIT Places
- PASCAL VOC
- iCubWorld - camera-dataset](https://github.com/muratkrty/iCub-camera-dataset).
- text8
- HICO
-
NN/DNN Software Frameworks
- ml5
- Torch
- CoreML - C) (support: Apple)
- Caffe
- presentation - deep-reinforcement-learning/blob/143a885cc10b4331b9b3fa3e1a9436d5325676af/doc/inria2017DLFrameworks.pdf)).
- safari
- safari
- 1
- DALI - accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.
- Sonnet
- OpenNN
- tensorflow/tensorflow - level) (API: Python most stable, JavaScript, C++, Java...) (support: Google). [](https://github.com/tensorflow/tensorflow/stargazers) 
- keras-team/keras - team/keras)](https://github.com/keras-team/keras/stargazers) 
- pytorch/pytorch - commit/pytorch/pytorch?label=last%20update)
- oneapi-src/oneDNN
- 1
- GitHub
- flashlight/flashlight - commit/flashlight/flashlight?label=last%20update)
- PaddlePaddle
- https://github.com/janhuenermann/neurojs - commit/janhuenermann/neurojs?label=last%20update)
- jittor
- GitHub
- GitHub
- 1 - docker), [3](https://github.com/bethgelab/docker-deeplearning).
- 1
- 1
- sony/nnabla
- 1
- oneapi-src/oneDNN
- keras - learning-python), [2](https://elitedatascience.com/keras-tutorial-deep-learning-in-python)
- DL4J
- PyBrain
- OpenNN
- Sonnet
- OpenCV
- GitHub
- Chainer
- Darknet
- PyBrain
-
NN/DNN Models
- arxiv - 3 weeks.
- arxiv - 7 million parameters, via smaller convs. A more aggressive cropping approach than that of Krizhevsky. Batch normalization, image distortions, RMSprop. Uses 9 novel "Inception modules" (at each layer of a traditional ConvNet, you have to make a choice of whether to have a pooling operation or a conv operation as well as the choice of filter size; an Inception module performa all these operations in parallel), and no fully connected. Trained on CPU (estimated as weeks via GPU) implemented in DistBelief (closed-source predecessor of TensorFlow). Variants ([summary](https://towardsdatascience.com/a-simple-guide-to-the-versions-of-the-inception-network-7fc52b863202)): v1, v2, v4, resnet v1, resnet v2; v9 ([slides](http://lsun.cs.princeton.edu/slides/Christian.pdf)). Also see [Xception (2017)](https://arxiv.org/pdf/1610.02357.pdf) paper.
- arxiv
- arxiv
- arxiv
- arxiv - Single-Shot-MultiBox-Detector)
- arxiv - Adversarial-Networks)
- arxiv - brief-history-of-cnns-in-image-segmentation-from-r-cnn-to-mask-r-cnn-34ea83205de4)): Fast R-CNN, Faster R-CNN, Mask R-CNN.
- arxiv
- arxiv
- arxiv
- arxiv
- arxiv - freiburg.de/people/ronneber/u-net/).
- arxiv - pytorch)
- 1
- arxiv - fcis).
- arxiv
- 1 - tricks.com/cnn/understand-resnet-alexnet-vgg-inception/), [3](https://medium.com/@sidereal/cnns-architectures-lenet-alexnet-vgg-googlenet-resnet-and-more-666091488df5)
- arxiv
- arxiv
- doi - justified finer tuning and visualization (namely Deconvolutional Network).
- doi
- Geometric deep learning
- ref
- tensorflow
- arxiv - painterly-harmonization)
- arxiv - photo-styletransfer)
- arxiv - style), keras [1](https://github.com/keras-team/keras/blob/master/examples/neural_style_transfer.py) [2](https://github.com/titu1994/Neural-Style-Transfer) [3](https://github.com/handong1587/handong1587.github.io/blob/master/_posts/deep_learning/2015-10-09-fun-with-deep-learning.md) [4](https://medium.com/mlreview/making-ai-art-with-style-transfer-using-keras-8bb5fa44b216)
- FTTNet - Time Speaker-Dependent Neural Vocoder". [pytorch](https://github.com/mozilla/FFTNet)
- keras
- keras - image-similarity)
- arxiv
- wikipedia
- 1
- ref
- 1
- 1
- ref
- 1
- ref
- 1
- doi
- ref
- 1
- ref
- 1
- ref
- 1
- ref
- 1
- ref
- 1
- ref
- 1
- ref
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- ref
- 1
- ref
- 1
- ref
- chihming/awesome-network-embedding
- thunlp/GNNPapers
- caffe
- CycleGAN - Yan Zhu et Al; Berkeley; "Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks". [torch](https://github.com/junyanz/CycleGAN) and migrated to [pytorch](https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix).
- hindupuravinash/the-gan-zoo
- 1
- facebookresearch/Detectron
- DLG
- pytorch
- tensorflow/gnn
- pytorch
- ref
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- 1
- facebookresearch/Detectron
- doi - 61 million parameters, split into 2 pipelines to enable 5-6 day GTX 580 GPU training (while CPU data augmentation).
- tensorflow
- WaveNet
- arxiv
-
NN/DNN Visualization and Explanation
- tensorboard
- keras - deep-learning-neural-network-model-keras/), [2](https://github.com/keplr-io/quiver), [3](https://raghakot.github.io/keras-vis/), [4](https://www.kaggle.com/amarjeet007/visualize-cnn-with-keras)
- EthicalML/xai
- tensorboardX
- loss-landscape
- netscope
- netscope
- slundberg/shap
-
NN/DNN Techniques Misc
- wikipedia - validation).
- keras
- wikipedia - activations/), [ref](https://towardsdatascience.com/deep-study-of-a-not-very-deep-neural-network-part-2-activation-functions-fd9bd8d406fc).
- keras
- keras
- keras
- wikipedia
- keras
- facebookresearch/nevergrad
- tensorflow - tutorial-fine-tuning-using-pre-trained-models/)
-
NN/DNN Pretrained Models
- keras web - team/keras/tree/master/keras/applications), [keras 2](https://github.com/keras-team/keras-applications), [pytorch](https://pytorch.org/docs/stable/torchvision/models.html), [caffe](https://github.com/BVLC/caffe/wiki/Model-Zoo), [ONNX](https://github.com/onnx/models) (pytorch/caffe2).
- keras
- keras by keras - team/keras/tree/e15533e6c725dca8c37a861aacb13ef149789433/keras/applications)) / [keras by kaggle](https://www.kaggle.com/keras) / [pytorch by kaggle](https://www.kaggle.com/pytorch)
- keras
- keras
- caffe by original VGG author
- gensim
- keras - 10 weights](https://drive.google.com/open?id=0B4odNGNGJ56qVW9JdkthbzBsX28) / [keras CIFAR-100 weights](https://drive.google.com/open?id=0B4odNGNGJ56qTEdnT1RjTU44Zms)
-
NN/DNN Benchmarks
-
-
General Machine Learning (ML)
-
General ML Books
-
General ML Software Frameworks
-
-
Reinforcement Learning (RL) and Deep Reinforcement Learning (DRL)
-
RL/DRL Algorithms
- Reinforcement Learning Specialization - 20). Note that another major separation is off/on policy RL algorithms. DRL methods would fit into function approximators.
- Part 2: Kinds of RL Algorithms - Rendered from <https://github.com/openai/spinningup/blob/038665d62d569055401d91856abb287263096178/docs/spinningup/rl_intro2.rst>
-
RL/DRL Algorithm Implementations and Software Frameworks
- RL-Glue - glue-ext/wikis/RLGlueCore.wiki)) (API: C/C++, Java, Matlab, Python, Lisp) (support: Alberta)
- keras-rl/keras-rl - rl/keras-rl)](https://github.com/keras-rl/keras-rl/stargazers) 
- ray-project/ray - project/ray)](https://github.com/ray-project/ray/stargazers)  (Ray total) (also covers multiagent)
- google/dopamine - commit/google/dopamine?label=last%20update)
- Unity-Technologies/ml-agents - Technologies/ml-agents)](https://github.com/Unity-Technologies/ml-agents/stargazers) 
- oxwhirl/pymarl - agent reinforcement learning [](https://github.com/oxwhirl/pymarl/stargazers) 
- openai/spinningup - commit/openai/spinningup?label=last%20update)
- DLR-RM/stable-baselines3 - a/stable-baselines](https://github.com/hill-a/stable-baselines) fork of [openai/baselines](https://github.com/openai/baselines)) [](https://github.com/DLR-RM/stable-baselines3/stargazers) 
- vwxyzjn/cleanrl - commit/vwxyzjn/cleanrl?label=last%20update)
- catalyst-team/catalyst - team/catalyst)](https://github.com/catalyst-team/catalyst/stargazers) 
- tensorflow/agents - commit/tensorflow/agents?label=last%20update)
- astooke/rlpyt - commit/astooke/rlpyt?label=last%20update)
- thu-ml/tianshou - ml/tianshou)](https://github.com/thu-ml/tianshou/stargazers) 
- chainer/chainerrl - commit/chainer/chainerrl?label=last%20update)
- MushroomRL/mushroom-rl - rl)](https://github.com/MushroomRL/mushroom-rl/stargazers) 
- rail-berkeley/rlkit - berkeley/rlkit)](https://github.com/rail-berkeley/rlkit/stargazers) 
- SurrealAI/surreal - commit/SurrealAI/surreal?label=last%20update)
- SoyGema/Startcraft_pysc2_minigames
- medipixel/rl_algorithms - commit/medipixel/rl_algorithms?label=last%20update)
- qfettes/DeepRL-Tutorials - Tutorials)](https://github.com/qfettes/DeepRL-Tutorials/stargazers) 
- tinkoff-ai/CORL - quality single-file implementations of SOTA Offline RL algorithms: AWAC, BC, CQL, DT, EDAC, IQL, SAC-N, TD3+BC" [](https://github.com/tinkoff-ai/CORL/stargazers) 
- rll/rllab - commit/rll/rllab?label=last%20update)
- facebookresearch/mbrl-lib - lib)](https://github.com/facebookresearch/mbrl-lib/stargazers) 
- haarnoja/sac
- ikostrikov/jaxrl - commit/ikostrikov/jaxrl?label=last%20update)
- ikostrikov/jaxrl2 - commit/ikostrikov/jaxrl2?label=last%20update)
- learnables/cherry - commit/learnables/cherry?label=last%20update)
- reinforceio/tensorforce - commit/reinforceio/tensorforce?label=last%20update)
- deepmind/bsuite - commit/deepmind/bsuite?label=last%20update)
- deepmind/trfl - commit/deepmind/trfl?label=last%20update)
- deepmind/acme - commit/deepmind/acme?label=last%20update)
- trackmania-rl/tmrl - rl/tmrl)](https://github.com/trackmania-rl/tmrl/stargazers) 
- ethanluoyc/magi - commit/ethanluoyc/magi?label=last%20update)
- Asap7772/PTR - Training for Robots: Leveraging Diverse Multitask Data via Offline Reinforcement Learning
- ikostrikov/pytorch-a2c-ppo-acktr
- google/jax - commit/google/jax?label=last%20update)
-
RL/DRL Environments
- Microsoft/malmo
- Farama-Foundation/Gymnasium - Foundation/Gymnasium)](https://github.com/Farama-Foundation/Gymnasium/stargazers) . ~~DEPRECATED: [openai/gym](https://github.com/openai/gym), <https://gym.openai.com>, <https://gym.openai.com/docs/>~~
- koulanurag/ma-gym
- ppaquette/gym-doom
- duckietown/gym-duckietown
- minerllabs/minerl
- openai/retro
- openai/gym-soccer
- openai/roboschool
- openai/safety-gym
- benelot/pybullet-gym
- arex18/rocket-lander
- stanfordnmbl/osim-rl
- Unity-Technologies/obstacle-tower-env
- Unity-Technologies/marathon-envs
- Improbable-AI/walk-these-ways
- Farama-Foundation/PettingZoo - Foundation/PettingZoo)](https://github.com/Farama-Foundation/PettingZoo/stargazers) 
- leggedrobotics/legged_gym
- osudrl/cassie-mujoco-sim
- thedimlebowski/Trading-Gym
- eugenevinitsky/sequential_social_dilemma_games
- nadavbh12/Retro-Learning-Environment
- erlerobot/gym-gazebo
- Farama-Foundation/Minigrid
- utiasDSL/safe-control-gym
- upb-lea/openmodelica-microgrid-gym
- intelligent-environments-lab/CityLearn
- upb-lea/gym-electric-motor
- Farama-Foundation/Gymnasium-Robotics
- facebookresearch/minihack
- qgallouedec/panda-gym
- Healthcare-Robotics/assistive-gym
- Farama-Foundation/ViZDoom
- JKCooper2/gym-bandits
- kngwyu/mujoco-maze
- eleurent/highway-env
- deepmind/pysc2
- tobirohrer/building-energy-storage-simulation
- huggingface/gym-xarm
- NVIDIA-Omniverse/IsaacGymEnvs
- twitter/torch-twrl
- Farama-Foundation/MiniWorld
- LucasAlegre/sumo-rl
- dartsim/gym-dart
- Roboy/gym-roboy
- ucuapps/modelicagym
- denisyarats/dmc2gym
- UtkarshMishra04/bioimitation-gym
- magni84/gym_bandits
- ThomasLecat/gym-bandit-environments
- diegoalejogm/openai-k-armed-bandits
- Phylliade/awesome-openai-gym-environments
- Farama-Foundation/MAgent2
-
RL/DRL Benchmarking
- stepjam/RLBench - scale benchmark and learning environment." [](https://github.com/stepjam/RLBench/stargazers) 
- rlworkgroup/garage - commit/rlworkgroup/garage?label=last%20update)
- google-research/robel - research/robel)](https://github.com/google-research/robel/stargazers) 
- google-research/rliable - research/rliable)](https://github.com/google-research/rliable/stargazers) 
- Farama-Foundation/D4RL - berkeley/d4rl](https://github.com/rail-berkeley/d4rl)) [](https://github.com/Farama-Foundation/D4RL/stargazers) 
- Farama-Foundation/Minari - Foundation/Kabuki](https://github.com/Farama-Foundation/Kabuki)) [](https://github.com/Farama-Foundation/Minari/stargazers) 
- google-research/rlds - research/rlds)](https://github.com/google-research/rlds/stargazers) 
- google-research/rl-reliability-metrics - research/rl-reliability-metrics)](https://github.com/google-research/rl-reliability-metrics/stargazers) 
- HYDesmondLiu/B2RL - commit/HYDesmondLiu/B2RL?label=last%20update)
-
-
Evolutionary Algorithms (EA)
-
RL/DRL Books
-
Programming Languages
Categories
Sub Categories
NN/DNN Models
136
RL/DRL Environments
53
NN/DNN Software Frameworks
39
RL/DRL Algorithm Implementations and Software Frameworks
36
NN/DNN Datasets
21
NN/DNN Techniques Misc
10
RL/DRL Books
9
RL/DRL Benchmarking
9
NN/DNN Visualization and Explanation
8
NN/DNN Pretrained Models
8
RL/DRL Algorithms
3
General ML Books
1
General ML Software Frameworks
1
NN/DNN Benchmarks
1
Keywords
reinforcement-learning
43
deep-learning
29
machine-learning
29
python
17
pytorch
16
gym
11
tensorflow
11
deep-reinforcement-learning
10
neural-network
10
openai-gym
7
robotics
7
dqn
6
rl
6
ml
5
gymnasium
5
gym-environment
4
deep-neural-networks
4
sac
3
cpp
3
mujoco
3
simulation
3
computer-vision
3
chainer
3
pybullet
3
torch
3
awesome-list
3
control
3
ppo
3
ai
3
atari
3
research
3
keras
3
neural-networks
3
jax
3
actor-critic
3
a2c
3
x86-64
2
x64
2
vnni
2
tbb
2
sycl
2
python3
2
performance
2
machinelearning
2
openmp
2
onednn
2
oneapi
2
deepmind
2
library
2
aarch64
2