An open API service indexing awesome lists of open source software.

https://github.com/scitator/rl-intro


https://github.com/scitator/rl-intro

Last synced: about 1 year ago
JSON representation

Awesome Lists containing this project

README

          

## RL Intro

2019 edition - Gym intro, Genetics, CEM, Tabular DQN

#### 0. Gym interface
- `00-gym.ipynb` [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Scitator/rl-teaser/blob/master/2019/code/00-gym.ipynb)

#### 1. Genetic algorithm
- [slides](./2019/slides/01-genetics.pdf)
- `01-genetics.ipynb` [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Scitator/rl-teaser/blob/master/2019/code/01-genetics.ipynb)

##### Additional materials
* __[recommended]__ - awesome openai post about evolution strategies - [blog post](https://blog.openai.com/evolution-strategies/), [article](https://arxiv.org/abs/1703.03864)
* Video on genetic algorithms - https://www.youtube.com/watch?v=ejxfTy4lI6I
* Another guide to genetic algorithm - https://www.youtube.com/watch?v=zwYV11a__HQ
* PDF on Differential evolution - http://jvanderw.une.edu.au/DE_1.pdf
* Video on Ant Colony Algorithm - https://www.youtube.com/watch?v=D58nLNLkb0I
* Longer video on Ant Colony Algorithm - https://www.youtube.com/watch?v=xpyKmjJuqhk

#### 2. Cross Entropy Method
- [slides](./2019/slides/02-cem.pdf)
- `02-cem.ipynb` [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Scitator/rl-teaser/blob/master/2019/code/02-cem.ipynb)

##### Additional materials
* __[main]__ Video-intro by David Silver - https://www.youtube.com/watch?v=2pWv7GOvuf0
* Optional lecture by David Silver - https://www.youtube.com/watch?v=lfHX2hHRMVQ
* __[recommended]__ - formal explanation of crossentropy method in [general](https://people.smp.uq.edu.au/DirkKroese/ps/CEEncycl.pdf) and for [optimization](https://people.smp.uq.edu.au/DirkKroese/ps/CEopt.pdf)

#### 3. Tabular
- [slides](./2019/slides/03-tabular.pdf)
- `03-tabular.ipynb` [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Scitator/rl-teaser/blob/master/2019/code/03-tabular.ipynb)

##### Additional materials
* __[main]__ lecture by David Silver - [url](https://www.youtube.com/watch?v=Nd1-UUMVfz4)
* Alternative lecture by Pieter Abbeel: [part 1](https://www.youtube.com/watch?v=i0o-ui1N35U), [part 2](https://www.youtube.com/watch?v=Csiiv6WGzKM)
* Alternative lecture by John Schulmann: https://www.youtube.com/watch?v=IL3gVyJMmhg
* Definitive guide in policy/value iteration from Sutton: start from page 81 [here](http://incompleteideas.net/sutton/book/bookdraft2017june19.pdf).

#### 4. DQN
- [slides](./2019/slides/04-dqn.pdf)
- `04-dqn.ipynb` [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Scitator/rl-teaser/blob/master/2019/code/04-dqn.ipynb)

##### Additional materials
* Lecture by David Silver - [video part I](https://www.youtube.com/watch?v=PnHCvfgC_ZA), [video part II](https://www.youtube.com/watch?v=0g4j2k_Ggc4&t=43s)
* Alternative lecture by Pieter Abbeel - [video](https://www.youtube.com/watch?v=ifma8G7LegE)
* Alternative lecture by John Schulmann - [video](https://www.youtube.com/watch?v=IL3gVyJMmhg)
* Blog post on q-learning Vs SARSA - [url](https://studywolf.wordpress.com/2013/07/01/reinforcement-learning-sarsa-vs-q-learning/)
* N-step temporal difference from Sutton's book - [suttonbook](http://incompleteideas.net/book/RLbook2018.pdf) __chapter 7__
* Eligibility traces from Sutton's book - [suttonbook](http://incompleteideas.net/book/RLbook2018.pdf) __chapter 12__
* Blog post on eligibility traces - [url](http://pierrelucbacon.com/traces/)

2020 edition - Deep RL, DQN, DDPG

Credits

* [Berkeley CS188x](http://ai.berkeley.edu/home.html)
* [David Silver's Reinforcement Learning Course](http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching.html)
* [dennybritz/reinforcement-learning](https://github.com/dennybritz/reinforcement-learning)
* [yandexdataschool/Practical_RL](https://github.com/yandexdataschool/Practical_RL)
* [yandexdataschool/AgentNet](https://github.com/yandexdataschool/AgentNet)
* [rl-course-experiments](https://github.com/Scitator/rl-course-experiments)
* [RL-Adventure](https://github.com/higgsfield/RL-Adventure)
* [RL-Adventure-2](https://github.com/higgsfield/RL-Adventure-2)