https://github.com/machine-learning-tokyo/reinforcement_learning
Material for MLT Reinforcement Learning workshops and study sessions
https://github.com/machine-learning-tokyo/reinforcement_learning
Last synced: 10 months ago
JSON representation
Material for MLT Reinforcement Learning workshops and study sessions
- Host: GitHub
- URL: https://github.com/machine-learning-tokyo/reinforcement_learning
- Owner: Machine-Learning-Tokyo
- Created: 2019-07-23T02:12:10.000Z (over 6 years ago)
- Default Branch: master
- Last Pushed: 2020-06-20T15:38:37.000Z (over 5 years ago)
- Last Synced: 2025-04-18T16:26:40.079Z (10 months ago)
- Language: Jupyter Notebook
- Size: 5.01 MB
- Stars: 51
- Watchers: 5
- Forks: 9
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Reinforcement_Learning
Material for MLT Reinforcement Learning workshops and study sessions.
Also, check out our [MLT repo](https://github.com/Machine-Learning-Tokyo/Deep_Reinforcement_Learning) with top Deep RL resources (tutorials, code, books).
# RL Interactive Tools
1. ε Decay
2. k-Armed Bandit
3. Exploration vs Explotation
- Original concept and Python code: [Anugraha Sinha](https://twitter.com/anugrahasinha)
- Javascript implementation: [Francisco Dalla Rosa Soares](https://twitter.com/dallarosajp)
# Intro to Reinforcement Learning – Session #1
by [Anugraha Sinha](https://twitter.com/anugrahasinha)
### [[Meetup]](https://www.meetup.com/Machine-Learning-Tokyo/events/263347323/) & [[Slides and Code]](https://github.com/Machine-Learning-Tokyo/Reinforcement_Learning/tree/master/session%20%231)
Presentation
1. Introduction to RL
2. Important elements of an RL problem
3. Description of Markov Decision Process (MDP) and and Markov Assumption.
4. Importance of parametrization of State, Action, Reward and Environment.
5. Model Based and Model Free Methods
6. Meaning of Control Problem and Evaluation Problem.
7. Algorithm of Policy Evaluation and Value iteration methods
Code examples
1. Finding the best route through a maze/obstruction avoidance using policy iteration algorithm.
2. Above problem statement with value iterations algorithm.
3. Code exercise