https://github.com/applenob/rl_learn
我的强化学习笔记和学习材料:book: still updating ... ...
https://github.com/applenob/rl_learn
learning-by-doing reinforcement-learning
Last synced: about 1 year ago
JSON representation
我的强化学习笔记和学习材料:book: still updating ... ...
- Host: GitHub
- URL: https://github.com/applenob/rl_learn
- Owner: applenob
- Created: 2017-09-22T02:38:40.000Z (over 8 years ago)
- Default Branch: master
- Last Pushed: 2019-06-09T14:29:30.000Z (about 7 years ago)
- Last Synced: 2025-03-30T10:06:44.572Z (about 1 year ago)
- Topics: learning-by-doing, reinforcement-learning
- Language: Jupyter Notebook
- Homepage:
- Size: 74.6 MB
- Stars: 343
- Watchers: 12
- Forks: 117
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# [WIP]强化学习的学习仓库
这是我个人学习**强化学习**的时候收集的比较经典的学习资料、笔记和代码,分享给所有人。
为了直接在GitHub上用markdown文件看公式,推荐安装chrome插件:[MathJax Plugin for Github](https://chrome.google.com/webstore/detail/mathjax-plugin-for-github/ioemnmodlmafdkllaclgeombjnmnbima)
## 入门指南
- [入门指南](learning_route.md)
## 课程笔记
- [David Silver 的 Reinforcement Learning 课程学习笔记。](class_note.ipynb)
- [课程对应的所有PPT](slides)
- Sutton 的 Reinforcement Learning: An Introduction书本学习笔记
- [1. Introduction](notes/intro_note_01.md)
- [2. Multi-armed Bandits](notes/intro_note_02.md)
- [3. Finite Markov DecisionProcesses](notes/intro_note_03.md)
- [4. Dynamic Programming](notes/intro_note_04.md)
- [5. Monte Carlo Methods](notes/intro_note_05.md)
- [6. Temporal-Difference Learning](notes/intro_note_06.md)
- [7. n-step Bootstrapping](notes/intro_note_07.md)
- [8. Planning and Learning with Tabular Methods](notes/intro_note_08.md)
- [9. On-policy Prediction with Approximation](notes/intro_note_09.md)
- [10. On-policy Control with Approximation](notes/intro_note_10.md)
- [11. Off-policy Methods with Approximation](notes/intro_note_11.md)
- [12. Eligibility Traces](notes/intro_note_12.md)
- [13. Policy Gradient Methods](notes/intro_note_13.md)
- [14. Psychology](notes/intro_note_14.md)
- [15. Neuroscience](notes/intro_note_15.md)
- [16. Applications and Case Studies](notes/intro_note_16.md)
- [17. Frontiers](notes/intro_note_17.md)
- [书本的各版本pdf](book)
- [2017-6 draft](book/bookdraft2017june19.pdf)
- [2018 second edition](book/bookdraft2018.pdf)
## 实验目录
所有的实验源代码都在`lib`目录下,来自[dennybritz](https://github.com/dennybritz/reinforcement-learning)。在原先代码的基础上,增加了对实验背景的具体介绍、代码和公式的对照。
- [Gridworld](exp/1_gridworld.ipynb):对应**MDP**的**Dynamic Programming**
- [Blackjack](exp/2_blackjack.ipynb):对应**Model Free**的**Monte Carlo**的Planning和Controlling
- [Windy Gridworld](exp/3_windy_gridworld.ipynb):对应**Model Free**的**Temporal Difference**的**On-Policy Controlling**:**SARSA**。
- [Cliff Walking](exp/4_cliff_walking.ipynb):对应**Model Free**的**Temporal Difference**的**Off-Policy Controlling**:**Q-learning**。
- [Mountain Car](exp/5_mountain_car.ipynb):对应Q表格很大无法处理(state空间连续)的**Q-Learning with Linear Function Approximation**。
- [Atari](exp/6_atari.ipynb):对应**Deep-Q Learning**。
## 其他重要学习资料:
- [WildML的博客](http://www.wildml.com/2016/10/learning-reinforcement-learning/)
- [David Silver’s Reinforcement Learning Course](http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching.html)
- [Reinforcement Learning: An Introduction](http://incompleteideas.net/book/the-book-2nd.html)
- [书本的python代码实现](https://github.com/ShangtongZhang/reinforcement-learning-an-introduction)