https://github.com/luckyzxl2016/reinforcement-learning
Reinforcement Learning学习之路
https://github.com/luckyzxl2016/reinforcement-learning
machine-learning python reinforcement-learning
Last synced: over 1 year ago
JSON representation
Reinforcement Learning学习之路
- Host: GitHub
- URL: https://github.com/luckyzxl2016/reinforcement-learning
- Owner: LuckyZXL2016
- Created: 2019-02-22T02:30:16.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2019-05-29T13:53:04.000Z (about 7 years ago)
- Last Synced: 2025-03-18T17:44:59.641Z (over 1 year ago)
- Topics: machine-learning, python, reinforcement-learning
- Language: Python
- Homepage:
- Size: 49.8 KB
- Stars: 32
- Watchers: 1
- Forks: 35
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# 强化学习的博客及配套代码
记录自己强化学习由浅入深的学习过程,目前主要参考的资料是David Silver的公开课,下面提到的代码有部分源于网络。
## [目录](#目录)
- [强化学习博客与代码](#强化学习博客与代码)
## 强化学习博客与代码:
|**博客** | **代码** |
| --------------------------------------------------------------------------------------------- |:-------------:|
| [强化学习-术语和数学符号](https://blog.csdn.net/u011254180/article/details/84031546) | 无 |
| [强化学习(一)简介](https://blog.csdn.net/u011254180/article/details/83349455) | 无 |
| [强化学习(二)马尔科夫决策过程](https://blog.csdn.net/u011254180/article/details/83387344) | 无 |
| [强化学习(三)动态规划寻找最优策略](https://blog.csdn.net/u011254180/article/details/83573220) | 无 |
| [强化学习(四)不基于模型的预测](https://blog.csdn.net/u011254180/article/details/83994391) | 无 |
| [强化学习(五)不基于模型的控制](https://blog.csdn.net/u011254180/article/details/84253095) | 无 |
| [强化学习实践(一)Tic-Tac-Toe游戏](https://blog.csdn.net/u011254180/article/details/86479795) | [代码](/01-blog_code/Tic-Tac-Toe/example.py) |
| [强化学习实践(二)迭代法评估4\*4方格世界下的随机策略](https://blog.csdn.net/u011254180/article/details/88133551) | [代码](/01-blog_code/Gridworld/gridworld.py) |
| [强化学习实践(三)理解gym的建模思想](https://blog.csdn.net/u011254180/article/details/88211536) | 无 |
| [强化学习实践(四)编写通用的格子世界环境类](https://blog.csdn.net/u011254180/article/details/88220484) | [代码](/01-blog_code/Gridworld2/gridworld2.py) |
| [强化学习实践(五)Agent类和SARSA算法实现](https://blog.csdn.net/u011254180/article/details/88430601) | [代码](/01-blog_code/sarsa/sarsa.py) |
| [强化学习实践(六)SARSA(λ)算法实现](https://blog.csdn.net/u011254180/article/details/88673519) | [代码](/01-blog_code/sarsa/sarsa(lambda).py) |
| [强化学习(六)价值函数的近似表示](https://blog.csdn.net/u011254180/article/details/89238765) | 无 |
| [强化学习实践(七)给Agent添加记忆功能](https://blog.csdn.net/u011254180/article/details/89326920) | [代码](/01-blog_code/core/core.py) |
| [强化学习(七)策略梯度](https://blog.csdn.net/u011254180/article/details/89431822) | 无 |
| [强化学习(八)整合学习与规划](https://blog.csdn.net/u011254180/article/details/89556617) | 无 |
| [强化学习(九)探索与利用](https://blog.csdn.net/u011254180/article/details/90063387) | 无 |
| [强化学习实践(八)DQN的实现](https://blog.csdn.net/u011254180/article/details/90240163) | [代码](/01-blog_code/dqn/approxagent.py) |