https://github.com/zhuliquan/reinforcement_learning_basic_book
这是一个学习强化学习基础原理的仓库,主要包括了《深入浅出强化学习原理入门》书中一些例子和课后作业的代码
https://github.com/zhuliquan/reinforcement_learning_basic_book
Last synced: 2 months ago
JSON representation
这是一个学习强化学习基础原理的仓库,主要包括了《深入浅出强化学习原理入门》书中一些例子和课后作业的代码
- Host: GitHub
- URL: https://github.com/zhuliquan/reinforcement_learning_basic_book
- Owner: zhuliquan
- Created: 2018-04-09T10:27:16.000Z (about 7 years ago)
- Default Branch: master
- Last Pushed: 2018-12-04T08:22:06.000Z (over 6 years ago)
- Last Synced: 2025-04-10T05:08:06.818Z (2 months ago)
- Language: Python
- Homepage:
- Size: 49.8 MB
- Stars: 258
- Watchers: 5
- Forks: 102
- Open Issues: 8
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# 代码说明
## 描述
> 这是一个我学习《深入浅出强化学习-原理入门》的学习代码仓库,主要是一些书上的例子和书后面的练习题的代码
## 目录
### 1-gym二次开发(gym develop)
1. [gym二次开发相关文件配置](https://github.com/zhuliquan/reinforcement_learning_basic_book/blob/master/1-gym_developing/README.md)
2. [改写gym下的core.py文件](https://github.com/zhuliquan/reinforcement_learning_basic_book/blob/master/1-gym_developing/core.py)
3. [利用gym二次开发的一个网格游戏例子](https://github.com/zhuliquan/reinforcement_learning_basic_book/blob/master/1-gym_developing/grid_game.py)
4. [利用gym二次开发的一个迷宫游戏例子](https://github.com/zhuliquan/reinforcement_learning_basic_book/blob/master/1-gym_developing/maze_game.py)
### 2-马尔科夫决策过程(Markov Decision Process)
1. [学习生活的例子](https://github.com/zhuliquan/reinforcement_learning_basic_book/blob/master/2-markov_decision_process/our_life.py)
2. [里面对于迷宫的环境模拟的课后作业](https://github.com/zhuliquan/reinforcement_learning_basic_book/blob/master/2-markov_decision_process/game.py)
### 3-动态规划(Dynamic Program)
1. [网格游戏在均匀策略下的策略评估例子](https://github.com/zhuliquan/reinforcement_learning_basic_book/blob/master/3-dynamic_program/grid_game_with_average_policy.py)
2. [策略迭代算法流程图](https://github.com/zhuliquan/reinforcement_learning_basic_book/blob/master/3-dynamic_program/policy_iteration_algorithm.png)
3. [网格游戏在贪婪策略下的策略迭代例子](https://github.com/zhuliquan/reinforcement_learning_basic_book/blob/master/3-dynamic_program/grid_game_with_policy_iterate.py)
4. [值迭代算法流程图](https://github.com/zhuliquan/reinforcement_learning_basic_book/blob/master/3-dynamic_program/value_iteration_algorithm.png)
5. [网格游戏在贪婪测略下的值迭代例子](https://github.com/zhuliquan/reinforcement_learning_basic_book/blob/master/3-dynamic_program/grid_game_with_value_iterate.py)
6. [迷宫游戏在动态规划下的课后作业](https://github.com/zhuliquan/reinforcement_learning_basic_book/blob/master/3-dynamic_program/maze_game_with_dynamic_program.py)
### 4-蒙特卡洛值迭代(Monte Carlo)
1. [蒙特卡罗方法采样](https://github.com/zhuliquan/reinforcement_learning_basic_book/blob/master/4-monte_carlo/monte_carlo_sample.py)
2. [蒙特卡罗方法评估](https://github.com/zhuliquan/reinforcement_learning_basic_book/blob/master/4-monte_carlo/monte_carlo_evaluate.py)
### 5-时间差分值迭代(Temporal Difference)
1. [Q-learning算法流程图](https://github.com/zhuliquan/reinforcement_learning_basic_book/blob/master/5-temporal_difference/q_learning_algortihm.png)
2. [Sarsa算法流程图](https://github.com/zhuliquan/reinforcement_learning_basic_book/master/5-temporal_difference/sarsa_algorithm.png)
3. [Sarsa(λ)算法流程图](https://github.com/zhuliquan/reinforcement_learning_basic_book/master/5-temporal_difference/sarsa_lambda_algorithm.png)
4. [利用gym二次开发的一个推箱子游戏例子](https://github.com/zhuliquan/reinforcement_learning_basic_book/master/5-temporal_difference/push_box_game.py)
5. [利用时间差分学习推箱子实例](https://github.com/zhuliquan/reinforcement_learning_basic_book/tree/master/5-temporal_difference/push_box_game)
### 6-值函数逼近(Value Function Approximate)
1. [Deep Q-learning算法流程图](https://github.com/zhuliquan/reinforcement_learning_basic_book/blob/master/6-value_function_approximate/deep_q_network_algortihm.png)
2. [Deep Q-learning算法模板](https://github.com/zhuliquan/reinforcement_learning_basic_book/blob/master/6-value_function_approximate/deep_q_network_template.py)
3. [利用Deep Q-learning写的flappy游戏](https://github.com/zhuliquan/reinforcement_learning_basic_book/tree/master/6-value_function_approximate/deep_learning_flappy_bird)