An open API service indexing awesome lists of open source software.

https://github.com/zhuliquan/reinforcement_learning_basic_book

这是一个学习强化学习基础原理的仓库,主要包括了《深入浅出强化学习原理入门》书中一些例子和课后作业的代码
https://github.com/zhuliquan/reinforcement_learning_basic_book

Last synced: 2 months ago
JSON representation

这是一个学习强化学习基础原理的仓库,主要包括了《深入浅出强化学习原理入门》书中一些例子和课后作业的代码

Awesome Lists containing this project

README

        

# 代码说明
## 描述
> 这是一个我学习《深入浅出强化学习-原理入门》的学习代码仓库,主要是一些书上的例子和书后面的练习题的代码
## 目录
### 1-gym二次开发(gym develop)
1. [gym二次开发相关文件配置](https://github.com/zhuliquan/reinforcement_learning_basic_book/blob/master/1-gym_developing/README.md)
2. [改写gym下的core.py文件](https://github.com/zhuliquan/reinforcement_learning_basic_book/blob/master/1-gym_developing/core.py)
3. [利用gym二次开发的一个网格游戏例子](https://github.com/zhuliquan/reinforcement_learning_basic_book/blob/master/1-gym_developing/grid_game.py)
4. [利用gym二次开发的一个迷宫游戏例子](https://github.com/zhuliquan/reinforcement_learning_basic_book/blob/master/1-gym_developing/maze_game.py)
### 2-马尔科夫决策过程(Markov Decision Process)
1. [学习生活的例子](https://github.com/zhuliquan/reinforcement_learning_basic_book/blob/master/2-markov_decision_process/our_life.py)
2. [里面对于迷宫的环境模拟的课后作业](https://github.com/zhuliquan/reinforcement_learning_basic_book/blob/master/2-markov_decision_process/game.py)
### 3-动态规划(Dynamic Program)
1. [网格游戏在均匀策略下的策略评估例子](https://github.com/zhuliquan/reinforcement_learning_basic_book/blob/master/3-dynamic_program/grid_game_with_average_policy.py)
2. [策略迭代算法流程图](https://github.com/zhuliquan/reinforcement_learning_basic_book/blob/master/3-dynamic_program/policy_iteration_algorithm.png)
3. [网格游戏在贪婪策略下的策略迭代例子](https://github.com/zhuliquan/reinforcement_learning_basic_book/blob/master/3-dynamic_program/grid_game_with_policy_iterate.py)
4. [值迭代算法流程图](https://github.com/zhuliquan/reinforcement_learning_basic_book/blob/master/3-dynamic_program/value_iteration_algorithm.png)
5. [网格游戏在贪婪测略下的值迭代例子](https://github.com/zhuliquan/reinforcement_learning_basic_book/blob/master/3-dynamic_program/grid_game_with_value_iterate.py)
6. [迷宫游戏在动态规划下的课后作业](https://github.com/zhuliquan/reinforcement_learning_basic_book/blob/master/3-dynamic_program/maze_game_with_dynamic_program.py)
### 4-蒙特卡洛值迭代(Monte Carlo)
1. [蒙特卡罗方法采样](https://github.com/zhuliquan/reinforcement_learning_basic_book/blob/master/4-monte_carlo/monte_carlo_sample.py)
2. [蒙特卡罗方法评估](https://github.com/zhuliquan/reinforcement_learning_basic_book/blob/master/4-monte_carlo/monte_carlo_evaluate.py)
### 5-时间差分值迭代(Temporal Difference)
1. [Q-learning算法流程图](https://github.com/zhuliquan/reinforcement_learning_basic_book/blob/master/5-temporal_difference/q_learning_algortihm.png)
2. [Sarsa算法流程图](https://github.com/zhuliquan/reinforcement_learning_basic_book/master/5-temporal_difference/sarsa_algorithm.png)
3. [Sarsa(λ)算法流程图](https://github.com/zhuliquan/reinforcement_learning_basic_book/master/5-temporal_difference/sarsa_lambda_algorithm.png)
4. [利用gym二次开发的一个推箱子游戏例子](https://github.com/zhuliquan/reinforcement_learning_basic_book/master/5-temporal_difference/push_box_game.py)
5. [利用时间差分学习推箱子实例](https://github.com/zhuliquan/reinforcement_learning_basic_book/tree/master/5-temporal_difference/push_box_game)
### 6-值函数逼近(Value Function Approximate)
1. [Deep Q-learning算法流程图](https://github.com/zhuliquan/reinforcement_learning_basic_book/blob/master/6-value_function_approximate/deep_q_network_algortihm.png)
2. [Deep Q-learning算法模板](https://github.com/zhuliquan/reinforcement_learning_basic_book/blob/master/6-value_function_approximate/deep_q_network_template.py)
3. [利用Deep Q-learning写的flappy游戏](https://github.com/zhuliquan/reinforcement_learning_basic_book/tree/master/6-value_function_approximate/deep_learning_flappy_bird)