{"id":16389010,"url":"https://github.com/luckyzxl2016/reinforcement-learning","last_synced_at":"2025-03-23T04:31:33.420Z","repository":{"id":105601244,"uuid":"171980595","full_name":"LuckyZXL2016/Reinforcement-Learning","owner":"LuckyZXL2016","description":"Reinforcement Learning学习之路","archived":false,"fork":false,"pushed_at":"2019-05-29T13:53:04.000Z","size":51,"stargazers_count":32,"open_issues_count":0,"forks_count":35,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-03-18T17:44:59.641Z","etag":null,"topics":["machine-learning","python","reinforcement-learning"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/LuckyZXL2016.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-02-22T02:30:16.000Z","updated_at":"2025-02-27T07:53:57.000Z","dependencies_parsed_at":null,"dependency_job_id":"4b49b006-2076-4de6-b36c-feb777537d66","html_url":"https://github.com/LuckyZXL2016/Reinforcement-Learning","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LuckyZXL2016%2FReinforcement-Learning","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LuckyZXL2016%2FReinforcement-Learning/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LuckyZXL2016%2FReinforcement-Learning/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LuckyZXL2016%2FReinforcement-Learning/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/LuckyZXL2016","download_url":"https://codeload.github.com/LuckyZXL2016/Reinforcement-Learning/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245056889,"owners_count":20553855,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["machine-learning","python","reinforcement-learning"],"created_at":"2024-10-11T04:30:46.095Z","updated_at":"2025-03-23T04:31:33.133Z","avatar_url":"https://github.com/LuckyZXL2016.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# 强化学习的博客及配套代码\n记录自己强化学习由浅入深的学习过程，目前主要参考的资料是David Silver的公开课，下面提到的代码有部分源于网络。\n\n## [目录](#目录)\n- [强化学习博客与代码](#强化学习博客与代码)\n\n## 强化学习博客与代码：\n|**博客**                                                                                       | **代码**       | \n| --------------------------------------------------------------------------------------------- |:-------------:| \n| [强化学习-术语和数学符号](https://blog.csdn.net/u011254180/article/details/84031546)            | 无 | \n| [强化学习（一）简介](https://blog.csdn.net/u011254180/article/details/83349455)            | 无      |   \n| [强化学习（二）马尔科夫决策过程](https://blog.csdn.net/u011254180/article/details/83387344)       | 无      |    \n| [强化学习（三）动态规划寻找最优策略](https://blog.csdn.net/u011254180/article/details/83573220)       | 无      |\n| [强化学习（四）不基于模型的预测](https://blog.csdn.net/u011254180/article/details/83994391)       | 无      |\n| [强化学习（五）不基于模型的控制](https://blog.csdn.net/u011254180/article/details/84253095)       | 无      |\n| [强化学习实践（一）Tic-Tac-Toe游戏](https://blog.csdn.net/u011254180/article/details/86479795)       | [代码](/01-blog_code/Tic-Tac-Toe/example.py)      |\n| [强化学习实践（二）迭代法评估4\\*4方格世界下的随机策略](https://blog.csdn.net/u011254180/article/details/88133551)    | [代码](/01-blog_code/Gridworld/gridworld.py)      |\n| [强化学习实践（三）理解gym的建模思想](https://blog.csdn.net/u011254180/article/details/88211536)       | 无  |  \n| [强化学习实践（四）编写通用的格子世界环境类](https://blog.csdn.net/u011254180/article/details/88220484)       | [代码](/01-blog_code/Gridworld2/gridworld2.py)  | \n| [强化学习实践（五）Agent类和SARSA算法实现](https://blog.csdn.net/u011254180/article/details/88430601)       | [代码](/01-blog_code/sarsa/sarsa.py)  |\n| [强化学习实践（六）SARSA(λ)算法实现](https://blog.csdn.net/u011254180/article/details/88673519)       | [代码](/01-blog_code/sarsa/sarsa(lambda).py)  |\n| [强化学习（六）价值函数的近似表示](https://blog.csdn.net/u011254180/article/details/89238765)       | 无      |\n| [强化学习实践（七）给Agent添加记忆功能](https://blog.csdn.net/u011254180/article/details/89326920)       | [代码](/01-blog_code/core/core.py)  |\n| [强化学习（七）策略梯度](https://blog.csdn.net/u011254180/article/details/89431822)       | 无      |\n| [强化学习（八）整合学习与规划](https://blog.csdn.net/u011254180/article/details/89556617)       | 无      |\n| [强化学习（九）探索与利用](https://blog.csdn.net/u011254180/article/details/90063387)       | 无      |\n| [强化学习实践（八）DQN的实现](https://blog.csdn.net/u011254180/article/details/90240163)       | [代码](/01-blog_code/dqn/approxagent.py)  |\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fluckyzxl2016%2Freinforcement-learning","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fluckyzxl2016%2Freinforcement-learning","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fluckyzxl2016%2Freinforcement-learning/lists"}