{"id":18049591,"url":"https://github.com/dalmia/david-silver-reinforcement-learning","last_synced_at":"2025-04-04T10:08:18.141Z","repository":{"id":51536257,"uuid":"116924594","full_name":"dalmia/David-Silver-Reinforcement-learning","owner":"dalmia","description":"Notes for the Reinforcement Learning course by David Silver along with implementation of various algorithms.","archived":false,"fork":false,"pushed_at":"2022-03-31T00:28:42.000Z","size":22952,"stargazers_count":801,"open_issues_count":5,"forks_count":213,"subscribers_count":21,"default_branch":"master","last_synced_at":"2025-03-28T09:08:31.466Z","etag":null,"topics":["artificial-intelligence","course-notes","gym-environment","open-ai","python","reinforcement-learning"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/dalmia.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2018-01-10T07:38:51.000Z","updated_at":"2025-03-26T18:25:47.000Z","dependencies_parsed_at":"2022-08-12T23:31:13.269Z","dependency_job_id":null,"html_url":"https://github.com/dalmia/David-Silver-Reinforcement-learning","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dalmia%2FDavid-Silver-Reinforcement-learning","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dalmia%2FDavid-Silver-Reinforcement-learning/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dalmia%2FDavid-Silver-Reinforcement-learning/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dalmia%2FDavid-Silver-Reinforcement-learning/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/dalmia","download_url":"https://codeload.github.com/dalmia/David-Silver-Reinforcement-learning/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247157283,"owners_count":20893220,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["artificial-intelligence","course-notes","gym-environment","open-ai","python","reinforcement-learning"],"created_at":"2024-10-30T21:08:10.726Z","updated_at":"2025-04-04T10:08:18.117Z","avatar_url":"https://github.com/dalmia.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# David-Silver-Reinforcement-learning \n\n[![Tweet](https://img.shields.io/twitter/url/http/shields.io.svg?style=social)](https://twitter.com/intent/tweet?text=David%20Silver%20Reinforcement%20Learning%20course%20notes%20along%20with%20implementation\u0026url=https://github.com/dalmia/David-Silver-Reinforcement-learning\u0026hashtags=deeplearning,reinforcementlearning,python,machinelearning,keras)\n\n[![apm](https://img.shields.io/apm/l/vim-mode.svg)]()\n[![Build Status](https://travis-ci.org/athityakumar/colorls.svg?branch=master)](https://travis-ci.org/athityakumar/colorls)\n[![PRs Welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg?style=shields)](http://makeapullrequest.com)\n\nThis repository contains the notes for the Reinforcement Learning [course](www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching.html) by [David Silver](http://www0.cs.ucl.ac.uk/staff/d.silver/web/Home.html) along with the implementation of the various algorithms discussed, both in Keras (with TensorFlow backend) and [OpenAI](https://openai.com/)'s [gym](https://github.com/openai/gym) framework.\n\n## Syllabus:\n\n- Week 1: Introduction to Reinforcement Learning [[slide](http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching_files/intro_RL.pdf)][[video](https://www.youtube.com/watch?v=2pWv7GOvuf0\u0026list=PL7-jPKtc4r78-wCZcQn5IqyuWhBZ8fOxT\u0026index=1)]\n\n- Week 2: Markov Decision Processes  [[slide](http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching_files/MDP.pdf)][[video](https://www.youtube.com/watch?v=lfHX2hHRMVQ\u0026list=PL7-jPKtc4r78-wCZcQn5IqyuWhBZ8fOxT\u0026index=2\u0026t=3223s)]\n\n- Week 3: Planning by Dynamic Programming  [[slide](http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching_files/DP.pdf)][[video](https://www.youtube.com/watch?v=Nd1-UUMVfz4\u0026list=PL7-jPKtc4r78-wCZcQn5IqyuWhBZ8fOxT\u0026index=3\u0026t=417s)]\n\n- Week 4: Model-Free Prediction  [[slide](http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching_files/MC-TD.pdf)][[video](https://www.youtube.com/watch?v=PnHCvfgC_ZA\u0026list=PL7-jPKtc4r78-wCZcQn5IqyuWhBZ8fOxT\u0026index=4)]\n\n- Week 5: Model-Free Control  [[slide](http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching_files/control.pdf)][[video](https://www.youtube.com/watch?v=0g4j2k_Ggc4\u0026list=PL7-jPKtc4r78-wCZcQn5IqyuWhBZ8fOxT\u0026index=5)]\n\n- Week 6: Value Function Approximation  [[slide](http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching_files/FA.pdf)][[video](https://www.youtube.com/watch?v=UoPei5o4fps\u0026list=PL7-jPKtc4r78-wCZcQn5IqyuWhBZ8fOxT\u0026index=6)]\n\n- Week 7: Policy Gradient Methods  [[slide](http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching_files/pg.pdf)][[video](https://www.youtube.com/watch?v=KHZVXao4qXs\u0026list=PL7-jPKtc4r78-wCZcQn5IqyuWhBZ8fOxT\u0026index=7)]\n\n- Week 8: Integrating Learning and Planning  [[slide](http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching_files/dyna.pdf)][[video](https://www.youtube.com/watch?v=ItMutbeOHtc\u0026list=PL7-jPKtc4r78-wCZcQn5IqyuWhBZ8fOxT\u0026index=8)]\n\n- Week 9: Exploration and Exploitation  [[slide](http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching_files/XX.pdf)][[video](https://www.youtube.com/watch?v=sGuiWX07sKw\u0026list=PL7-jPKtc4r78-wCZcQn5IqyuWhBZ8fOxT\u0026index=9)]\n\n- Week 10: Case Study: RL in Classic Games  [[slide](http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching_files/games.pdf)][[video](https://www.youtube.com/watch?v=kZ_AUmFcZtk\u0026list=PL7-jPKtc4r78-wCZcQn5IqyuWhBZ8fOxT\u0026index=10)]\n\n\n## Dependencies\n- TensorFlow\n- Keras\n- Gym\n- Numpy\n\nInstall them using [pip](https://www.google.co.in/url?sa=t\u0026rct=j\u0026q=\u0026esrc=s\u0026source=web\u0026cd=1\u0026cad=rja\u0026uact=8\u0026ved=0ahUKEwjRhLWLnfHYAhVEtY8KHRqfCc4QFggoMAA\u0026url=https%3A%2F%2Fpip.pypa.io%2Fen%2Fstable%2F\u0026usg=AOvVaw18gydNGbBQg6WMxXoxO97K).\n\n## Contributing\nPlease feel free to create a Pull Request for adding implementations of the algorithms discussed in different frameworks like PyTorch, Caffe, etc. or improving the existing implementations. If you are a beginner, you can refer [this](https://opensource.guide/how-to-contribute/) for getting started.\n\n## Support\nIf you found this useful, please consider starring(★) the repo so that it can reach a broader audience.\n\n## License\nThis project is licensed under the MIT License - see the [LICENSE](https://github.com/dalmia/David-Silver-Reinforcement-learning/blob/master/LICENSE) file for details.\n\n## References\n- https://github.com/dennybritz/reinforcement-learning\n- https://github.com/llSourcell/AI_for_Video_Games_Syllabus\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdalmia%2Fdavid-silver-reinforcement-learning","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdalmia%2Fdavid-silver-reinforcement-learning","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdalmia%2Fdavid-silver-reinforcement-learning/lists"}