{"id":15027421,"url":"https://github.com/awjuliani/deeprl-agents","last_synced_at":"2025-05-15T12:03:58.914Z","repository":{"id":48186974,"uuid":"61159758","full_name":"awjuliani/DeepRL-Agents","owner":"awjuliani","description":"A set of Deep Reinforcement Learning Agents implemented in Tensorflow.","archived":false,"fork":false,"pushed_at":"2019-02-12T17:26:26.000Z","size":360,"stargazers_count":2251,"open_issues_count":45,"forks_count":827,"subscribers_count":119,"default_branch":"master","last_synced_at":"2025-05-15T12:03:52.540Z","etag":null,"topics":["reinforcement-learning","tensorflow"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/awjuliani.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2016-06-14T22:25:31.000Z","updated_at":"2025-05-02T20:10:58.000Z","dependencies_parsed_at":"2022-07-25T00:02:16.037Z","dependency_job_id":null,"html_url":"https://github.com/awjuliani/DeepRL-Agents","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/awjuliani%2FDeepRL-Agents","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/awjuliani%2FDeepRL-Agents/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/awjuliani%2FDeepRL-Agents/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/awjuliani%2FDeepRL-Agents/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/awjuliani","download_url":"https://codeload.github.com/awjuliani/DeepRL-Agents/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254337612,"owners_count":22054253,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["reinforcement-learning","tensorflow"],"created_at":"2024-09-24T20:06:24.340Z","updated_at":"2025-05-15T12:03:53.899Z","avatar_url":"https://github.com/awjuliani.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Deep Reinforcement Learning Agents\n\nThis repository contains a collection of reinforcement learning algorithms written in Tensorflow. The ipython notebook here were written to go\nalong with a still-underway tutorial series I have been publishing on [Medium](https://medium.com/@awjuliani/simple-reinforcement-learning-with-tensorflow-part-0-q-learning-with-tables-and-neural-networks-d195264329d0#.4gyadb8a4).\nIf you are new to reinforcement learning, I recommend reading the accompanying post for each algorithm.\n\nThe repository currently contains the following algorithms:\n* **Q-Table** - An implementation of Q-learning using tables to solve a stochastic environment problem.\n* **Q-Network** - A neural network implementation of Q-Learning to solve the same environment as in Q-Table.\n* **Simple-Policy** - An implementation of policy gradient method for stateless environments such as n-armed bandit problems.\n* **Contextual-Policy** - An implementation of policy gradient method for stateful environments such as contextual bandit problems.\n* **Policy-Network** - An implementation of a neural network policy-gradient agent that solves full RL problems with states and delayed rewards, and two opposite actions (ie. CartPole or Pong).\n* **Vanilla-Policy** - An implementation of a neural network vanilla-policy-gradient agent that solves full RL problems with states, delayed rewards, and an arbitrary number of actions.\n* **Model-Network** - An addition to the Policy-Network algorithm which includes a separate network which models the environment dynamics.\n* **Double-Dueling-DQN** - An implementation of a Deep-Q Network with the Double DQN and Dueling DQN additions to improve stability and performance.\n* **Deep-Recurrent-Q-Network** - An implementation of a Deep Recurrent Q-Network which can solve reinforcement learning problems involving partial observability.\n* **Q-Exploration** - An implementation of DQN containing multiple action-selection strategies for exploration. Strategies include: greedy, random, e-greedy, Boltzmann, and Bayesian Dropout.\n* **A3C-Doom** - An implementation of Asynchronous Advantage Actor-Critic (A3C) algorithm. It utilizes multiple agents to collectively improve a policy. This implementation can solve RL problems in 3D environments such as VizDoom challenges.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fawjuliani%2Fdeeprl-agents","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fawjuliani%2Fdeeprl-agents","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fawjuliani%2Fdeeprl-agents/lists"}