{"id":20432366,"url":"https://github.com/captaine/deep-reinforcement-learning-a3c","last_synced_at":"2026-05-20T07:12:16.203Z","repository":{"id":134234094,"uuid":"198687408","full_name":"CaptainE/Deep-reinforcement-learning-A3C","owner":"CaptainE","description":"Repository containing material regarding a modified version of the Berkeley Deep reinforcement learning course  and an implementation of A3C as a project","archived":false,"fork":false,"pushed_at":"2019-10-10T18:46:39.000Z","size":11376,"stargazers_count":3,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-03-05T06:18:07.639Z","etag":null,"topics":["a3c","advantage-estimator","berkeley-reinforcement-learning","cs-294","reinforcement-learning"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/CaptainE.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-07-24T18:13:24.000Z","updated_at":"2024-07-21T10:36:43.000Z","dependencies_parsed_at":null,"dependency_job_id":"e18bc0bf-cc0b-4ae4-ac9d-218f20104d84","html_url":"https://github.com/CaptainE/Deep-reinforcement-learning-A3C","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/CaptainE/Deep-reinforcement-learning-A3C","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CaptainE%2FDeep-reinforcement-learning-A3C","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CaptainE%2FDeep-reinforcement-learning-A3C/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CaptainE%2FDeep-reinforcement-learning-A3C/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CaptainE%2FDeep-reinforcement-learning-A3C/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/CaptainE","download_url":"https://codeload.github.com/CaptainE/Deep-reinforcement-learning-A3C/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CaptainE%2FDeep-reinforcement-learning-A3C/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":267983505,"owners_count":24176060,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-07-31T02:00:08.723Z","response_time":66,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["a3c","advantage-estimator","berkeley-reinforcement-learning","cs-294","reinforcement-learning"],"created_at":"2024-11-15T08:14:44.205Z","updated_at":"2026-05-20T07:12:13.445Z","avatar_url":"https://github.com/CaptainE.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Deep-reinforcement-learning-A3C\nRepository containing material regarding a modified version of the Berkeley Deep reinforcement learning course, that is it only contain some of the assignments for CS294-112, and a PyTorch implementation of Asynchronous Advantage Actor-Critic (A3C) using Generalized Advantage Estimation  as a project\n\nHere only solutions material for homework 1, 4 and 5a is provided in Tensorflow.\n\nThe A3C algorithm is made possible by distributed learning in which numerous workers interact with the environment\nand update model parameters asynchronously. (hence the name..) \nThis removes the need for a memory buffer as with other algorithms, also the distributed learning allows for more \nefficient use of hardware as we can generate multiple rollouts running in parallel.\n\nSince our rollouts are generated on-policy, there is a high chance that all trajectories end up being similar, \nas action probabilities gradually become near-zero for all but one action in the discrete case, which ultimately limits exploration. \nThus we see another benefit of A3C as it addresses this problem by introducing an entropy-term to the loss function, which is discussed in section 2. \nWe extend the A3C by replacing the advantage estimator used in (Minh, 2016) by the Generalized Advantage Estimate (GAE) as proposed by (Schulman, 2015), and evaluate the algorithm on a number of environments.\n\nSee our paper for more details.\n\nThe work presented in this repository is to be considered open source under the MIT License. If you found this code useful in your research, then please cite\n\n```\n@misc{hansen-ebert,\n  title={Distributed Deep Reinforcement Learning with Asynchronous Advantage Actor-Critic using Generalized Advantage Estimation},\n  author={Ebert, Peter and Hansen, Nicklas},\n  year={2019}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcaptaine%2Fdeep-reinforcement-learning-a3c","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcaptaine%2Fdeep-reinforcement-learning-a3c","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcaptaine%2Fdeep-reinforcement-learning-a3c/lists"}