{"id":26773252,"url":"https://github.com/rlopensource/spinning_up_kr","last_synced_at":"2025-07-16T02:44:58.881Z","repository":{"id":95667359,"uuid":"177943259","full_name":"RLOpensource/spinning_up_kr","owner":"RLOpensource","description":null,"archived":false,"fork":false,"pushed_at":"2019-04-02T21:52:00.000Z","size":2046,"stargazers_count":6,"open_issues_count":0,"forks_count":3,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-04-15T21:48:45.683Z","etag":null,"topics":["ddpg","deep-deterministic-policy-gradient","ou-noise","ppo","ppo2","proximal-policy-optimization","reinforcement-learning","robotics","sac","soft-actor-critic","spinningup","td3","trpo","trust-region-policy-optimization"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/RLOpensource.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2019-03-27T07:44:17.000Z","updated_at":"2020-04-15T16:35:08.000Z","dependencies_parsed_at":"2023-05-28T18:30:53.835Z","dependency_job_id":null,"html_url":"https://github.com/RLOpensource/spinning_up_kr","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/RLOpensource/spinning_up_kr","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RLOpensource%2Fspinning_up_kr","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RLOpensource%2Fspinning_up_kr/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RLOpensource%2Fspinning_up_kr/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RLOpensource%2Fspinning_up_kr/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/RLOpensource","download_url":"https://codeload.github.com/RLOpensource/spinning_up_kr/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RLOpensource%2Fspinning_up_kr/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":265477308,"owners_count":23773029,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ddpg","deep-deterministic-policy-gradient","ou-noise","ppo","ppo2","proximal-policy-optimization","reinforcement-learning","robotics","sac","soft-actor-critic","spinningup","td3","trpo","trust-region-policy-optimization"],"created_at":"2025-03-29T01:47:29.732Z","updated_at":"2025-07-16T02:44:58.874Z","avatar_url":"https://github.com/RLOpensource.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Reconstruction of OpenAI spinningup for reinforcement-learning\n\n* The purpose of this repository is study and research about reinforcement learning for robotics control.\n\n* This repository provides the Model-Free reinforcement learning algorithms.\n\n```\nDDPG\nTRPO\nPPO\nPPO2\nSAC\nTD3\n```\n\n* These algorithms are demonstrated in Environment Reacher with [ML-Agent](https://github.com/Unity-Technologies/ml-agents).\n\n* The directory architecture have to be under format.\n\n```\n└─spinning_up_kr\n   ├─env(environment of reacher in unity)\n   ├─mlagents\n   ├─buffer.py\n   ├─core.py\n   ├─ddpg.py\n   ├─ou_noise.py\n   ├─ppo.py\n   ├─ppo2.py\n   ├─sac.py\n   ├─td3.py\n   └─trpo.py\n```\n\n## Demonstration\n\n\u003cdiv align=\"center\"\u003e\n  \u003cimg src=\"source/graph.png\" width=\"60%\" height='300'\u003e\n  \u003cimg src=\"source/table.png\" width=\"39%\" height='200'\u003e\n  \u003cimg src=\"source/out-2.gif\" width=\"100%\" height='400'\u003e\n\u003c/div\u003e\n\nReference\n\n[1] [Proximal Policy Optimization](https://arxiv.org/abs/1707.06347)\n\n[2] [High-Dimensional Continuous Control Using Generalized Advantage Estimation](https://arxiv.org/abs/1506.02438)\n\n[3] [Continuous Control With Deep Reinforcement Learning](https://arxiv.org/pdf/1509.02971.pdf)\n\n[4] [OpenAI Spinningup](https://github.com/openai/spinningup)\n\n[5] [Reinforcement Learning Korea PG Travel](https://github.com/reinforcement-learning-kr/pg_travel)\n\n[6] [Medipixel Reinforcement Learning Repository](https://github.com/medipixel/rl_algorithms)\n\n[7] [Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor](https://arxiv.org/abs/1801.01290)\n\n[8] [tensorflow reinforcement learning framework](https://github.com/RLOpensource/tensorflow_RL)\n\n[9] [Trust Region Policy Optimization](https://arxiv.org/abs/1502.05477)\n\n[10] [Addressing Function Approximation Error in Actor-Critic Methods](https://arxiv.org/abs/1802.09477)","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frlopensource%2Fspinning_up_kr","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frlopensource%2Fspinning_up_kr","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frlopensource%2Fspinning_up_kr/lists"}