{"id":16936788,"url":"https://github.com/chendrag/mujoco-benchmark","last_synced_at":"2025-03-17T07:32:27.802Z","repository":{"id":150801088,"uuid":"350270014","full_name":"ChenDRAG/mujoco-benchmark","owner":"ChenDRAG","description":"Provide full reinforcement learning benchmark on mujoco environments, including ddpg, sac, td3, pg, a2c, ppo, library","archived":false,"fork":false,"pushed_at":"2021-04-29T11:09:07.000Z","size":240,"stargazers_count":85,"open_issues_count":0,"forks_count":6,"subscribers_count":5,"default_branch":"master","last_synced_at":"2025-02-27T20:40:33.427Z","etag":null,"topics":["baseline","benchmark","ddpg","drl","mujoco","performance","ppo","pytorch","results","rl","sac","tianshou"],"latest_commit_sha":null,"homepage":"","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ChenDRAG.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-03-22T08:47:34.000Z","updated_at":"2025-01-17T16:07:10.000Z","dependencies_parsed_at":"2023-05-21T09:45:36.124Z","dependency_job_id":null,"html_url":"https://github.com/ChenDRAG/mujoco-benchmark","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ChenDRAG%2Fmujoco-benchmark","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ChenDRAG%2Fmujoco-benchmark/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ChenDRAG%2Fmujoco-benchmark/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ChenDRAG%2Fmujoco-benchmark/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ChenDRAG","download_url":"https://codeload.github.com/ChenDRAG/mujoco-benchmark/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243852425,"owners_count":20358270,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["baseline","benchmark","ddpg","drl","mujoco","performance","ppo","pytorch","results","rl","sac","tianshou"],"created_at":"2024-10-13T20:57:55.053Z","updated_at":"2025-03-17T07:32:27.797Z","avatar_url":"https://github.com/ChenDRAG.png","language":null,"readme":"\u003cdiv align=\"center\"\u003e\n  \u003ca href=\"http://tianshou.readthedocs.io\"\u003e\u003cimg width=\"300px\" height=\"auto\" src=\"tianshou-logo.png\"\u003e\u003c/a\u003e\n\u003c/div\u003e\n\n---\n\nThis repo only servers as a link to Tianshou's benchmark of Mujoco environments. Latest benchmark is maintained under [thu-ml/tianshou](https://github.com/thu-ml/tianshou). See full benchmark [here](https://github.com/thu-ml/tianshou/tree/master/examples/mujoco).\n\n**Keywords**: deep reinforcement learning, pytorch, mujoco, benchmark, performances, Tianshou, baseline\n\n# Tianshou's Mujoco Benchmark\n\nWe benchmarked Tianshou algorithm implementations in 9 out of 13 environments from the MuJoCo Gym task suite.\n\nFor each supported algorithm and supported mujoco environments, we provide:\n- Default hyperparameters used for benchmark and scripts to reproduce the benchmark;\n- A comparison of performance (or code level details) with other open source implementations or classic papers;\n- Graphs and raw data that can be used for research purposes;\n- Log details obtained during training;\n- Pretrained agents;\n- Some hints on how to tune the algorithm.\n  \n\nSupported algorithms are listed below:\n- [Deep Deterministic Policy Gradient (DDPG)](https://arxiv.org/pdf/1509.02971.pdf), [commit id](https://github.com/thu-ml/tianshou/tree/e605bdea942b408126ef4fbc740359773259c9ec)\n- [Twin Delayed DDPG (TD3)](https://arxiv.org/pdf/1802.09477.pdf), [commit id](https://github.com/thu-ml/tianshou/tree/e605bdea942b408126ef4fbc740359773259c9ec)\n- [Soft Actor-Critic (SAC)](https://arxiv.org/pdf/1812.05905.pdf), [commit id](https://github.com/thu-ml/tianshou/tree/e605bdea942b408126ef4fbc740359773259c9ec)\n- [REINFORCE algorithm](https://papers.nips.cc/paper/1999/file/464d828b85b0bed98e80ade0a5c43b0f-Paper.pdf), [commit id](https://github.com/thu-ml/tianshou/tree/e27b5a26f330de446fe15388bf81c3777f024fb9)\n- [Natural Policy Gradient (NPG)](https://proceedings.neurips.cc/paper/2001/file/4b86abe48d358ecf194c56c69108433e-Paper.pdf), [commit id](https://github.com/thu-ml/tianshou/tree/844d7703c313009c4c364edb4018c91de93439ca)\n- [Advantage Actor-Critic (A2C)](https://openai.com/blog/baselines-acktr-a2c/), [commit id](https://github.com/thu-ml/tianshou/tree/1730a9008ad6bb67cac3b21347bed33b532b17bc)\n- [Proximal Policy Optimization (PPO)](https://arxiv.org/pdf/1707.06347.pdf), [commit id](https://github.com/thu-ml/tianshou/tree/6426a39796db052bafb7cabe85c764db20a722b0)\n- [Trust Region Policy Optimization (TRPO)](https://arxiv.org/pdf/1502.05477.pdf), [commit id](https://github.com/thu-ml/tianshou/tree/a57503c0aa6d40a0c939b37f4ae3c3a83bb2e126)\n- [Trust Region Policy Optimization (ACKTR)](https://arxiv.org/pdf/1708.05144.pdf), [commit id](https://github.com/thu-ml/tianshou/tree/844d7703c313009c4c364edb4018c91de93439ca)\n\n## Example benchmark\n### SAC\n\n|      Environment       |      Tianshou      | [SpinningUp (Pytorch)](https://spinningup.openai.com/en/latest/spinningup/bench.html) | [SAC paper](https://arxiv.org/abs/1801.01290) |\n| :--------------------: | :----------------: | :-------------------: | :---------: |\n|          Ant           |  **5850.2±475.7**  |         ~3980         |    ~3720    |\n|      HalfCheetah       | **12138.8±1049.3** |        ~11520         |   ~10400    |\n|         Hopper         |  **3542.2±51.5**   |         ~3150         |    ~3370    |\n|        Walker2d        |  **5007.0±251.5**  |         ~4250         |    ~3740    |\n|        Swimmer         |    **44.4±0.5**    |         ~41.7         |      N      |\n|        Humanoid        |  **5488.5±81.2**   |           N           |    ~5200    |\n|        Reacher         |    **-2.6±0.2**    |           N           |      N      |\n|    InvertedPendulum    |   **1000.0±0.0**   |           N           |      N      |\n| InvertedDoublePendulum |   **9359.5±0.4**   |           N           |      N      |\n\n\u003cimg src=\"./example_graph.png\" width=\"500\" height=\"450\"\u003e\n\n\n## Other resources\n- [Spinningup Benchmark](https://spinningup.openai.com/en/latest/spinningup/bench.html)\n- [OpenAI Baseliens Benchmark](https://htmlpreview.github.com/?https://github.com/openai/baselines/blob/master/benchmarks_mujoco1M.htm)\n- TODO and relative discussions: [1](https://github.com/thu-ml/tianshou/issues/274), [2](https://github.com/thu-ml/tianshou/issues/307)\n\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fchendrag%2Fmujoco-benchmark","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fchendrag%2Fmujoco-benchmark","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fchendrag%2Fmujoco-benchmark/lists"}