{"id":28616025,"url":"https://github.com/agi-brain/xuance","last_synced_at":"2026-05-27T07:02:16.587Z","repository":{"id":167866419,"uuid":"643495755","full_name":"agi-brain/xuance","owner":"agi-brain","description":"XuanCe: A Comprehensive and Unified Deep Reinforcement Learning Library","archived":false,"fork":false,"pushed_at":"2026-05-26T16:48:43.000Z","size":494350,"stargazers_count":1065,"open_issues_count":31,"forks_count":156,"subscribers_count":16,"default_branch":"master","last_synced_at":"2026-05-26T18:25:28.571Z","etag":null,"topics":["a2c","atari","ddpg","dqn","google-research-football","maddpg","magent","mappo","mindspore","mpe","mujoco","multi-agent-reinforcement-learning","ppo","pytorch","qmix","reinforcement-learning","reinforcement-learning-library","starcraft2","tensorflow2"],"latest_commit_sha":null,"homepage":"https://xuance.org","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/agi-brain.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE.txt","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2023-05-21T10:55:10.000Z","updated_at":"2026-05-26T16:48:47.000Z","dependencies_parsed_at":null,"dependency_job_id":"c3d7f373-c41c-445e-adeb-e00ac9437d3f","html_url":"https://github.com/agi-brain/xuance","commit_stats":null,"previous_names":["wenzhangliu/xuanpolicy","agi-brain/xuanpolicy","agi-brain/xuance"],"tags_count":48,"template":false,"template_full_name":null,"purl":"pkg:github/agi-brain/xuance","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/agi-brain%2Fxuance","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/agi-brain%2Fxuance/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/agi-brain%2Fxuance/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/agi-brain%2Fxuance/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/agi-brain","download_url":"https://codeload.github.com/agi-brain/xuance/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/agi-brain%2Fxuance/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33554780,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-05-27T02:00:06.184Z","response_time":53,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["a2c","atari","ddpg","dqn","google-research-football","maddpg","magent","mappo","mindspore","mpe","mujoco","multi-agent-reinforcement-learning","ppo","pytorch","qmix","reinforcement-learning","reinforcement-learning-library","starcraft2","tensorflow2"],"created_at":"2025-06-12T03:00:45.504Z","updated_at":"2026-05-27T07:02:16.581Z","avatar_url":"https://github.com/agi-brain.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cdiv align=\"center\"\u003e\n\u003cimg src=\"docs/source/_static/figures/logo_1.png\" width=\"400\" height=\"auto\" align=center /\u003e\n\u003c/div\u003e\n\n# XuanCe: A Comprehensive and Unified Deep Reinforcement Learning Library\n\n[![PyPI](https://img.shields.io/pypi/v/xuance)](https://pypi.org/project/xuance/)\n[![Documentation Status](https://readthedocs.org/projects/xuance/badge/?version=latest)](https://xuance.org)\n[![GitHub](https://img.shields.io/github/license/agi-brain/xuance)](https://github.com/agi-brain/xuance/blob/master/LICENSE.txt)\n[![Downloads](https://static.pepy.tech/badge/xuance)](https://pepy.tech/project/xuance)\n[![GitHub Repo stars](https://img.shields.io/github/stars/agi-brain/xuance?style=social)](https://github.com/agi-brain/xuance/stargazers)\n[![GitHub forks](https://img.shields.io/github/forks/agi-brain/xuance?style=social)](https://github.com/agi-brain/xuance/forks)\n[![GitHub watchers](https://img.shields.io/github/watchers/agi-brain/xuance?style=social)](https://github.com/agi-brain/xuance/watchers)\n\n[![PyTorch](https://img.shields.io/badge/PyTorch-%3E%3D1.13.0-red)](https://pytorch.org/get-started/locally/)\n[![TensorFlow](https://img.shields.io/badge/TensorFlow-%3E%3D2.6.0-orange)](https://www.tensorflow.org/install)\n[![MindSpore](https://img.shields.io/badge/MindSpore-%3E%3D1.10.1-blue)](https://www.mindspore.cn/install/en)\n[![gymnasium](https://img.shields.io/badge/gymnasium-%3E%3D0.28.1-blue)](https://www.gymlibrary.dev/)\n[![pettingzoo](https://img.shields.io/badge/PettingZoo-%3E%3D1.23.0-blue)](https://pettingzoo.farama.org/)\n![PyPI - Python Version](https://img.shields.io/pypi/pyversions/xuance)\n\n[![Benchmarks](https://img.shields.io/badge/Benchmarks-Results-blue)](https://github.com/agi-brain/xuance-benchmarks.git)\n\n**[Full Documentation](https://xuance.org)**\n| **[中文文档](https://cn.xuance.org)**\n| **[README_CN.md](README_CN.md)**\n\n**XuanCe** is an open-source ensemble of Deep Reinforcement Learning (DRL) algorithm implementations.\n\nWe call it as **Xuan-Ce (玄策)** in Chinese.\n\"**Xuan (玄)**\" means incredible and magic box, \"**Ce (策)**\" means policy.\n\nDRL algorithms are sensitive to hyper-parameters tuning, varying in performance with different tricks,\nand suffering from unstable training processes, therefore, sometimes DRL algorithms seems elusive and \"Xuan\".\nThis project gives a thorough, high-quality and easy-to-understand implementation of DRL algorithms,\nand hope this implementation can give a hint on the magics of reinforcement learning.\n\nWe expect it to be compatible with multiple deep learning backends(\n**[PyTorch](https://pytorch.org/)**,\n**[TensorFlow](https://www.tensorflow.org/)**, and\n**[MindSpore](https://www.mindspore.cn/en)**),\nand hope it can really become a zoo full of DRL algorithms.\n\n**Paper link**: [**https://arxiv.org/pdf/2312.16248.pdf**](https://arxiv.org/pdf/2312.16248.pdf)\n\n## Table of Contents:\n\n- [**Features**](#features)\n- [**Algorithms**](#algorithms)\n- [**Environments**](#environments)\n- [**Installation**](#point_right-installation)\n- [**Quickly Start**](#point_right-quickly-start)\n- [**Community**](#community)\n- [**Citation**](#citations)\n\n## Features\n\n- :school_satchel: Highly modularized.\n- :thumbsup: Easy to [learn](https://xuance.org), easy for [installation](https://xuance.org/documents/usage/installation.html), and easy for [usage](https://xuance.org/documents/usage/basic_usage.html).\n- :twisted_rightwards_arrows: Flexible for model combination.\n- :tada: Abundant [algorithms](https://xuance.org/#list-of-algorithms) with various tasks.\n- :couple: Supports both DRL and MARL tasks.\n- :key: High compatibility for different users. (PyTorch, TensorFlow2, MindSpore, CPU, GPU, Linux, Windows, MacOS, etc.)\n- :zap: Fast running speed with parallel environments.\n- :computer: Distributed training with multi-GPUs.\n- 🎛️ Support automatically hyperparameters tuning.\n- :chart_with_upwards_trend: Good visualization effect with [tensorboard](https://www.tensorflow.org/tensorboard) or [wandb](https://wandb.ai/site) tool.\n\n## Algorithms\n\n### :point_right: DRL\n\n- **DQN**: Deep Q Network [[Paper](https://www.nature.com/articles/nature14236)]\n- **Double DQN**: DQN with Double Q-learning [[Paper](https://ojs.aaai.org/index.php/AAAI/article/view/10295)]\n- **Dueling DQN**: DQN with Dueling Network [[Paper](http://proceedings.mlr.press/v48/wangf16.pdf)]\n- **PER**: DQN with Prioritized Experience Replay [[Paper](https://arxiv.org/pdf/1511.05952.pdf)]\n- **NoisyDQN**: DQN with Parameter Space Noise for Exploration [[Paper](https://arxiv.org/pdf/1706.01905.pdf)]\n- **DRQN**: Deep Recurrent Q-Network [[Paper](https://cdn.aaai.org/ocs/11673/11673-51288-1-PB.pdf)]\n- **QRDQN**: DQN with Quantile Regression [[Paper](https://ojs.aaai.org/index.php/AAAI/article/view/11791)]\n- **C51**: Distributional Reinforcement Learning [[Paper](http://proceedings.mlr.press/v70/bellemare17a/bellemare17a.pdf)]\n- **PG**: Vanilla Policy Gradient [[Paper](https://proceedings.neurips.cc/paper_files/paper/1999/file/464d828b85b0bed98e80ade0a5c43b0f-Paper.pdf)]\n- **NPG**: Natural Policy Gradient [[Paper](https://proceedings.neurips.cc/paper_files/paper/2001/file/4b86abe48d358ecf194c56c69108433e-Paper.pdf)]\n- **PPG**: Phasic Policy Gradient [[Paper](http://proceedings.mlr.press/v139/cobbe21a/cobbe21a.pdf)] [[Code](https://github.com/openai/phasic-policy-gradient)]\n- **A2C**: Advantage Actor Critic [[Paper](http://proceedings.mlr.press/v48/mniha16.pdf)] [[Code](https://github.com/openai/baselines/tree/master/baselines/a2c)]\n- **SAC**: Soft Actor-Critic [[Paper](http://proceedings.mlr.press/v80/haarnoja18b/haarnoja18b.pdf)] [[Code](http://github.com/haarnoja/sac)]\n- **SAC-Discrete**: Soft Actor-Critic for Discrete Actions [[Paper](https://arxiv.org/pdf/1910.07207.pdf)] [[Code](https://github.com/p-christ/Deep-Reinforcement-Learning-Algorithms-with-PyTorch)]\n- **PPO-Clip**: Proximal Policy Optimization with Clipped Objective [[Paper](https://arxiv.org/pdf/1707.06347.pdf)] [[Code](https://github.com/berkeleydeeprlcourse/homework/tree/master/hw4)]\n- **PPO-KL**: Proximal Policy Optimization with KL Divergence [[Paper](https://arxiv.org/pdf/1707.06347.pdf)] [[Code](https://github.com/berkeleydeeprlcourse/homework/tree/master/hw4)]\n- **DDPG**: Deep Deterministic Policy Gradient [[Paper](https://arxiv.org/pdf/1509.02971.pdf)] [[Code](https://github.com/openai/baselines/tree/master/baselines/ddpg)]\n- **TD3**: Twin Delayed Deep Deterministic Policy Gradient [[Paper](http://proceedings.mlr.press/v80/fujimoto18a/fujimoto18a.pdf)][[Code](https://github.com/sfujim/TD3)]\n- **P-DQN**: Parameterised Deep Q-Network [[Paper](https://arxiv.org/pdf/1810.06394.pdf)]\n- **MP-DQN**: Multi-pass Parameterised Deep Q-network [[Paper](https://arxiv.org/pdf/1905.04388.pdf)] [[Code](https://github.com/cycraig/MP-DQN)]\n- **SP-DQN**: Split Parameterised Deep Q-Network [[Paper](https://arxiv.org/pdf/1810.06394.pdf)]\n\n### :point_right: Model-Based Reinforcement Learning (MBRL)\n\n- **DreamerV2** [[Paper](https://openreview.net/pdf?id=0oabwyZbOu)] [[Code](https://github.com/danijar/dreamerv2.git)]\n- **DreamerV3** [[Paper](https://www.nature.com/articles/s41586-025-08744-2.pdf)] [[Code](https://github.com/danijar/dreamerv3.git)]\n- **HarmonyDream** [[Paper](https://proceedings.mlr.press/v235/ma24o.html)] [[Code](https://github.com/thuml/HarmonyDream.git)]\n\n### :point_right: Multi-Agent Reinforcement Learning (MARL)\n\n- **IQL**: Independent Q-learning [[Paper](https://hal.science/file/index/docid/720669/filename/Matignon2012independent.pdf)] [[Code](https://github.com/oxwhirl/pymarl)]\n- **VDN**: Value Decomposition Networks [[Paper](https://arxiv.org/pdf/1706.05296.pdf)] [[Code](https://github.com/oxwhirl/pymarl)]\n- **QMIX**: Q-mixing networks [[Paper](http://proceedings.mlr.press/v80/rashid18a/rashid18a.pdf)] [[Code](https://github.com/oxwhirl/pymarl)]\n- **WQMIX**: Weighted Q-mixing networks [[Paper](https://proceedings.neurips.cc/paper/2020/file/73a427badebe0e32caa2e1fc7530b7f3-Paper.pdf)] [[Code](https://github.com/oxwhirl/wqmix)]\n- **QTRAN**: Q-transformation [[Paper](http://proceedings.mlr.press/v97/son19a/son19a.pdf)] [[Code](https://github.com/Sonkyunghwan/QTRAN)]\n- **DCG**: Deep Coordination Graphs [[Paper](http://proceedings.mlr.press/v119/boehmer20a/boehmer20a.pdf)] [[Code](https://github.com/wendelinboehmer/dcg)]\n- **IDDPG**: Independent Deep Deterministic Policy Gradient [[Paper](https://proceedings.neurips.cc/paper/2017/file/68a9750337a418a86fe06c1991a1d64c-Paper.pdf)]\n- **MADDPG**: Multi-agent Deep Deterministic Policy Gradient [[Paper](https://proceedings.neurips.cc/paper/2017/file/68a9750337a418a86fe06c1991a1d64c-Paper.pdf)] [[Code](https://github.com/openai/maddpg)]\n- **IAC**: Independent Actor-Critic [[Paper](https://ojs.aaai.org/index.php/AAAI/article/view/11794)] [[Code](https://github.com/oxwhirl/pymarl)]\n- **COMA**: Counterfactual Multi-agent Policy Gradient [[Paper](https://ojs.aaai.org/index.php/AAAI/article/view/11794)] [[Code](https://github.com/oxwhirl/pymarl)]\n- **VDAC**: Value-Decomposition Actor-Critic [[Paper](https://ojs.aaai.org/index.php/AAAI/article/view/17353)] [[Code](https://github.com/hahayonghuming/VDACs.git)]\n- **IPPO**: Independent Proximal Policy Optimization [[Paper](https://proceedings.neurips.cc/paper_files/paper/2022/file/9c1535a02f0ce079433344e14d910597-Paper-Datasets_and_Benchmarks.pdf)] [[Code](https://github.com/marlbenchmark/on-policy)]\n- **MAPPO**: Multi-agent Proximal Policy Optimization [[Paper](https://proceedings.neurips.cc/paper_files/paper/2022/file/9c1535a02f0ce079433344e14d910597-Paper-Datasets_and_Benchmarks.pdf)] [[Code](https://github.com/marlbenchmark/on-policy)]\n- **MFQ**: Mean-Field Q-learning [[Paper](http://proceedings.mlr.press/v80/yang18d/yang18d.pdf)] [[Code](https://github.com/mlii/mfrl)]\n- **MFAC**: Mean-Field Actor-Critic [[Paper](http://proceedings.mlr.press/v80/yang18d/yang18d.pdf)] [[Code](https://github.com/mlii/mfrl)]\n- **ISAC**: Independent Soft Actor-Critic\n- **MASAC**: Multi-agent Soft Actor-Critic [[Paper](https://arxiv.org/pdf/2104.06655.pdf)]\n- **MATD3**: Multi-agent Twin Delayed Deep Deterministic Policy Gradient [[Paper](https://arxiv.org/pdf/1910.01465.pdf)]\n- **IC3Net**: Individualized Controlled Continuous Communication Model [[Paper](https://arxiv.org/pdf/1812.09755)] [[Code](https://github.com/IC3Net/IC3Net.git)]\n- **CommNet**: Communication Neural Net [[Paper](https://proceedings.neurips.cc/paper_files/paper/2016/file/55b1927fdafef39c48e5b73b5d61ea60-Paper.pdf)][[Code](https://github.com/cts198859/deeprl_network.git)]\n\n### :point_right: Contrastive Reinforcement Learning (CRL)\n\n- **CURL**: Contrastive Unsupervised Representation Learning for Sample-Efficient Reinforcement Learning [[Paper](http://proceedings.mlr.press/v119/laskin20a/laskin20a.pdf)] [[Code](https://github.com/MishaLaskin/curl/blob/master/curl_sac.py)]\n- **SPR**: Data-Efficient Reinforcement Learning with Self-Predictive Representations [[Paper]](https://arxiv.org/abs/2007.05929) [[Code]](https://github.com/mila-iqia/spr)\n- **DrQ**: Image Augmentation Is All You Need: Regularizing Deep Reinforcement Learning from Pixels [[Paper]](https://openreview.net/forum?id=GY6-6sTvGaf) [[Code]](https://sites.google.com/view/data-regularized-q)\n\n## Environments\n\n### [Classic Control](https://xuance.org/documents/api/environments/single_agent_env/gym.html#classic-control)\n\n\u003ctable rules=\"none\" align=\"center\"\u003e\u003ctr\u003e\n\u003ctd\u003e \u003ccenter\u003e\n\u003cimg src=\"docs/source/_static/figures/classic_control/cart_pole.gif\" height=100\" /\u003e\u003cbr/\u003e\u003cfont color=\"AAAAAA\"\u003eCart Pole\u003c/font\u003e\n\u003c/center\u003e\u003c/td\u003e\n\u003ctd\u003e \u003ccenter\u003e\n\u003cimg src=\"docs/source/_static/figures/classic_control/pendulum.gif\" height=100\" /\u003e \u003cbr/\u003e \u003cfont color=\"AAAAAA\"\u003ePendulum\u003c/font\u003e\n\u003c/center\u003e \u003c/td\u003e\n\u003ctd\u003e \u003ccenter\u003e\n\u003cimg src=\"docs/source/_static/figures/classic_control/acrobot.gif\" height=100\" /\u003e \u003cbr/\u003e \u003cfont color=\"AAAAAA\"\u003eAcrobot\u003c/font\u003e\n\u003c/center\u003e \u003c/td\u003e\n\u003ctd\u003e \u003ccenter\u003e\n\u003cimg src=\"docs/source/_static/figures/classic_control/mountain_car.gif\" height=100\" /\u003e \u003cbr/\u003e \u003cfont color=\"AAAAAA\"\u003eMountainCar\u003c/font\u003e\n\u003c/center\u003e \u003c/td\u003e\n\u003c/table\u003e\n\n### [Box2D](https://xuance.org/documents/api/environments/single_agent_env/gym.html#box2d)\n\n\u003ctable rules=\"none\" align=\"center\"\u003e\u003ctr\u003e\n\u003ctd\u003e \u003ccenter\u003e\n\u003cimg src=\"docs/source/_static/figures/box2d/bipedal_walker.gif\" height=100\" /\u003e\u003cbr/\u003e\u003cfont color=\"AAAAAA\"\u003eBipedal Walker\u003c/font\u003e\n\u003c/center\u003e\u003c/td\u003e\n\u003ctd\u003e \u003ccenter\u003e\n\u003cimg src=\"docs/source/_static/figures/box2d/car_racing.gif\" height=100\" /\u003e \u003cbr/\u003e \u003cfont color=\"AAAAAA\"\u003eCar Racing\u003c/font\u003e\n\u003c/center\u003e \u003c/td\u003e\n\u003ctd\u003e \u003ccenter\u003e\n\u003cimg src=\"docs/source/_static/figures/box2d/lunar_lander.gif\" height=100\" /\u003e \u003cbr/\u003e \u003cfont color=\"AAAAAA\"\u003eLunar Lander\u003c/font\u003e\n\u003c/center\u003e \u003c/td\u003e\n\u003c/tr\u003e\n\u003c/table\u003e\n\n### [MuJoCo Environments](https://xuance.org/documents/api/environments/single_agent_env/gym.html#mujoco)\n\n\u003ctable rules=\"none\" align=\"center\"\u003e\u003ctr\u003e\n\u003ctd\u003e \u003ccenter\u003e\n\u003cimg src=\"docs/source/_static/figures/mujoco/ant.gif\" height=100\" /\u003e\u003cbr/\u003e\u003cfont color=\"AAAAAA\"\u003eAnt\u003c/font\u003e\n\u003c/center\u003e\u003c/td\u003e\n\u003ctd\u003e \u003ccenter\u003e\n\u003cimg src=\"docs/source/_static/figures/mujoco/half_cheetah.gif\" height=100\" /\u003e \u003cbr/\u003e \u003cfont color=\"AAAAAA\"\u003eHalfCheetah\u003c/font\u003e\n\u003c/center\u003e \u003c/td\u003e\n\u003ctd\u003e \u003ccenter\u003e\n\u003cimg src=\"docs/source/_static/figures/mujoco/hopper.gif\" height=100\" /\u003e \u003cbr/\u003e \u003cfont color=\"AAAAAA\"\u003eHopper\u003c/font\u003e\n\u003c/center\u003e \u003c/td\u003e\n\u003ctd\u003e \u003ccenter\u003e\n\u003cimg src=\"docs/source/_static/figures/mujoco/humanoid_standup.gif\" height=100\" /\u003e \u003cbr/\u003e \u003cfont color=\"AAAAAA\"\u003eHumanoidStandup\u003c/font\u003e\n\u003c/center\u003e \u003c/td\u003e\n\u003ctd\u003e \u003ccenter\u003e\n\u003cimg src=\"docs/source/_static/figures/mujoco/humanoid.gif\" height=100\" /\u003e \u003cbr/\u003e \u003cfont color=\"AAAAAA\"\u003eHumanoid\u003c/font\u003e\n\u003c/center\u003e \u003c/td\u003e\n\u003ctd\u003e \u003ccenter\u003e\n\u003cimg src=\"docs/source/_static/figures/mujoco/inverted_pendulum.gif\" height=100\" /\u003e \u003cbr/\u003e \u003cfont color=\"AAAAAA\"\u003eInvertedPendulum\u003c/font\u003e\n\u003c/center\u003e \u003c/td\u003e\n\u003ctd\u003e \u003ccenter\u003e\n\u003cbr/\u003e \u003cfont color=\"AAAAAA\"\u003e...\u003c/font\u003e\n\u003c/center\u003e \u003c/td\u003e\n\u003c/tr\u003e\n\u003c/table\u003e\n\n### [Atari Environments](https://xuance.org/documents/api/environments/single_agent_env/gym.html#atari)\n\n\u003ctable rules=\"none\" align=\"center\"\u003e\u003ctr\u003e\n\u003ctd\u003e \u003ccenter\u003e\n\u003cimg src=\"docs/source/_static/figures/atari/adventure.gif\" height=100\" /\u003e \u003cbr/\u003e \u003cfont color=\"AAAAAA\"\u003eAdventure\u003c/font\u003e\n\u003c/center\u003e \u003c/td\u003e\n\u003ctd\u003e \u003ccenter\u003e\n\u003cimg src=\"docs/source/_static/figures/atari/air_raid.gif\" height=100\" /\u003e \u003cbr/\u003e \u003cfont color=\"AAAAAA\"\u003eAir Raid\u003c/font\u003e\n\u003c/center\u003e \u003c/td\u003e\n\u003ctd\u003e \u003ccenter\u003e\n\u003cimg src=\"docs/source/_static/figures/atari/alien.gif\" height=100\" /\u003e \u003cbr/\u003e \u003cfont color=\"AAAAAA\"\u003eAlien\u003c/font\u003e\n\u003c/center\u003e \u003c/td\u003e\n\u003ctd\u003e \u003ccenter\u003e\n\u003cimg src=\"docs/source/_static/figures/atari/amidar.gif\" height=100\" /\u003e\u003cbr/\u003e\u003cfont color=\"AAAAAA\"\u003eAmidar\u003c/font\u003e\n\u003c/center\u003e\u003c/td\u003e\n\u003ctd\u003e \u003ccenter\u003e\n\u003cimg src=\"docs/source/_static/figures/atari/assault.gif\" height=100\" /\u003e \u003cbr/\u003e \u003cfont color=\"AAAAAA\"\u003eAssault\u003c/font\u003e\n\u003c/center\u003e \u003c/td\u003e\n\u003ctd\u003e \u003ccenter\u003e\n\u003cimg src=\"docs/source/_static/figures/atari/asterix.gif\" height=100\" /\u003e \u003cbr/\u003e \u003cfont color=\"AAAAAA\"\u003eAsterix\u003c/font\u003e\n\u003c/center\u003e \u003c/td\u003e\n\u003ctd\u003e \u003ccenter\u003e\n\u003cimg src=\"docs/source/_static/figures/atari/asteroids.gif\" height=100\" /\u003e \u003cbr/\u003e \u003cfont color=\"AAAAAA\"\u003eAsteroids\u003c/font\u003e\n\u003c/center\u003e \u003c/td\u003e\n\u003ctd\u003e \u003ccenter\u003e\n\u003cbr/\u003e \u003cfont color=\"AAAAAA\"\u003e...\u003c/font\u003e\n\u003c/center\u003e \u003c/td\u003e\n\u003c/tr\u003e\n\u003c/table\u003e\n\n### [Minigrid Environments](https://xuance.org/documents/api/environments/single_agent_env/minigrid.html)\n\n\u003ctable rules=\"none\" align=\"center\"\u003e\u003ctr\u003e\n\u003ctd\u003e \u003ccenter\u003e\n\u003cimg src=\"docs/source/_static/figures/minigrid/GoToDoorEnv.gif\" height=100\" /\u003e\u003cbr/\u003e\u003cfont color=\"AAAAAA\"\u003eGoToDoorEnv\u003c/font\u003e\n\u003c/center\u003e\u003c/td\u003e\n\u003ctd\u003e \u003ccenter\u003e\n\u003cimg src=\"docs/source/_static/figures/minigrid/LockedRoomEnv.gif\" height=100\" /\u003e \u003cbr/\u003e \u003cfont color=\"AAAAAA\"\u003eLockedRoomEnv\u003c/font\u003e\n\u003c/center\u003e \u003c/td\u003e\n\u003ctd\u003e \u003ccenter\u003e\n\u003cimg src=\"docs/source/_static/figures/minigrid/MemoryEnv.gif\" height=100\" /\u003e \u003cbr/\u003e \u003cfont color=\"AAAAAA\"\u003eMemoryEnv\u003c/font\u003e\n\u003c/center\u003e \u003c/td\u003e\n\u003ctd\u003e \u003ccenter\u003e\n\u003cimg src=\"docs/source/_static/figures/minigrid/PlaygroundEnv.gif\" height=100\" /\u003e \u003cbr/\u003e \u003cfont color=\"AAAAAA\"\u003ePlaygroundEnv\u003c/font\u003e\n\u003c/center\u003e \u003c/td\u003e\n\u003ctd\u003e \u003ccenter\u003e\n\u003cbr/\u003e \u003cfont color=\"AAAAAA\"\u003e...\u003c/font\u003e\n\u003c/center\u003e \u003c/td\u003e\n\u003c/tr\u003e\n\u003c/table\u003e\n\n### [Drones Environments](https://xuance.org/documents/api/environments/multi_agent_env/drones.html)\n\n\u003ctable rules=\"none\" align=\"center\"\u003e\u003ctr\u003e\n\u003ctd\u003e \u003ccenter\u003e\n\u003cimg src=\"docs/source/_static/figures/drones/helix.gif\" height=100\" /\u003e\u003cbr/\u003e\u003cfont color=\"AAAAAA\"\u003eHelix\u003c/font\u003e\n\u003c/center\u003e\u003c/td\u003e\n\u003ctd\u003e \u003ccenter\u003e\n\u003cimg src=\"docs/source/_static/figures/drones/rl.gif\" height=100\" /\u003e \u003cbr/\u003e \u003cfont color=\"AAAAAA\"\u003eSingle-Agent Hover\u003c/font\u003e\n\u003c/center\u003e \u003c/td\u003e\n\u003ctd\u003e \u003ccenter\u003e\n\u003cimg src=\"docs/source/_static/figures/drones/marl.gif\" height=100\" /\u003e \u003cbr/\u003e \u003cfont color=\"AAAAAA\"\u003eMulti-Agent Hover\u003c/font\u003e\n\u003c/center\u003e \u003c/td\u003e\n\u003ctd\u003e \u003ccenter\u003e\n\u003cbr/\u003e \u003cfont color=\"AAAAAA\"\u003e...\u003c/font\u003e\n\u003c/center\u003e \u003c/td\u003e\n\u003c/tr\u003e\n\u003c/table\u003e\n\n### [MetaDrive](https://xuance.org/documents/api/environments/single_agent_env/metadrive.html)\n\n\u003cdiv align=\"center\"\u003e\n\u003cimg src=\"docs/source/_static/figures/metadrive/metadeive_teaser_1.gif\" width=\"auto\" height=\"120\" align=center /\u003e\n\u003c/div\u003e\n\n### [MPE Environments](https://xuance.org/documents/api/environments/multi_agent_env/mpe.html)\n\n\u003ctable rules=\"none\" align=\"center\"\u003e\u003ctr\u003e\n\u003ctd\u003e \u003ccenter\u003e\n\u003cimg src=\"docs/source/_static/figures/mpe/mpe_simple_push.gif\" height=100\" /\u003e\u003cbr/\u003e\u003cfont color=\"AAAAAA\"\u003eSimple Push\u003c/font\u003e\n\u003c/center\u003e\u003c/td\u003e\n\u003ctd\u003e \u003ccenter\u003e\n\u003cimg src=\"docs/source/_static/figures/mpe/mpe_simple_reference.gif\" height=100\" /\u003e \u003cbr/\u003e \u003cfont color=\"AAAAAA\"\u003eSimple Reference\u003c/font\u003e\n\u003c/center\u003e \u003c/td\u003e\n\u003ctd\u003e \u003ccenter\u003e\n\u003cimg src=\"docs/source/_static/figures/mpe/mpe_simple_spread.gif\" height=100\" /\u003e \u003cbr/\u003e \u003cfont color=\"AAAAAA\"\u003eSimple Spread\u003c/font\u003e\n\u003c/center\u003e \u003c/td\u003e\n\u003ctd\u003e \u003ccenter\u003e\n\u003cimg src=\"docs/source/_static/figures/mpe/mpe_simple_adversary.gif\" height=100\" /\u003e \u003cbr/\u003e \u003cfont color=\"AAAAAA\"\u003eSimple Adversary\u003c/font\u003e\n\u003c/center\u003e \u003c/td\u003e\n\u003ctd\u003e \u003ccenter\u003e\n\u003cbr/\u003e \u003cfont color=\"AAAAAA\"\u003e...\u003c/font\u003e\n\u003c/center\u003e \u003c/td\u003e\n\u003c/tr\u003e\n\u003c/table\u003e\n\n### [Robotic Warehouse](https://xuance.org/documents/api/environments/multi_agent_env/robotic_warehouse.html)\n\n\u003ctable rules=\"none\" align=\"center\"\u003e\u003ctr\u003e\n\u003ctd\u003e \u003ccenter\u003e\n\u003cimg src=\"docs/source/_static/figures/rware/rware.gif\" height=100\" /\u003e\u003cbr/\u003e\u003cfont color=\"AAAAAA\"\u003eExample 1\u003c/font\u003e\n\u003c/center\u003e\u003c/td\u003e\n\u003ctd\u003e \u003ccenter\u003e\n\u003cimg src=\"docs/source/_static/figures/rware/collision1.gif\" height=100\" /\u003e \u003cbr/\u003e \u003cfont color=\"AAAAAA\"\u003eExample 2\u003c/font\u003e\n\u003c/center\u003e \u003c/td\u003e\n\u003ctd\u003e \u003ccenter\u003e\n\u003cimg src=\"docs/source/_static/figures/rware/collision2.gif\" height=100\" /\u003e \u003cbr/\u003e \u003cfont color=\"AAAAAA\"\u003eExample 3\u003c/font\u003e\n\u003c/center\u003e \u003c/td\u003e\n\u003ctd\u003e \u003ccenter\u003e\n\u003cimg src=\"docs/source/_static/figures/rware/collision3.gif\" height=100\" /\u003e \u003cbr/\u003e \u003cfont color=\"AAAAAA\"\u003eExample 4\u003c/font\u003e\n\u003c/center\u003e \u003c/td\u003e\n\u003ctd\u003e \u003ccenter\u003e\n\u003cbr/\u003e \u003cfont color=\"AAAAAA\"\u003e...\u003c/font\u003e\n\u003c/center\u003e \u003c/td\u003e\n\u003c/tr\u003e\n\u003c/table\u003e\n\n### [SMAC](https://xuance.org/documents/api/environments/multi_agent_env/smac.html)\n\n\u003cdiv align=\"center\"\u003e\n\u003cimg src=\"docs/source/_static/figures/smac/smac.png\" width=\"715\" height=\"auto\" align=center /\u003e\n\u003c/div\u003e\n\n### [Google Research Football](https://xuance.org/documents/api/environments/multi_agent_env/football.html)\n\n\u003cdiv align=\"center\"\u003e\n\u003cimg src=\"docs/source/_static/figures/football/gfootball.png\" width=\"720\" height=\"auto\" align=center /\u003e\n\u003c/div\u003e\n\n## :point_right: Installation\n\n:computer: XuanCe can run at Linux, Windows, MacOS, and EulerOS, etc.\n\n**Step 1**: Set up a Python environment\n\nWe recommend installing [Anaconda](https://www.anaconda.com/download) to manage your Python environment.\n(You can also download a specific Anaconda installer from [**here**](https://repo.anaconda.com/archive/).)\n\nThen open a terminal and create/activate a new conda environment (Python \u003e= 3.8 is recommended):\n\n```bash\nconda create -n xuance_env python=3.8 \u0026\u0026 conda activate xuance_env\n```\n\n**Step 2**: Install XuanCe\n\n```bash\npip install xuance\n```\n\nThis command does not include the dependencies of deep learning backends. To install the **XuanCe** with\ndeep learning tools, you can type `pip install xuance[torch]` for [PyTorch](https://pytorch.org/get-started/locally/),\n`pip install xuance[tensorflow]` for [TensorFlow2](https://www.tensorflow.org/install),\n`pip install xuance[mindspore]` for [MindSpore](https://www.mindspore.cn/install/en),\nand `pip install xuance[all]` for all dependencies.\n\nNote: Some extra packages should be installed manually for further usage.\nClick [**here**](https://xuance.org/documents/usage/installation.html) to see more details for installation.\n\n## :point_right: Quickly Start\n\n### Train a Model\n\n```python\nimport xuance\n\nrunner = xuance.get_runner(algo='ppo',\n                           env='classic_control',\n                           env_id='CartPole-v1')\nrunner.run(mode='train')\n```\n\n### Test the Model\n\n```python\nimport xuance\n\nrunner = xuance.get_runner(algo='ppo',\n                           env='classic_control',\n                           env_id='CartPole-v1')\nrunner.run(mode='test')\n```\n\n### Visualize the results\n\n#### Tensorboard\n\nYou can use tensorboard to visualize what happened in the training process. After training, the log file will be\nautomatically generated in the directory \".results/\" and you should be able to see some training data after running the\ncommand.\n\n```\n$ tensorboard --logdir ./logs/dqn/torch/CartPole-v0\n```\n\n\u003cdiv align=\"center\"\u003e\n\u003cimg src=\"docs/source/_static/figures/log/tensorboard.png\" width=\"700\" height=\"auto\" align=center /\u003e\n\u003c/div\u003e\n\n#### Weights \u0026 Biases (wandb)\n\nXuanCe also supports Weights \u0026 Biases (wandb) tools for users to visualize the results of the running implementation.\n\nHow to use wandb online? :arrow_right: [https://github.com/wandb/wandb.git/](https://github.com/wandb/wandb.git/)\n\nHow to use wandb offline? :arrow_right: [https://github.com/wandb/server.git/](https://github.com/wandb/server.git/)\n\n\u003cdiv align=\"center\"\u003e\n\u003cimg src=\"docs/source/_static/figures/log/wandb.png\" width=\"700\" height=\"auto\" align=center /\u003e\n\u003c/div\u003e\n\n\u003c!-- If everything going well, you should get a similar display like below. \n\n![Tensorboard](docs/source/figures/debug.png) --\u003e\n\n## Benchmarks\n\nXuanCe provides an official benchmark pipeline for evaluating DRL and MARL algorithms.\n\nTo avoid increasing the size of the main repository,\n**official benchmark results (including evaluation curves, summary tables, and pretrained models)**\nare maintained in a separate repository:\n\n👉 **https://github.com/agi-brain/xuance-benchmarks**\n\nUsers can either:\n\n- Run benchmarks locally using the provided pipeline, or\n- Directly inspect and reuse the official benchmark results without rerunning experiments.\n\n## Community\n\n- GitHub issues: [https://github.com/agi-brain/xuance/issues](https://github.com/agi-brain/xuance/issues)\n- Github discussions: [https://github.com/orgs/agi-brain/discussions](https://github.com/orgs/agi-brain/discussions)\n- Discord invite link: [https://discord.gg/HJn2TBQS7y](https://discord.gg/HJn2TBQS7y)\n- Slack invite link: [https://join.slack.com/t/xuancerllib/](https://join.slack.com/t/xuancerllib/shared_invite/zt-2x2r98msi-iMX6mSVcgWwXYj95abcXIw)\n- QQ App's group number: 552432695, 153966755\n- WeChat account: \"玄策 RLlib\"\n\n(Note: You can also post your questions on [Stack Overflow](https://stackoverflow.com/).)\n\n\u003cdetails open\u003e\n\u003csummary\u003e(QR code for QQ group and WeChat official account)\u003c/summary\u003e\n\n\u003ctable rules=\"none\" align=\"center\"\u003e\u003ctr\u003e\n\u003ctd\u003e \u003ccenter\u003e\n\u003cimg src=\"docs/source/_static/figures/QQ_group_1.JPG\" width=\"150\" height=\"auto\" /\u003e\u003cbr/\u003e\u003cfont color=\"AAAAAA\"\u003eQQ group 1\u003c/font\u003e\n\u003c/center\u003e\u003c/td\u003e\n\u003ctd\u003e \u003ccenter\u003e\n\u003cimg src=\"docs/source/_static/figures/QQ_group_2.JPG\" width=\"150\" height=\"auto\" /\u003e\u003cbr/\u003e\u003cfont color=\"AAAAAA\"\u003eQQ group 2\u003c/font\u003e\n\u003c/center\u003e\u003c/td\u003e\n\u003ctd\u003e \u003ccenter\u003e\n\u003cimg src=\"docs/source/_static/figures/Official_Account_Wechat.JPG\" width=\"150\" height=\"auto\" /\u003e \u003cbr/\u003e \u003cfont color=\"AAAAAA\"\u003eOfficial account (WeChat)\u003c/font\u003e\n\u003c/center\u003e \u003c/td\u003e\n\u003c/tr\u003e\n\u003c/table\u003e\n\n\u003c/details\u003e\n\n## Citations\n\nIf you use XuanCe in your research or development, please cite the paper:\n\n```\n@article{liu2023xuance,\n  title={XuanCe: A Comprehensive and Unified Deep Reinforcement Learning Library},\n  author={Liu, Wenzhang and Cai, Wenzhe and Jiang, Kun and Cheng, Guangran and Wang, Yuanda and Wang, Jiawei and Cao, Jingyu and Xu, Lele and Mu, Chaoxu and Sun, Changyin},\n  journal={arXiv preprint arXiv:2312.16248},\n  year={2023}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fagi-brain%2Fxuance","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fagi-brain%2Fxuance","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fagi-brain%2Fxuance/lists"}