{"id":13958523,"url":"https://github.com/sjtu-marl/malib","last_synced_at":"2025-07-21T00:31:06.442Z","repository":{"id":40336931,"uuid":"365077762","full_name":"sjtu-marl/malib","owner":"sjtu-marl","description":"A parallel framework for population-based multi-agent reinforcement learning.","archived":false,"fork":false,"pushed_at":"2023-12-14T09:46:35.000Z","size":9621,"stargazers_count":525,"open_issues_count":6,"forks_count":63,"subscribers_count":9,"default_branch":"main","last_synced_at":"2025-04-21T09:53:09.108Z","etag":null,"topics":["distributed","games","multiagent","parallel","python","ray","reinforcement-learning"],"latest_commit_sha":null,"homepage":"https://malib.io","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sjtu-marl.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2021-05-07T01:08:37.000Z","updated_at":"2025-04-18T09:04:53.000Z","dependencies_parsed_at":"2023-02-18T13:46:31.244Z","dependency_job_id":"2c881b44-9377-4837-bd6b-1d80a6c17868","html_url":"https://github.com/sjtu-marl/malib","commit_stats":null,"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"purl":"pkg:github/sjtu-marl/malib","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sjtu-marl%2Fmalib","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sjtu-marl%2Fmalib/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sjtu-marl%2Fmalib/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sjtu-marl%2Fmalib/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sjtu-marl","download_url":"https://codeload.github.com/sjtu-marl/malib/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sjtu-marl%2Fmalib/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":266221252,"owners_count":23894965,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["distributed","games","multiagent","parallel","python","ray","reinforcement-learning"],"created_at":"2024-08-08T13:01:41.870Z","updated_at":"2025-07-21T00:31:05.726Z","avatar_url":"https://github.com/sjtu-marl.png","language":"Python","funding_links":[],"categories":["时间序列","Industry Strength RL"],"sub_categories":["网络服务_其他"],"readme":"\n\u003cdiv align=center\u003e\u003cimg src=\"docs/imgs/logo.svg\" width=\"35%\"\u003e\u003c/div\u003e\n\n\n# MALib: A parallel framework for population-based reinforcement learning\n\n[![GitHub license](https://img.shields.io/badge/license-MIT-blue.svg)](https://github.com/sjtu-marl/malib/blob/main/LICENSE)\n[![Documentation Status](https://readthedocs.org/projects/malib/badge/?version=latest)](https://malib.readthedocs.io/en/latest/?badge=latest)\n[![Build Status](https://app.travis-ci.com/sjtu-marl/malib.svg?branch=main)](https://app.travis-ci.com/sjtu-marl/malib.svg?branch=main)\n[![codecov](https://codecov.io/gh/sjtu-marl/malib/branch/main/graph/badge.svg?token=CJX14B2AJG)](https://codecov.io/gh/sjtu-marl/malib)\n\nMALib is a parallel framework of population-based learning nested with reinforcement learning methods, such as Policy Space Response Oracle, Self-Play, and Neural Fictitious Self-Play. MALib provides higher-level abstractions of MARL training paradigms, which enables efficient code reuse and flexible deployments on different distributed computing paradigms.\n\n![architecture](docs/imgs/architecture3.png)\n\n## Installation\n\nThe installation of MALib is very easy. We've tested MALib on Python 3.8 and above. This guide is based on Ubuntu 18.04 and above (currently, MALib can only run on Linux system). We strongly recommend using [conda](https://docs.conda.io/en/latest/miniconda.html) to manage your dependencies, and avoid version conflicts. Here we show the example of building python 3.8 based conda environment.\n\n\n```bash\nconda create -n malib python==3.8 -y\nconda activate malib\n\n# install dependencies\n./install.sh\n```\n\n## Environments\n\nMALib integrates many popular reinforcement learning environments, we list some of them as follows.\n\n- [x] [OpenSpiel](https://github.com/deepmind/open_spiel): A framework for Reinforcement Learning in games, it provides plenty of environments for the research of game theory.\n- [x] [Gym](https://github.com/openai/gym): An open source environment collections for developing and comparing reinforcement learning algorithms.\n- [x] [Google Research Football](https://github.com/google-research/football): RL environment based on open-source game Gameplay Football.\n- [x] [SMAC](https://github.com/oxwhirl/smac): An environment for research in the field of collaborative multi-agent reinforcement learning (MARL) based on Blizzard's StarCraft II RTS game.\n- [x] [PettingZoo](https://github.com/Farama-Foundation/PettingZoo): A Python library for conducting research in multi-agent reinforcement learning, akin to a multi-agent version of [Gymnasium](https://github.com/Farama-Foundation/Gymnasium).\n- [ ] [DexterousHands](https://github.com/PKU-MARL/DexterousHands): An environment collection of bimanual dexterous manipulations tasks.\n\nSee [malib/envs](/malib/envs/) for more details. In addition, users can customize environments with MALib's environment interfaces. Please refer to our documentation.\n\n## Algorithms and Scenarios\n\nMALib integrates population-based reinforcement learning, popular deep reinforcement learning algorithms. See algorithms table [here](/algorithms.md). The supported learning scenarios are listed as follow:\n\n- [x] Single-stream PSRO scenario: for single-stream population-based reinforcement learning algorithms, cooperating with empirical game theoretical analysis methods. See [scenarios/psro_scenario.py](/malib/scenarios/psro_scenario.py)\n- [ ] Multi-stream PSRO scenario: for multi-stream population-based reinforcement learning algorithms, cooperating with empirical game theoretical analysis methods. See [scenarios/p2sro_scenario.py](/malib/scenarios/p2sro_scenario.py)\n- [x] Multi-agent Reinforcement Learning scenario: for multi-/single-agent reinforcement learning, with distributed techniques. See [scenarios/marl_scenario.py](/malib/scenarios/marl_scenario.py)\n\n## Quick Start\n\nBefore running examples, please ensure that you import python path as:\n\n```bash\ncd malib\n\n# if you run malib installation with `pip install -e .`, you can ignore the path export\nexport PYTHONPATH=./\n```\n\n- Running PSRO example to start training for Kuhn Poker game: `python examples/run_psro.py`\n- Running RL example to start training for CartPole-v1 game: `python examples/run_gym.py`\n\n## Documentation\n\nSee online documentation at [MALib Docs](https://malib.readthedocs.io/), or you can also compile a local version by compiling local files as\n\n```bash\npip install -e .[dev]\nmake docs-compile\n```\n\nThen start a web server to get the docs:\n\n```bash\n# execute following command, then the server will start at: http://localhost:8000\nmake docs-view\n```\n\n## Contributing\n\nRead [CONTRIBUTING.md](/CONTRIBUTING.md) for more details.\n\n## Citing MALib\n\n\nIf you use MALib in your work, please cite the accompanying [paper](https://www.jmlr.org/papers/v24/22-0169.html).\n\n```bibtex\n@article{JMLR:v24:22-0169,\n  author  = {Ming Zhou and Ziyu Wan and Hanjing Wang and Muning Wen and Runzhe Wu and Ying Wen and Yaodong Yang and Yong Yu and Jun Wang and Weinan Zhang},\n  title   = {MALib: A Parallel Framework for Population-based Multi-agent Reinforcement Learning},\n  journal = {Journal of Machine Learning Research},\n  year    = {2023},\n  volume  = {24},\n  number  = {150},\n  pages   = {1--12},\n  url     = {http://jmlr.org/papers/v24/22-0169.html}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsjtu-marl%2Fmalib","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsjtu-marl%2Fmalib","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsjtu-marl%2Fmalib/lists"}