{"id":18478874,"url":"https://github.com/facebookresearch/benchmarl","last_synced_at":"2025-05-15T11:04:10.333Z","repository":{"id":199901978,"uuid":"681603891","full_name":"facebookresearch/BenchMARL","owner":"facebookresearch","description":"A collection of MARL benchmarks based on TorchRL","archived":false,"fork":false,"pushed_at":"2025-02-10T17:04:38.000Z","size":485,"stargazers_count":339,"open_issues_count":10,"forks_count":53,"subscribers_count":13,"default_branch":"main","last_synced_at":"2025-02-11T09:17:47.461Z","etag":null,"topics":["benchmark","machine-learning","marl","multi-agent","multi-agent-reinforcement-learning","pytorch","reinforcement-learning","rl","robotics","torch"],"latest_commit_sha":null,"homepage":"https://benchmarl.readthedocs.io/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/facebookresearch.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-08-22T11:19:20.000Z","updated_at":"2025-02-10T17:04:39.000Z","dependencies_parsed_at":"2023-11-25T17:31:26.323Z","dependency_job_id":"579bd3d3-7842-48f4-bf8c-af94c2d788ad","html_url":"https://github.com/facebookresearch/BenchMARL","commit_stats":{"total_commits":124,"total_committers":4,"mean_commits":31.0,"dds":"0.024193548387096753","last_synced_commit":"e910a837c2cedf2531ead63b0719d39516149d79"},"previous_names":["facebookresearch/benchmarl"],"tags_count":12,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/facebookresearch%2FBenchMARL","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/facebookresearch%2FBenchMARL/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/facebookresearch%2FBenchMARL/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/facebookresearch%2FBenchMARL/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/facebookresearch","download_url":"https://codeload.github.com/facebookresearch/BenchMARL/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248027407,"owners_count":21035594,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["benchmark","machine-learning","marl","multi-agent","multi-agent-reinforcement-learning","pytorch","reinforcement-learning","rl","robotics","torch"],"created_at":"2024-11-06T12:12:23.069Z","updated_at":"2025-04-09T11:05:40.003Z","avatar_url":"https://github.com/facebookresearch.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"![BenchMARL](https://raw.githubusercontent.com/matteobettini/benchmarl_sphinx_theme/master/benchmarl_sphinx_theme/static/img/benchmarl.png?raw=true)\n\n\n# BenchMARL\n[![tests](https://github.com/facebookresearch/BenchMARL/actions/workflows/unit_tests.yml/badge.svg)](test)\n[![Documentation Status](https://readthedocs.org/projects/benchmarl/badge/?version=latest)](https://benchmarl.readthedocs.io/en/latest/?badge=latest)\n[![Python](https://img.shields.io/badge/python-3.8%20%7C%203.9%20%7C%203.10%20%7C%203.11-blue.svg)](https://www.python.org/downloads/)\n\u003ca href=\"https://pypi.org/project/benchmarl\"\u003e\u003cimg src=\"https://img.shields.io/pypi/v/benchmarl\" alt=\"pypi version\"\u003e\u003c/a\u003e\n[![Downloads](https://static.pepy.tech/personalized-badge/benchmarl?period=total\u0026units=international_system\u0026left_color=grey\u0026right_color=blue\u0026left_text=Downloads)](https://pepy.tech/project/benchmarl)\n[![Discord Shield](https://dcbadge.vercel.app/api/server/jEEWCn6T3p?style=flat)](https://discord.gg/jEEWCn6T3p)\n[![arXiv](https://img.shields.io/badge/arXiv-2312.01472-b31b1b.svg)](https://arxiv.org/abs/2312.01472)\n\n```bash\npython benchmarl/run.py algorithm=mappo task=vmas/balance\n```\n\n\n[![Examples](https://img.shields.io/badge/Examples-blue.svg)](examples) [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/facebookresearch/BenchMARL/blob/main/notebooks/run.ipynb)\n[![Static Badge](https://img.shields.io/badge/Benchmarks-Wandb-yellow)](https://wandb.ai/matteobettini/benchmarl-public/reportlist)\n\n- Watch the [talk on multi-agent simulation and learning in BenchMARL and TorchRL](https://www.youtube.com/watch?v=1tOIMgJf_VQ).\n- Watch the [lecture on creating a custom scenario in VMAS and training it in BenchMARL](https://www.youtube.com/watch?v=mIb1uGeRJsg)\n\n### What is BenchMARL 🧐?\n\nBenchMARL is a Multi-Agent Reinforcement Learning (MARL) training library created to enable reproducibility\nand benchmarking across different MARL algorithms and environments.\nIts mission is to present a standardized interface that allows easy integration of new algorithms and environments to \nprovide a fair comparison with existing solutions.\nBenchMARL uses [TorchRL](https://github.com/pytorch/rl) as its backend, which grants it high performance \nand state-of-the-art implementations.\nIt also uses [hydra](https://hydra.cc/docs/intro/) for flexible and modular configuration,\nand its data reporting is compatible with [marl-eval](https://sites.google.com/view/marl-standard-protocol/home) \nfor standardised and statistically strong evaluations.\n\nBenchMARL **core design tenets** are:\n* _Reproducibility through systematical grounding and standardization of configuration_ \n* _Standardised and statistically-strong plotting and reporting_\n* _Experiments that are independent of the algorithm, environment, and model choices_\n* _Breadth over the MARL ecosystem_\n* _Easy implementation of new algorithms, environments, and models_\n* _Leveraging the know-how and infrastructure of [TorchRL](https://github.com/pytorch/rl), without reinventing the wheel_\n\n### Why would I BenchMARL 🤔?\n\nWhy would you BenchMARL, I see you ask. \nWell, you can BenchMARL to compare different algorithms, environments, models, \nto check how your new research compares to existing ones, or if you just want to approach \nthe domain and want to easily take a picture of the landscape.\n\n### Table of contents\n\n- [BenchMARL](#benchmarl)\n  * [How to use](#how-to-use)\n    + [Notebooks](#notebooks)\n    + [Install](#install)\n    + [Run](#run)\n  * [Concept](#concept)\n  * [Fine-tuned public benchmarks](#fine-tuned-public-benchmarks)\n  * [Reporting and plotting](#reporting-and-plotting)\n  * [Extending](#extending)\n  * [Configuring](#configuring)\n    + [Experiment](#experiment)\n    + [Algorithm](#algorithm)\n    + [Task](#task)\n    + [Model](#model)\n  * [Features](#features)\n    + [Logging](#logging)\n    + [Checkpointing](#checkpointing)\n    + [Callbacks](#callbacks)\n  * [Citing BenchMARL](#citing-benchmarl)\n\n\n## How to use\n\n### Notebooks\n- [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/facebookresearch/BenchMARL/blob/main/notebooks/run.ipynb) \u0026ensp; **Running BenchMARL experiments**.\n- [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/proroklab/VectorizedMultiAgentSimulator/blob/main/notebooks/Simulation_and_training_in_VMAS_and_BenchMARL.ipynb) \u0026ensp;  **Creating a VMAS scenario and training it in BenchMARL**.  We will create a scenario where multiple robots with different embodiments need to navigate to their goals while avoiding each other (as well as obstacles) and train it using MAPPO and MLP/GNN policies.\n\n\n### Install\n\n#### Install TorchRL\n\nYou can install TorchRL from PyPi.\n\n```bash\npip install torchrl\n```\nFor more details, or for installing nightly versions, see the\n[TorchRL installation guide](https://github.com/pytorch/rl#installation).\n\n#### Install BenchMARL\nYou can just install it from github\n```bash\npip install benchmarl\n```\nOr also clone it locally to access the configs and scripts\n```bash\ngit clone https://github.com/facebookresearch/BenchMARL.git\npip install -e BenchMARL\n```\n#### Install environments\n\nAll enviornment dependencies are optional in BenchMARL and can be installed separately.\n\n##### VMAS\n\n```bash\npip install vmas\n```\n\n##### PettingZoo\n```bash\npip install \"pettingzoo[all]\"\n```\n\n##### MeltingPot\n```bash\npip install dm-meltingpot\n```\n\n##### MAgent2\n\n```bash\npip install git+https://github.com/Farama-Foundation/MAgent2\n```\n\n##### SMACv2\n\nFollow the instructions on the environment [repository](https://github.com/oxwhirl/smacv2).\n\n[Here](.github/unittest/install_smacv2.sh) is how we install it on linux.\n\n### Run\n\nExperiments are launched with a [default configuration](benchmarl/conf) that \ncan be overridden in many ways. \nTo learn how to customize and override configurations\nplease refer to the [configuring section](#configuring).\n\n#### Command line\n\nTo launch an experiment from the command line you can do\n\n```bash\npython benchmarl/run.py algorithm=mappo task=vmas/balance\n```\n[![Example](https://img.shields.io/badge/Example-blue.svg)](examples/running/run_experiment.sh)\n\n\nThanks to [hydra](https://hydra.cc/docs/intro/), you can run benchmarks as multi-runs like:\n```bash\npython benchmarl/run.py -m algorithm=mappo,qmix,masac task=vmas/balance,vmas/sampling seed=0,1\n```\n[![Example](https://img.shields.io/badge/Example-blue.svg)](examples/running/run_benchmark.sh)\n\nThe default implementation for hydra multi-runs is sequential, but [parallel](https://hydra.cc/docs/plugins/joblib_launcher/)\n and [slurm](https://hydra.cc/docs/plugins/submitit_launcher/) launchers are also available.\n\n#### Script\n\nYou can also load and launch your experiments from within a script\n\n```python\n experiment = Experiment(\n    task=VmasTask.BALANCE.get_from_yaml(),\n    algorithm_config=MappoConfig.get_from_yaml(),\n    model_config=MlpConfig.get_from_yaml(),\n    critic_model_config=MlpConfig.get_from_yaml(),\n    seed=0,\n    config=ExperimentConfig.get_from_yaml(),\n)\nexperiment.run()\n```\n[![Example](https://img.shields.io/badge/Example-blue.svg)](examples/running/run_experiment.py)\n\n\nYou can also run multiple experiments in a `Benchmark`.\n\n```python\nbenchmark = Benchmark(\n    algorithm_configs=[\n        MappoConfig.get_from_yaml(),\n        QmixConfig.get_from_yaml(),\n        MasacConfig.get_from_yaml(),\n    ],\n    tasks=[\n        VmasTask.BALANCE.get_from_yaml(),\n        VmasTask.SAMPLING.get_from_yaml(),\n    ],\n    seeds={0, 1},\n    experiment_config=ExperimentConfig.get_from_yaml(),\n    model_config=MlpConfig.get_from_yaml(),\n    critic_model_config=MlpConfig.get_from_yaml(),\n)\nbenchmark.run_sequential()\n```\n[![Example](https://img.shields.io/badge/Example-blue.svg)](examples/running/run_benchmark.py)\n\n\n## Concept\n\nThe goal of BenchMARL is to bring different MARL environments and algorithms\nunder the same interfaces to enable fair and reproducible comparison and benchmarking.\nBenchMARL is a full-pipline unified training library with the goal of enabling users to run\nany comparison they want across our algorithms and tasks in just one line of code.\nTo achieve this, BenchMARL interconnects components from [TorchRL](https://github.com/pytorch/rl), \nwhich provides an efficient and reliable backend.\n\nThe library has a [default configuration](benchmarl/conf) for each of its components.\nWhile parts of this configuration are supposed to be changed (for example experiment configurations),\nother parts (such as tasks) should not be changed to allow for reproducibility.\nTo aid in this, each version of BenchMARL is paired to a default configuration.\n\nLet's now introduce each component in the library.\n\n**Experiment**. An experiment is a training run in which an algorithm, a task, and a model are fixed.\nExperiments are configured by passing these values alongside a seed and the experiment hyperparameters.\nThe experiment [hyperparameters](benchmarl/conf/experiment/base_experiment.yaml) cover both \non-policy and off-policy algorithms, discrete and continuous actions, and probabilistic and deterministic policies\n(as they are agnostic of the algorithm or task used).\nAn experiment can be launched from the command line or from a script. \nSee the [run](#run) section for more information.\n\n**Benchmark**. In the library we call `benchmark` a collection of experiments that can vary in tasks, algorithm, or model.\nA benchmark shares the same experiment configuration across all of its experiments.\nBenchmarks allow to compare different MARL components in a standardized way.\nA benchmark can be launched from the command line or from a script. \nSee the [run](#run) section for more information.\n\n**Algorithms**. Algorithms are an ensemble of components (e.g., losss, replay buffer) which\ndetermine the training strategy. Here is a table with the currently implemented algorithms in BenchMARL.\n\n| Name                                                                                                                                        | On/Off policy | Actor-critic | Full-observability in critic | Action compatibility  | Probabilistic actor |   \n|---------------------------------------------------------------------------------------------------------------------------------------------|---------------|--------------|------------------------------|-----------------------|---------------------|\n| [MAPPO](https://arxiv.org/abs/2103.01955)                                                                                                   | On            | Yes          | Yes                          | Continuous + Discrete | Yes                 |   \n| [IPPO](https://arxiv.org/abs/2011.09533)                                                                                                    | On            | Yes          | No                           | Continuous + Discrete | Yes                 |  \n| [MADDPG](https://arxiv.org/abs/1706.02275)                                                                                                  | Off           | Yes          | Yes                          | Continuous            | No                  | \n| [IDDPG](benchmarl/algorithms/iddpg.py)                                                                                                      | Off           | Yes          | No                           | Continuous            | No                  |   \n| [MASAC](benchmarl/algorithms/masac.py)                                                                                                      | Off           | Yes          | Yes                          | Continuous + Discrete | Yes                 |   \n| [ISAC](benchmarl/algorithms/isac.py)                                                                                                        | Off           | Yes          | No                           | Continuous + Discrete | Yes                 |   \n| [QMIX](https://arxiv.org/abs/1803.11485)                                                                                                    | Off           | No           | NA                           | Discrete              | No                  | \n| [VDN](https://arxiv.org/abs/1706.05296)                                                                                                     | Off           | No           | NA                           | Discrete              | No                  |  \n| [IQL](https://www.semanticscholar.org/paper/Multi-Agent-Reinforcement-Learning%3A-Independent-Tan/59de874c1e547399b695337bcff23070664fa66e) | Off           | No           | NA                           | Discrete              | No                  |  \n\n\n**Tasks**. Tasks are scenarios from a specific environment which constitute the MARL\nchallenge to solve.\nThey differ based on many aspects, here is a table with the current environments in BenchMARL\n\n| Environment                                                         | Tasks                                | Cooperation               | Global state | Reward function               | Action space          |    Vectorized    |\n|---------------------------------------------------------------------|--------------------------------------|---------------------------|--------------|-------------------------------|-----------------------|:----------------:|\n| [VMAS](https://github.com/proroklab/VectorizedMultiAgentSimulator)  | [27](benchmarl/conf/task/vmas)       | Cooperative + Competitive | No           | Shared + Independent + Global | Continuous + Discrete |       Yes        |    \n| [SMACv2](https://github.com/oxwhirl/smacv2)                         | [15](benchmarl/conf/task/smacv2)     | Cooperative               | Yes          | Global                        | Discrete              |        No        |\n| [MPE](https://github.com/openai/multiagent-particle-envs)           | [8](benchmarl/conf/task/pettingzoo)  | Cooperative + Competitive | Yes          | Shared + Independent          | Continuous + Discrete |        No        |\n| [SISL](https://github.com/sisl/MADRL)                               | [2](benchmarl/conf/task/pettingzoo)  | Cooperative               | No           | Shared                        | Continuous            |        No        |\n| [MeltingPot](https://github.com/google-deepmind/meltingpot)         | [49](benchmarl/conf/task/meltingpot) | Cooperative + Competitive | Yes          | Independent                   | Discrete              |        No        |\n| [MAgent2](https://github.com/Farama-Foundation/magent2)             | [1](benchmarl/conf/task/magent)      | Cooperative + Competitive | Yes          | Global in groups              | Discrete              |        No        |\n\n\n\u003e [!NOTE]  \n\u003e BenchMARL uses the [TorchRL MARL API](https://github.com/pytorch/rl/issues/1463) for grouping agents.\n\u003e In competitive environments like MPE, for example, teams will be in different groups. Each group has its own loss,\n\u003e models, buffers, and so on. Parameter sharing options refer to sharing within the group. See the example on [creating\n\u003e a custom algorithm](examples/extending/algorithm/algorithms/customalgorithm.py) for more info.\n\n**Models**. Models are neural networks used to process data. They can be used as actors (policies) or, \nwhen requested, as critics. We provide a set of base models (layers) and a SequenceModel to concatenate\ndifferent layers. All the models can be used with or without parameter sharing within an \nagent group. Here is a table of the models implemented in BenchMARL\n\n| Name                                     | Decentralized | Centralized with local inputs | Centralized with global input | \n|------------------------------------------|:-------------:|:-----------------------------:|:-----------------------------:|\n| [MLP](benchmarl/models/mlp.py)           |      Yes      |              Yes              |              Yes              |\n| [GRU](benchmarl/models/gru.py)           |      Yes      |              Yes              |              Yes              |\n| [LSTM](benchmarl/models/lstm.py)         |      Yes      |              Yes              |              Yes              |\n| [GNN](benchmarl/models/gnn.py)           |      Yes      |              Yes              |              No               |\n| [CNN](benchmarl/models/cnn.py)           |      Yes      |              Yes              |              Yes              |\n| [Deepsets](benchmarl/models/deepsets.py) |      Yes      |              Yes              |              Yes              |\n\n\n## Fine-tuned public benchmarks\n\u003e [!WARNING]  \n\u003e This section is under a work in progress. We are constantly working on fine-tuning\n\u003e our experiments to enable our users to have access to state-of-the-art benchmarks.\n\u003e If you would like to collaborate in this effort, please reach out to us.\n\nIn the [fine_tuned](fine_tuned) folder we are collecting some tested hyperparameters for\nspecific environments to enable users to bootstrap their benchmarking.\nYou can just run the scripts in this folder to automatically use the proposed hyperparameters.\n\nWe will tune benchmarks for you and publish the config and benchmarking plots on\n[Wandb](https://wandb.ai/matteobettini/benchmarl-public/reportlist) publicly\n\nCurrently available ones are:\n\n- **VMAS**:  [![Conf](https://img.shields.io/badge/Conf-purple.svg)](fine_tuned/vmas/conf/config.yaml)  [![Static Badge](https://img.shields.io/badge/Benchmarks-Wandb-yellow)](https://api.wandb.ai/links/matteobettini/r5744vas)\n\nIn the following, we report a table of the results:\n\n| **\u003cp align=\"center\"\u003eEnvironment\u003c/p\u003e** | **\u003cp align=\"center\"\u003eSample efficiency curves (all tasks)\u003c/p\u003e**                                                                                                                        | **\u003cp align=\"center\"\u003ePerformance profile\u003c/p\u003e**                                                                                                                               | **\u003cp align=\"center\"\u003eAggregate scores\u003c/p\u003e**                                                                                                                        |\n|---------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| VMAS                                  | \u003cimg src=\"https://raw.githubusercontent.com/matteobettini/benchmarl_sphinx_theme/master/benchmarl_sphinx_theme/static/img/benchmarks/vmas/environemnt_sample_efficiency_curves.png\"/\u003e | \u003cimg src=\"https://raw.githubusercontent.com/matteobettini/benchmarl_sphinx_theme/master/benchmarl_sphinx_theme/static/img/benchmarks/vmas/performance_profile_figure.png\"/\u003e | \u003cimg src=\"https://raw.githubusercontent.com/matteobettini/benchmarl_sphinx_theme/master/benchmarl_sphinx_theme/static/img/benchmarks/vmas/aggregate_scores.png\"/\u003e |\n\n## Reporting and plotting\n\nReporting and plotting is compatible with [marl-eval](https://github.com/instadeepai/marl-eval). \nIf `experiment.create_json=True` (this is the default in the [experiment config](benchmarl/conf/experiment/base_experiment.yaml))\na file named `{experiment_name}.json` will be created in the experiment output folder with the format of [marl-eval](https://github.com/instadeepai/marl-eval).\nYou can load and merge these files using the utils in [eval_results](benchmarl/eval_results.py) to create beautiful plots of \nyour benchmarks.  No more struggling with matplotlib and latex!\n\n[![Example](https://img.shields.io/badge/Example-blue.svg)](examples/plotting)\n\n![aggregate_scores](https://raw.githubusercontent.com/matteobettini/benchmarl_sphinx_theme/master/benchmarl_sphinx_theme/static/img/benchmarks/vmas/aggregate_scores.png)\n![sample_efficiancy](https://raw.githubusercontent.com/matteobettini/benchmarl_sphinx_theme/master/benchmarl_sphinx_theme/static/img/benchmarks/vmas/environemnt_sample_efficiency_curves.png)\n![performace_profile](https://raw.githubusercontent.com/matteobettini/benchmarl_sphinx_theme/master/benchmarl_sphinx_theme/static/img/benchmarks/vmas/performance_profile_figure.png)\n\n\n## Extending\nOne of the core tenets of BenchMARL is allowing users to leverage the existing algorithm\nand tasks implementations to benchmark their newly proposed solution.\n\nFor this reason we expose standard interfaces with simple abstract methods\nfor [algorithms](benchmarl/algorithms/common.py), [tasks](benchmarl/environments/common.py) and [models](benchmarl/models/common.py).\nTo introduce your solution in the library, you just need to implement the abstract methods\nexposed by these base classes which use objects from the [TorchRL](https://github.com/pytorch/rl) library.\n\nHere is an example on how you can create a custom algorithm [![Example](https://img.shields.io/badge/Example-blue.svg)](examples/extending/algorithm).\n\nHere is an example on how you can create a custom task [![Example](https://img.shields.io/badge/Example-blue.svg)](examples/extending/task).\n\nHere is an example on how you can create a custom model [![Example](https://img.shields.io/badge/Example-blue.svg)](examples/extending/model).\n\n\n## Configuring\nAs highlighted in the [run](#run) section, the project can be configured either\nin the script itself or via [hydra](https://hydra.cc/docs/intro/). \nWe suggest to read the hydra documentation\nto get familiar with all its functionalities. \n\nEach component in the project has a corresponding yaml configuration in the BenchMARL \n[conf tree](benchmarl/conf). \nComponents' configurations are loaded from these files into python dataclasses that act \nas schemas for validation of parameter names and types. That way we keep the best of \nboth words: separation of all configuration from code and strong typing for validation! \nYou can also directly load and validate configuration yaml files without using hydra from a script by calling \n`ComponentConfig.get_from_yaml()`.\n\n### Experiment\n\nExperiment configurations are in [`benchmarl/conf/config.yaml`](benchmarl/conf/config.yaml).\nRunning custom experiments is extremely simplified by the [Hydra](https://hydra.cc/) configurations.\nThe default configuration for the library is contained in the [`benchmarl/conf`](benchmarl/conf) folder.\n\nWhen running an experiment you can override its hyperparameters like so\n```bash\npython benchmarl/run.py task=vmas/balance algorithm=mappo experiment.lr=0.03 experiment.evaluation=true experiment.train_device=\"cpu\"\n```\n\nExperiment hyperparameters are loaded from [`benchmarl/conf/experiment/base_experiment.yaml`](benchmarl/conf/experiment/base_experiment.yaml)\ninto a dataclass [`ExperimentConfig`](benchmarl/experiment/experiment.py) defining their domain.\nThis makes it so that all and only the parameters expected are loaded with the right types.\nYou can also directly load them from a script by calling `ExperimentConfig.get_from_yaml()`.\n\nHere is an example of overriding experiment hyperparameters from hydra \n[![Example](https://img.shields.io/badge/Example-blue.svg)](examples/configuring/configuring_experiment.sh) or from\na script [![Example](https://img.shields.io/badge/Example-blue.svg)](examples/configuring/configuring_experiment.py).\n\n### Algorithm\n\nYou can override an algorithm configuration when launching BenchMARL.\n\n```bash\npython benchmarl/run.py task=vmas/balance algorithm=masac algorithm.num_qvalue_nets=3 algorithm.target_entropy=auto algorithm.share_param_critic=true\n```\n\nAvailable algorithms and their default configs can be found at [`benchmarl/conf/algorithm`](benchmarl/conf/algorithm).\nThey are loaded into a dataclass [`AlgorithmConfig`](benchmarl/algorithms/common.py), present for each algorithm, defining their domain.\nThis makes it so that all and only the parameters expected are loaded with the right types.\nYou can also directly load them from a script by calling `YourAlgorithmConfig.get_from_yaml()`.\n\nHere is an example of overriding algorithm hyperparameters from hydra \n[![Example](https://img.shields.io/badge/Example-blue.svg)](examples/configuring/configuring_algorithm.sh) or from\na script [![Example](https://img.shields.io/badge/Example-blue.svg)](examples/configuring/configuring_algorithm.py).\n\n\n### Task\n\nYou can override a task configuration when launching BenchMARL.\nHowever this is not recommended for benchmarking as tasks should have fixed version and parameters for reproducibility.\n\n```bash\npython benchmarl/run.py task=vmas/balance algorithm=mappo task.n_agents=4\n```\n\nAvailable tasks and their default configs can be found at [`benchmarl/conf/task`](benchmarl/conf/task).\nThey are loaded into a dataclass [`TaskConfig`](benchmarl/environments/common.py), defining their domain.\nTasks are enumerations under the environment name. For example, `VmasTask.NAVIGATION` represents the navigation task in the\nVMAS simulator. This allows autocompletion and seeing all available tasks at once.\nYou can also directly load them from a script by calling `YourEnvTask.TASK_NAME.get_from_yaml()`.\n\nHere is an example of overriding task hyperparameters from hydra \n[![Example](https://img.shields.io/badge/Example-blue.svg)](examples/configuring/configuring_task.sh) or from\na script [![Example](https://img.shields.io/badge/Example-blue.svg)](examples/configuring/configuring_task.py).\n\n### Model\n\nYou can override the model configuration when launching BenchMARL.\nBy default an MLP model will be loaded with the default config.\nYou can change it like so:\n\n```bash\npython benchmarl/run.py task=vmas/balance algorithm=mappo model=layers/mlp model=layers/mlp model.layer_class=\"torch.nn.Linear\" \"model.num_cells=[32,32]\" model.activation_class=\"torch.nn.ReLU\"\n```\n\nAvailable models and their configs can be found at [`benchmarl/conf/model/layers`](benchmarl/conf/model/layers).\nThey are loaded into a dataclass [`ModelConfig`](benchmarl/models/common.py), defining their domain.\nYou can also directly load them from a script by calling `YourModelConfig.get_from_yaml()`.\n\nHere is an example of overriding model hyperparameters from hydra \n[![Example](https://img.shields.io/badge/Example-blue.svg)](examples/configuring/configuring_model.sh) or from\na script [![Example](https://img.shields.io/badge/Example-blue.svg)](examples/configuring/configuring_model.py).\n\n#### Sequence model\nYou can compose layers into a sequence model.\nAvailable layer names are in the [`benchmarl/conf/model/layers`](benchmarl/conf/model/layers) folder.\n\n```bash\npython benchmarl/run.py task=vmas/balance algorithm=mappo model=sequence \"model.intermediate_sizes=[256]\" \"model/layers@model.layers.l1=mlp\" \"model/layers@model.layers.l2=mlp\" \"+model/layers@model.layers.l3=mlp\" \"model.layers.l3.num_cells=[3]\"\n```\nAdd a layer with `\"+model/layers@model.layers.l3=mlp\"`.\n\nRemove a layer with `\"~model.layers.l2\"`.\n\nConfigure a layer with `\"model.layers.l1.num_cells=[3]\"`.\n\nHere is an example of creating a sequence model from hydra \n[![Example](https://img.shields.io/badge/Example-blue.svg)](examples/configuring/configuring_sequence_model.sh) or from\na script [![Example](https://img.shields.io/badge/Example-blue.svg)](examples/configuring/configuring_sequence_model.py).\n\n## Features\n\nBenchMARL has several features:\n- A test CI with integration and training test routines that are run for all simulators and algorithms\n- Integration in the official TorchRL ecosystem for dedicated support\n- Possibility of using different algorithms and models for different agent groups (see [`examples/ensemble`](examples/ensemble))\n\n\n### Logging\n\nBenchMARL is compatible with the [TorchRL loggers](https://github.com/pytorch/rl/tree/main/torchrl/record/loggers).\nA list of logger names can be provided in the [experiment config])(benchmarl/conf/experiment/base_experiment.yaml.\nExample of available options are: `wandb`, `csv`, `mflow`, `tensorboard` or any other option available in TorchRL. You can specify the loggers\nin the yaml config files or in the script arguments like so:\n```bash\npython benchmarl/run.py algorithm=mappo task=vmas/balance \"experiment.loggers=[wandb]\"\n```\nThe wandb logger is fully compatible with experiment restoring and will automatically resume the run of \nthe loaded experiment.\n\n### Checkpointing\n\nExperiments can be checkpointed every `experiment.checkpoint_interval` collected frames.\nExperiments will use an output folder for logging and checkpointing which can be specified in `experiment.save_folder`.\nIf this is left unspecified,\nthe default will be the hydra output folder (if using hydra) or (otherwise) the current directory \nwhere the script is launched.\nThe output folder will contain a folder for each experiment with the corresponding experiment name.\nTheir checkpoints will be stored in a `\"checkpoints\"` folder within the experiment folder.\n```bash\npython benchmarl/run.py task=vmas/balance algorithm=mappo experiment.max_n_iters=3 experiment.on_policy_collected_frames_per_batch=100 experiment.checkpoint_interval=100\n```\n\nTo load from a checkpoint, pass the absolute checkpoint file name to `experiment.restore_file`.\n```bash\npython benchmarl/run.py task=vmas/balance algorithm=mappo experiment.max_n_iters=6 experiment.on_policy_collected_frames_per_batch=100 experiment.restore_file=\"/hydra/experiment/folder/checkpoint/checkpoint_300.pt\"\n```\nHere is a python example when modifying the config\n[![Example](https://img.shields.io/badge/Example-blue.svg)](examples/checkpointing/reload_experiment.py)\nand one keeping the same config \n[![Example](https://img.shields.io/badge/Example-blue.svg)](examples/checkpointing/resume_experiment.py).\n\nThere are also ways to **resume** and **evaluate** hydra experiments directly from the file\n```bash\npython benchmarl/evaluate.py ../outputs/2024-09-09/20-39-31/mappo_balance_mlp__cd977b69_24_09_09-20_39_31/checkpoints/checkpoint_100.pt\n```\n```bash\npython benchmarl/resume.py ../outputs/2024-09-09/20-39-31/mappo_balance_mlp__cd977b69_24_09_09-20_39_31/checkpoints/checkpoint_100.pt\n```\n\n### Callbacks\n\nExperiments optionally take a list of [`Callback`](benchmarl/experiment/callback.py) which have several methods\nthat you can implement to see what's going on during training such \nas `on_batch_collected`, `on_train_end`, and `on_evaluation_end`.\n\n[![Example](https://img.shields.io/badge/Example-blue.svg)](examples/callback/custom_callback.py)\n\n\n## Citing BenchMARL\n\nIf you use BenchMARL in your research please use the following BibTeX entry:\n\n```BibTeX\n@article{bettini2024benchmarl,\n  author  = {Matteo Bettini and Amanda Prorok and Vincent Moens},\n  title   = {BenchMARL: Benchmarking Multi-Agent Reinforcement Learning},\n  journal = {Journal of Machine Learning Research},\n  year    = {2024},\n  volume  = {25},\n  number  = {217},\n  pages   = {1--10},\n  url     = {http://jmlr.org/papers/v25/23-1612.html}\n}\n```\n\n## License\nBenchMARL is licensed under the MIT License. See [LICENSE](LICENSE) for details.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffacebookresearch%2Fbenchmarl","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffacebookresearch%2Fbenchmarl","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffacebookresearch%2Fbenchmarl/lists"}