https://github.com/xuehaipan/mate
MATE: the Multi-Agent Tracking Environment.
https://github.com/xuehaipan/mate
multi-agent-reinforcement-learning openai-gym openai-gym-environment reinforcement-learning reinforcement-learning-algorithms reinforcement-learning-environment
Last synced: 3 months ago
JSON representation
MATE: the Multi-Agent Tracking Environment.
- Host: GitHub
- URL: https://github.com/xuehaipan/mate
- Owner: XuehaiPan
- License: mit
- Created: 2021-08-21T06:13:49.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2023-03-31T07:39:07.000Z (about 2 years ago)
- Last Synced: 2025-01-10T01:40:45.301Z (4 months ago)
- Topics: multi-agent-reinforcement-learning, openai-gym, openai-gym-environment, reinforcement-learning, reinforcement-learning-algorithms, reinforcement-learning-environment
- Language: Python
- Homepage: https://mate-gym.readthedocs.io
- Size: 492 KB
- Stars: 33
- Watchers: 3
- Forks: 22
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# MATE: the Multi-Agent Tracking Environment
This repo contains the source code of `MATE`, the _**M**ulti-**A**gent **T**racking **E**nvironment_. The full documentation can be found at . The full list of implemented agents can be found in section [Implemented Algorithms](#implemented-algorithms). For detailed description, please checkout our paper ([PDF](https://openreview.net/pdf?id=SyoUVEyzJbE), [bibtex](#citation)).
This is an **asymmetric two-team zero-sum stochastic game** with _partial observations_, and each team has multiple agents (multiplayer). Intra-team communications are allowed, but inter-team communications are prohibited. It is **cooperative** among teammates, but it is **competitive** among teams (opponents).
## Installation
```bash
git config --global core.symlinks true # required on Windows
pip3 install git+https://github.com/XuehaiPan/mate.git#egg=mate
```**NOTE:** Python 3.7+ is required, and Python versions lower than 3.7 is not supported.
It is highly recommended to create a new isolated virtual environment for `MATE` using [`conda`](https://docs.conda.io/en/latest/miniconda.html):
```bash
git clone https://github.com/XuehaiPan/mate.git && cd mate
conda env create --no-default-packages --file conda-recipes/basic.yaml # or full-cpu.yaml to install RLlib
conda activate mate
```## Getting Started
Make the ``MultiAgentTracking`` environment and play!
```python
import mate# Base environment for MultiAgentTracking
env = mate.make('MultiAgentTracking-v0')
env.seed(0)
done = False
camera_joint_observation, target_joint_observation = env.reset()
while not done:
camera_joint_action, target_joint_action = env.action_space.sample() # your agent here (this takes random actions)
(
(camera_joint_observation, target_joint_observation),
(camera_team_reward, target_team_reward),
done,
(camera_infos, target_infos)
) = env.step((camera_joint_action, target_joint_action))
```Another example with a built-in single-team wrapper (see also [Built-in Wrappers](#built-in-wrappers)):
```python
import mateenv = mate.make('MultiAgentTracking-v0')
env = mate.MultiTarget(env, camera_agent=mate.GreedyCameraAgent(seed=0))
env.seed(0)
done = False
target_joint_observation = env.reset()
while not done:
target_joint_action = env.action_space.sample() # your agent here (this takes random actions)
target_joint_observation, target_team_reward, done, target_infos = env.step(target_joint_action)
```
![]()
4 Cameras vs. 8 Targets (9 Obstacles)### Examples and Demos
[`mate/evaluate.py`](mate/evaluate.py) contains the example evaluation code for the `MultiAgentTracking` environment. Try out the following demos:
```bash
# >(4 cameras, 2 targets, 9 obstacles)
python3 -m mate.evaluate --episodes 1 --config MATE-4v2-9.yaml# >(4 cameras, 8 targets, 9 obstacles)
python3 -m mate.evaluate --episodes 1 --config MATE-4v8-9.yaml# >(8 cameras, 8 targets, 9 obstacles)
python3 -m mate.evaluate --episodes 1 --config MATE-8v8-9.yaml# >(4 cameras, 8 targets, 0 obstacle)
python3 -m mate.evaluate --episodes 1 --config MATE-4v8-0.yaml# >(0 camera, 8 targets, 32 obstacles)
python3 -m mate.evaluate --episodes 1 --config MATE-Navigation.yaml
```
4 Cameras vs. 2 Targets (9 obstacles)
4 Cameras vs. 8 Targets (9 obstacles)
8 Cameras vs. 8 Targets (9 obstacles)
4 Cameras vs. 8 Targets (no obstacles)
8 Targets Navigation (no cameras)
![]()
![]()
![]()
![]()
![]()
You can specify the agent classes and arguments by:
```bash
python3 -m mate.evaluate --camera-agent module:class --camera-kwargs --target-agent module:class --target-kwargs
```You can find the example code for agents in [`examples`](examples). The full list of implemented agents can be found in section [Implemented Algorithms](#implemented-algorithms). For example:
```bash
# Example demos in examples
python3 -m examples.naive# Use the evaluation script
python3 -m mate.evaluate --episodes 1 --render-communication \
--camera-agent examples.greedy:GreedyCameraAgent --camera-kwargs '{"memory_period": 20}' \
--target-agent examples.greedy:GreedyTargetAgent \
--config MATE-4v8-9.yaml \
--seed 0
```
![]()
You can implement your own custom agents classes to play around. See [Make Your Own Agents](docs/source/getting-started.rst#make-your-own-agents) for more details.
## Environment Configurations
The `MultiAgentTracking` environment accepts a Python dictionary mapping or a configuration file in JSON or YAML format.
If you want to use customized environment configurations, you can copy the default configuration file:```bash
cp "$(python3 -m mate.assets)"/MATE-4v8-9.yaml MyEnvCfg.yaml
```Then make some modifications for your own. Use the modified environment by:
```python
env = mate.make('MultiAgentTracking-v0', config='/path/to/your/cfg/file')
```There are several preset configuration files in [`mate/assets`](mate/assets) directory.
```python
# >(4 camera, 2 targets, 9 obstacles)
env = mate.make('MATE-4v2-9-v0')# >(4 camera, 8 targets, 9 obstacles)
env = mate.make('MATE-4v8-9-v0')# >(8 camera, 8 targets, 9 obstacles)
env = mate.make('MATE-8v8-9-v0')# >(4 camera, 8 targets, 0 obstacles)
env = mate.make('MATE-4v8-0-v0')# >(0 camera, 8 targets, 32 obstacles)
env = mate.make('MATE-Navigation-v0')
```You can reinitialize the environment with a new configuration without creating a new instance:
```python
>>> env = mate.make('MultiAgentTracking-v0', wrappers=[mate.MoreTrainingInformation]) # we support wrappers
>>> print(env)
>(4 cameras, 8 targets, 9 obstacles)>>>> env.load_config('MATE-8v8-9.yaml')
>>> print(env)
>(8 cameras, 8 targets, 9 obstacles)>
```Besides, we provide a script [`mate/assets/generator.py`](mate/assets/generator.py) to generate a configuration file with responsible camera placement:
```bash
python3 -m mate.assets.generator --path 24v48.yaml --num-cameras 24 --num-targets 48 --num-obstacles 20
```See [Environment Customization](docs/source/getting-started.rst#environment-customization) for more details.
## Built-in Wrappers
MATE provides multiple wrappers for different settings. Such as _fully observability_, _discrete action spaces_, _single team multi-agent_, etc. See [Built-in Wrappers](docs/source/wrappers.rst#wrappers) for more details.
Wrapper
Description
observation
EnhancedObservation
Enhance the agent’s observation, which sets all observation mask toTrue
.
SharedFieldOfView
Share field of view among agents in the same team, which applies theor
operator over the observation masks. The target agents share the empty status of warehouses.
MoreTrainingInformation
Add more environment and agent information to theinfo
field ofstep()
, enabling full observability of the environment.
RescaledObservation
Rescale all entity states in the observation to [-1, +1].
RelativeCoordinates
Convert all locations of other entities in the observation to relative coordinates.
action
DiscreteCamera
Allow cameras to use discrete actions.
DiscreteTarget
Allow targets to use discrete actions.
reward
AuxiliaryCameraRewards
Add additional auxiliary rewards for each individual camera.
AuxiliaryTargetRewards
Add additional auxiliary rewards for each individual target.
single-team
MultiCamera
Wrap into a single-team multi-agent environment.
MultiTarget
SingleCamera
Wrap into a single-team single-agent environment.
SingleTarget
communication
MessageFilter
Filter messages from agents of intra-team communications.
RandomMessageDropout
Randomly drop messages in communication channels.
RestrictedCommunicationRange
Add a restricted communication range to channels.
NoCommunication
Disable intra-team communications, i.e., filter out all messages.
ExtraCommunicationDelays
Add extra message delays to communication channels.
miscellaneous
RepeatedRewardIndividualDone
Repeat thereward
field and assign individualdone
field ofstep()
, which is similar to MPE.
You can create an environment with multiple wrappers at once. For example:
```python
env = mate.make('MultiAgentTracking-v0',
wrappers=[
mate.EnhancedObservation,
mate.MoreTrainingInformation,
mate.WrapperSpec(mate.DiscreteCamera, levels=5),
mate.WrapperSpec(mate.MultiCamera, target_agent=mate.GreedyTargetAgent(seed=0)),
mate.RepeatedRewardIndividualDone,
mate.WrapperSpec(mate.AuxiliaryCameraRewards,
coefficients={'raw_reward': 1.0,
'coverage_rate': 1.0,
'soft_coverage_score': 1.0,
'baseline': -2.0}),
])
```## Implemented Algorithms
The following algorithms are implemented in [`examples`](examples):
- **Rule-based:**
1. **Random** (source: [`mate/agents/random.py`](mate/agents/random.py))
1. **Naive** (source: [`mate/agents/naive.py`](mate/agents/naive.py))
1. **Greedy** (source: [`mate/agents/greedy.py`](mate/agents/greedy.py))
1. **Heuristic** (source: [`mate/agents/heuristic.py`](mate/agents/heuristic.py))- **Multi-Agent Reinforcement Learning Algorithms:**
1. **IQL** ()
1. **QMIX** ()
1. **MADDPG** (MA-TD3) ()
1. **IPPO** ()
1. **MAPPO** ()- _Multi-Agent Reinforcement Learning Algorithms_ with **Multi-Agent Communication:**
1. **TarMAC** (base algorithm: IPPO) ()
1. **TarMAC** (base algorithm: MAPPO)
1. **I2C** (base algorithm: MAPPO) ()- **Population Based Adversarial Policy Learning**, available meta-solvers:
1. Self-Play (SP)
1. Fictitious Self-Play (FSP) ()
1. PSRO-Nash (NE) ()**NOTE:** all learning-based algorithms are tested with [Ray 1.12.0](https://github.com/ray-project/ray) on Ubuntu 20.04 LTS.
## Citation
If you find MATE useful, please consider citing:
```bibtex
@inproceedings{pan2022mate,
title = {{MATE}: Benchmarking Multi-Agent Reinforcement Learning in Distributed Target Coverage Control},
author = {Xuehai Pan and Mickel Liu and Fangwei Zhong and Yaodong Yang and Song-Chun Zhu and Yizhou Wang},
booktitle = {Thirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track},
year = {2022},
url = {https://openreview.net/forum?id=SyoUVEyzJbE}
}
```## License
MIT License