An open API service indexing awesome lists of open source software.

https://github.com/xuehaipan/mate

MATE: the Multi-Agent Tracking Environment.
https://github.com/xuehaipan/mate

multi-agent-reinforcement-learning openai-gym openai-gym-environment reinforcement-learning reinforcement-learning-algorithms reinforcement-learning-environment

Last synced: 3 months ago
JSON representation

MATE: the Multi-Agent Tracking Environment.

Awesome Lists containing this project

README

        

# MATE: the Multi-Agent Tracking Environment

This repo contains the source code of `MATE`, the _**M**ulti-**A**gent **T**racking **E**nvironment_. The full documentation can be found at . The full list of implemented agents can be found in section [Implemented Algorithms](#implemented-algorithms). For detailed description, please checkout our paper ([PDF](https://openreview.net/pdf?id=SyoUVEyzJbE), [bibtex](#citation)).

This is an **asymmetric two-team zero-sum stochastic game** with _partial observations_, and each team has multiple agents (multiplayer). Intra-team communications are allowed, but inter-team communications are prohibited. It is **cooperative** among teammates, but it is **competitive** among teams (opponents).

## Installation

```bash
git config --global core.symlinks true # required on Windows
pip3 install git+https://github.com/XuehaiPan/mate.git#egg=mate
```

**NOTE:** Python 3.7+ is required, and Python versions lower than 3.7 is not supported.

It is highly recommended to create a new isolated virtual environment for `MATE` using [`conda`](https://docs.conda.io/en/latest/miniconda.html):

```bash
git clone https://github.com/XuehaiPan/mate.git && cd mate
conda env create --no-default-packages --file conda-recipes/basic.yaml # or full-cpu.yaml to install RLlib
conda activate mate
```

## Getting Started

Make the ``MultiAgentTracking`` environment and play!

```python
import mate

# Base environment for MultiAgentTracking
env = mate.make('MultiAgentTracking-v0')
env.seed(0)
done = False
camera_joint_observation, target_joint_observation = env.reset()
while not done:
camera_joint_action, target_joint_action = env.action_space.sample() # your agent here (this takes random actions)
(
(camera_joint_observation, target_joint_observation),
(camera_team_reward, target_team_reward),
done,
(camera_infos, target_infos)
) = env.step((camera_joint_action, target_joint_action))
```

Another example with a built-in single-team wrapper (see also [Built-in Wrappers](#built-in-wrappers)):

```python
import mate

env = mate.make('MultiAgentTracking-v0')
env = mate.MultiTarget(env, camera_agent=mate.GreedyCameraAgent(seed=0))
env.seed(0)
done = False
target_joint_observation = env.reset()
while not done:
target_joint_action = env.action_space.sample() # your agent here (this takes random actions)
target_joint_observation, target_team_reward, done, target_infos = env.step(target_joint_action)
```


Screencast

4 Cameras vs. 8 Targets (9 Obstacles)

### Examples and Demos

[`mate/evaluate.py`](mate/evaluate.py) contains the example evaluation code for the `MultiAgentTracking` environment. Try out the following demos:

```bash
# >(4 cameras, 2 targets, 9 obstacles)
python3 -m mate.evaluate --episodes 1 --config MATE-4v2-9.yaml

# >(4 cameras, 8 targets, 9 obstacles)
python3 -m mate.evaluate --episodes 1 --config MATE-4v8-9.yaml

# >(8 cameras, 8 targets, 9 obstacles)
python3 -m mate.evaluate --episodes 1 --config MATE-8v8-9.yaml

# >(4 cameras, 8 targets, 0 obstacle)
python3 -m mate.evaluate --episodes 1 --config MATE-4v8-0.yaml

# >(0 camera, 8 targets, 32 obstacles)
python3 -m mate.evaluate --episodes 1 --config MATE-Navigation.yaml
```


4 Cameras vs. 2 Targets (9 obstacles)
4 Cameras vs. 8 Targets (9 obstacles)
8 Cameras vs. 8 Targets (9 obstacles)
4 Cameras vs. 8 Targets (no obstacles)
8 Targets Navigation (no cameras)







You can specify the agent classes and arguments by:

```bash
python3 -m mate.evaluate --camera-agent module:class --camera-kwargs --target-agent module:class --target-kwargs
```

You can find the example code for agents in [`examples`](examples). The full list of implemented agents can be found in section [Implemented Algorithms](#implemented-algorithms). For example:

```bash
# Example demos in examples
python3 -m examples.naive

# Use the evaluation script
python3 -m mate.evaluate --episodes 1 --render-communication \
--camera-agent examples.greedy:GreedyCameraAgent --camera-kwargs '{"memory_period": 20}' \
--target-agent examples.greedy:GreedyTargetAgent \
--config MATE-4v8-9.yaml \
--seed 0
```


Communication

You can implement your own custom agents classes to play around. See [Make Your Own Agents](docs/source/getting-started.rst#make-your-own-agents) for more details.

## Environment Configurations

The `MultiAgentTracking` environment accepts a Python dictionary mapping or a configuration file in JSON or YAML format.
If you want to use customized environment configurations, you can copy the default configuration file:

```bash
cp "$(python3 -m mate.assets)"/MATE-4v8-9.yaml MyEnvCfg.yaml
```

Then make some modifications for your own. Use the modified environment by:

```python
env = mate.make('MultiAgentTracking-v0', config='/path/to/your/cfg/file')
```

There are several preset configuration files in [`mate/assets`](mate/assets) directory.

```python
# >(4 camera, 2 targets, 9 obstacles)
env = mate.make('MATE-4v2-9-v0')

# >(4 camera, 8 targets, 9 obstacles)
env = mate.make('MATE-4v8-9-v0')

# >(8 camera, 8 targets, 9 obstacles)
env = mate.make('MATE-8v8-9-v0')

# >(4 camera, 8 targets, 0 obstacles)
env = mate.make('MATE-4v8-0-v0')

# >(0 camera, 8 targets, 32 obstacles)
env = mate.make('MATE-Navigation-v0')
```

You can reinitialize the environment with a new configuration without creating a new instance:

```python
>>> env = mate.make('MultiAgentTracking-v0', wrappers=[mate.MoreTrainingInformation]) # we support wrappers
>>> print(env)
>(4 cameras, 8 targets, 9 obstacles)>

>>> env.load_config('MATE-8v8-9.yaml')
>>> print(env)
>(8 cameras, 8 targets, 9 obstacles)>
```

Besides, we provide a script [`mate/assets/generator.py`](mate/assets/generator.py) to generate a configuration file with responsible camera placement:

```bash
python3 -m mate.assets.generator --path 24v48.yaml --num-cameras 24 --num-targets 48 --num-obstacles 20
```

See [Environment Customization](docs/source/getting-started.rst#environment-customization) for more details.

## Built-in Wrappers

MATE provides multiple wrappers for different settings. Such as _fully observability_, _discrete action spaces_, _single team multi-agent_, etc. See [Built-in Wrappers](docs/source/wrappers.rst#wrappers) for more details.



Wrapper
Description




observation
EnhancedObservation

Enhance the agent’s observation, which sets all observation mask to True.



SharedFieldOfView

Share field of view among agents in the same team, which applies the or operator over the observation masks. The target agents share the empty status of warehouses.



MoreTrainingInformation

Add more environment and agent information to the info field of step(), enabling full observability of the environment.



RescaledObservation

Rescale all entity states in the observation to [-1, +1].



RelativeCoordinates

Convert all locations of other entities in the observation to relative coordinates.



action
DiscreteCamera

Allow cameras to use discrete actions.



DiscreteTarget

Allow targets to use discrete actions.



reward
AuxiliaryCameraRewards

Add additional auxiliary rewards for each individual camera.



AuxiliaryTargetRewards

Add additional auxiliary rewards for each individual target.



single-team
MultiCamera

Wrap into a single-team multi-agent environment.



MultiTarget


SingleCamera

Wrap into a single-team single-agent environment.



SingleTarget


communication
MessageFilter

Filter messages from agents of intra-team communications.



RandomMessageDropout

Randomly drop messages in communication channels.



RestrictedCommunicationRange

Add a restricted communication range to channels.



NoCommunication

Disable intra-team communications, i.e., filter out all messages.



ExtraCommunicationDelays

Add extra message delays to communication channels.



miscellaneous
RepeatedRewardIndividualDone

Repeat the reward field and assign individual done field of step(), which is similar to MPE.


You can create an environment with multiple wrappers at once. For example:

```python
env = mate.make('MultiAgentTracking-v0',
wrappers=[
mate.EnhancedObservation,
mate.MoreTrainingInformation,
mate.WrapperSpec(mate.DiscreteCamera, levels=5),
mate.WrapperSpec(mate.MultiCamera, target_agent=mate.GreedyTargetAgent(seed=0)),
mate.RepeatedRewardIndividualDone,
mate.WrapperSpec(mate.AuxiliaryCameraRewards,
coefficients={'raw_reward': 1.0,
'coverage_rate': 1.0,
'soft_coverage_score': 1.0,
'baseline': -2.0}),
])
```

## Implemented Algorithms

The following algorithms are implemented in [`examples`](examples):

- **Rule-based:**

1. **Random** (source: [`mate/agents/random.py`](mate/agents/random.py))
1. **Naive** (source: [`mate/agents/naive.py`](mate/agents/naive.py))
1. **Greedy** (source: [`mate/agents/greedy.py`](mate/agents/greedy.py))
1. **Heuristic** (source: [`mate/agents/heuristic.py`](mate/agents/heuristic.py))

- **Multi-Agent Reinforcement Learning Algorithms:**

1. **IQL** ()
1. **QMIX** ()
1. **MADDPG** (MA-TD3) ()
1. **IPPO** ()
1. **MAPPO** ()

- _Multi-Agent Reinforcement Learning Algorithms_ with **Multi-Agent Communication:**

1. **TarMAC** (base algorithm: IPPO) ()
1. **TarMAC** (base algorithm: MAPPO)
1. **I2C** (base algorithm: MAPPO) ()

- **Population Based Adversarial Policy Learning**, available meta-solvers:

1. Self-Play (SP)
1. Fictitious Self-Play (FSP) ()
1. PSRO-Nash (NE) ()

**NOTE:** all learning-based algorithms are tested with [Ray 1.12.0](https://github.com/ray-project/ray) on Ubuntu 20.04 LTS.

## Citation

If you find MATE useful, please consider citing:

```bibtex
@inproceedings{pan2022mate,
title = {{MATE}: Benchmarking Multi-Agent Reinforcement Learning in Distributed Target Coverage Control},
author = {Xuehai Pan and Mickel Liu and Fangwei Zhong and Yaodong Yang and Song-Chun Zhu and Yizhou Wang},
booktitle = {Thirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track},
year = {2022},
url = {https://openreview.net/forum?id=SyoUVEyzJbE}
}
```

## License

MIT License