{"id":19834974,"url":"https://github.com/hijkzzz/noisy-mappo","last_synced_at":"2025-05-01T17:32:01.463Z","repository":{"id":48142342,"uuid":"380657801","full_name":"hijkzzz/noisy-mappo","owner":"hijkzzz","description":"Multi-agent PPO with noise (97% win rates on Hard scenarios of SMAC)","archived":false,"fork":false,"pushed_at":"2023-06-09T22:41:25.000Z","size":155,"stargazers_count":30,"open_issues_count":0,"forks_count":5,"subscribers_count":3,"default_branch":"master","last_synced_at":"2023-06-09T23:23:28.286Z","etag":null,"topics":["mappo","multi-agent-reinforcement-learning","smac","sota"],"latest_commit_sha":null,"homepage":"https://arxiv.org/abs/2106.14334","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/hijkzzz.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2021-06-27T05:23:35.000Z","updated_at":"2023-06-09T23:23:28.286Z","dependencies_parsed_at":"2022-09-21T18:42:02.620Z","dependency_job_id":null,"html_url":"https://github.com/hijkzzz/noisy-mappo","commit_stats":null,"previous_names":[],"tags_count":null,"template":null,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hijkzzz%2Fnoisy-mappo","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hijkzzz%2Fnoisy-mappo/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hijkzzz%2Fnoisy-mappo/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hijkzzz%2Fnoisy-mappo/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/hijkzzz","download_url":"https://codeload.github.com/hijkzzz/noisy-mappo/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":224270292,"owners_count":17283649,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["mappo","multi-agent-reinforcement-learning","smac","sota"],"created_at":"2024-11-12T12:06:06.313Z","updated_at":"2024-11-12T12:06:06.896Z","avatar_url":"https://github.com/hijkzzz.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Noisy-MAPPO\r\nCodes for [Policy Regularization via Noisy Advantage Values for Cooperative Multi-agent Actor-Critic methods](https://arxiv.org/abs/2106.14334). This repository is heavily based on https://github.com/marlbenchmark/on-policy. In this study we find that noise perturbation of the Advantage function can effectively improve the performance of MAPPO in SMAC.\r\n\r\n## Environments supported:\r\n\r\n- [StarCraftII (SMAC)](https://github.com/oxwhirl/smac)\r\n\r\n**StarCraft 2 version: SC2.4.10. difficulty: 7.**\r\n\r\n## 1. Usage\r\n**WARNING: by default all experiments assume a shared policy by all agents i.e. there is one neural network shared by all agents**\r\n\r\nAll core code is located within the onpolicy folder. The algorithms/ subfolder contains code\r\nfor MAPPO. \r\n\r\n* The config.py file contains relevant hyperparameter and env settings. Most hyperparameters are defaulted to the ones\r\nused in the paper; however, please refer to the appendix for a full list of hyperparameters used. \r\n\r\n## 2. Installation\r\n\r\n Here we give an example installation on CUDA == 10.1. For non-GPU \u0026 other CUDA version installation, please refer to the [PyTorch website](https://pytorch.org/get-started/locally/).\r\n\r\n``` Bash\r\n# create conda environment\r\nconda create -n marl python==3.7\r\nconda activate marl\r\nconda install pytorch torchvision torchaudio cudatoolkit=11.1 -c pytorch-lts -c nvidia\r\npip install -r requirements.txt\r\n```\r\n\r\n```\r\n# install on-policy package\r\ncd on-policy\r\npip install -e .\r\n```\r\n\r\nEven though we provide requirement.txt, it may have redundancy. We recommend that the user try to install other required packages by running the code and finding which required package hasn't installed yet.\r\n\r\n### 2.1 Install StarCraftII [4.10](https://blzdistsc2-a.akamaihd.net/Linux/SC2.4.10.zip)\r\n\r\n``` Bash\r\ncd ~\r\nwget https://blzdistsc2-a.akamaihd.net/Linux/SC2.4.10.zip\r\nunzip -P iagreetotheeula SC2.4.10.zip\r\nrm -rf SC2.4.10.zip\r\necho \"export SC2PATH=~/StarCraftII/\" \u003e ~/.bashrc\r\n```\r\n\r\n* download [SMAC Maps](https://github.com/oxwhirl/smac/releases/download/v1/SMAC_Maps_V1.tar.gz), and move it to `~/StarCraftII/Maps/`.\r\n```\r\nwget https://github.com/oxwhirl/smac/releases/download/v0.1-beta1/SMAC_Maps.zip\r\nunzip SMAC_Maps.zip\r\nmv ./SMAC_Maps ~/StarCraftII/Maps/\r\n```\r\n\r\n* To use a stableid, copy `stableid.json` from https://github.com/Blizzard/s2client-proto.git to `~/StarCraftII/`.\r\n\r\n## 3.Train\r\n**Please modify the hyperparameters in the shell scripts according to the Appendix of the paper.**\r\n\r\n**Noisy-Value MAPPO (NV-MAPPO)**\r\n\r\n```\r\n./train_smac_value.sh 3s5z_vs_3s6z 3\r\n```\r\n\r\n**Noisy-Advantage MAPPO (NA-MAPPO)**\r\n\r\n```\r\n./train_smac_adv.sh 3s5z_vs_3s6z 3\r\n```\r\n\r\n**Noisy-Value IPPO (NV-IPPO)**\r\n\r\n```\r\n./train_smac_value_ippo.sh 3s5z_vs_3s6z 3\r\n```\r\n\r\n**Vanilla MAPPO (MAPPO)**\r\n\r\n```\r\n./train_smac_vanilla.sh 3s5z_vs_3s6z 3\r\n```\r\n\r\nLocal results are stored in subfold scripts/results. Note that we use Tensorboard as the default visualization platform;\r\n\r\n## Citation\r\n```\r\n@article{hu2021policy,\r\n      title={Policy Regularization via Noisy Advantage Values for Cooperative Multi-agent Actor-Critic methods}, \r\n      author={Jian Hu and Siyue Hu and Shih-wei Liao},\r\n      year={2021},\r\n      eprint={2106.14334},\r\n      archivePrefix={arXiv},\r\n      primaryClass={cs.MA}\r\n}\r\n```\r\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhijkzzz%2Fnoisy-mappo","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhijkzzz%2Fnoisy-mappo","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhijkzzz%2Fnoisy-mappo/lists"}