{"id":24845873,"url":"https://github.com/lamda-rl/odis","last_synced_at":"2025-10-14T17:31:11.509Z","repository":{"id":88847551,"uuid":"606692428","full_name":"LAMDA-RL/ODIS","owner":"LAMDA-RL","description":"The implementation of ICLR-2023 paper \"Discovering Generalizable Multi-agent Coordination Skills from Multi-task Offline Data\".","archived":false,"fork":false,"pushed_at":"2024-10-31T16:05:48.000Z","size":6810,"stargazers_count":36,"open_issues_count":1,"forks_count":5,"subscribers_count":1,"default_branch":"main","last_synced_at":"2024-10-31T17:18:34.798Z","etag":null,"topics":["deep-learning","machine-learning","multi-agent-reinforcement-learning","pytorch","reinforcement-learning"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/LAMDA-RL.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-02-26T09:16:40.000Z","updated_at":"2024-10-31T16:05:54.000Z","dependencies_parsed_at":"2024-07-26T02:46:31.577Z","dependency_job_id":null,"html_url":"https://github.com/LAMDA-RL/ODIS","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LAMDA-RL%2FODIS","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LAMDA-RL%2FODIS/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LAMDA-RL%2FODIS/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LAMDA-RL%2FODIS/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/LAMDA-RL","download_url":"https://codeload.github.com/LAMDA-RL/ODIS/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":236496058,"owners_count":19158144,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deep-learning","machine-learning","multi-agent-reinforcement-learning","pytorch","reinforcement-learning"],"created_at":"2025-01-31T10:17:07.955Z","updated_at":"2025-10-14T17:31:05.182Z","avatar_url":"https://github.com/LAMDA-RL.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# ODIS: Offline coordination skill discovery in MARL\n\n[Paper Link](https://openreview.net/forum?id=53FyUAdP7d\u0026referrer=%5BAuthor%20Console%5D(%2Fgroup%3Fid%3DICLR.cc%2F2023%2FConference%2FAuthors%23your-submissions))\n\nThis is the implementation of the ICLR 2023 paper \"Discovering Generalizable Multi-agent Coordination Skills from Multi-task Offline Data\". \n\n## Installation instructions\n\n### Install StarCraft II\n\nSet up StarCraft II and SMAC:\n\n```bash\nbash install_sc2.sh\n```\n\nThis will download SC2.4.10 into the 3rdparty folder and copy the maps necessary to run over. You may also need to persist the environment variable `SC2PATH` (e.g., append this command to `.bashrc`):\n\n```bash\nexport SC2PATH=[Your SC2 folder like /abc/xyz/3rdparty/StarCraftII]\n```\n\n### Install Python environment\n\nInstall Python environment with conda:\n\n```bash\nconda create -n odis python=3.10 -y\nconda activate odis\npip install -r requirements.txt\n```\n\n### Configure SMAC package\n\nWe extend the original [SMAC](https://github.com/oxwhirl/smac) package by adding additional maps for multi-task evaluation. Here are a simple script to make some modifications in `smac` and copy additional maps to StarCraft II installation. Please make sure that you have set `SC2PATH` correctly.\n\n```bash\ngit clone https://github.com/oxwhirl/smac.git\npip install -e smac/\nbash install_smac_patch.sh\n```\n\n## Run experiments\n\nYou can execute the following command to run ODIS with a toy task config, which will perform training on a small batch of data:\n\n```bash\npython src/main.py --mto --config=odis --env-config=sc2_offline --task-config=toy --seed=1\n```\n\nThe `--task-config` flag can be followed with any existing config name in the `src/config/tasks/` directory, and any other config named `xx` can be passed by `--xx=value`. \n\nAs the dataset is large, we only contain the a toy task config of `3m` medium data in the `dataset` folder from the default code base. Therefore, we provide the data link to the full dataset by this [Google Drive URL](https://drive.google.com/file/d/1BZSNaAzEN7nAGthsDCpIxXOo1oVoLdqP/view?usp=share_link) and you can substitute the original data with the full dataset. After putting the full dataset in `dataset` folder, you can run experiments in our pre-defined task sets like \n\n```bash\npython src/main.py --mto --config=odis --env-config=sc2_offline --task-config=marine-hard-expert --seed=1\n```\n\nAll results will be stored in the `results` folder. You can see the console output, config, and tensorboard logging in the cooresponding directory.\n\nAlternatively, you can use the following code to run baselines including BC-t, BC-r, UPDeT-m, and UPDeT-l. \n\n```bash\npython src/main.py --baseline_run --config=updet-m --env-config=sc2_offline --task-config=toy --seed=1\n# config=[updet-m/updet-l/bc-t/bc-r]\n```\n\n### Cooperative Navigation Tasks\n\nWe also add dataset for the cooperative navigation tasks. Similarly, you can run the following code to run ODIS: \n\n```bash\npython src/main.py --mto --config=odis --env-config=cn_offline --task-config=cn-expert  --entity_embed_dim=64 --t_max=30000 --seed=1\n```\n\nAnd also, use the following code to run baselines:\n\n```bash\npython src/main.py --baseline_run --config=bc-t --env-config=cn_offline --task-config=cn-expert  --entity_embed_dim=64 --t_max=30000 --seed=1\n# config=[updet-m/updet-l/bc-t/bc-r]\n```\n\n### Data Collection\n\nWe provide our scripts for the ODIS dataset style data collection. You can run the following code to collect data\n\n```bash\npython src/main.py --data_collect --config=qmix --env-config=sc2_collect --offline_data_quality=expert --num_episodes_collected=2000 --map_name=5m_vs_6m --save_replay_buffer=False\n```\n\nTo collect medium data, you should specify a `stop_winrate` where the policy will start to collect data after reaching the test winrate.\n\n```bash\npython src/main.py --data_collect --config=qmix --env-config=sc2_collect --offline_data_quality=medium --num_episodes_collected=2000 --map_name=5m_vs_6m --save_replay_buffer=False stop_winrate=0.5\n```\n\n## License\n\nCode licensed under the Apache License v2.0.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flamda-rl%2Fodis","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flamda-rl%2Fodis","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flamda-rl%2Fodis/lists"}