{"id":27427326,"url":"https://github.com/deepbiolab/multi-agent-ddpg","last_synced_at":"2026-04-29T07:35:25.294Z","repository":{"id":282601579,"uuid":"949103862","full_name":"deepbiolab/multi-agent-ddpg","owner":"deepbiolab","description":"Implementation of MADDPG for cooperative-competitive multi-agent environments","archived":false,"fork":false,"pushed_at":"2025-03-15T17:28:47.000Z","size":7183,"stargazers_count":1,"open_issues_count":0,"forks_count":1,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-06-30T22:40:47.077Z","etag":null,"topics":["ddpg","maddpg","multiple-agent","reinforcement-learning"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/deepbiolab.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-03-15T17:26:16.000Z","updated_at":"2025-04-29T14:14:27.000Z","dependencies_parsed_at":"2025-03-15T18:39:27.291Z","dependency_job_id":null,"html_url":"https://github.com/deepbiolab/multi-agent-ddpg","commit_stats":null,"previous_names":["deepbiolab/multi-agent-ddpg"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/deepbiolab/multi-agent-ddpg","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/deepbiolab%2Fmulti-agent-ddpg","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/deepbiolab%2Fmulti-agent-ddpg/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/deepbiolab%2Fmulti-agent-ddpg/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/deepbiolab%2Fmulti-agent-ddpg/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/deepbiolab","download_url":"https://codeload.github.com/deepbiolab/multi-agent-ddpg/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/deepbiolab%2Fmulti-agent-ddpg/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32416146,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-29T06:29:02.080Z","status":"ssl_error","status_checked_at":"2026-04-29T06:29:00.631Z","response_time":110,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ddpg","maddpg","multiple-agent","reinforcement-learning"],"created_at":"2025-04-14T12:49:55.165Z","updated_at":"2026-04-29T07:35:25.274Z","avatar_url":"https://github.com/deepbiolab.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Multi-Agent Deep Deterministic Policy Gradient (MADDPG)\n\nThis repository contains an implementation of MADDPG for cooperative-competitive multi-agent environments. The implementation is based on the paper [\"Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments\"](https://arxiv.org/abs/1706.02275).\n\n### Environment\n\nThe implementation uses a modified version of the Multi-Agent Particle Environment, specifically the \"simple_adversary\" scenario where:\n\n![CleanShot-2025-03-15-23-44-13@2x](assets/CleanShot-2025-03-15-23-44-13@2x.png)\n\n- 2 good agents (blue) try to reach a target location\n- 1 adversary agent (red) tries to prevent them from reaching the target\n\n### Architecture\n\n\u003cimg src=\"assets/CleanShot-2025-03-15-17-45-23@2x.png\" alt=\"alt text\" style=\"zoom:50%;\" /\u003e\n\n#### MADDPG Agent\n\n- Implements centralized training with decentralized execution\n- Each agent has its own actor and critic networks\n- Critics have access to all agents' observations and actions\n- Actors only access their own observations\n\n#### Network Architecture\n\n- Actor Network: 3-layer MLP with leaky ReLU activation\n- Critic Network: 3-layer MLP with leaky ReLU activation\n- Configurable hidden layer dimensions\n\n#### Training Process\n\n1. Collect experiences from parallel environments\n2. Store transitions in replay buffer\n3. Sample batch from buffer\n4. Update critics using TD-error\n5. Update actors using policy gradient\n6. Soft update target networks\n\n### Project Structure\n\n```\nsrc/\n├── main.py          # Training script with configuration and training loop\n├── render_trained_model.py  # Visualization script for trained models\n├── env_wrapper.py   # Environment wrapper for parallel execution (SubprocVecEnv)\n├── envs.py          # Environment setup and initialization functions\n├── maddpg.py        # MADDPG algorithm core implementation\n├── network.py       # Neural network architectures for Actor and Critic\n├── utils.py         # Utility functions for tensor operations and updates\n├── replay_buffer.py # Experience replay buffer implementation\n├── multiagent/      # Multi-agent environment implementation\n    └── scenarios/   # Different scenarios/tasks implementations\n```\n\n### Requirements\n\n```bash\npip install -r requirements.txt\n```\n\n### Usage\n\n#### Training\n\nTo train the agents:\n\n```bash\npython main.py\n```\n\nKey parameters in `main.py`:\n- `parallel_envs`: Number of parallel environments (default: 8)\n- `number_of_episodes`: Total number of training episodes (default: 30000)\n- `episode_length`: Maximum steps per episode (default: 80)\n- `save_interval`: Episodes between model saves (default: 1000)\n\n#### Visualization\n\nTo visualize trained agents:\n\n```bash\npython render_trained_model.py\n```\n\n### Logging and Monitoring\n\nTraining progress can be monitored using TensorBoard:\n\n```bash\ntensorboard --logdir=log\n```\n\nMetrics logged:\n- Actor loss\n- Critic loss\n- Mean episode rewards for each agent\n\n### Save and Load\n\nModels are automatically saved to `model_dir/` at specified intervals. Each save contains:\n- Actor network parameters\n- Critic network parameters\n- Optimizer states\n\n## License\n\nMIT License\n\n## Acknowledgments\n\nThis implementation is based on the MADDPG paper by Lowe et al. and uses components from OpenAI's Multi-Agent Particle Environment.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdeepbiolab%2Fmulti-agent-ddpg","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdeepbiolab%2Fmulti-agent-ddpg","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdeepbiolab%2Fmulti-agent-ddpg/lists"}