https://github.com/coac/commnet-bicnet

CommNet and BiCnet implementation in tensorflow
https://github.com/coac/commnet-bicnet

multi-agent-reinforcement-learning reinforcement-learning tensorflow

Last synced: 12 months ago
JSON representation

CommNet and BiCnet implementation in tensorflow

Host: GitHub
URL: https://github.com/coac/commnet-bicnet
Owner: Coac
Created: 2018-05-19T11:56:32.000Z (about 8 years ago)
Default Branch: master
Last Pushed: 2018-07-27T11:13:25.000Z (almost 8 years ago)
Last Synced: 2023-10-25T21:28:52.721Z (over 2 years ago)
Topics: multi-agent-reinforcement-learning, reinforcement-learning, tensorflow
Language: Python
Size: 66.4 KB
Stars: 53
Watchers: 4
Forks: 17
Open Issues: 3
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# CommNet-BiCnet
[CommNet](https://arxiv.org/abs/1605.07736) and [BiCnet](https://arxiv.org/abs/1703.10069) implementation in tensorflow

## Training
Train CommNet using DDPG algorithm
```
python train_comm_net.py
```

## Hypersearch
To find the optimal hyperparameters such as `actor_lr` or `critic_lr`, a simple grid search has been implemented. It launches multiple instances of the trainer in parallel based on the number of CPU cores.
```
python hypersearch.py
```

## Guessing sum environment
It is a simple game described in the [BiCnet](https://arxiv.org/abs/1703.10069) paper for testing if the communication works. The environment implements the crucial methods of the core gym interface from OpenAI

Each agent receives a scalar sampled between `[−10, 10]` under a truncated Gaussian. Each agent needs to output the sum of all inputs received among the agents. An agent gets a normalized reward between `[0, 1]` based on the absolute difference between the sum and its output.

## Results
### Training CommNet in the Guessing sum env with 2 agents
![2_agents_commnet_training_reward](docs/2_agents_commnet.png)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/coac/commnet-bicnet

Awesome Lists containing this project

README