https://github.com/asappresearch/emergent-comms-negotiation

Reproduce ICLR2018 submission "Emergent Communication through Negotiation"
https://github.com/asappresearch/emergent-comms-negotiation

Last synced: 3 months ago
JSON representation

Reproduce ICLR2018 submission "Emergent Communication through Negotiation"

Host: GitHub
URL: https://github.com/asappresearch/emergent-comms-negotiation
Owner: asappresearch
License: mit
Created: 2017-10-30T00:12:15.000Z (over 7 years ago)
Default Branch: master
Last Pushed: 2018-04-19T01:53:50.000Z (about 7 years ago)
Last Synced: 2025-04-05T11:11:09.718Z (3 months ago)
Language: Python
Homepage: https://openreview.net/forum?id=Hk6WhagRW&noteId=Hk6WhagRW
Size: 1.43 MB
Stars: 17
Watchers: 6
Forks: 8
Open Issues: 1
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

        # "Emergent Communication through Negotiation"

Reproduce https://openreview.net/forum?id=Hk6WhagRW&noteId=Hk6WhagRW , "Emergent Communication through Negotation", ICLR 2018 anonymous submission.

## To install

- install pytorch 0.2, https://pytorch.org

- download this repo, `git clone https://github.com/asappinc/emergent_comms_negotiation`

## To run

```

python ecn.py [--disable-comms] [--disable-proposal] [--disable-prosocial] [--enable-cuda] [--term-entropy-reg 0.5] [--utterance-entropy-reg 0.0001] [--proposal-entropy-reg 0.01] [--model-file model_saves/mymodel.dat] [--name gpu3box]

```

Where options are:

- `--enable-cuda`: use NVIDIA GPU, instead of CPU

- `--disable-comms`: disable the comms channel

- `--disable-proposal`: disable the proposal channel (ie agent can create proposals, but other agent cant see them)

- `--disable-prosocial`: disable prosocial reward

- `--term-entropy-reg VALUE`: termination policy entropy regularization

- `--utterance-entorpy-reg VALUE`: utterance policy entropy regularization

- `--proposal-entropy-reg VALUE`: proposal policy entropy regularization

- `--model-file models_saves/FILENAME`: where to save the model to, and where to look for it on startup

- `--name NAME`: this is used in the logfile name, just to make it easier to find/distinguish logfiles, no other purpose

## Stdout layout

eg if we have:

```

   000000 4:4/0 7:5/5 9:4/4

                                      000000 4:5/0 6:1/5 7:2/4

   000000 4:0/0 7:0/5 9:1/4

                                      ACC

  r: 0.91

```

Then:

- each of the first 4 lines here is the action of a single agent

- the `ACC` line is the agent accepting previous proposal

- each proposal line is laid out as:

```

  [utterance]   [utility 0]:[proposal 0]/[pool 0] ... etc ...

```

- if the agents run out of time, last line will be `[out of time]`

One negotation is printed out every 3 seconds or so, using the training set; the other negotations are executed silently.  There is no test set for now.

## Results so far, summary

| Agent sociability            | Proposal | Linguistic | Both   | None    |

|------------------------------|----------|------------|--------|---------|

| Self-interested, random term |          |            | >=0.80 |         |

| Prosocial, random term       | ~0.91    | ~0.83      | ~0.96  | >= 0.90 |

Notes:

- prosocial runs all use termreg=0.5, uttreg=0.0001, propreg=0.01

- self-interested run uses: termreg=0.05, uttreg=0.0001, propreg=0.005

### Scenario details

|Prop? | Comm? | Soc? | Rend term? | Term reg | Utt reg | Prop reg | Subjective variance | Reward | Greedy ratios |

|-----|-------|-------|-------------|--------|--------|------------|---------------------|---------|-----------|

| Y   | Y     | Y      | Y          | 0.5    | 0.0001 | 0.01   | Low                     | ~0.96 | term=0.7345 utt=0.7635 prop=0.8304 |

| Y   | -      | Y      | Y         | 0.5    | 0.0001 | 0.01   | Medium-High             | ~0.91 | term=0.6965 utt=0.0000 prop=0.8741 |

| -   | Y      | Y     | Y          | 0.5     | 0.0001 | 0.01  | High                   | ~0.83  | term=0.6889 utt=0.7849 prop=0.8222 |

| -   | -       | Y     | Y         | 0.5      | 0.0001 | 0.01  | Very low              | >= 0.90 (climbing) | term=0.7781 utt=0.0000 prop=0.6006 |

| Y   | Y       | -     | Y         | 0.5      | 0.0001 | 0.01  | Very High             | ~0.25  | term=0.7467 utt=0.9284 prop=0.8137 |

| Y   | Y       | -     | Y         | 0.05     | 0.0001 | 0.005 | Very Low              | >= 0.80 (climbing) | term=0.9820 utt=0.7040 prop=0.6523 |

### Training curves

__proposal, comms, prosocial__

Three training runs, identical settings:



__Proposal, no comms, prosocial__



__No proposal, comms, prosocial__



__No proposal, no comms, prosocial__



__Proposal, comms, no social__

Run 1, same entropy regularization as prosocial graphs:



Run 2, with reduced entropy regularization:



## Unit tests

- install pytest, ie `conda install -y pytest`, and then:

```

py.test -svx

```

- there are also some additional tests in:

```

python net_tests.py

```

(which allow close examination of specific parts of the network, policies, and so on; but which arent really 'unit-tests' as such, since neither termination criteria, nor success criteria)

## Plotting graphs

__Assumptions__:

- running the training on remote Ubuntu 16.04 instances

  - `ssh` access, as user `ubuntu`, to these instances

  - remote has home directory `/home/ubuntu`

  - logs are stored in subdirectory `logs` of current local directory

  - the location of `logs` relative to `~` is identical on local computer and remote computer

__Setup/configuration__:

- copy `instances.yaml.templ` to `~/instances.yaml`, on your own machine

  - configure `~/instances.yaml` with:

    - name and ip of each instance (names are arbitrary)

    - the path to your private sshkey, that can access these instances

__Procedure__

- run:

```

python merge.py --hostname [name in instances.yaml] [--logfile logs/log_20171104_1234.log] \

    [--title 'my graph title'] [--y-min 75 --y-max 85]

```

This will:

- `rsync` the logs from the remote instance identified by `--hostname`

- if `--logfile` is specified, load the results from that logfile

  - else, will look for the most recent logfile, ordered by name

- plots the graph into `/tmp/out-reward.png`

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/asappresearch/emergent-comms-negotiation

Awesome Lists containing this project

README