Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/ischlag/fast-weight-transformers

Official code repository of the paper Linear Transformers Are Secretly Fast Weight Programmers.
https://github.com/ischlag/fast-weight-transformers

Last synced: 4 months ago
JSON representation

Official code repository of the paper Linear Transformers Are Secretly Fast Weight Programmers.

Host: GitHub
URL: https://github.com/ischlag/fast-weight-transformers
Owner: ischlag
License: mit
Created: 2021-02-11T11:50:33.000Z (about 4 years ago)
Default Branch: main
Last Pushed: 2021-06-10T13:22:22.000Z (over 3 years ago)
Last Synced: 2024-08-01T13:24:49.576Z (7 months ago)
Language: Jupyter Notebook
Homepage:
Size: 934 KB
Stars: 95
Watchers: 5
Forks: 10
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

        # Linear Transformers Are Secretly Fast Weight Programmers

This repository contains the code accompanying the paper [*Linear Transformers Are Secretly Fast Weight Programmers*](https://arxiv.org/abs/2102.11174) which is published at ICML'21.

It also contains the logs of all synthetic experiments.

## Synthetic Experiments

### Requirements

```bash

$ cat req.txt 

jupyter==1.0.0

pandas==1.0.1

seaborn==0.10.0

torch==1.6.0

matplotlib==3.1.3

numpy==1.17.2

```

```bash

pip3 install -r req.txt

```

### Rerun Experiments

Logs are provided in the ```synthetic/logs``` folder. 

The files in that folder are a result of running the following commands:

Setting 1 (capacity):

```bash

python3 main.py --begin=20 --end=600 --step=20 --attn_name=softmax --update_rule=sum

python3 main.py --begin=20 --end=600 --step=20 --attn_name=linear --update_rule=sum

python3 main.py --begin=20 --end=600 --step=20 --attn_name=dpfp --attn_arg=1 --update_rule=sum

python3 main.py --begin=20 --end=600 --step=20 --attn_name=dpfp --attn_arg=2 --update_rule=sum

python3 main.py --begin=20 --end=600 --step=20 --attn_name=dpfp --attn_arg=3 --update_rule=sum

python3 main.py --begin=20 --end=600 --step=20 --attn_name=favor --attn_arg=64 --update_rule=sum

python3 main.py --begin=20 --end=600 --step=20 --attn_name=favor --attn_arg=128 --update_rule=sum

python3 main.py --begin=20 --end=600 --step=20 --attn_name=favor --attn_arg=512 --update_rule=sum

```

Setting 2 (update rule):

```bash

python3 main.py --begin=20 --end=200 --step=20 --attn_name=dpfp --attn_arg=1 --update_rule=sum --replace

python3 main.py --begin=20 --end=200 --step=20 --attn_name=dpfp --attn_arg=1 --update_rule=ours --replace

python3 main.py --begin=20 --end=200 --step=20 --attn_name=tanh --update_rule=fwm --replace

python3 main.py --begin=20 --end=200 --step=20 --attn_name=dpfp --attn_arg=1 --update_rule=fwm --replace

python3 main.py --begin=20 --end=200 --step=20 --attn_name=dpfp --attn_arg=2 --update_rule=ours --replace

python3 main.py --begin=20 --end=200 --step=20 --attn_name=linear --update_rule=ours --replace

python3 main.py --begin=20 --end=200 --step=20 --attn_name=favor --attn_arg=64 --update_rule=ours --replace

python3 main.py --begin=20 --end=200 --step=20 --attn_name=favor --attn_arg=128 --update_rule=ours --replace

```

Generate figures from the logs using the following notebooks:

```

synthetic/setting1_generate_figure.ipynb

synthetic/setting2_generate_figure.ipynb

```

## Language Modelling & Machine Translation

The toolkit and scripts for language modeling experiments can be found at [IDSIA/lmtool-fwms](https://github.com/IDSIA/lmtool-fwms).

For machine translation experiments, we ported the different attention functions implemented in the language modeling toolkit to the multi-head attention implementation in [FAIRSEQ](https://github.com/pytorch/fairseq).

## Citation

```

@inproceedings{schlag2021linear,

      title={Linear Transformers Are Secretly Fast Weight Programmers}, 

      author={Imanol Schlag and Kazuki Irie and J\"urgen Schmidhuber},

      booktitle={Proc. Int. Conf. on Machine Learning (ICML)},

      address = {Virtual only},

      month = jul,

      year={2021}

}

```