Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/btlmd/attentionaccelerations
Course Project for Numerical Analysis, THU-CST, 2023 Spring
https://github.com/btlmd/attentionaccelerations
Last synced: 12 days ago
JSON representation
Course Project for Numerical Analysis, THU-CST, 2023 Spring
- Host: GitHub
- URL: https://github.com/btlmd/attentionaccelerations
- Owner: Btlmd
- Created: 2023-06-06T09:26:33.000Z (over 1 year ago)
- Default Branch: master
- Last Pushed: 2023-06-15T10:51:58.000Z (over 1 year ago)
- Last Synced: 2024-11-21T05:12:38.797Z (2 months ago)
- Language: Python
- Homepage:
- Size: 1.77 MB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Attention Accelerations
This repo provides code for the course project #7 of Numerical Analysis, THU-CST, 2023 Spring.
On Long Range Arena, we tried to reproduce results in [Skyformer](https://arxiv.org/abs/2111.00035), [CosFormer](https://arxiv.org/abs/2202.08791), [LARA](https://arxiv.org/abs/2204.04667) and [MEGA](https://arxiv.org/abs/2209.10655). We also measured the training speed and inference speed of these models.
## Data Preparation
- Download Preprocessed data from [TsinghuaCloud](https://cloud.tsinghua.edu.cn/d/76489e9a0b154692a502/)
- Unzip `lra_data_mega.zip` and `lra_data_skyformer.zip` and make the directory structure as follows:
```
data/skyformer
├── lra-image.dev.pickle
├── lra-image.test.pickle
├── lra-image.train.pickle
├── ...
├── lra-text.dev.pickle
├── lra-text.test.pickle
└── lra-text.train.pickle
data/mega
├── aan
│ ├── dict-bin
│ ├── label-bin
│ ├── src-bin
│ └── src1-bin
├── cifar10
│ ├── input
│ └── label
├── imdb-4000
│ ├── label-bin
│ └── src-bin
├── listops
│ ├── label-bin
│ └── src-bin
├── path-x
│ ├── input
│ └── label
└── pathfinder
├── input
└── label
```## Installation
Prepare the environment by
```bash
conda create -n acce python=3.8
conda activate acce# install `torch==1.8.0` follow your CUDA version, e.g.
conda install pytorch==1.8.0 torchvision==0.9.0 torchaudio==0.8.0 cudatoolkit=11.1 -c pytorch -c conda-forge# install skyformer dependencies
pip install -r skyformer/requirements.txt# install mega and its dependencies
pip install -e mega
```## Run
### Training & Inference
- CosFormer, LARA and Skyformer
```bash
cd skyformer
python main.py --mode train --attn --task
```- ``:
- `softmax`: baseline attention
- `skyformer`
- `cosformer`
- `lara`- ``:
- `lra-listops`
- `lra-pathfinder`
- `lra-retrieval`
- `lra-text`
- `lra-image`- MEGA
```bash
cd mega
bash training_scripts/run_.sh
```- ``:
- `listops`
- `pathfinder`
- `retrieval`
- `text`
- `image`- The scripts select the best checkpoint on vavlidation set and evaluate on test set at the end of training.
### Speed Test
- CosFormer, LARA and Skyformer
```bash
cd skyformer
bash speed_tests.sh
```It runs speed tests for all `softmax`, `skyformer`, `cosformer` and `lara` on all 5 tasks.
- MEGA
```bash
cd mega
bash timing_scripts/speed_tests.sh
```It runs speed tests for all `MEGA-∞` and `MEGA-128` on all 5 tasks.
## Acknowledgement and Refernce
This repo is derived from [Skyformer](https://github.com/pkuzengqi/Skyformer) and [MEGA](https://github.com/facebookresearch/mega), with implementation refernce from [CosFormer](https://github.com/OpenNLPLab/cosFormer) and [LARA](https://github.com/HKUNLP/efficient-attention). We thank the authors for their great work.