https://github.com/renll/SeqBoat

[NeurIPS 2023] Sparse Modular Activation for Efficient Sequence Modeling
https://github.com/renll/SeqBoat

Last synced: 3 months ago
JSON representation

[NeurIPS 2023] Sparse Modular Activation for Efficient Sequence Modeling

Host: GitHub
URL: https://github.com/renll/SeqBoat
Owner: renll
License: mit
Created: 2023-06-15T06:27:22.000Z (about 2 years ago)
Default Branch: main
Last Pushed: 2023-12-02T21:43:17.000Z (over 1 year ago)
Last Synced: 2024-10-28T08:41:56.999Z (8 months ago)
Language: Assembly
Homepage: https://arxiv.org/abs/2306.11197
Size: 11.8 MB
Stars: 35
Watchers: 3
Forks: 1
Open Issues: 1
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md

Awesome Lists containing this project

Awesome-state-space-models - GitHub

README

        
Sparse Modular Activation for Efficient Sequence Modeling



  

    Liliang Ren^1*,

  

    Yang Liu²,

  

    Shuohang Wang²,

  

  

    Yichong Xu²,

  

  

    Chenguang Zhu²,

  

    

    ChengXiang Zhai¹

  





  ¹University of Illinois at Urbana-Champaign,

  ²Microsoft Azure Cognitive Services Research,

  ^* Work done at Microsoft internship and UIUC. 



[![arXiv](https://img.shields.io/badge/arXiv-2306.11197-brightgreen.svg?style=flat-square)](https://arxiv.org/abs/2306.11197)  [![arXiv](https://img.shields.io/badge/slides-blue.svg?style=flat-square)](https://docs.google.com/presentation/d/1a-P_aWkjEDHPWW65ODy5nVrDfiqRjTYi_OgwhAwJO3o/edit?usp=sharing) [![arXiv](https://img.shields.io/badge/poster-orange.svg?style=flat-square)](https://drive.google.com/file/d/1GIrmReqLkc7AyegKXV-C4ns0jy2b-6xd/view?usp=sharing) 

## Introduction

This is the PyTorch implementation of SeqBoat :speedboat: proposed in our paper. This repository is based on [MEGA](https://github.com/facebookresearch/mega) and the [fairseq package v0.9.0](https://github.com/pytorch/fairseq/tree/v0.9.0).



 



## Updates

- [Nov. 26] Added a standalone CIFAR-10 training [script](standalone_cifar.py) of SeqBoat for quickstart!

- [Nov. 5] Released training scripts for enwik8 and added a standalone implementation of SeqBoat [here](standalone_seqboat.py)!

- [Sep. 21] Our paper is accepted by NeurIPS 2023!

- [July 18] Released training scripts for LRA and Speech Commands.

## Code Overview

1. The *compress* and *extract* operators for Sparse Modular Activation (SMA) are implemented in [fairseq/modules/seqboat_utils.py](fairseq/modules/seqboat_utils.py) with the functions `compress_seq` and `extract` respectively.

2. SeqBoat layer is implemented in [fairseq/modules/seqboat_unit.py](fairseq/modules/seqboat_unit.py).

## Setup

This repository requires Python 3.8+ and Pytorch 1.11+.

```bash

# Install from this repo

pip install -e .

```

For faster training, install NVIDIA's apex library following [fairseq](https://github.com/facebookresearch/fairseq#requirements-and-installation).

## Quickstart

The easiest way to get started is to run the [standalone_cifar.py](standalone_cifar.py) script. This script trains a simple SeqBoat model on CIFAR-10:

```bash

python standalone_cifar.py --prenorm

```

## Experiments

- [Long Range Arena](examples/seqboat/README.lra.md)

- [Speech Classification](examples/seqboat/README.sc.md)

- [Language Modeling](examples/seqboat/README.lm.md)

We also provide the training and testing scripts for each of the tasks in the `experiment` directory.

## Citation

If you find our work useful, please consider citing:

```bibtex

@inproceedings{ren2023sparse,

  title={Sparse Modular Activation for Efficient Sequence Modeling},

  author={Liliang Ren and Yang Liu and Shuohang Wang and Yichong Xu and Chenguang Zhu and ChengXiang Zhai},

  booktitle={Thirty-seventh Conference on Neural Information Processing Systems},

  year={2023},

  url={https://openreview.net/forum?id=TfbzX6I14i}

}

```

## License

SeqBoat is under MIT license. The license also applies to model checkpoints.

## Contact

Liliang Ren ([email protected])

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/renll/SeqBoat

Awesome Lists containing this project

README

Sparse Modular Activation for Efficient Sequence Modeling