https://github.com/epfml/dynamic-sparse-flash-attention

Last synced: 7 months ago
JSON representation

Host: GitHub
URL: https://github.com/epfml/dynamic-sparse-flash-attention
Owner: epfml
License: other
Created: 2023-05-24T06:16:16.000Z (almost 3 years ago)
Default Branch: main
Last Pushed: 2023-06-02T12:28:57.000Z (almost 3 years ago)
Last Synced: 2025-04-28T12:38:49.594Z (about 1 year ago)
Language: Jupyter Notebook
Size: 177 KB
Stars: 143
Watchers: 7
Forks: 6
Open Issues: 2
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# Dynamic Sparse FlashAttention

Code to reproduce results for the paper "Faster Causal Attention Over Large Sequences Through Sparse Flash Attention"

# Setup

To install the required python dependencies, first run:

```bash
pip install -r ./requirements.txt
```

Then install Triton:
```bash
git clone https://github.com/openai/triton.git
cd triton
git checkout b2a757d00028fe844a93904036a18e8670bfe92f
cd python
pip install cmake
pip install -e .
```
In the command above we set the Triton library to the commit used in our experiments. Feel free to experiment with later Triton versions.

# Reproducing our LM experiments on OpenWebText2

**GPU requirements:** Preferably, you need at least one A100. Some of our experiments use data-parallelism with up to 3 A100s. You should have no problem running those experiments on any GPU supporting `bfloat16`, you might have to change the model parameters to adapt to the memory available.

Go in the `openwebtext2-experiments` folder and run the `script/train-LMs.sh` command.

# Reproducing our runtime results

**GPU requirements:** We used one A100.

For the Hash-sparse and QK-sparse results, go in the `runtime-experiments` folder and check the `timeperf-hash-and-qk-sparse.ipynb` notebook.

# Reproducing our Reformer results

Coming soon

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/epfml/dynamic-sparse-flash-attention

Awesome Lists containing this project

README