https://github.com/NVIDIA/transformer-ls

Official PyTorch Implementation of Long-Short Transformer (NeurIPS 2021).
https://github.com/NVIDIA/transformer-ls

efficient-transformers long-sequence transformer vision-transformer

Last synced: 6 months ago
JSON representation

Official PyTorch Implementation of Long-Short Transformer (NeurIPS 2021).

Host: GitHub
URL: https://github.com/NVIDIA/transformer-ls
Owner: NVIDIA
License: mit
Created: 2021-07-22T18:35:04.000Z (over 4 years ago)
Default Branch: master
Last Pushed: 2022-04-18T18:18:29.000Z (over 3 years ago)
Last Synced: 2025-04-24T00:28:17.518Z (7 months ago)
Topics: efficient-transformers, long-sequence, transformer, vision-transformer
Language: Python
Homepage:
Size: 124 KB
Stars: 225
Watchers: 14
Forks: 33
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

StarryDivineSky - NVIDIA/transformer-ls

README

          # Long-Short Transformer (Transformer-LS)

This repository hosts the code and models for the paper:

[Long-Short Transformer: Efficient Transformers for Language and Vision](https://arxiv.org/abs/2107.02192)

# Updates

- December 6, 2021: Release the code for [autoregressive language modeling](./autoregressive)

- July 23, 2021: Release the code and models for [ImageNet classification](./imagenet) and [Long-Range Arena](./lra)

# Architecture

![plot](https://user-images.githubusercontent.com/18202259/125551111-28369067-22f1-4615-adaf-611934a9752d.png)

Long-short Transformer substitutes the full self attention of the original Transformer models with an efficient attention that considers both long-range and short-term correlations. Each query attends to tokens from the segment-wise sliding window to capture short-term correlations, and the dynamically projected features to capture long-range correlations. To align the norms of the original and projected feature vectors and improve the efficacy of the aggregation, we normalize the original and project feature vectors with two sets of Layer Normalizations.

# Tasks

- [>>> Transformer-LS for ImageNet classification](./imagenet)

- [>>> Transformer-LS for Long Range Areana](./lra)

- [>>> Transformer-LS for autoregressive language modeling](./autoregressive)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/NVIDIA/transformer-ls

Awesome Lists containing this project

README