https://github.com/shi-labs/neighborhood-attention-transformer

Neighborhood Attention Transformer, arxiv 2022 / CVPR 2023. Dilated Neighborhood Attention Transformer, arxiv 2022
https://github.com/shi-labs/neighborhood-attention-transformer

neighborhood-attention pytorch

Last synced: 6 months ago
JSON representation

Neighborhood Attention Transformer, arxiv 2022 / CVPR 2023. Dilated Neighborhood Attention Transformer, arxiv 2022

Host: GitHub
URL: https://github.com/shi-labs/neighborhood-attention-transformer
Owner: SHI-Labs
License: mit
Created: 2022-04-14T06:40:50.000Z (over 3 years ago)
Default Branch: main
Last Pushed: 2024-05-15T01:12:12.000Z (over 1 year ago)
Last Synced: 2025-04-08T16:07:03.588Z (6 months ago)
Topics: neighborhood-attention, pytorch
Language: Python
Homepage:
Size: 25.4 MB
Stars: 1,105
Watchers: 15
Forks: 88
Open Issues: 5
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          # Neighborhood Attention Transformers





[](https://github.com/SHI-Labs/NATTEN)

[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/dilated-neighborhood-attention-transformer/instance-segmentation-on-ade20k-val)](https://paperswithcode.com/sota/instance-segmentation-on-ade20k-val?p=dilated-neighborhood-attention-transformer)

[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/dilated-neighborhood-attention-transformer/panoptic-segmentation-on-ade20k-val)](https://paperswithcode.com/sota/panoptic-segmentation-on-ade20k-val?p=dilated-neighborhood-attention-transformer)

[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/dilated-neighborhood-attention-transformer/instance-segmentation-on-cityscapes-val)](https://paperswithcode.com/sota/instance-segmentation-on-cityscapes-val?p=dilated-neighborhood-attention-transformer)

[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/dilated-neighborhood-attention-transformer/panoptic-segmentation-on-coco-minival)](https://paperswithcode.com/sota/panoptic-segmentation-on-coco-minival?p=dilated-neighborhood-attention-transformer)

[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/dilated-neighborhood-attention-transformer/semantic-segmentation-on-ade20k-val)](https://paperswithcode.com/sota/semantic-segmentation-on-ade20k-val?p=dilated-neighborhood-attention-transformer)

[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/dilated-neighborhood-attention-transformer/semantic-segmentation-on-cityscapes-val)](https://paperswithcode.com/sota/semantic-segmentation-on-cityscapes-val?p=dilated-neighborhood-attention-transformer)

[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/dilated-neighborhood-attention-transformer/panoptic-segmentation-on-cityscapes-val)](https://paperswithcode.com/sota/panoptic-segmentation-on-cityscapes-val?p=dilated-neighborhood-attention-transformer)

[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/dilated-neighborhood-attention-transformer/instance-segmentation-on-coco-minival)](https://paperswithcode.com/sota/instance-segmentation-on-coco-minival?p=dilated-neighborhood-attention-transformer)

[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/stylenat-giving-each-head-a-new-perspective/image-generation-on-ffhq-256-x-256)](https://paperswithcode.com/sota/image-generation-on-ffhq-256-x-256?p=stylenat-giving-each-head-a-new-perspective)

[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/stylenat-giving-each-head-a-new-perspective/image-generation-on-ffhq-1024-x-1024)](https://paperswithcode.com/sota/image-generation-on-ffhq-1024-x-1024?p=stylenat-giving-each-head-a-new-perspective)

[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/stylenat-giving-each-head-a-new-perspective/image-generation-on-lsun-churches-256-x-256)](https://paperswithcode.com/sota/image-generation-on-lsun-churches-256-x-256?p=stylenat-giving-each-head-a-new-perspective)

![NAT-Intro](assets/dinat/intro_dark.png#gh-dark-mode-only)

![NAT-Intro](assets/dinat/intro_light.png#gh-light-mode-only)

**Powerful hierarchical vision transformers based on sliding window attention.**

Neighborhood Attention (NA, local attention) was introduced in our original paper, 

[NAT](NAT.md), and runs efficiently with our extension to PyTorch, [NATTEN](https://github.com/SHI-Labs/NATTEN).

We recently introduced a new model, [DiNAT](DiNAT.md), 

which extends NA by dilating neighborhoods (DiNA, sparse global attention, a.k.a. dilated local attention).

Combinations of NA/DiNA are capable of preserving locality, maintaining

translational equivariance,

expanding the receptive field exponentially, 

and capturing longer-range inter-dependencies, 

leading to significant performance boosts in downstream vision tasks, such as

[StyleNAT](https://github.com/SHI-Labs/StyleNAT) for image generation.

# News

### March 25, 2023

* Neighborhood Attention Transformer was accepted to CVPR 2023!

### November 18, 2022

* NAT and DiNAT are now available through HuggingFace's [transformers](https://github.com/huggingface/transformers).

  * NAT and DiNAT classification models are also available on the HuggingFace's Model Hub: [NAT](https://huggingface.co/models?filter=nat) | [DiNAT](https://huggingface.co/models?filter=dinat)

### November 11, 2022

* New preprint: [StyleNAT: Giving Each Head a New Perspective](https://github.com/SHI-Labs/StyleNAT).

  * Style-based GAN powered with Neighborhood Attention sets new SOTA on FFHQ-256 with a 2.05 FID.

  ![stylenat](assets/stylenat/stylenat.png)

### October 8, 2022

* [NATTEN](https://github.com/SHI-Labs/NATTEN) is now [available as a pip package](https://www.shi-labs.com/natten/)!

    * You can now install NATTEN with pre-compiled wheels, and start using it in seconds. 

    * NATTEN will be maintained and developed as a [separate project](https://github.com/SHI-Labs/NATTEN) to support broader usage of sliding window attention, even beyond computer vision.

### September 29, 2022

* New preprint: [Dilated Neighborhood Attention Transformer](DiNAT.md).

# Dilated Neighborhood Attention :fire:

![DiNAT-Abs](assets/dinat/radar_dark.png#gh-dark-mode-only)

![DiNAT-Abs](assets/dinat/radar_light.png#gh-light-mode-only)

A new hierarchical vision transformer based on Neighborhood Attention (local attention) and Dilated Neighborhood Attention (sparse global attention) that enjoys significant performance boost in downstream tasks.

Check out the [DiNAT README](DiNAT.md).

# Neighborhood Attention Transformer

![NAT-Abs](assets/nat/computeplot_dark.png#gh-dark-mode-only)

![NAT-Abs](assets/nat/computeplot_light.png#gh-light-mode-only)

Our original paper, [Neighborhood Attention Transformer (NAT)](NAT.md), the first efficient sliding-window local attention.

# How Neighborhood Attention works

Neighborhood Attention localizes the query token's (red) receptive field to its nearest neighboring tokens in the key-value pair (green). 

This is equivalent to dot-product self attention when the neighborhood size is identical to the image dimensions. 

Note that the edges are special (edge) cases.

![720p_fast_dm](assets/nat/720p_fast_dm.gif#gh-dark-mode-only)

![720p_fast_lm](assets/nat/720p_fast_lm.gif#gh-light-mode-only)

# Citation

```bibtex

@inproceedings{hassani2023neighborhood,

	title        = {Neighborhood Attention Transformer},

	author       = {Ali Hassani and Steven Walton and Jiachen Li and Shen Li and Humphrey Shi},

	booktitle    = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},

	month        = {June},

	year         = {2023},

	pages        = {6185-6194}

}

@article{hassani2022dilated,

	title        = {Dilated Neighborhood Attention Transformer},

	author       = {Ali Hassani and Humphrey Shi},

	year         = 2022,

	url          = {https://arxiv.org/abs/2209.15001},

	eprint       = {2209.15001},

	archiveprefix = {arXiv},

	primaryclass = {cs.CV}

}

@article{walton2022stylenat,

	title        = {StyleNAT: Giving Each Head a New Perspective},

	author       = {Steven Walton and Ali Hassani and Xingqian Xu and Zhangyang Wang and Humphrey Shi},

	year         = 2022,

	url          = {https://arxiv.org/abs/2211.05770},

	eprint       = {2211.05770},

	archiveprefix = {arXiv},

	primaryclass = {cs.CV}

}

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/shi-labs/neighborhood-attention-transformer

Awesome Lists containing this project

README