https://github.com/shi-labs/neighborhood-attention-transformer
Neighborhood Attention Transformer, arxiv 2022 / CVPR 2023. Dilated Neighborhood Attention Transformer, arxiv 2022
https://github.com/shi-labs/neighborhood-attention-transformer
neighborhood-attention pytorch
Last synced: 6 months ago
JSON representation
Neighborhood Attention Transformer, arxiv 2022 / CVPR 2023. Dilated Neighborhood Attention Transformer, arxiv 2022
- Host: GitHub
- URL: https://github.com/shi-labs/neighborhood-attention-transformer
- Owner: SHI-Labs
- License: mit
- Created: 2022-04-14T06:40:50.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2024-05-15T01:12:12.000Z (over 1 year ago)
- Last Synced: 2025-04-08T16:07:03.588Z (6 months ago)
- Topics: neighborhood-attention, pytorch
- Language: Python
- Homepage:
- Size: 25.4 MB
- Stars: 1,105
- Watchers: 15
- Forks: 88
- Open Issues: 5
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Neighborhood Attention Transformers
[
](https://github.com/SHI-Labs/NATTEN)
[](https://paperswithcode.com/sota/instance-segmentation-on-ade20k-val?p=dilated-neighborhood-attention-transformer)
[](https://paperswithcode.com/sota/panoptic-segmentation-on-ade20k-val?p=dilated-neighborhood-attention-transformer)
[](https://paperswithcode.com/sota/instance-segmentation-on-cityscapes-val?p=dilated-neighborhood-attention-transformer)
[](https://paperswithcode.com/sota/panoptic-segmentation-on-coco-minival?p=dilated-neighborhood-attention-transformer)
[](https://paperswithcode.com/sota/semantic-segmentation-on-ade20k-val?p=dilated-neighborhood-attention-transformer)
[](https://paperswithcode.com/sota/semantic-segmentation-on-cityscapes-val?p=dilated-neighborhood-attention-transformer)
[](https://paperswithcode.com/sota/panoptic-segmentation-on-cityscapes-val?p=dilated-neighborhood-attention-transformer)
[](https://paperswithcode.com/sota/instance-segmentation-on-coco-minival?p=dilated-neighborhood-attention-transformer)[](https://paperswithcode.com/sota/image-generation-on-ffhq-256-x-256?p=stylenat-giving-each-head-a-new-perspective)
[](https://paperswithcode.com/sota/image-generation-on-ffhq-1024-x-1024?p=stylenat-giving-each-head-a-new-perspective)
[](https://paperswithcode.com/sota/image-generation-on-lsun-churches-256-x-256?p=stylenat-giving-each-head-a-new-perspective)
**Powerful hierarchical vision transformers based on sliding window attention.**
Neighborhood Attention (NA, local attention) was introduced in our original paper,
[NAT](NAT.md), and runs efficiently with our extension to PyTorch, [NATTEN](https://github.com/SHI-Labs/NATTEN).We recently introduced a new model, [DiNAT](DiNAT.md),
which extends NA by dilating neighborhoods (DiNA, sparse global attention, a.k.a. dilated local attention).Combinations of NA/DiNA are capable of preserving locality, maintaining
translational equivariance,
expanding the receptive field exponentially,
and capturing longer-range inter-dependencies,
leading to significant performance boosts in downstream vision tasks, such as
[StyleNAT](https://github.com/SHI-Labs/StyleNAT) for image generation.# News
### March 25, 2023
* Neighborhood Attention Transformer was accepted to CVPR 2023!### November 18, 2022
* NAT and DiNAT are now available through HuggingFace's [transformers](https://github.com/huggingface/transformers).
* NAT and DiNAT classification models are also available on the HuggingFace's Model Hub: [NAT](https://huggingface.co/models?filter=nat) | [DiNAT](https://huggingface.co/models?filter=dinat)### November 11, 2022
* New preprint: [StyleNAT: Giving Each Head a New Perspective](https://github.com/SHI-Labs/StyleNAT).
* Style-based GAN powered with Neighborhood Attention sets new SOTA on FFHQ-256 with a 2.05 FID.
### October 8, 2022
* [NATTEN](https://github.com/SHI-Labs/NATTEN) is now [available as a pip package](https://www.shi-labs.com/natten/)!
* You can now install NATTEN with pre-compiled wheels, and start using it in seconds.
* NATTEN will be maintained and developed as a [separate project](https://github.com/SHI-Labs/NATTEN) to support broader usage of sliding window attention, even beyond computer vision.### September 29, 2022
* New preprint: [Dilated Neighborhood Attention Transformer](DiNAT.md).# Dilated Neighborhood Attention :fire:

A new hierarchical vision transformer based on Neighborhood Attention (local attention) and Dilated Neighborhood Attention (sparse global attention) that enjoys significant performance boost in downstream tasks.
Check out the [DiNAT README](DiNAT.md).
# Neighborhood Attention Transformer

Our original paper, [Neighborhood Attention Transformer (NAT)](NAT.md), the first efficient sliding-window local attention.
# How Neighborhood Attention works
Neighborhood Attention localizes the query token's (red) receptive field to its nearest neighboring tokens in the key-value pair (green).
This is equivalent to dot-product self attention when the neighborhood size is identical to the image dimensions.
Note that the edges are special (edge) cases.
# Citation
```bibtex
@inproceedings{hassani2023neighborhood,
title = {Neighborhood Attention Transformer},
author = {Ali Hassani and Steven Walton and Jiachen Li and Shen Li and Humphrey Shi},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2023},
pages = {6185-6194}
}
@article{hassani2022dilated,
title = {Dilated Neighborhood Attention Transformer},
author = {Ali Hassani and Humphrey Shi},
year = 2022,
url = {https://arxiv.org/abs/2209.15001},
eprint = {2209.15001},
archiveprefix = {arXiv},
primaryclass = {cs.CV}
}
@article{walton2022stylenat,
title = {StyleNAT: Giving Each Head a New Perspective},
author = {Steven Walton and Ali Hassani and Xingqian Xu and Zhangyang Wang and Humphrey Shi},
year = 2022,
url = {https://arxiv.org/abs/2211.05770},
eprint = {2211.05770},
archiveprefix = {arXiv},
primaryclass = {cs.CV}
}
```