https://github.com/cmsflash/efficient-attention
An implementation of the efficient attention module.
https://github.com/cmsflash/efficient-attention
attention-mechanism computer-vision deep-learning paper paper-implementation paper-open-source
Last synced: 4 months ago
JSON representation
An implementation of the efficient attention module.
- Host: GitHub
- URL: https://github.com/cmsflash/efficient-attention
- Owner: cmsflash
- License: mit
- Created: 2019-01-23T14:44:40.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2020-11-30T06:06:12.000Z (over 5 years ago)
- Last Synced: 2023-11-07T14:31:57.213Z (over 2 years ago)
- Topics: attention-mechanism, computer-vision, deep-learning, paper, paper-implementation, paper-open-source
- Language: Python
- Homepage: https://arxiv.org/abs/1812.01243
- Size: 1.39 MB
- Stars: 244
- Watchers: 6
- Forks: 23
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Efficient Attention
An implementation of the [efficient attention](https://arxiv.org/abs/1812.01243) module.
## Description

Efficient attention is an attention mechanism that substantially optimizes the memory and computational efficiency while retaining **exactly** the same expressive power as the conventional dot-product attention. The illustration above compares the two types of attention. The efficient attention module is a drop-in replacement for the non-local module ([Wang et al., 2018](https://arxiv.org/abs/1711.07971)), while it:
- uses less resources to achieve the same accuracy;
- achieves higher accuracy with the same resource constraints (by allowing more insertions); and
- is applicable in domains and models where the non-local module is not (due to resource constraints).
## Resources
YouTube:
- Presentation: https://youtu.be/_wnjhTM04NM
bilibili (for users in Mainland China):
- Presentation: https://www.bilibili.com/video/BV1tK4y1f7Rm
- Presentation in Chinese: https://www.bilibili.com/video/bv1Gt4y1Y7E3
## Implementation details
This repository implements the efficient attention module with softmax normalization, output reprojection, and residual connection.
## Features not in the paper
This repository implements additionally implements the multi-head mechanism which was not in the paper. To learn more about the mechanism, refer to [Vaswani et al.](https://arxiv.org/abs/1706.03762)
## Citation
The [paper](https://arxiv.org/abs/1812.01243) will appear at WACV 2021. If you use, compare with, or refer to this work, please cite
```bibtex
@inproceedings{shen2021efficient,
author = {Zhuoran Shen and Mingyuan Zhang and Haiyu Zhao and Shuai Yi and Hongsheng Li},
title = {Efficient Attention: Attention with Linear Complexities},
booktitle = {WACV},
year = {2021},
}
```