https://github.com/cmsflash/efficient-attention

An implementation of the efficient attention module.
https://github.com/cmsflash/efficient-attention

attention-mechanism computer-vision deep-learning paper paper-implementation paper-open-source

Last synced: 5 months ago
JSON representation

An implementation of the efficient attention module.

Host: GitHub
URL: https://github.com/cmsflash/efficient-attention
Owner: cmsflash
License: mit
Created: 2019-01-23T14:44:40.000Z (over 7 years ago)
Default Branch: master
Last Pushed: 2020-11-30T06:06:12.000Z (over 5 years ago)
Last Synced: 2023-11-07T14:31:57.213Z (over 2 years ago)
Topics: attention-mechanism, computer-vision, deep-learning, paper, paper-implementation, paper-open-source
Language: Python
Homepage: https://arxiv.org/abs/1812.01243
Size: 1.39 MB
Stars: 244
Watchers: 6
Forks: 23
Open Issues: 1
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# Efficient Attention

An implementation of the [efficient attention](https://arxiv.org/abs/1812.01243) module.

## Description

![](illustration.png)

Efficient attention is an attention mechanism that substantially optimizes the memory and computational efficiency while retaining **exactly** the same expressive power as the conventional dot-product attention. The illustration above compares the two types of attention. The efficient attention module is a drop-in replacement for the non-local module ([Wang et al., 2018](https://arxiv.org/abs/1711.07971)), while it:

- uses less resources to achieve the same accuracy;
- achieves higher accuracy with the same resource constraints (by allowing more insertions); and
- is applicable in domains and models where the non-local module is not (due to resource constraints).

## Resources

YouTube:
- Presentation: https://youtu.be/_wnjhTM04NM

bilibili (for users in Mainland China):
- Presentation: https://www.bilibili.com/video/BV1tK4y1f7Rm
- Presentation in Chinese: https://www.bilibili.com/video/bv1Gt4y1Y7E3

## Implementation details

This repository implements the efficient attention module with softmax normalization, output reprojection, and residual connection.

## Features not in the paper

This repository implements additionally implements the multi-head mechanism which was not in the paper. To learn more about the mechanism, refer to [Vaswani et al.](https://arxiv.org/abs/1706.03762)

## Citation

The [paper](https://arxiv.org/abs/1812.01243) will appear at WACV 2021. If you use, compare with, or refer to this work, please cite

```bibtex
@inproceedings{shen2021efficient,
author = {Zhuoran Shen and Mingyuan Zhang and Haiyu Zhao and Shuai Yi and Hongsheng Li},
title = {Efficient Attention: Attention with Linear Complexities},
booktitle = {WACV},
year = {2021},
}
```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/cmsflash/efficient-attention

Awesome Lists containing this project

README