Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/gbup-group/EAN-efficient-attention-network
The implementation of paper ''Efficient Attention Network: Accelerate Attention by Searching Where to Plug''.
https://github.com/gbup-group/EAN-efficient-attention-network
Last synced: 3 months ago
JSON representation
The implementation of paper ''Efficient Attention Network: Accelerate Attention by Searching Where to Plug''.
- Host: GitHub
- URL: https://github.com/gbup-group/EAN-efficient-attention-network
- Owner: gbup-group
- License: mit
- Created: 2020-11-28T02:38:41.000Z (almost 4 years ago)
- Default Branch: main
- Last Pushed: 2023-06-16T02:38:22.000Z (over 1 year ago)
- Last Synced: 2024-07-04T00:57:41.676Z (4 months ago)
- Language: Python
- Size: 552 KB
- Stars: 20
- Watchers: 5
- Forks: 6
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# EAN-efficient-attention-network
![GitHub](https://img.shields.io/github/license/gbup-group/DIANet.svg)
![GitHub](https://img.shields.io/badge/gbup-%E7%A8%B3%E4%BD%8F-blue.svg)By [Zhongzhan Huang](https://github.com/dedekinds), [Senwei Liang](https://leungsamwai.github.io), [Mingfu Liang](https://wuyujack.github.io/), [Wei He](https://github.com/erichhhhho) and [Haizhao Yang](https://haizhaoyang.github.io/).
The implementation of paper ''Efficient Attention Network: Accelerate Attention by Searching Where to Plug'' [[paper]](https://arxiv.org/abs/2011.14058).
## Introduction
Efficient Attention Network (EAN) is a framework to improve the efficiency for the existing attention modules in computer vision. In EAN, we leverage the sharing mechanism [(Huang et al. 2020)](https://arxiv.org/pdf/1905.10671.pdf) to share the attention module within the backbone and search where to connect the shared attention module via reinforcement learning.
## Requirement
* Python 3.6 and [PyTorch 1.0](http://pytorch.org/)## Implementation
Our implementation is divided in three parts. First, we pre-train a supernet. Second, we use a policy-gradient-based method to search for an optimal connection scheme from the supernet. Last, we train from scratch a network searched by the second step.### Pretrain a Supernet
First, we pretrain a supernet and the checkpoint is saved in NAS_ckpts. For example, we train a SGE-supernet,
```
CUDA_VISIBLE_DEVICES=0,1,2,3 python train_imagenet/train_imagenet_ensemble_subset.py -a forward_config_share_sge_resnet50 -data /home/jovyan/ILSVRC2012_Data --checkpoint NAS_ckpts/ensemble_sge_train_on_subset
```
or train a DIA-supernet,
```
CUDA_VISIBLE_DEVICES=0,1,2,3 python train_imagenet/train_imagenet_ensemble_subset.py -a forward_dia_fbresnet50 -data /home/jovyan/ILSVRC2012_Data --checkpoint NAS_ckpts/ensemble_dia_train_on_subset
```### Search an Optimal Connection Scheme
Then, we search an optimal connection scheme from supernet.For SGE,
```
python search_imagenet/run_code_search_sge.py
```
For DIA,
```
python search_imagenet/run_code_search_dia.py
```### Train a Network From Scratch
Last, we train from scracth the attention network with the connection scheme searched in the second step. Note that to train the attention network with the different scheme, we need to edit train_imagenet/run_codes_train_from_scratch.py
```
python train_imagenet/run_codes_train_from_scratch.py
```
The checkpoints will be save in NAS_ckpts.## Citation
If you find this paper helps in your research, please kindly cite
## Acknowledgement
We would like to thank Taehoon Kim for his pytorch version of [ENAS fromework](https://github.com/carpedm20/ENAS-pytorch) and Xiang Li for his [attention network framework](https://github.com/implus/PytorchInsight).