https://github.com/d-li14/lambda.pytorch
PyTorch implementation of Lambda Network and pretrained Lambda-ResNet
https://github.com/d-li14/lambda.pytorch
attention iclr2021 imagenet lambda-network pre-trained-model pytorch
Last synced: 2 months ago
JSON representation
PyTorch implementation of Lambda Network and pretrained Lambda-ResNet
- Host: GitHub
- URL: https://github.com/d-li14/lambda.pytorch
- Owner: d-li14
- Created: 2020-11-28T00:25:43.000Z (about 5 years ago)
- Default Branch: main
- Last Pushed: 2021-03-11T00:53:06.000Z (almost 5 years ago)
- Last Synced: 2025-03-24T21:42:21.733Z (9 months ago)
- Topics: attention, iclr2021, imagenet, lambda-network, pre-trained-model, pytorch
- Language: Python
- Homepage: https://arxiv.org/abs/2102.08602
- Size: 9.77 KB
- Stars: 54
- Watchers: 2
- Forks: 7
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# lambda.pytorch
**[NEW!]** Check out our latest work [involution](https://github.com/d-li14/involution) in CVPR'21 that bridges convolution and self-attention operators.
---
PyTorch implementation of [LambdaNetworks: Modeling long-range Interactions without Attention](https://openreview.net/forum?id=xTJEN-ggl1b).
Lambda Networks apply associative law of matrix multiplication to reverse the computing order of self-attention, achieving the linear computation complexity regarding content interactions.
Similar techniques have been used previously in [A2-Net](https://arxiv.org/abs/1810.11579) and [CGNL](https://arxiv.org/abs/1810.13125). Check out a collection of self-attention modules in another repository [dot-product-attention](https://github.com/d-li14/dot-product-attention).
## Training Configuration
✓ SGD optimizer, initial learning rate 0.1, momentum 0.9, weight decay 0.0001
✓ epoch 130, batch size 256, 8x Tesla V100 GPUs, LR decay strategy cosine
✓ label smoothing 0.1
## Pre-trained checkpoints
| Architecture | Parameters | FLOPs | Top-1 / Top-5 Acc. (%) | Download |
| :----------------------: | :--------: | :---: | :------------------------: | :------: |
| Lambda-ResNet-50 | 14.995M | 6.576G | 78.208 / 93.820 | [model](https://hkustconnect-my.sharepoint.com/:u:/g/personal/dlibh_connect_ust_hk/EUZkICtpXitIq6PGa6h6m_YBnFXCiCYTSuqoIUqiR33C5A?e=mhgEbC) | [log](https://hkustconnect-my.sharepoint.com/:t:/g/personal/dlibh_connect_ust_hk/EQuZ1itCS2dFpN2MBVepL5YBQe9N-ZUv6y4vNdO5uiVFig?e=dX7Id1) |
## Citation
If you find this repository useful in your research, please cite
```bibtex
@InProceedings{Li_2021_CVPR,
author = {Li, Duo and Hu, Jie and Wang, Changhu and Li, Xiangtai and She, Qi and Zhu, Lei and Zhang, Tong and Chen, Qifeng},
title = {Involution: Inverting the Inherence of Convolution for Visual Recognition},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2021}
}
```
```bibtex
@inproceedings{
bello2021lambdanetworks,
title={LambdaNetworks: Modeling long-range Interactions without Attention},
author={Irwan Bello},
booktitle={International Conference on Learning Representations},
year={2021},
}
```