Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
awesome-attention-mechanism-in-cv
Awesome List of Attention Modules and Plug&Play Modules in Computer Vision
https://github.com/pprp/awesome-attention-mechanism-in-cv
Last synced: 2 days ago
JSON representation
-
Plug and Play Module
- Enhancing feature fusion for human pose estimation
- Enhancing feature fusion for human pose estimation
- ACNet: Strengthening the Kernel Skeletons for Powerful CNN via Asymmetric Convolution Blocks
- DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs - pytorch) |
- MixConv: Mixed Depthwise Convolutional Kernels
- Pyramid Scene Parsing Network
- Receptive Field Block Net for Accurate and Fast Object Detection
- Strip Pooling: Rethinking Spatial Pooling for Scene Parsing - Qibin/SPNet) |
- SSH: Single Stage Headless Face Detector
- GhostNet: More Features from Cheap Operations
- SlimConv: Reducing Channel Redundancy in Convolutional Neural Networks by Weights Flipping
- EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks - PyTorch) |
- PP-NAS: Searching for Plug-and-Play Blocks on Convolutional Neural Network - NAS) |
- Dynamic Convolution: Attention over Convolution Kernels - convolution-Pytorch) |
- PSConv: Squeezing Feature Pyramid into One Compact Poly-Scale Convolutional Layer - li14/PSConv) |
- DCANet: Dense Context-Aware Network for Semantic Segmentation
- Enhancing feature fusion for human pose estimation
- Object Contextual Representation for sematic segmentation - OCR](https://github.com/HRNet/HRNet-Semantic-Segmentation/tree/HRNet-OCR?v=2) |
- DO-Conv: Depthwise Over-parameterized Convolutional Layer - Conv](https://github.com/yangyanli/DO-Conv) |
- Pyramidal Convolution: Rethinking Convolutional Neural Networks for Visual Recognition
- Dynamic Group Convolution for Accelerating Convolutional Neural Networks
- Enhancing feature fusion for human pose estimation
- Enhancing feature fusion for human pose estimation
- Enhancing feature fusion for human pose estimation
- Enhancing feature fusion for human pose estimation
- Enhancing feature fusion for human pose estimation
- Enhancing feature fusion for human pose estimation
- Enhancing feature fusion for human pose estimation
- Enhancing feature fusion for human pose estimation
- Enhancing feature fusion for human pose estimation
- Enhancing feature fusion for human pose estimation
- Enhancing feature fusion for human pose estimation
- Enhancing feature fusion for human pose estimation
- CondConv: Conditionally Parameterized Convolutions for Efficient Inference - li14/condconv.pytorch) |
- Enhancing feature fusion for human pose estimation
- Enhancing feature fusion for human pose estimation
- Enhancing feature fusion for human pose estimation
- Enhancing feature fusion for human pose estimation
- Enhancing feature fusion for human pose estimation
- Enhancing feature fusion for human pose estimation
- Enhancing feature fusion for human pose estimation
- Enhancing feature fusion for human pose estimation
- Enhancing feature fusion for human pose estimation
- Enhancing feature fusion for human pose estimation
- ULSAM: Ultra-Lightweight Subspace Attention Module for Compact Convolutional Neural Networks
- Enhancing feature fusion for human pose estimation
- Enhancing feature fusion for human pose estimation
- Enhancing feature fusion for human pose estimation
- Enhancing feature fusion for human pose estimation
- Enhancing feature fusion for human pose estimation
- Enhancing feature fusion for human pose estimation
- Enhancing feature fusion for human pose estimation
- Enhancing feature fusion for human pose estimation
- Enhancing feature fusion for human pose estimation
- DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs - pytorch) |
- Enhancing feature fusion for human pose estimation
- Enhancing feature fusion for human pose estimation
- Enhancing feature fusion for human pose estimation
- Enhancing feature fusion for human pose estimation
-
Attention Mechanism
- SA-Net:shuffle attention for deep convolutional neural networks - Net) | [zhihu](https://zhuanlan.zhihu.com/p/350912960) |
- Image Restoration via Residual Non-local Attention Networks
- Concurrent Spatial and Channel ‘Squeeze & Excitation’ in Fully Convolutional Networks - med/squeeze_and_excitation) | [zhihu](https://zhuanlan.zhihu.com/p/102036086) |
- Squeeze and Excitation Network - frank/SENet) | [zhihu](https://zhuanlan.zhihu.com/p/102035721) |
- Neural Architecture Search for Lightweight Non-Local Networks
- Selective Kernel Network
- Convolutional Block Attention Module - module) | [zhihu](https://zhuanlan.zhihu.com/p/102035273) |
- BottleNeck Attention Module - module) | [zhihu](https://zhuanlan.zhihu.com/p/102033063) |
- Non-local Neural Networks - Local(NL)](https://github.com/AlexHex7/Non-local_pytorch) | [zhihu](https://zhuanlan.zhihu.com/p/102984842) |
- GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond
- CCNet: Criss-Cross Attention for Semantic Segmentation
- SA-Net:shuffle attention for deep convolutional neural networks - Net) | [zhihu](https://zhuanlan.zhihu.com/p/350912960) |
- ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks
- Spatial Group-wise Enhance: Improving Semantic Feature Learning in Convolutional Networks
- FcaNet: Frequency Channel Attention Networks
- $A^2\text{-}Nets$: Double Attention Networks - Attention-Network) | |
- Asymmetric Non-local Neural Networks for Semantic Segmentation
- Efficient Attention: Attention with Linear Complexities - attention) | |
- Image Restoration via Residual Non-local Attention Networks
- Exploring Self-attention for Image Recognition
- An Empirical Study of Spatial Attention Mechanisms in Deep Networks
- Object-Contextual Representations for Semantic Segmentation - Semantic-Segmentation/tree/HRNet-OCR?v=2) | |
- IAUnet: Global text-Aware Feature Learning for Person Re-Identification - blue272/ImgReID-IAnet) | |
- ResNeSt: Split-Attention Networks
- Gather-Excite: Exploiting Feature Context in Convolutional Neural Networks - frank/GENet) | |
- Improving Convolutional Networks with Self-calibrated Convolutions - NKU/SCNet) | |
- Rotate to Attend: Convolutional Triplet Attention Module - attention) | |
- Dual Attention Network for Scene Segmentation
- Relation-Aware Global Attention for Person Re-identification - Aware-Global-Attention-Networks) | |
- Attentional Feature Fusion - aff) | |
- An Attentive Survey of Attention Models
- Stand-Alone Self-Attention in Vision Models - Alone-Self-Attention) | |
- BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation
- DCANet: Learning Connected Attentions for Convolutional Neural Networks
- An Empirical Study of Spatial Attention Mechanisms in Deep Networks
- Look closer to see better: Recurrent attention convolutional neural network for fine-grained image recognition - CNN](https://github.com/Jianlong-Fu/Recurrent-Attention-CNN) | |
- Guided Attention Network for Object Detection and Counting on Drones
- Attention Augmented Convolutional Networks - Augmented-Conv2d) | |
- GLOBAL SELF-ATTENTION NETWORKS FOR IMAGE RECOGNITION - self-attention-network) | |
- Attention-Guided Hierarchical Structure Aggregation for Image Matting - HAttMatting) | |
- Weight Excitation: Built-in Attention Mechanisms in Convolutional Neural Networks
- Expectation-Maximization Attention Networks for Semantic Segmentation
- Dense-and-implicit attention network - group/DIANet) | |
- Coordinate Attention for Efficient Mobile Network Design - Qibin/CoordAttention) | |
- Cross-channel Communication Networks
- Gated Convolutional Networks with Hybrid Connectivity for Image Classification
- Weighted Channel Dropout for Regularization of Deep Convolutional Neural Network
- BA^2M: A Batch Aware Attention Module for Image Classification
- EPSANet:An Efficient Pyramid Split Attention Block on Convolutional Neural Network
- ResT: An Efficient Transformer for Visual Recognition
- Spanet: Spatial Pyramid Attention Network for Enhanced Image Recognition
- Space-time Mixing Attention for Video Transformer
- DMSANet: Dual Multi Scale Attention Network
- CompConv: A Compact Convolution Module for Efficient Feature Learning
- VOLO: Vision Outlooker for Visual Recognition - sg/volo) | |
- Interflow: Aggregating Multi-layer Featrue Mappings with Attention Mechanism
- MUSE: Parallel Multi-Scale Attention for Sequence to Sequence Learning
- Polarized Self-Attention: Towards High-quality Pixel-wise Regression
- CA-Net: Comprehensive Attention Convolutional Neural Networks for Explainable Medical Image Segmentation - Net](https://github.com/HiLab-git/CA-Net) | |
- BAM: A Lightweight and Efficient Balanced Attention Mechanism for Single Image Super Resolution
- Attention as Activation - atac) | |
- Region-based Non-local Operation for Video Classification - based-non-local-network) | |
- MSAF: Multimodal Split Attention Fusion - hu/MSAF) | |
- All-Attention Layer
- Compact Global Descriptor - Global-Descriptor) | |
- SimAM: A Simple, Parameter-Free Attention Module for Convolutional Neural Networks
- Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks With Octave Convolution
- Contextual Transformer Networks for Visual Recognition - CV/CoTNet) | |
- Residual Attention: A Simple but Effective Method for Multi-Label Recognition
- Self-supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentation
- An Attention Module for Convolutional Neural Networks - Conv | |
- Attentive Normalization
- Person Re-identification via Attention Pyramid
- Unifying Nonlocal Blocks for Neural Networks
- Tiled Squeeze-and-Excite: Channel Attention With Local Spatial Context
- PP-NAS: Searching for Plug-and-Play Blocks on Convolutional Neural Network - NAS](https://github.com/sbl1996/PP-NAS) | |
- Distilling Knowledge via Knowledge Review - research/ReviewKD) | |
- Dynamic Region-Aware Convolution
- Encoder Fusion Network With Co-Attention Embedding for Referring Image Segmentation
- Introvert: Human Trajectory Prediction via Conditional 3D Attention
- SSAN: Separable Self-Attention Network for Video Representation Learning
- Delving Deep into Many-to-many Attention for Few-shot Video Object Segmentation
- A2 -FPN: Attention Aggregation based Feature Pyramid Network for Instance Segmentation
- Image Super-Resolution with Non-Local Sparse Attention
- Keep your Eyes on the Lane: Real-time Attention-guided Lane Detection
- NAM: Normalization-based Attention Module - lyc/NAM) | |
- NAS-SCAM: Neural Architecture Search-Based Spatial and Channel Joint Attention Module for Nuclei Semantic Segmentation and Classification - SCAM](https://github.com/ZuhaoLiu/NAS-SCAM) | |
- NASABN: A Neural Architecture Search Framework for Attention-Based Networks
- Att-DARTS: Differentiable Neural Architecture Search for Attention - Darts](https://github.com/chomin/Att-DARTS) | |
- On the Integration of Self-Attention and Convolution
- BoxeR: Box-Attention for 2D and 3D Transformers
- CoAtNet: Marrying Convolution and Attention for All Data Sizes - pytorch) | |
- Pay Attention to MLPs - mlp) | |
- IC-Conv: Inception Convolution With Efficient Dilation Search - Conv](https://github.com/yifan123/IC-Conv) | |
- SRM : A Style-based Recalibration Module for Convolutional Neural Networks - based-recalibration-module) | |
- SPANet: Spatial Pyramid Attention Network for Enhanced Image Recognition
- Competitive Inner-Imaging Squeeze and Excitation for Residual Network - SENet](https://github.com/scut-aitcm/Competitive-Inner-Imaging-SENet) | |
- Augmenting Convolutional networks with attention-based aggregation
- Context-aware Attentional Pooling (CAP) for Fine-grained Visual Classification
- Instance Enhancement Batch Normalization: An Adaptive Regulator of Batch Noise - group/IEBN) | |
- ASR: Attention-alike Structural Re-parameterization
- ResNeSt: Split-Attention Networks
- Object-Contextual Representations for Semantic Segmentation - Semantic-Segmentation/tree/HRNet-OCR?v=2) | |
- Convolutional Block Attention Module - module) | [zhihu](https://zhuanlan.zhihu.com/p/102035273) |
- Relation-Aware Global Attention for Person Re-identification - Aware-Global-Attention-Networks) | |
- ResT: An Efficient Transformer for Visual Recognition
- Improving Convolutional Networks with Self-calibrated Convolutions - NKU/SCNet) | |
- Concurrent Spatial and Channel ‘Squeeze & Excitation’ in Fully Convolutional Networks - med/squeeze_and_excitation) | [zhihu](https://zhuanlan.zhihu.com/p/102036086) |
- Efficient Attention: Attention with Linear Complexities - attention) | |
- Selective Kernel Network
- Expectation-Maximization Attention Networks for Semantic Segmentation
- Region-based Non-local Operation for Video Classification - based-non-local-network) | |
- Space-time Mixing Attention for Video Transformer
- BoxeR: Box-Attention for 2D and 3D Transformers
- IC-Conv: Inception Convolution With Efficient Dilation Search - Conv](https://github.com/yifan123/IC-Conv) | |
- BottleNeck Attention Module - module) | [zhihu](https://zhuanlan.zhihu.com/p/102033063) |
- Competitive Inner-Imaging Squeeze and Excitation for Residual Network - SENet](https://github.com/scut-aitcm/Competitive-Inner-Imaging-SENet) | |
- GLOBAL SELF-ATTENTION NETWORKS FOR IMAGE RECOGNITION - self-attention-network) | |
-
Vision Transformer
- Efficient Training of Visual Transformers with Small-Size Datasets
- Swin Transformer: Hierarchical Vision Transformer using Shifted Windows - Transformer) | |
- [paper - pytorch)
- GLiT: Neural Architecture Search for Global and Local Image Transformer
- ConViT: Improving Vision Transformers with Soft Convolutional Inductive Biases
- CvT: Introducing Convolutions to Vision Transformers
- ResT: An Efficient Transformer for Visual Recognition
- DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification
- Early Convolutions Help Transformers See Better
- LeViT: a Vision Transformer in ConvNet's Clothing for Faster Inference
- ViTAE: Vision Transformer Advanced by Exploring Intrinsic Inductive Bias
- LocalViT: Bringing Locality to Vision Transformers
- DeiT: Training data-efficient image transformers & distillation through attention
- CaiT: Going deeper with Image Transformers
- Vision Transformer with Deformable Attention
- MaxViT: Multi-Axis Vision Transformer
- Rethinking Mobile Block for Efficient Neural Models
- ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders - V2](https://github.com/facebookresearch/ConvNeXt-V2) |
- A Close Look at Spatial Modeling: From Attention to Convolution - xu/FCViT) |
- Scalable Diffusion Models with Transformers
- Dynamic Grained Encoder for Vision Transformers
- Segment Anything - anything.com/) |
- Improved robustness of vision transformers via prelayernorm in patch embedding
- Demystify Transformers & Convolutions in Modern Image Deep Networks - Evaluation](https://github.com/OpenGVLab/STM-Evaluation) | dai jifeng! |
- CPVT: Conditional Positional Encodings for Vision Transformer - AutoML/CPVT) | |
- BoTNet: Bottleneck Transformers for Visual Recognition - like |
- ConTNet: Why not use convolution and transformer at the same time? - hao-tian/ConTNet) | |
- CoAtNet: Marrying Convolution and Attention for All Data Sizes - pytorch) | |
- MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer - pytorch) | |
- Shuffle Transformer: Rethinking Spatial Shuffle for Vision Transformer - Transformer) | |
- CeiT: Incorporating Convolution Designs into Visual Transformers - pytorch) | LCA,LeFF |
- Compact Transformers: Escaping the Big Data Paradigm with Compact Transformers - Labs/Compact-Transformers) | |
- TransCNN: Transformer in Convolutional Neural Networks - liu/TransCNN) | |
- DVT: Not All Images are Worth 16x16 Words: Dynamic Transformers for Efficient Image Recognition - wang/Dynamic-Vision-Transformer) | |
- CoAtNet: Marrying Convolution and Attention for All Data Sizes - pytorch) | |
- Conv2Former: A Simple Transformer-Style ConvNet for Visual Recognition - NKU/Conv2Former) | |
-
Dynamic Networks
- Dynamic Neural Networks: A Survey
- DyNet: Dynamic Convolution for Accelerating Convolutional Neural Networks
- Dynamic Convolution: Attention over Convolution Kernels - convolution-Pytorch](https://github.com/kaijieshi7/Dynamic-convolution-Pytorch) |
- WeightNet: Revisiting the Design Space of Weight Network - model/WeightNet) |
- Dynamic Filter Networks
- Dynamic deep neural networks: Optimizing accuracy-efficiency trade-offs by selective execution
- SkipNet: Learning Dynamic Routing in Convolutional Networks
- Pay Less Attention with Lightweight and Dynamic Convolutions
- Unified Dynamic Convolutional Network for Super-Resolution with Variational Degradations
- SkipNet: Learning Dynamic Routing in Convolutional Networks
- Dynamic deep neural networks: Optimizing accuracy-efficiency trade-offs by selective execution
-
Contributing
Categories
Sub Categories