Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
awesome-transformer-search
A curated list of awesome resources combining Transformers with Neural Architecture Search
https://github.com/automl/awesome-transformer-search
Last synced: 5 days ago
JSON representation
-
General Transformer Search
- Few-shot Task-agnostic Neural Architecture Search for Distilling Large Language Models
- Training Free Transformer Architecture Search
- LiteTransformerSearch: Training-free On-device Search for Efficient Autoregressive Language Models
- Searching the Search Space of Vision Transformer
- UniNet: Unified Architecture Search with Convolutions, Transformer and MLP
- Analyzing and Mitigating Interference in Neural Architecture Search
- BossNAS: Exploring Hybrid CNN-transformers with Block-wisely Self-supervised Neural Architecture Search - sen University |
- Memory-Efficient Differentiable Transformer Architecture Search - IJCNLP'21** | MSR, Peking University |
- Finding Fast Transformers: One-Shot Neural Architecture Search by Component Composition
- AutoTrans: Automating Transformer Design via Reinforced Architecture Search
- NASABN: A Neural Architecture Search Framework for Attention-Based Networks
- NAT: Neural Architecture Transformer for Accurate and Compact Architectures
- The Evolved Transformer
- NASABN: A Neural Architecture Search Framework for Attention-Based Networks
- NASABN: A Neural Architecture Search Framework for Attention-Based Networks
- NASABN: A Neural Architecture Search Framework for Attention-Based Networks
- NASABN: A Neural Architecture Search Framework for Attention-Based Networks
- NASABN: A Neural Architecture Search Framework for Attention-Based Networks
- NASABN: A Neural Architecture Search Framework for Attention-Based Networks
- NASABN: A Neural Architecture Search Framework for Attention-Based Networks
- NASABN: A Neural Architecture Search Framework for Attention-Based Networks
- BossNAS: Exploring Hybrid CNN-transformers with Block-wisely Self-supervised Neural Architecture Search - sen University |
- NASABN: A Neural Architecture Search Framework for Attention-Based Networks
- UniNet: Unified Architecture Search with Convolutions, Transformer and MLP
- NAT: Neural Architecture Transformer for Accurate and Compact Architectures
- NASABN: A Neural Architecture Search Framework for Attention-Based Networks
- NASABN: A Neural Architecture Search Framework for Attention-Based Networks
- NASABN: A Neural Architecture Search Framework for Attention-Based Networks
- LiteTransformerSearch: Training-free On-device Search for Efficient Autoregressive Language Models
- Finding Fast Transformers: One-Shot Neural Architecture Search by Component Composition
-
Domain Specific Transformer Search
-
Vision
- NASViT: Neural Architecture Search for Efficient Vision Transformers with Gradient Conflict aware Supernet Training
- AutoFormer: Searching Transformers for Visual Recognition
- GLiT: Neural Architecture Search for Global and Local Image Transformer
- Searching for Efficient Multi-Stage Vision Transformers
- HR-NAS: Searching Efficient High-Resolution Neural Architectures with Lightweight Transformers
- πΌNAS: Neural Architecture Search using Property Guided Synthesis
- GLiT: Neural Architecture Search for Global and Local Image Transformer
- AutoFormer: Searching Transformers for Visual Recognition
-
Natural Language Processing
- AutoBERT-Zero: Evolving the BERT backbone from scratch
- Primer: Searching for Efficient Transformers for Language Modeling
- AutoTinyBERT: Automatic Hyper-parameter Optimization for Efficient Pre-trained Language Models
- NAS-BERT: Task-Agnostic and Adaptive-Size BERT Compression with Neural Architecture Search
- HAT: Hardware-Aware Transformers for Efficient Natural Language Processing
- AutoBERT-Zero: Evolving the BERT backbone from scratch
- NAS-BERT: Task-Agnostic and Adaptive-Size BERT Compression with Neural Architecture Search
- HAT: Hardware-Aware Transformers for Efficient Natural Language Processing
- Primer: Searching for Efficient Transformers for Language Modeling
-
Automatic Speech Recognition
-
Transformers Knowledge: Insights, Searchable parameters, Attention
- RWKV: Reinventing RNNs for the Transformer Era
- Patches are All You Need ?
- Seperable Self Attention for Mobile Vision Transformers
- Parameter-efficient Fine-tuning for Vision Transformers
- EfficientFormer: Vision Transformers at MobileNet Speed
- Neighborhood Attention Transformer
- Training Compute Optimal Large Language Models
- CMT: Convolutional Neural Networks meet Vision Transformers
- Patch Slimming for Efficient Vision Transformers
- Lite Vision Transformer with Enhanced Self-Attention
- TubeDETR: Spatio-Temporal Video Grounding with Transformers
- Beyond Fixation: Dynamic Window Visual Transformer
- BEiT: BERT Pre-Training of Image Transformers
- How Do Vision Transformers Work?
- Scale Efficiently: Insights from Pretraining and FineTuning Transformers
- Tuformer: Data-Driven Design of Expressive Transformer by Tucker Tensor Representation
- DictFormer: Tiny Transformer with Shared Dictionary
- QuadTree Attention for Vision Transformers
- Expediting Vision Transformers via Token Reorganization
- UniFormer: Unified Transformer for Efficient Spatial-Temporal Representation Learning - SenseTime |
- Hierarchical Transformers Are More Efficient Language Models
- Transformer in Transformer
- Long-Short Transformer: Efficient Transformers for Language and Vision
- Memory-efficient Transformers via Top-k Attention
- Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
- Rethinking Spatial Dimensions of Vision Transformers
- What makes for hierarchical vision transformers
- AutoAttend: Automated Attention Representation Search
- Rethinking Attention with Performers
- LambdaNetworks: Modeling long-range Interactions without Attention
- HyperGrid Transformers
- LocalViT: Bringing Locality to Vision Transformers
- Compressive Transformers for Long Range Sequence Modelling
- Improving Transformer Models by Reordering their Sublayers
- Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned
- What makes for hierarchical vision transformers
- TubeDETR: Spatio-Temporal Video Grounding with Transformers
- Hierarchical Transformers Are More Efficient Language Models
- Rethinking Spatial Dimensions of Vision Transformers
- LocalViT: Bringing Locality to Vision Transformers
- Improving Transformer Models by Reordering their Sublayers
- Training Compute Optimal Large Language Models
- Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
- QuadTree Attention for Vision Transformers
- Parameter-efficient Fine-tuning for Vision Transformers
- Neighborhood Attention Transformer
- RWKV: Reinventing RNNs for the Transformer Era
-
-
Transformer Surveys
-
Transformers Knowledge: Insights, Searchable parameters, Attention
-
-
Foundation Models
-
Transformers Knowledge: Insights, Searchable parameters, Attention
-
Misc resources
-
Categories
Sub Categories
Keywords
visual-transformer
2
transformer-with-cv
2
transformer-cv
2
transformer-awesome
2
transformer
2
detr
2
attention-mechanism
1
attention-mechanisms
1
awesome-list
1
computer-vision
1
deep-learning
1
papers
1
self-attention
1
transformer-architecture
1
transformer-models
1
transformers
1
vision-transformer
1
vit
1