Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/lucasjinreal/awesome_transformer

A curated list of transformer learning materials, shared blogs, technical reviews.
https://github.com/lucasjinreal/awesome_transformer

List: awesome_transformer

Last synced: about 2 months ago
JSON representation

A curated list of transformer learning materials, shared blogs, technical reviews.

Awesome Lists containing this project

README

        

# Awesome Transformer [![Awesome](https://cdn.rawgit.com/sindresorhus/awesome/d7305f38d29fed78fa85652e3a63e154dd8e8829/media/badge.svg)](https://github.com/sindresorhus/awesome)

A curated list of all kinds of transformers, also include some personal experiment results, applications and thoughts from industry.

## Updates

- **2021.02.20**: I opened github discuss panel, we can start discuss about transformers there.

## Blogs

- [Attention is All You Need](https://arxiv.org/abs/1706.03762)
- [Chinese Blog] 3W字长文带你轻松入门视觉transformer [[Link](https://zhuanlan.zhihu.com/p/308301901)]
- Transformers in Vision: A Survey [[paper](https://arxiv.org/abs/2101.01169)] - 2021.01.04
- A Survey on Visual Transformer [[paper](https://arxiv.org/abs/2012.12556)] - 2020.12.24
- https://zhuanlan.zhihu.com/p/342512339 万字长文带你了解2021大热的ViT
- 线性Attention的探索:Attention必须有个Softmax吗?[link](线性Attention的探索:Attention必须有个Softmax吗?)

## Standalone Github Repos

- https://github.com/ThilinaRajapakse/simpletransformers: Transformers for Classification, NER, QA, Language Modelling, Language Generation, T5, Multi-Modal, and Conversational AI;
-

## arXiv papers

- Training Vision Transformers for Image Retrieval[[paper](https://arxiv.org/abs/2102.05644)]
- **[TransReID]** TransReID: Transformer-based Object Re-Identification[[paper](https://arxiv.org/abs/2102.04378)]
- **[VTN]** Video Transformer Network[[paper](https://arxiv.org/abs/2102.00719)]
- **[T2T-ViT]** Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet [[paper](https://arxiv.org/abs/2101.11986)] [[code](https://github.com/yitu-opensource/T2T-ViT)]
- **[BoTNet]** Bottleneck Transformers for Visual Recognition [[paper](https://arxiv.org/abs/2101.11605)]
- **[CPTR]** CPTR: Full Transformer Network for Image Captioning [[paper](https://arxiv.org/abs/2101.10804)]
- Learn to Dance with AIST++: Music Conditioned 3D Dance Generation [[paper](https://arxiv.org/abs/2101.08779)] [[code](https://google.github.io/aichoreographer/)]
- **[Trans2Seg]** Segmenting Transparent Object in the Wild with Transformer [[paper](https://github.com/xieenze/Trans2Seg)] [[code](https://github.com/xieenze/Trans2Seg)]
- **[SMCA]** Fast Convergence of DETR with Spatially Modulated Co-Attention [[paper](https://arxiv.org/abs/2101.07448)]
- Investigating the Vision Transformer Model for Image Retrieval Tasks [[paper](https://arxiv.org/abs/2101.03771)]
- **[Trear]** Trear: Transformer-based RGB-D Egocentric Action Recognition [[paper](https://arxiv.org/abs/2101.03904)]
- **[VisTR]** End-to-End Video Instance Segmentation with Transformers [[paper](https://arxiv.org/abs/2011.14503)]
- **[VisualSparta]** VisualSparta: Sparse Transformer Fragment-level Matching for Large-scale Text-to-Image Search [[paper](https://arxiv.org/abs/2101.00265)]
- **[TrackFormer]** TrackFormer: Multi-Object Tracking with Transformers [[paper](https://arxiv.org/abs/2101.02702)]
- **[LETR]** Line Segment Detection Using Transformers without Edges [[paper](https://arxiv.org/abs/2101.01909)]
- **[TAPE]** Transformer Guided Geometry Model for Flow-Based Unsupervised Visual Odometry [[paper](https://arxiv.org/abs/2101.02143)]
- **[TRIQ]** Transformer for Image Quality Assessment [[paper](https://arxiv.org/abs/2101.01097)] [[code](https://github.com/junyongyou/triq)]
- **[TransTrack]** TransTrack: Multiple-Object Tracking with Transformer [[paper](https://arxiv.org/abs/2012.15460)] [[code](https://github.com/PeizeSun/TransTrack)]
- **[SETR]** Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers [[paper](https://arxiv.org/abs/2012.15840)] [[code](https://fudan-zvg.github.io/SETR/)]
- **[TransPose]** TransPose: Towards Explainable Human Pose Estimation by Transformer [[paper](https://arxiv.org/abs/2012.14214)]
- **[DeiT]** Training data-efficient image transformers & distillation through attention [[paper](https://arxiv.org/abs/2012.12877)]
- **[Pointformer]** 3D Object Detection with Pointformer [[paper](https://arxiv.org/abs/2012.11409)]
- **[ViT-FRCNN]** Toward Transformer-Based Object Detection [[paper](https://arxiv.org/abs/2012.09958)]
- **[Taming-transformers]** Taming Transformers for High-Resolution Image Synthesis [[paper](https://arxiv.org/abs/2012.09841)] [[code](https://compvis.github.io/taming-transformers/)]
- **[SceneFormer]** SceneFormer: Indoor Scene Generation with Transformers [[paper](https://arxiv.org/abs/2012.09793)]
- **[PCT]** PCT: Point Cloud Transformer [[paper](https://arxiv.org/abs/2012.09688)]
- Transformer Interpretability Beyond Attention Visualization[[paper](https://arxiv.org/abs/2012.09838)] [[code](https://github.com/hila-chefer/Transformer-Explainability)]
- **[METRO]** End-to-End Human Pose and Mesh Reconstruction with Transformers [[paper]](https://arxiv.org/abs/2012.09760)
- **[PointTransformer]** Point Transformer[[paper](https://arxiv.org/abs/2012.09164)]
- **[PED]** DETR for Pedestrian Detection[[paper](https://arxiv.org/abs/2012.06785)]
- **[UP-DETR]** UP-DETR: Unsupervised Pre-training for Object Detection with Transformers[[paper](https://arxiv.org/abs/2011.09094)]
- **[LAMBDANETWORKS]** MODELING LONG-RANGE INTERACTIONS WITHOUT ATTENTION[[paper](https://openreview.net/pdf?id=xTJEN-ggl1b)] [[code](https://github.com/lucidrains/lambda-networks)]
- **[C-Tran]** General Multi-label Image Classification with Transformers[[paper](https://arxiv.org/abs/2011.14027)]
- **[TSP-FCOS]** Rethinking Transformer-based Set Prediction for Object Detection[[paper](https://arxiv.org/abs/2011.10881)]
- **[IPT]** Pre-Trained Image Processing Transformer[[paper](https://arxiv.org/abs/2012.00364)]
- **[ACT]** End-to-End Object Detection with Adaptive Clustering Transformer[[paper](https://arxiv.org/abs/2011.09315)]
- **[VTs]** Visual Transformers: Token-based Image Representation and Processing for Computer Vision[[paper](https://arxiv.org/abs/2006.03677)]

### 2021

- **[Vision Transformer]** An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale(**ICLR**)[[paper](https://arxiv.org/abs/2010.11929)] [[code](https://github.com/google-research/vision_transformer)]
- **[Deformable DETR]** Deformable DETR: Deformable Transformers for End-to-End Object Detection(**ICLR**)[[paper](https://arxiv.org/abs/2010.04159)] [[code](https://github.com/fundamentalvision/Deformable-DETR)]
- **[LSTR]** End-to-end Lane Shape Prediction with Transformers(**WACV**) [[paper](https://arxiv.org/abs/2011.04233)] [[code](https://github.com/liuruijin17/LSTR)]

### 2020

- **[DETR]** End-to-End Object Detection with Transformers (**ECCV**) [[paper](https://arxiv.org/abs/2005.12872)] [[code](https://github.com/facebookresearch/detr)]
- **[FPT]** Feature Pyramid Transformer(**CVPR**) [[paper](https://arxiv.org/abs/2007.09451)] [[code](https://github.com/ZHANGDONG-NJUST/FPT)]
- **[TTSR]** Learning Texture Transformer Network for Image Super-Resolution(**CVPR**) [[paper](https://arxiv.org/abs/2006.04139)] [[code](https://github.com/researchmm/TTSR)]

## Reference

1. [origin](https://github.com/dk-liang/Awesome-Visual-Transformer)

## Copyright

Collected by Lucas Jin. 2021