Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/lijiaman/awesome-transformer-for-vision
https://github.com/lijiaman/awesome-transformer-for-vision
List: awesome-transformer-for-vision
Last synced: about 2 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/lijiaman/awesome-transformer-for-vision
- Owner: lijiaman
- Created: 2021-01-02T05:15:33.000Z (almost 4 years ago)
- Default Branch: main
- Last Pushed: 2021-03-22T06:02:30.000Z (over 3 years ago)
- Last Synced: 2024-05-19T18:58:43.111Z (7 months ago)
- Size: 4.88 KB
- Stars: 280
- Watchers: 15
- Forks: 22
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- Contributing: contributing.md
Awesome Lists containing this project
- awesomeai - Awesome Transformer for Vision Resources List
- awesome-ai-awesomeness - Awesome Transformer for Vision Resources List
- Awesome-Transformer-Attention - Awesome Transformer for Vision Resources List (GitHub)
README
# Awesome Transformer for Vision Resources List [![Awesome](https://cdn.rawgit.com/sindresorhus/awesome/d7305f38d29fed78fa85652e3a63e154dd8e8829/media/badge.svg)](https://github.com/sindresorhus/awesome)
> A curated list of papers & resources linked to Transformer-based research mainly for vision and graphics tasks.
## Contents
- [Papers](#papers)
- [Original Paper](#papers-ori)
- [2D Vision Tasks](#papers-2d)
- [Classification](#papers-classification)
- [Detection](#papers-detection)
- [Segmentation](#papers-segmentation)
- [Tracking](#papers-tracking)
- [Image Synthesis](#papers-image-synthesis)
- [Action Understanding](#papers-action)
- [3D Vision Tasks](#papers-3d)
- [Point Cloud Processing](#papers-point-cloud)
- [Motion Modeling](#papers-motion)
- [Human Body Modeling](#papers-body)
- [Others](#papers-others)
- [Music Modeling](#papers-music)- [Contributing](#contributing)
[Attention Is All You Need](https://papers.nips.cc/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf). Ashish Vaswani*, Noam Shazeer*, Niki Parmar*, Jakob Uszkoreit*, Llion Jones*, Aidan N. Gomez*, Łukasz Kaiser*, Illia Polosukhin*. NIPs 2017.
[AN IMAGE IS WORTH 16X16 WORDS: TRANSFORMERS FOR IMAGE RECOGNITION AT SCALE](https://arxiv.org/pdf/2010.11929.pdf). Alexey Dosovitskiy∗, Lucas Beyer∗, Alexander Kolesnikov∗, Dirk Weissenborn∗, Xiaohua Zhai∗, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, Neil Houlsby. Arxiv 2020.
[Fast Convergence of DETR with Spatially Modulated Co-Attention](https://arxiv.org/pdf/2101.07448.pdf). Peng Gao, Minghang Zheng, Xiaogang Wang, Jifeng Dai, Hongsheng Li. Arxiv 2021.
[End-to-End Object Detection with Adaptive Clustering Transformer](https://arxiv.org/pdf/2011.09315.pdf). Minghang Zheng, Peng Gao, Xiaogang Wang, Hongsheng Li, Hao Dong. Arxiv 2020.
[Toward Transformer-Based Object Detection](https://arxiv.org/pdf/2012.09958.pdf). Josh Beal*, Eric Kim*, Eric Tzeng, Dong Huk Park, Andrew Zhai, Dmitry Kislyuk. Arxiv 2020.
[Rethinking Transformer-based Set Prediction for Object Detection](https://arxiv.org/pdf/2011.10881.pdf). Zhiqing Sun*, Shengcao Cao*, Yiming Yang, Kris Kitani. Arxiv 2020.
[UP-DETR: Unsupervised Pre-training for Object Detection with Transformers](https://arxiv.org/pdf/2011.09094.pdf). Zhigang Dai1, Bolun Cai, Yugeng Lin, Junying Chen. Arxiv 2020.
[DEFORMABLE DETR: DEFORMABLE TRANSFORMERS FOR END-TO-END OBJECT DETECTION](https://arxiv.org/pdf/2010.04159.pdf). Xizhou Zhu∗, Weijie Su∗, Lewei Lu, Bin Li, Xiaogang Wang, Jifeng Dai. Arxiv 2020.
[End-to-End Object Detection with Transformers](https://arxiv.org/pdf/2005.12872.pdf). Nicolas Carion*, Francisco Massa*, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, and Sergey Zagoruyko. ECCV 2020.
[Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers](https://arxiv.org/pdf/2012.15840.pdf). Sixiao Zheng, Jiachen Lu, Hengshuang Zhao, Xiatian Zhu, Zekun Luo, Yabiao Wang, Yanwei Fu, Jianfeng Feng, Tao Xiang, Philip H.S. Torr, Li Zhang. Arxiv 2020.
[End-to-End Video Instance Segmentation with Transformers](https://arxiv.org/pdf/2011.14503.pdf). Yuqing Wang, Zhaoliang Xu, Xinlong Wang, Chunhua Shen, Baoshan Cheng, Hao Shen, Huaxia Xia. Arxiv 2020.
[TransTrack: Multiple-Object Tracking with Transformer](https://arxiv.org/pdf/2012.15460.pdf). Peize Sun, Yi Jiang, Rufeng Zhang, Enze Xie, Jinkun Cao, Xinting Hu, Tao Kong, Zehuan Yuan, Changhu Wang, Ping Luo. Arxiv 2020.
[Taming Transformers for High-Resolution Image Synthesis](https://arxiv.org/pdf/2012.09841.pdf). Patrick Esser*, Robin Rombach*, Bjorn Ommer. Arxiv 2020.
[Video Action Transformer Network](https://arxiv.org/pdf/1812.02707.pdf). Rohit Girdhar, Joao Carreira, Carl Doersch, Andrew Zisserman. CVPR 2019.
[PCT: Point Cloud Transformer](https://arxiv.org/pdf/2012.09688.pdf). Meng-Hao Guo, Jun-Xiong Cai, Zheng-Ning Liu, Tai-Jiang Mu, Ralph R. Martin, Shi-Min Hu. Arxiv 2020.
[Point Transformer](https://arxiv.org/pdf/2012.09164.pdf). Hengshuang Zhao, Li Jiang, Jiaya Jia, Philip Torr, Vladlen Koltun. Arxiv 2020.
[Learning to Generate Diverse Dance Motions with Transformer](https://arxiv.org/pdf/2008.08171.pdf). Jiaman Li, Yihang Yin, Hang Chu, Yi Zhou, Tingwu Wang, Sanja Fidler, Hao Li. Arxiv 2020.
[A Spatio-temporal Transformer for 3D Human Motion Prediction](https://arxiv.org/pdf/2004.08692.pdf). Emre Aksan*, Peng Cao*, Manuel Kaufmann, Otmar Hilliges. Arxiv 2020.
[End-to-End Human Pose and Mesh Reconstruction with Transformers](https://arxiv.org/pdf/2012.09760.pdf). Kevin Lin, Lijuan Wang, Zicheng Liu. Arxiv 2020.
[MUSIC TRANSFORMER: GENERATING MUSIC WITH LONG-TERM STRUCTURE](https://arxiv.org/pdf/1809.04281.pdf). Cheng-Zhi Anna Huang, Ashish Vaswani, Jakob Uszkoreit, Noam Shazeer, Ian Simon, Curtis Hawthorne, Andrew M. Dai, Matthew D. Hoffman, Monica Dinculescu, Douglas Eck. Arxiv 2018.
# Contributing
Please see [CONTRIBUTING](https://github.com/openMVG/awesome_3DReconstruction_list/blob/master/contributing.md) for details.