Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/sayakpaul/vision-transformers-tf
A non-exhaustive collection of vision transformer models implemented in TensorFlow.
https://github.com/sayakpaul/vision-transformers-tf
inductive-biases keras recognition segmentation tensorflow transformers vision
Last synced: 14 days ago
JSON representation
A non-exhaustive collection of vision transformer models implemented in TensorFlow.
- Host: GitHub
- URL: https://github.com/sayakpaul/vision-transformers-tf
- Owner: sayakpaul
- License: apache-2.0
- Created: 2022-09-25T08:59:20.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2022-09-25T09:07:58.000Z (about 2 years ago)
- Last Synced: 2024-10-04T18:34:53.714Z (about 1 month ago)
- Topics: inductive-biases, keras, recognition, segmentation, tensorflow, transformers, vision
- Homepage:
- Size: 6.84 KB
- Stars: 10
- Watchers: 2
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# vision-transformers-tf
This repository contains a **non-exhaustive** collection of vision transformer models implemented in TensorFlow by me. Not to confuse with the original Vision Transformers paper [1], the architectures implemented here are generally referred to as Vision Transformers since they make use of Transformers in some way or the other for the vision modality.
## Transformer (-inspired) architectures for Computer Vision
* [ViT](https://github.com/sayakpaul/probing-vits/blob/main/notebooks/load-jax-weights-vitb16.ipynb)
* [DeiT](https://github.com/sayakpaul/deit-tf/)
* [Swin](https://github.com/sayakpaul/swin-transformers-tf)
* [CaiT](https://github.com/sayakpaul/cait-tf)
* [MobileViT](https://huggingface.co/docs/transformers/main/en/model_doc/mobilevit)
* [CCT](https://keras.io/examples/vision/cct)
* [ViT MAE (with Aritra Roy Gosthipaty)](https://huggingface.co/docs/transformers/main/en/model_doc/vit_mae)
* [ViT MSN](https://huggingface.co/docs/transformers/main/en/model_doc/vit_msn)
* [ViT data2vec](https://huggingface.co/docs/transformers/main/en/model_doc/data2vec)
* [SegFormer](https://huggingface.co/docs/transformers/main/en/model_doc/segformer)## References
[1] Dosovitskiy, Alexey, et al. An Image Is Worth 16x16 Words: Transformers for Image Recognition at Scale. arXiv, 3 June 2021. arXiv.org, https://doi.org/10.48550/arXiv.2010.11929.