Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/sayakpaul/vision-transformers-tf

A non-exhaustive collection of vision transformer models implemented in TensorFlow.
https://github.com/sayakpaul/vision-transformers-tf

inductive-biases keras recognition segmentation tensorflow transformers vision

Last synced: 21 days ago
JSON representation

A non-exhaustive collection of vision transformer models implemented in TensorFlow.

Host: GitHub
URL: https://github.com/sayakpaul/vision-transformers-tf
Owner: sayakpaul
License: apache-2.0
Created: 2022-09-25T08:59:20.000Z (over 2 years ago)
Default Branch: main
Last Pushed: 2022-09-25T09:07:58.000Z (over 2 years ago)
Last Synced: 2025-01-10T17:47:34.511Z (24 days ago)
Topics: inductive-biases, keras, recognition, segmentation, tensorflow, transformers, vision
Homepage:
Size: 6.84 KB
Stars: 10
Watchers: 2
Forks: 1
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

        # vision-transformers-tf

This repository contains a **non-exhaustive** collection of vision transformer models implemented in TensorFlow by me. Not to confuse with the original Vision Transformers paper [1], the architectures implemented here are generally referred to as Vision Transformers since they make use of Transformers in some way or the other for the vision modality. 

## Transformer (-inspired) architectures for Computer Vision

* [ViT](https://github.com/sayakpaul/probing-vits/blob/main/notebooks/load-jax-weights-vitb16.ipynb)

* [DeiT](https://github.com/sayakpaul/deit-tf/)

* [Swin](https://github.com/sayakpaul/swin-transformers-tf)

* [CaiT](https://github.com/sayakpaul/cait-tf)

* [MobileViT](https://huggingface.co/docs/transformers/main/en/model_doc/mobilevit)

* [CCT](https://keras.io/examples/vision/cct)

* [ViT MAE (with Aritra Roy Gosthipaty)](https://huggingface.co/docs/transformers/main/en/model_doc/vit_mae)

* [ViT MSN](https://huggingface.co/docs/transformers/main/en/model_doc/vit_msn)

* [ViT data2vec](https://huggingface.co/docs/transformers/main/en/model_doc/data2vec)

* [SegFormer](https://huggingface.co/docs/transformers/main/en/model_doc/segformer)

## References

[1] Dosovitskiy, Alexey, et al. An Image Is Worth 16x16 Words: Transformers for Image Recognition at Scale. arXiv, 3 June 2021. arXiv.org, https://doi.org/10.48550/arXiv.2010.11929.