https://github.com/yingkaisha/keras-vision-transformer

The Tensorflow, Keras implementation of Swin-Transformer and Swin-UNET
https://github.com/yingkaisha/keras-vision-transformer

keras swinunet tensorflow transformer vision-transformer

Last synced: 5 months ago
JSON representation

The Tensorflow, Keras implementation of Swin-Transformer and Swin-UNET

Host: GitHub
URL: https://github.com/yingkaisha/keras-vision-transformer
Owner: yingkaisha
License: mit
Created: 2021-06-05T15:57:27.000Z (almost 4 years ago)
Default Branch: main
Last Pushed: 2024-04-26T04:43:39.000Z (about 1 year ago)
Last Synced: 2024-12-09T19:46:12.904Z (5 months ago)
Topics: keras, swinunet, tensorflow, transformer, vision-transformer
Language: Python
Homepage:
Size: 175 KB
Stars: 116
Watchers: 2
Forks: 41
Open Issues: 5
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# keras-vision-transformer

This repository contains the `tensorflow.keras` implementation of the Swin Transformer (Liu et al., 2021) and its applications to benchmark datasets.

* Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S. and Guo, B., 2021. Swin transformer: Hierarchical vision transformer using shifted windows. arXiv preprint arXiv:2103.14030. https://arxiv.org/abs/2103.14030.

* Hu, C., Wang, Y., Chen, J., Jiang, D., Zhang, X., Tian, Q. and Wang, M., 2021. Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation. arXiv preprint arXiv:2105.05537.

# Notebooks

Note: the Swin-UNET implementation is experimental

* MNIST image classification with Swin Transformers [[link](https://github.com/yingkaisha/keras-vision-transformer/blob/main/examples/Swin_Transformer_MNIST.ipynb)]
* Oxford IIIT Pet image Segmentation with Swin-UNET [[link](https://github.com/yingkaisha/keras-vision-transformer/blob/main/examples/Swin_UNET_oxford_iiit.ipynb)]

# Dependencies

* TensorFlow 2.5.0, Keras 2.5.0, Numpy 1.19.5.

# Overview

Swin Transformers are Transformer-based computer vision models that feature self-attention with shift-windows. Compared to other vision transformer variants, which compute embedded patches (tokens) globally, the Swin Transformer computes token subsets through non-overlapping windows that are alternatively shifted within Transformer blocks. This mechanism makes Swin Transformers more suitable for processing high-resolution images. Swin Transformers have shown effectiveness in image classification, object detection, and semantic segmentation problems.

# Contact

Yingkai (Kyle) Sha <> <>

The work is benefited from:
* The official Pytorch implementation of Swin-Transformers [[link](https://github.com/microsoft/Swin-Transformer)].
* Swin-Transformer-TF [[link](https://github.com/rishigami/Swin-Transformer-TF)].

# License

[MIT License](https://github.com/yingkaisha/swin_transformer_keras/blob/main/LICENSE)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/yingkaisha/keras-vision-transformer

Awesome Lists containing this project

README