https://github.com/emla2805/vision-transformer
Tensorflow implementation of the Vision Transformer (An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale)
https://github.com/emla2805/vision-transformer
computer-vision tensorflow transformer vision-transformer
Last synced: 9 months ago
JSON representation
Tensorflow implementation of the Vision Transformer (An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale)
- Host: GitHub
- URL: https://github.com/emla2805/vision-transformer
- Owner: emla2805
- Created: 2020-10-07T18:58:50.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2020-10-18T21:07:01.000Z (over 5 years ago)
- Last Synced: 2023-11-07T18:12:47.820Z (over 2 years ago)
- Topics: computer-vision, tensorflow, transformer, vision-transformer
- Language: Python
- Homepage: https://openreview.net/pdf?id=YicbFdNTTy
- Size: 328 KB
- Stars: 196
- Watchers: 3
- Forks: 63
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Vision Transformer (ViT)
Tensorflow implementation of the Vision Transformer (ViT) presented in
[An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale](https://openreview.net/pdf?id=YicbFdNTTy),
where the authors show that Transformers applied directly to image patches and pre-trained on large datasets work really well on image classification.
## Install dependencies
Create a Python 3 virtual environment and activate it:
```bash
virtualenv -p python3 venv
source ./venv/bin/activate
```
Next, install the required dependencies:
```bash
pip install -r requirements.txt
```
## Train model
Start the model training by running:
```bash
python train.py --logdir path/to/log/dir
```
To track metrics, start `Tensorboard`
```bash
tensorboard --logdir path/to/log/dir
```
and then go to [localhost:6006](localhost:6006).
## Citation
```bibtex
@inproceedings{
anonymous2021an,
title={An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale},
author={Anonymous},
booktitle={Submitted to International Conference on Learning Representations},
year={2021},
url={https://openreview.net/forum?id=YicbFdNTTy},
note={under review}
}
```