https://github.com/emla2805/vision-transformer

Tensorflow implementation of the Vision Transformer (An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale)
https://github.com/emla2805/vision-transformer

computer-vision tensorflow transformer vision-transformer

Last synced: 9 months ago
JSON representation

Tensorflow implementation of the Vision Transformer (An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale)

Host: GitHub
URL: https://github.com/emla2805/vision-transformer
Owner: emla2805
Created: 2020-10-07T18:58:50.000Z (almost 6 years ago)
Default Branch: master
Last Pushed: 2020-10-18T21:07:01.000Z (over 5 years ago)
Last Synced: 2023-11-07T18:12:47.820Z (over 2 years ago)
Topics: computer-vision, tensorflow, transformer, vision-transformer
Language: Python
Homepage: https://openreview.net/pdf?id=YicbFdNTTy
Size: 328 KB
Stars: 196
Watchers: 3
Forks: 63
Open Issues: 3
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

          # Vision Transformer (ViT)

Tensorflow implementation of the Vision Transformer (ViT) presented in 

[An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale](https://openreview.net/pdf?id=YicbFdNTTy),

where the authors show that Transformers applied directly to image patches and pre-trained on large datasets work really well on image classification.



    



## Install dependencies

Create a Python 3 virtual environment and activate it:

```bash

virtualenv -p python3 venv

source ./venv/bin/activate

```

Next, install the required dependencies:

```bash

pip install -r requirements.txt

```

## Train model

Start the model training by running:

```bash

python train.py --logdir path/to/log/dir

```

To track metrics, start `Tensorboard`

```bash

tensorboard --logdir path/to/log/dir

```

and then go to [localhost:6006](localhost:6006).

## Citation

```bibtex

@inproceedings{

    anonymous2021an,

    title={An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale},

    author={Anonymous},

    booktitle={Submitted to International Conference on Learning Representations},

    year={2021},

    url={https://openreview.net/forum?id=YicbFdNTTy},

    note={under review}

}

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/emla2805/vision-transformer

Awesome Lists containing this project

README