https://github.com/TACJu/TransFG

This is the official PyTorch implementation of the paper "TransFG: A Transformer Architecture for Fine-grained Recognition" (Ju He, Jie-Neng Chen, Shuai Liu, Adam Kortylewski, Cheng Yang, Yutong Bai, Changhu Wang, Alan Yuille).
https://github.com/TACJu/TransFG

fine-grained-recognition

Last synced: 26 days ago
JSON representation

Host: GitHub
URL: https://github.com/TACJu/TransFG
Owner: TACJu
License: mit
Created: 2021-03-28T10:33:00.000Z (about 4 years ago)
Default Branch: master
Last Pushed: 2022-10-04T06:39:47.000Z (over 2 years ago)
Last Synced: 2024-11-15T06:33:23.051Z (7 months ago)
Topics: fine-grained-recognition
Language: Python
Homepage:
Size: 728 KB
Stars: 388
Watchers: 5
Forks: 90
Open Issues: 31
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

        # TransFG: A Transformer Architecture for Fine-grained Recognition

[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/transfg-a-transformer-architecture-for-fine/fine-grained-image-classification-on-cub-200)](https://paperswithcode.com/sota/fine-grained-image-classification-on-cub-200?p=transfg-a-transformer-architecture-for-fine) [![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/transfg-a-transformer-architecture-for-fine/fine-grained-image-classification-on-nabirds)](https://paperswithcode.com/sota/fine-grained-image-classification-on-nabirds?p=transfg-a-transformer-architecture-for-fine) [![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/transfg-a-transformer-architecture-for-fine/fine-grained-image-classification-on-stanford-1)](https://paperswithcode.com/sota/fine-grained-image-classification-on-stanford-1?p=transfg-a-transformer-architecture-for-fine) [![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/transfg-a-transformer-architecture-for-fine/image-classification-on-inaturalist)](https://paperswithcode.com/sota/image-classification-on-inaturalist?p=transfg-a-transformer-architecture-for-fine)

Official PyTorch code for the paper:  [*TransFG: A Transformer Architecture for Fine-grained Recognition (AAAI2022)*](https://arxiv.org/abs/2103.07976)  

## Framework

![](./TransFG.png)

## Dependencies:

+ Python 3.7.3

+ PyTorch 1.5.1

+ torchvision 0.6.1

+ ml_collections

## Usage

### 1. Download Google pre-trained ViT models

* [Get models in this link](https://console.cloud.google.com/storage/vit_models/): ViT-B_16, ViT-B_32...

```bash

wget https://storage.googleapis.com/vit_models/imagenet21k/{MODEL_NAME}.npz

```

### 2. Prepare data

In the paper, we use data from 5 publicly available datasets:

+ [CUB-200-2011](http://www.vision.caltech.edu/visipedia/CUB-200-2011.html)

+ [Stanford Cars](https://ai.stanford.edu/~jkrause/cars/car_dataset.html)

+ [Stanford Dogs](http://vision.stanford.edu/aditya86/ImageNetDogs/)

+ [NABirds](http://dl.allaboutbirds.org/nabirds)

+ [iNaturalist 2017](https://github.com/visipedia/inat_comp/tree/master/2017)

Please download them from the official websites and put them in the corresponding folders.

### 3. Install required packages

Install dependencies with the following command:

```bash

pip3 install -r requirements.txt

```

### 4. Train

To train TransFG on CUB-200-2011 dataset with 4 gpus in FP-16 mode for 10000 steps run:

```bash

CUDA_VISIBLE_DEVICES=0,1,2,3 python3 -m torch.distributed.launch --nproc_per_node=4 train.py --dataset CUB_200_2011 --split overlap --num_steps 10000 --fp16 --name sample_run

```

## Citation

If you find our work helpful in your research, please cite it as:

```

@article{he2021transfg,

  title={TransFG: A Transformer Architecture for Fine-grained Recognition},

  author={He, Ju and Chen, Jie-Neng and Liu, Shuai and Kortylewski, Adam and Yang, Cheng and Bai, Yutong and Wang, Changhu and Yuille, Alan},

  journal={arXiv preprint arXiv:2103.07976},

  year={2021}

}

```

## Acknowledgement

Many thanks to [ViT-pytorch](https://github.com/jeonsworld/ViT-pytorch) for the PyTorch reimplementation of [An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale](https://arxiv.org/abs/2010.11929)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/TACJu/TransFG

Awesome Lists containing this project

README