Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/TACJu/TransFG
This is the official PyTorch implementation of the paper "TransFG: A Transformer Architecture for Fine-grained Recognition" (Ju He, Jie-Neng Chen, Shuai Liu, Adam Kortylewski, Cheng Yang, Yutong Bai, Changhu Wang, Alan Yuille).
https://github.com/TACJu/TransFG
fine-grained-recognition
Last synced: about 1 month ago
JSON representation
This is the official PyTorch implementation of the paper "TransFG: A Transformer Architecture for Fine-grained Recognition" (Ju He, Jie-Neng Chen, Shuai Liu, Adam Kortylewski, Cheng Yang, Yutong Bai, Changhu Wang, Alan Yuille).
- Host: GitHub
- URL: https://github.com/TACJu/TransFG
- Owner: TACJu
- License: mit
- Created: 2021-03-28T10:33:00.000Z (over 3 years ago)
- Default Branch: master
- Last Pushed: 2022-10-04T06:39:47.000Z (about 2 years ago)
- Last Synced: 2024-08-04T03:11:42.884Z (5 months ago)
- Topics: fine-grained-recognition
- Language: Python
- Homepage:
- Size: 728 KB
- Stars: 381
- Watchers: 5
- Forks: 88
- Open Issues: 31
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# TransFG: A Transformer Architecture for Fine-grained Recognition
[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/transfg-a-transformer-architecture-for-fine/fine-grained-image-classification-on-cub-200)](https://paperswithcode.com/sota/fine-grained-image-classification-on-cub-200?p=transfg-a-transformer-architecture-for-fine) [![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/transfg-a-transformer-architecture-for-fine/fine-grained-image-classification-on-nabirds)](https://paperswithcode.com/sota/fine-grained-image-classification-on-nabirds?p=transfg-a-transformer-architecture-for-fine) [![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/transfg-a-transformer-architecture-for-fine/fine-grained-image-classification-on-stanford-1)](https://paperswithcode.com/sota/fine-grained-image-classification-on-stanford-1?p=transfg-a-transformer-architecture-for-fine) [![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/transfg-a-transformer-architecture-for-fine/image-classification-on-inaturalist)](https://paperswithcode.com/sota/image-classification-on-inaturalist?p=transfg-a-transformer-architecture-for-fine)
Official PyTorch code for the paper: [*TransFG: A Transformer Architecture for Fine-grained Recognition (AAAI2022)*](https://arxiv.org/abs/2103.07976)
## Framework
![](./TransFG.png)
## Dependencies:
+ Python 3.7.3
+ PyTorch 1.5.1
+ torchvision 0.6.1
+ ml_collections## Usage
### 1. Download Google pre-trained ViT models* [Get models in this link](https://console.cloud.google.com/storage/vit_models/): ViT-B_16, ViT-B_32...
```bash
wget https://storage.googleapis.com/vit_models/imagenet21k/{MODEL_NAME}.npz
```### 2. Prepare data
In the paper, we use data from 5 publicly available datasets:
+ [CUB-200-2011](http://www.vision.caltech.edu/visipedia/CUB-200-2011.html)
+ [Stanford Cars](https://ai.stanford.edu/~jkrause/cars/car_dataset.html)
+ [Stanford Dogs](http://vision.stanford.edu/aditya86/ImageNetDogs/)
+ [NABirds](http://dl.allaboutbirds.org/nabirds)
+ [iNaturalist 2017](https://github.com/visipedia/inat_comp/tree/master/2017)Please download them from the official websites and put them in the corresponding folders.
### 3. Install required packages
Install dependencies with the following command:
```bash
pip3 install -r requirements.txt
```### 4. Train
To train TransFG on CUB-200-2011 dataset with 4 gpus in FP-16 mode for 10000 steps run:
```bash
CUDA_VISIBLE_DEVICES=0,1,2,3 python3 -m torch.distributed.launch --nproc_per_node=4 train.py --dataset CUB_200_2011 --split overlap --num_steps 10000 --fp16 --name sample_run
```## Citation
If you find our work helpful in your research, please cite it as:
```
@article{he2021transfg,
title={TransFG: A Transformer Architecture for Fine-grained Recognition},
author={He, Ju and Chen, Jie-Neng and Liu, Shuai and Kortylewski, Adam and Yang, Cheng and Bai, Yutong and Wang, Changhu and Yuille, Alan},
journal={arXiv preprint arXiv:2103.07976},
year={2021}
}
```## Acknowledgement
Many thanks to [ViT-pytorch](https://github.com/jeonsworld/ViT-pytorch) for the PyTorch reimplementation of [An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale](https://arxiv.org/abs/2010.11929)