https://github.com/QData/C-Tran
General Multi-label Image Classification with Transformers
https://github.com/QData/C-Tran
computer-vision multi-label-classification transformers
Last synced: 23 days ago
JSON representation
General Multi-label Image Classification with Transformers
- Host: GitHub
- URL: https://github.com/QData/C-Tran
- Owner: QData
- License: mit
- Created: 2021-01-25T17:19:22.000Z (over 4 years ago)
- Default Branch: main
- Last Pushed: 2024-11-02T02:16:30.000Z (6 months ago)
- Last Synced: 2025-04-04T01:18:42.373Z (25 days ago)
- Topics: computer-vision, multi-label-classification, transformers
- Language: Python
- Homepage:
- Size: 2.97 MB
- Stars: 263
- Watchers: 7
- Forks: 43
- Open Issues: 12
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
**General Multi-label Image Classification with Transformers**
Jack Lanchantin, Tianlu Wang, Vicente Ordóñez Román, Yanjun Qi
Conference on Computer Vision and Pattern Recognition (CVPR) 2021
[[paper]](https://arxiv.org/abs/2011.14027) [[poster]](https://github.com/QData/C-Tran/blob/main/supplemental/ctran_poster.pdf) [[slides]](https://github.com/QData/C-Tran/blob/main/supplemental/ctran_slides.pdf)## Training and Running C-Tran ##
Python version 3.7 is required and all major packages used and their versions are listed in `requirements.txt`.
### C-Tran on COCO80 Dataset ###
Download COCO data (19G)
```
wget https://www.cs.virginia.edu/~yq2h/jack/vision/coco.tar.gz
mkdir -p data/
tar -xvf coco.tar.gz -C data/
```Train New Model
```
python main.py --batch_size 16 --lr 0.00001 --optim 'adam' --layers 3 --dataset 'coco' --use_lmt --dataroot data/
```### C-Tran on VOC20 Dataset ###
Download VOC2007 data (1.7G)
```
wget https://www.cs.virginia.edu/~yq2h/jack/vision/voc.tar.gz
mkdir -p data/
tar -xvf voc.tar.gz -C data/
```Train New Model
```
python main.py --batch_size 16 --lr 0.00001 --optim 'adam' --layers 3 --dataset 'voc' --use_lmt --grad_ac_step 2 --dataroot data/
```## Citing ##
```bibtex
@article{lanchantin2020general,
title={General Multi-label Image Classification with Transformers},
author={Lanchantin, Jack and Wang, Tianlu and Ordonez, Vicente and Qi, Yanjun},
journal={arXiv preprint arXiv:2011.14027},
year={2020}
}
```