https://github.com/salesforce/must

PyTorch code for MUST
https://github.com/salesforce/must

clip masked-image-modeling self-training unsupervised-learning zero-shot-classification zero-shot-learning

Last synced: 6 months ago
JSON representation

PyTorch code for MUST

Host: GitHub
URL: https://github.com/salesforce/must
Owner: salesforce
License: bsd-3-clause
Created: 2022-05-24T00:05:36.000Z (over 3 years ago)
Default Branch: main
Last Pushed: 2023-03-08T07:25:18.000Z (over 2 years ago)
Last Synced: 2024-08-04T03:11:07.548Z (about 1 year ago)
Topics: clip, masked-image-modeling, self-training, unsupervised-learning, zero-shot-classification, zero-shot-learning
Language: Python
Homepage:
Size: 1.33 MB
Stars: 103
Watchers: 6
Forks: 12
Open Issues: 8
Metadata Files:
- Readme: README.md
- License: LICENSE.txt
- Code of conduct: CODE_OF_CONDUCT.md
- Codeowners: CODEOWNERS
- Security: SECURITY.md

Awesome Lists containing this project

README

          # Masked Unsupervised Self-training for Zero-shot Image Classification 

This is the PyTorch code of the [MUST paper](https://arxiv.org/abs/2206.02967). The repository supports finetuning a CLIP model on unlabeled images from a target domain.

### Requirements

* pytorch 1.10.0

* timm 0.4.12

* tensorboardX

* ftfy

### Dataset Setup

Dataset paths are stored in [dataset_catalog.json](https://github.com/salesforce/MUST/blob/main/dataset_catalog.json), which need to be modified to local paths. The imagenet dataset follows the standard folder structure. For other datasets, please refer to the scrips from [VISSL](https://github.com/facebookresearch/vissl/tree/main/extra_scripts/datasets) to download and prepare. CLIP's labels and prompt templates are stored in [classes.json](https://github.com/salesforce/MUST/blob/main/classes.json) and [templates.json](https://github.com/salesforce/MUST/blob/main/templates.json).

### Training

Run the following code on 16 A100 GPUs:

python -m torch.distributed.run --nproc_per_node=16 train.py --dataset [name_of_dataset] --clip_model ViT-B/16 


### Results

ViT-B/16:

Method | ImageNet | SUN397 | Food101 | GTSRB | DTD | UCF101 

--- | :---: | :---: | :---: | :---: | :---: | :---:

CLIP | 68.3 | 64.4 | 88.7 | 43.4 | 44.7 | 68.8

MUST | 77.7 | 71.8 | 92.7 | 65.5 | 54.1 | 81.1 

ViT-L/14:

Method | ImageNet | SUN397 | Food101 | GTSRB | DTD | UCF101 

--- | :---: | :---: | :---: | :---: | :---: | :---:

CLIP | 75.5 | 67.4 | 92.9 | 50.6 | 55.4 | 77.0

MUST | 82.1 | 74.6 | 95.3 | 68.7 | 62.6 | 85.7 

### Citation


@inproceedings{li2022masked,

      title={Masked Unsupervised Self-training for Label-Free Image Classification}, 

      author={Junnan Li and Silvio Savarese and Steven C. H. Hoi},

      year={2023},

      booktitle={ICLR},

}

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/salesforce/must

Awesome Lists containing this project

README