Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/Hhhhhhao/Noisy-Model-Learning

Last synced: 2 months ago
JSON representation

Host: GitHub
URL: https://github.com/Hhhhhhao/Noisy-Model-Learning
Owner: Hhhhhhao
Created: 2024-03-13T21:15:07.000Z (10 months ago)
Default Branch: master
Last Pushed: 2024-03-13T21:34:11.000Z (10 months ago)
Last Synced: 2024-10-26T00:36:24.793Z (2 months ago)
Language: Python
Size: 3.71 MB
Stars: 30
Watchers: 2
Forks: 3
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

        # Noisy Model Leanring 

Code for ICLR 2024 Spotlight Paper "[Understanding and Mitigating the Label Noise in Pre-training on Downstream Tasks](https://arxiv.org/abs/2309.17002)"

## Training 

### Linear Probing

ImageNet-1K Fully-Supervised Models

```

python src/finetune.py --seed=3907 --train-dataset=CIFAR10 --eval-datasets=CIFAR10 --model=resnet50_noise_5 --model_source=timm --epochs=30 --lr=0.01 --wd=0.0001 --batch-size=64 --save SAVE_PATH --results SAVE_PATH/results.jsonl --data-location=DATA_DIR --freeze-encoder  --template=simple_template  --load-zero-shot-classifier False  --num-shots=None

```

CLIP Models

```

python src/finetune.py --seed=3907 --train-dataset=CIFAR10 --eval-datasets=CIFAR10 --model=RN50_yfcc15m_orig --model_source=open_clip --epochs=30 --lr=0.01 --wd=0.0001 --batch-size=64 --save SAVE_PATH --results SAVE_PATH/results.jsonl --data-location=DATA_DIR --freeze-encoder  --template=simple_template  --load-zero-shot-classifier False  --num-shots=None

```

### NMTune 

ImageNet-1K Fully-Supervised Models

```

python src/nml_tune.py --seed=3907 --train-dataset=CIFAR10 --eval-datasets=CIFAR10 --model=resnet50_noise_5 --model_source=timm --epochs=30 --lr=0.001 --wd=0.0001 --batch-size=64 --save SAVE_PATH --results  SAVE_PATH/results.jsonl --data-location=DATA_DIR --template=openai_imagenet_template --freeze-encoder --load-zero-shot-classifier False --use-weak-strong=False --mlp-with-res=True --mlp-res-scale-init=1.0 --mlp-after-ratio=4 --mlp-after-layers=1 --mlp-after-layers=1 --mse-loss-weight=1.0 --cov-loss-weight=0.04 --svd-loss-weight=0.001

```

CLIP Models

```

python src/nml_tune.py --seed=3907 --train-dataset=CIFAR10 --eval-datasets=CIFAR10 --model=RN50_yfcc15m_noise_30 --model_source=open_clip --epochs=30 --lr=0.001 --wd=0.0001 --batch-size=64 --save SAVE_PATH --results  SAVE_PATH/results.jsonl --data-location=DATA_DIR --template=openai_imagenet_template --freeze-encoder --load-zero-shot-classifier False --use-weak-strong=False --mlp-with-res=True --mlp-res-scale-init=1.0 --mlp-after-ratio=4 --mlp-after-layers=1 --mlp-after-layers=1 --mse-loss-weight=1.0 --cov-loss-weight=0.04 --svd-loss-weight=0.001

```

## Pre-trained Models

Our pre-trained models (noisy) are released by by request.

You can also pre-train your own models on noisy datasets using [timm](https://github.com/huggingface/pytorch-image-models) and [open_clip](https://github.com/mlfoundations/open_clip).

Once you have the pre-trained models, replace/create the model config as in:

1. https://github.com/Hhhhhhao/Noisy-Model-Learning/blob/7547d662c9d3c801a10b71072a0dd7beb7e9b481/timm/models/resnet.py#L74

2. https://github.com/Hhhhhhao/Noisy-Model-Learning/blob/7547d662c9d3c801a10b71072a0dd7beb7e9b481/open_clip/open_clip/pretrained.py#L39

## Bibtex

```

@inproceedings{

chen2024understanding,

title={Understanding and Mitigating the Label Noise in Pre-training on Downstream Tasks},

author={Hao Chen and Jindong Wang and Ankit Shah and Ran Tao and Hongxin Wei and Xing Xie and Masashi Sugiyama and Bhiksha Raj},

booktitle={The Twelfth International Conference on Learning Representations},

year={2024},

}

```

## Acknowledge

Most of our code are developed based on [timm](https://github.com/huggingface/pytorch-image-models), [open_clip](https://github.com/mlfoundations/open_clip), and [wise-ft](https://github.com/mlfoundations/wise-ft)

## Contact

[email protected]