Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/Hhhhhhao/Noisy-Model-Learning
https://github.com/Hhhhhhao/Noisy-Model-Learning
Last synced: 7 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/Hhhhhhao/Noisy-Model-Learning
- Owner: Hhhhhhao
- Created: 2024-03-13T21:15:07.000Z (8 months ago)
- Default Branch: master
- Last Pushed: 2024-03-13T21:34:11.000Z (8 months ago)
- Last Synced: 2024-10-26T00:36:24.793Z (18 days ago)
- Language: Python
- Size: 3.71 MB
- Stars: 30
- Watchers: 2
- Forks: 3
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Noisy Model Leanring
Code for ICLR 2024 Spotlight Paper "[Understanding and Mitigating the Label Noise in Pre-training on Downstream Tasks](https://arxiv.org/abs/2309.17002)"
## Training
### Linear Probing
ImageNet-1K Fully-Supervised Models
```
python src/finetune.py --seed=3907 --train-dataset=CIFAR10 --eval-datasets=CIFAR10 --model=resnet50_noise_5 --model_source=timm --epochs=30 --lr=0.01 --wd=0.0001 --batch-size=64 --save SAVE_PATH --results SAVE_PATH/results.jsonl --data-location=DATA_DIR --freeze-encoder --template=simple_template --load-zero-shot-classifier False --num-shots=None
```CLIP Models
```
python src/finetune.py --seed=3907 --train-dataset=CIFAR10 --eval-datasets=CIFAR10 --model=RN50_yfcc15m_orig --model_source=open_clip --epochs=30 --lr=0.01 --wd=0.0001 --batch-size=64 --save SAVE_PATH --results SAVE_PATH/results.jsonl --data-location=DATA_DIR --freeze-encoder --template=simple_template --load-zero-shot-classifier False --num-shots=None
```### NMTune
ImageNet-1K Fully-Supervised Models
```
python src/nml_tune.py --seed=3907 --train-dataset=CIFAR10 --eval-datasets=CIFAR10 --model=resnet50_noise_5 --model_source=timm --epochs=30 --lr=0.001 --wd=0.0001 --batch-size=64 --save SAVE_PATH --results SAVE_PATH/results.jsonl --data-location=DATA_DIR --template=openai_imagenet_template --freeze-encoder --load-zero-shot-classifier False --use-weak-strong=False --mlp-with-res=True --mlp-res-scale-init=1.0 --mlp-after-ratio=4 --mlp-after-layers=1 --mlp-after-layers=1 --mse-loss-weight=1.0 --cov-loss-weight=0.04 --svd-loss-weight=0.001
```CLIP Models
```
python src/nml_tune.py --seed=3907 --train-dataset=CIFAR10 --eval-datasets=CIFAR10 --model=RN50_yfcc15m_noise_30 --model_source=open_clip --epochs=30 --lr=0.001 --wd=0.0001 --batch-size=64 --save SAVE_PATH --results SAVE_PATH/results.jsonl --data-location=DATA_DIR --template=openai_imagenet_template --freeze-encoder --load-zero-shot-classifier False --use-weak-strong=False --mlp-with-res=True --mlp-res-scale-init=1.0 --mlp-after-ratio=4 --mlp-after-layers=1 --mlp-after-layers=1 --mse-loss-weight=1.0 --cov-loss-weight=0.04 --svd-loss-weight=0.001
```## Pre-trained Models
Our pre-trained models (noisy) are released by by request.
You can also pre-train your own models on noisy datasets using [timm](https://github.com/huggingface/pytorch-image-models) and [open_clip](https://github.com/mlfoundations/open_clip).
Once you have the pre-trained models, replace/create the model config as in:
1. https://github.com/Hhhhhhao/Noisy-Model-Learning/blob/7547d662c9d3c801a10b71072a0dd7beb7e9b481/timm/models/resnet.py#L74
2. https://github.com/Hhhhhhao/Noisy-Model-Learning/blob/7547d662c9d3c801a10b71072a0dd7beb7e9b481/open_clip/open_clip/pretrained.py#L39## Bibtex
```
@inproceedings{
chen2024understanding,
title={Understanding and Mitigating the Label Noise in Pre-training on Downstream Tasks},
author={Hao Chen and Jindong Wang and Ankit Shah and Ran Tao and Hongxin Wei and Xing Xie and Masashi Sugiyama and Bhiksha Raj},
booktitle={The Twelfth International Conference on Learning Representations},
year={2024},
}
```## Acknowledge
Most of our code are developed based on [timm](https://github.com/huggingface/pytorch-image-models), [open_clip](https://github.com/mlfoundations/open_clip), and [wise-ft](https://github.com/mlfoundations/wise-ft)## Contact
[email protected]