https://github.com/LiJunnan1992/DivideMix

Code for paper: DivideMix: Learning with Noisy Labels as Semi-supervised Learning
https://github.com/LiJunnan1992/DivideMix

Last synced: 28 days ago
JSON representation

Code for paper: DivideMix: Learning with Noisy Labels as Semi-supervised Learning

Host: GitHub
URL: https://github.com/LiJunnan1992/DivideMix
Owner: LiJunnan1992
License: mit
Created: 2019-11-21T08:28:36.000Z (over 5 years ago)
Default Branch: master
Last Pushed: 2020-09-14T10:20:04.000Z (over 4 years ago)
Last Synced: 2024-11-13T19:39:40.312Z (7 months ago)
Language: Python
Size: 121 KB
Stars: 543
Watchers: 9
Forks: 84
Open Issues: 9
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

Awesome-Mixup - [Code

README

# DivideMix: Learning with Noisy Labels as Semi-supervised Learning
PyTorch Code for the following paper at ICLR2020:\
Title: DivideMix: Learning with Noisy Labels as Semi-supervised Learning [pdf]\
Authors:Junnan Li, Richard Socher, Steven C.H. Hoi\
Institute: Salesforce Research

Abstract\
Deep neural networks are known to be annotation-hungry. Numerous efforts have been devoted to reduce the annotation cost when learning with deep networks. Two prominent directions include learning with noisy labels and semi-supervised learning by exploiting unlabeled data. In this work, we propose DivideMix, a novel framework for learning with noisy labels by leveraging semi-supervised learning techniques. In particular, DivideMix models the per-sample loss distribution with a mixture model to dynamically divide the training data into a labeled set with clean samples and an unlabeled set with noisy samples, and trains the model on both the labeled and unlabeled data in a semi-supervised manner. To avoid confirmation bias, we simultaneously train two diverged networks where each network uses the dataset division from the other network. During the semi-supervised training phase, we improve the MixMatch strategy by performing label co-refinement and label co-guessing on labeled and unlabeled samples, respectively. Experiments on multiple benchmark datasets demonstrate substantial improvements over state-of-the-art methods.

Illustration\

Experiments\
First, please create a folder named checkpoint to store the results.\
mkdir checkpoint\
Next, run \
python Train_{dataset_name}.py --data_path path-to-your-data

Cite DivideMix\
If you find the code useful in your research, please consider citing our paper:


@inproceedings{

    li2020dividemix,

    title={DivideMix: Learning with Noisy Labels as Semi-supervised Learning},

    author={Junnan Li and Richard Socher and Steven C.H. Hoi},

    booktitle={International Conference on Learning Representations},

    year={2020},

}

License\
This project is licensed under the terms of the MIT license.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/LiJunnan1992/DivideMix

Awesome Lists containing this project

README