An open API service indexing awesome lists of open source software.

https://github.com/thumnlab/curbench


https://github.com/thumnlab/curbench

Last synced: 9 months ago
JSON representation

Awesome Lists containing this project

README

          

# CurBench: A Curriculum Learning Benchmark

A benchmark for Curriculum Learning.

The code of ICML 2024 paper **CurBench: A Curriculum Learning Benchmark**.

The paper can be downloaded from the [official website](https://openreview.net/pdf?id=Htw0bSgjXE) or in the [docs directory](https://github.com/THUMNLab/CurBench/tree/master/docs).

## Environment

1. python >= 3.7

[https://www.python.org/downloads/](https://www.python.org/downloads/)

2. pytorch >= 1.12

[https://pytorch.org/](https://pytorch.org/)

3. torch_geometric

[https://pytorch-geometric.readthedocs.io/en/latest/install/installation.html](https://pytorch-geometric.readthedocs.io/en/latest/install/installation.html)

4. other requirements

```bash
pip install -r requirements.txt
```

## Dataset

### Vision

**CIFAR-10** and **CIFAR-100** will be downloaded automatically.

**Tiny-ImageNet** is a subset of the ILSVRC2012 version of ImageNet and consists of 64 × 64 × 3 down-sampled images. It needs to be downloaded manually from the [official website](https://image-net.org/download.php).

``` bash
CurBench
└── data
├── cifar-10-batches-py
│ ├── data_batch_1
│ ├── data_batch_2
│ ├── ...
│ └── test_batch
├── cifar-100-python
│ ├── train
│ ├── test
│ └── meta
│ └── ...
└── tiny-imagenet-200
├── train
├── val
└── test

# For easier data processing, we use a Tiny-ImageNet dataset utility class for pytorch: https://gist.github.com/lromor/bcfc69dcf31b2f3244358aea10b7a11b
# After the processing, the directory becomes:

CurBench
└── data
└── tiny-imagenet-200
├── train_batch
├── val_batch
└── ...
```

### Text

**GLUE** will be downloaded automatically and it consists of **cola**, **sst2**, **mrpc**, **qqp**, **stsb**, **mnli**, **qnli**, **rte**, ...

### Graph

**TUDataset** will be downloaded automatically and it consists of many datasets, among which we choose **MUTAG**, **PROTEINS**, **NCI1**

**OGB** will be downloaded automatically and it consists of many datasets, among which we choose **molhiv**

## Quick Start

``` bash
# 1. clone from the repository
git clone
cd CurBench

# 2. pip install local module: curbench
pip install -e .

# 3. prepare dataset

# 4. run the example code
python examples/base.py
```

## Run

### Single Run

```bash
# 1. vision standard
python examples/base.py --data --net --gpu <0/1/2/...>

# 2. text standard
python examples/base.py --data --net --gpu <0/1/2/...>

# 3. graph standard
python examples/base.py --data --net --gpu <0/1/2/...>

# Note: Do not use LRE, MW-Net and DDS when backbone model is LSTM, which is not suitable for direct gradient calculation.
```

### Batch Run
```bash
python run.py
```

## Cite

Please cite our paper as follows if you find our code useful:

```
@inproceedings{zhoucurbench,
title={CurBench: Curriculum Learning Benchmark},
author={Zhou, Yuwei and Pan, Zirui and Wang, Xin and Chen, Hong and Li, Haoyang and Huang, Yanwen and Xiong, Zhixiao and Xiong, Fangzhou and Xu, Peiyang and Zhu, Wenwu and others},
booktitle={Forty-first International Conference on Machine Learning}
}
```

You may also find our [survey paper](https://arxiv.org/pdf/2010.13166.pdf) helpful:
```
@article{wang2021survey,
title={A survey on curriculum learning},
author={Wang, Xin and Chen, Yudong and Zhu, Wenwu},
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
year={2021},
publisher={IEEE}
}
```