https://github.com/ekellbuch/longtail_ensembles
Evaluating ensemble performance in long-tailed datasets (Neurips 2023 Heavy Tails Workshop)
https://github.com/ekellbuch/longtail_ensembles
class-imbalance ensemble-learning fairness-ml imbalanced-classes imbalanced-classification imbalanced-data imbalanced-learning
Last synced: 2 months ago
JSON representation
Evaluating ensemble performance in long-tailed datasets (Neurips 2023 Heavy Tails Workshop)
- Host: GitHub
- URL: https://github.com/ekellbuch/longtail_ensembles
- Owner: ekellbuch
- License: mit
- Created: 2023-05-10T14:16:09.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2024-04-26T23:16:25.000Z (about 1 year ago)
- Last Synced: 2024-04-27T23:30:17.591Z (about 1 year ago)
- Topics: class-imbalance, ensemble-learning, fairness-ml, imbalanced-classes, imbalanced-classification, imbalanced-data, imbalanced-learning
- Language: Python
- Homepage:
- Size: 1.58 MB
- Stars: 1
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-imbalanced-learning - [**Code**
README
# The Effects of Ensembling on Long-Tailed Data
Code for the paper ["The Effects of Ensembling on Long-Tailed Data"](https://openreview.net/pdf?id=l4GYs60kre) where we perform a systematic comparison between logit and probability ensembling for a variety
of models trained on balanced and imbalanced datasets.## Findings:
- Adding more ensemble members continues to improve performance on imbalanced datasets.
- No difference between logit and probability ensembles across a variety of balanced datasets.
- There are differences between logit and probability ensembles on imbalanced datasets depending on the ensemble diversity and dependency.

```
@inproceedings{
buchanan2023the,
title={The Effects of Ensembling on Long-Tailed Data},
author={E. Kelly Buchanan and Geoff Pleiss and Yixin Wang and John Patrick Cunningham},
booktitle={NeurIPS 2023 Workshop Heavy Tails in Machine Learning},
year={2023}
}
```
Installation instructions in docs/README.md: [docs/README.md](docs/README.md)## Experiments:
1. Train resnet32 model on CIFAR10 dataset
```
python scripts/run.py --config-name="run_gpu_cifar10"
```
2. Train models on CIFAR10LT dataset across multiple losses
```
wandb sweep experiments/compare_loss/train_gpu_loss_cifar10.yaml
```
3. Train additional models on CIFAR10LT.
```
wandb sweep experiments/compare_loss/train_gpu_loss_cifar10_largeM.yaml
```## Paper Experiments
| Wandb Experiment | parameters | comments |
|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------------------------:|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|
| [nggmmw4m](https://wandb.ai/ekellbuch/uncategorized/sweeps/nggmmw4m) , [0itowy8a](https://wandb.ai/ekellbuch/uncategorized/sweeps/0itowy8a), [d4s9wp4v](https://wandb.ai/ekellbuch/uncategorized/sweeps/d4s9wp4v) | train resnet32 and resnet110 models on CIFAR10-LT using multiple losses and for different seeds. (IMBALANCECIFAR10) | models trained using balanced softmax loss have best performance
| [9hwaytks](https://wandb.ai/ekellbuch/longtail_ensembles-scripts/sweeps/9hwaytks), [gv4bucon](https://wandb.ai/ekellbuch/longtail_ensembles-scripts/sweeps/gv4bucon) | train resnet32_cfa and resnet_110 on CIFAR100-LT using multiple losses and for difference seeds. (IMBALANCECIFAR100Aug) | models trained using balanced softmax loss have best performance## Reproduce paper tables and figures:
- [x] Fig: Ensemble size vs ensemble type across multiple losses
```
python scripts/vis_scripts/plot_results_metric_M.py --config-path="../../results/configs/comparison_baseline_cifar10lt" --config-name="compare_M"
```
- [x] Table: Ensemble performance of models trained on CIFAR10-LT and CIFAR100-LT:
```
python scripts/compare_all_results.py --config-path="../results/configs/comparison_baseline_cifar10lt" --config-name="default"
python scripts/compare_all_results.py --config-path="../results/configs/comparison_baseline_cifar100lt" --config-name="default"
```
- [x] Fig: Class ID vs avg. Disagreement:
```
python scripts/vis_scripts/plot_results_pclass.py
```
- [x] Fig: Class ID vs diversity/dependency:
```
python scripts/vis_scripts/plot_results_dkl_diff.py
```
- [x] Fig: performance of logit and probability ensembles on balanced datasets.
```
python scripts/vis_scripts/plot_single_metric_xy.py --datasets=base --metric=error
```
## References:
- Balanced Meta Softmax: [github.com/jiawei-ren/BalancedMetaSoftmax-Classification](https://github.com/jiawei-ren/BalancedMetaSoftmax-Classification)