https://github.com/feedzai/fair-automl

Repo for the paper "Promoting Fairness through Hyperparameter Optimization" @ ICDM 2021
https://github.com/feedzai/fair-automl

Last synced: 12 months ago
JSON representation

Repo for the paper "Promoting Fairness through Hyperparameter Optimization" @ ICDM 2021

Host: GitHub
URL: https://github.com/feedzai/fair-automl
Owner: feedzai
License: other
Created: 2020-10-06T13:42:25.000Z (over 5 years ago)
Default Branch: main
Last Pushed: 2022-02-15T10:40:13.000Z (over 4 years ago)
Last Synced: 2025-04-30T04:49:13.921Z (about 1 year ago)
Language: Jupyter Notebook
Homepage:
Size: 24 MB
Stars: 10
Watchers: 4
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE.md

Awesome Lists containing this project

README

          # Promoting Fairness through Hyperparameter Optimization

This repository contains ML artifacts and other materials from the experiments performed on the [paper](https://arxiv.org/pdf/2103.12715.pdf).

## Key Contributions

- An approach for promoting model fairness that can be easily plugged into current ML pipelines with no extra development or computational cost.

- A set of competitive fairness-aware HO algorithms for multi-objective optimization of the fairness-accuracy trade-off that are agnostic to both the explored hyperparameter space and the objective metrics.

- Strong empirical evidence that hyperparameter optimization (HO) is an effective way to navigate the fairness-accuracy trade-off.

- A heuristic to automatically set the fairness-accuracy trade-off parameter.

- Competitive results on a real-world fraud detection use case, as well as on three datasets from the fairness literature (Adult, COMPAS, Donors Choose).

## Repository Structure

- [`data`](data) contains detailed artifacts generated from each experiment;

  - `all_tuner_iters_evals_.csv.gz` contains all HO iterations from all tuners for each dataset;

  - `_non-aggregated-results.csv` contains one row per each HO run, for all tuners except TPE and FairTPE;

  - `all-datasets-with-TPE-tuner_non-aggregated-results.csv` contains one row per each HO run for TPE and FairTPE (all datasets on the same file);

  - `results_all_datasets.csv` contains one row per each HO run for all tuners, for all datasets;

  - `AOF-EG-experiment_non-aggregated-results.csv` contains data from the EG experiment (adding the Exponentiated Gradient reduction bias-reduction method to the search space);

- [`code`](code) contains misc. jupyter notebooks used for the paper;

  - [`code/plots.ipynb`](code/plots.ipynb) generates plots for all datasets from the provided data files;

  - [`code/stats.ipynb`](code/stats.ipynb) computes validation/test results for each experiment, as well as p-values of statistical difference between hyperparameter tuners;

- [`imgs`](imgs) contains all generated plots for all datasets (all plots from the paper plus a few that didn't make it due to space);

- [`hyperparameters`](hyperparameters) contains details on the hyperparameter search space used for all HO tasks;

## Fairband: Selected Fairness-Accuracy Trade-off, discriminated by Model Type

![EG Experiment on AOF dataset](imgs/AOF/AOF_fairness_performance_selected_by_model_type.png)

- Plot for the EG experiment on the Adult dataset [here](imgs/Adult/Adult_fairness_performance_selected_by_model_type.png).

- _Experiment:_ running Fairband (15 runs) on the AOF and Adult datasets, supplied with the following model choices: Neural Network (NN), Random Forest (RF), Decision Tree (DT), Logistic Regression (LR), LightGBM (LGBM), and Exponentiated Gradient reduction for fair classification (EG).

- EG is a state-of-the-art bias reduction method available at [fairlearn](https://github.com/fairlearn/fairlearn).

- As shown by the plot, **blindly applying bias reduction techniques may lead to suboptimal fairness-accuracy trade-offs**. In this example, EG is dominated by LGBM models on the AOF dataset, and by NN models on the Adult dataset. Fairband should be used in conjunction with a wide portfolio of model choices to achieve fairness.

## Citing

```

@inproceedings{cruz2021promoting,

    title={Promoting Fairness through Hyperparameter Optimization},

    author={Cruz, Andr{\'{e}} F. and Saleiro, Pedro and Bel{\'{e}}m, Catarina and Soares, Carlos and Bizarro, Pedro},

    booktitle={2021 {IEEE} International Conference on Data Mining ({ICDM})},   

    year={2021},

    pages={1036-1041},

    publisher={{IEEE}},

    url={https://doi.org/10.1109/ICDM51629.2021.00119},

    doi={10.1109/ICDM51629.2021.00119}

}

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/feedzai/fair-automl

Awesome Lists containing this project

README