{"id":22848421,"url":"https://github.com/feedzai/fair-automl","last_synced_at":"2025-07-05T09:06:10.847Z","repository":{"id":100232610,"uuid":"301740230","full_name":"feedzai/fair-automl","owner":"feedzai","description":"Repo for the paper \"Promoting Fairness through Hyperparameter Optimization\" @ ICDM 2021","archived":false,"fork":false,"pushed_at":"2022-02-15T10:40:13.000Z","size":25212,"stargazers_count":10,"open_issues_count":0,"forks_count":0,"subscribers_count":4,"default_branch":"main","last_synced_at":"2025-04-30T04:49:13.921Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/feedzai.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-10-06T13:42:25.000Z","updated_at":"2024-11-27T09:27:32.000Z","dependencies_parsed_at":"2023-05-13T00:15:31.285Z","dependency_job_id":null,"html_url":"https://github.com/feedzai/fair-automl","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/feedzai/fair-automl","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/feedzai%2Ffair-automl","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/feedzai%2Ffair-automl/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/feedzai%2Ffair-automl/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/feedzai%2Ffair-automl/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/feedzai","download_url":"https://codeload.github.com/feedzai/fair-automl/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/feedzai%2Ffair-automl/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":263715326,"owners_count":23500242,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-12-13T04:12:09.061Z","updated_at":"2025-07-05T09:06:10.828Z","avatar_url":"https://github.com/feedzai.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Promoting Fairness through Hyperparameter Optimization\n\nThis repository contains ML artifacts and other materials from the experiments performed on the [paper](https://arxiv.org/pdf/2103.12715.pdf).\n\n## Key Contributions\n\n- An approach for promoting model fairness that can be easily plugged into current ML pipelines with no extra development or computational cost.\n- A set of competitive fairness-aware HO algorithms for multi-objective optimization of the fairness-accuracy trade-off that are agnostic to both the explored hyperparameter space and the objective metrics.\n- Strong empirical evidence that hyperparameter optimization (HO) is an effective way to navigate the fairness-accuracy trade-off.\n- A heuristic to automatically set the fairness-accuracy trade-off parameter.\n- Competitive results on a real-world fraud detection use case, as well as on three datasets from the fairness literature (Adult, COMPAS, Donors Choose).\n\n\n## Repository Structure\n\n- [`data`](data) contains detailed artifacts generated from each experiment;\n  - `all_tuner_iters_evals_\u003cdataset\u003e.csv.gz` contains all HO iterations from all tuners for each dataset;\n  - `\u003cdataset\u003e_non-aggregated-results.csv` contains one row per each HO run, for all tuners except TPE and FairTPE;\n  - `all-datasets-with-TPE-tuner_non-aggregated-results.csv` contains one row per each HO run for TPE and FairTPE (all datasets on the same file);\n  - `results_all_datasets.csv` contains one row per each HO run for all tuners, for all datasets;\n  - `AOF-EG-experiment_non-aggregated-results.csv` contains data from the EG experiment (adding the Exponentiated Gradient reduction bias-reduction method to the search space);\n- [`code`](code) contains misc. jupyter notebooks used for the paper;\n  - [`code/plots.ipynb`](code/plots.ipynb) generates plots for all datasets from the provided data files;\n  - [`code/stats.ipynb`](code/stats.ipynb) computes validation/test results for each experiment, as well as p-values of statistical difference between hyperparameter tuners;\n- [`imgs`](imgs) contains all generated plots for all datasets (all plots from the paper plus a few that didn't make it due to space);\n- [`hyperparameters`](hyperparameters) contains details on the hyperparameter search space used for all HO tasks;\n\n\n## Fairband: Selected Fairness-Accuracy Trade-off, discriminated by Model Type\n\n![EG Experiment on AOF dataset](imgs/AOF/AOF_fairness_performance_selected_by_model_type.png)\n\n- Plot for the EG experiment on the Adult dataset [here](imgs/Adult/Adult_fairness_performance_selected_by_model_type.png).\n- _Experiment:_ running Fairband (15 runs) on the AOF and Adult datasets, supplied with the following model choices: Neural Network (NN), Random Forest (RF), Decision Tree (DT), Logistic Regression (LR), LightGBM (LGBM), and Exponentiated Gradient reduction for fair classification (EG).\n- EG is a state-of-the-art bias reduction method available at [fairlearn](https://github.com/fairlearn/fairlearn).\n- As shown by the plot, **blindly applying bias reduction techniques may lead to suboptimal fairness-accuracy trade-offs**. In this example, EG is dominated by LGBM models on the AOF dataset, and by NN models on the Adult dataset. Fairband should be used in conjunction with a wide portfolio of model choices to achieve fairness.\n\n\n## Citing\n```\n@inproceedings{cruz2021promoting,\n    title={Promoting Fairness through Hyperparameter Optimization},\n    author={Cruz, Andr{\\'{e}} F. and Saleiro, Pedro and Bel{\\'{e}}m, Catarina and Soares, Carlos and Bizarro, Pedro},\n    booktitle={2021 {IEEE} International Conference on Data Mining ({ICDM})},   \n    year={2021},\n    pages={1036-1041},\n    publisher={{IEEE}},\n    url={https://doi.org/10.1109/ICDM51629.2021.00119},\n    doi={10.1109/ICDM51629.2021.00119}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffeedzai%2Ffair-automl","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffeedzai%2Ffair-automl","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffeedzai%2Ffair-automl/lists"}