https://github.com/storopoli/udacity-compas

Udacity's ML Engineer nanodegree capstone project - COMPAS Fair Classifier trained/tuned/deployed in AWS SageMaker
https://github.com/storopoli/udacity-compas

aws binary-classification classification fairness fairness-ml gradient-boosting machine-learning sagemaker xgboost

Last synced: 7 months ago
JSON representation

Udacity's ML Engineer nanodegree capstone project - COMPAS Fair Classifier trained/tuned/deployed in AWS SageMaker

Host: GitHub
URL: https://github.com/storopoli/udacity-compas
Owner: storopoli
License: mit
Created: 2020-06-05T15:52:48.000Z (over 5 years ago)
Default Branch: master
Last Pushed: 2020-06-07T20:22:32.000Z (over 5 years ago)
Last Synced: 2024-11-20T03:41:49.849Z (11 months ago)
Topics: aws, binary-classification, classification, fairness, fairness-ml, gradient-boosting, machine-learning, sagemaker, xgboost
Language: Jupyter Notebook
Size: 1.1 MB
Stars: 0
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          # COMPAS Fair Classifier

This is a Capstone Project for the Udacity's [Machine Learning Engineer nanodegree](https://www.udacity.com/course/machine-learning-engineer-nanodegree--nd009t). The goal is to **train/tune/deploy a fair binary classifier** for recidivism using [COMPAS data](https://github.com/propublica/compas-analysis).

## COMPAS 2016 Scandal

In May 2016, [ProPublica published a report](https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing) regarding a judicial decision support algorithm that outputs a risk score for defendant redicivism. This algorithm is called COMPAS, short for *Correctional Offender Management Profiling for Alternative Sanctions*. It has been shown that COMPAS is extremely biased toward african-american offenders when compared to caucasian offenders for the same prior/post offenses. 

## Model

The model employed was [SageMaker's `XGBoost`](https://docs.aws.amazon.com/sagemaker/latest/dg/xgboost.html) tunned to maximize MAP (mean average precision) in order to deal with original COMPAS' model unbalanced false positive rates.

## Results

### Original COMPAS (Dressel & Farid, 2018)

**Accuracy**: 60.6%

|                         | **African-American** | **Caucasian** |

| ----------------------- | -------------------- | ------------- |

| **False Positive Rate** | 40.4%                | 25.4%         |

### Proposed Model

**Accuracy**: 100%

|                         | **African-American** | **Caucasian** |

| ----------------------- | -------------------- | ------------- |

| **False Positive Rate** | 0%                   | 0%            |

## Author

 Jose Storopoli, PhD - [ORCID](https://orcid.org/0000-0002-0559-5176) - [CV](https://storopoli.github.io)

[thestoropoli@gmail.com](mailto:thestoropoli@gmail.com)

## References

Dressel, J., & Farid, H. (2018). The accuracy, fairness, and limits of predicting recidivism. *Science advances*, *4*(1), eaao5580. https://doi.org/10.1126/sciadv.aao5580

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/storopoli/udacity-compas

Awesome Lists containing this project

README