Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/sergioisidoro/aequalis
Discrimination free Naive Bayes - Replication of methods in the research paper Calders10
https://github.com/sergioisidoro/aequalis
Last synced: 23 days ago
JSON representation
Discrimination free Naive Bayes - Replication of methods in the research paper Calders10
- Host: GitHub
- URL: https://github.com/sergioisidoro/aequalis
- Owner: sergioisidoro
- License: gpl-2.0
- Created: 2015-11-14T23:10:42.000Z (about 9 years ago)
- Default Branch: master
- Last Pushed: 2016-01-18T22:16:11.000Z (almost 9 years ago)
- Last Synced: 2024-10-05T20:41:13.174Z (3 months ago)
- Language: Python
- Homepage:
- Size: 1.86 MB
- Stars: 3
- Watchers: 2
- Forks: 3
- Open Issues: 0
-
Metadata Files:
- Readme: README.MD
- License: LICENSE
Awesome Lists containing this project
README
## Discrimination free Naive Bayes
This is a project for a seminar on Fairness-aware machine learning (Autumn 2015)
It aims to implement some of the methods described in the paper Calders10 (see research folder),
introducing ways to make a Naive Bayes classifier non discriminatory## Code
The majority of the code can be found in bayes.py
Some auxiliary functions for the models can be found in bayes_utils. (some
credit to Jason Brownlee -
http://machinelearningmastery.com/naive-bayes-classifier-scratch-python/ -
as some of his code was used to reduce boilerplate code)# Binary Bayes Model
A simple Naive Bayes Model
# Split Fair Bayes Model (2M Model)
This model splits the dataset into ```n``` subsets, one for each of the values
of the sensitive parameter that we do not want to discriminate against.By creating a model for each of the subset, it minimizes the discrimination
of the model.# Balanced Bayes Model (MODIFIED BAYES)
This model balances the dataset, making it more fair, tweaking the likelihood by
changing the number of occurrences in the model. It makes it, however, without
disturbing the probability of classifying sample x as the positive class (eg.
if we talk about loan attribution, we want to keep the total number of
loans the same as before)# RESULTS
```
NORMAL MODEL:
Accuracy: 78.4165591794
Discrimination score: 0.3985127410182M MODEL:
Accuracy: 78.5332596278
Discrimination score: 0.165316892258MODIFIED MODEL:
Accuracy: 75.9412812481
Discrimination score: -0.0101233590263```