https://github.com/praveenpolisetty/fraudcost
...
https://github.com/praveenpolisetty/fraudcost
anti-money-laundering calibration cost-sensitive-learning financial-ml fraud-detection imbalanced-classification machine-learning python
Last synced: 9 days ago
JSON representation
...
- Host: GitHub
- URL: https://github.com/praveenpolisetty/fraudcost
- Owner: praveenpolisetty
- License: mit
- Created: 2026-06-04T11:38:10.000Z (13 days ago)
- Default Branch: main
- Last Pushed: 2026-06-04T11:59:31.000Z (13 days ago)
- Last Synced: 2026-06-04T13:16:19.344Z (13 days ago)
- Topics: anti-money-laundering, calibration, cost-sensitive-learning, financial-ml, fraud-detection, imbalanced-classification, machine-learning, python
- Language: Python
- Size: 9.77 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Citation: CITATION.cff
Awesome Lists containing this project
README
# fraudcost — cost-aware thresholding & calibration for fraud models
[](https://github.com/praveenpolisetty/fraudcost/actions/workflows/ci.yml)
[](https://doi.org/10.5281/zenodo.20541781)
[](LICENSE)
[](#install)
**Your fraud model is probably pointed at the wrong number.** Most detectors are tuned for AUC, but a
deployed system decides at a threshold where the two kinds of error cost very different amounts.
`fraudcost` is a tiny, dependency-light library that takes any classifier's scores and gives you:
- **Example-dependent cost** scoring (false negative = transaction amount, false positive = admin cost),
- **Probability calibration** (Platt / isotonic) so scores behave like probabilities,
- **Cost-optimal threshold** selection, and
- The **cost-vs-recall operating curve** that makes the trade-off legible.
No new model. It wraps the scores you already have.
> Companion research: *Paying for Precision: A Cost-Aware Evaluation of Card-Fraud Models on the
> IEEE-CIS Benchmark* (paper in preparation). Reproduces the paper's numbers via `examples/ieee_cis.py`.
## Install
```bash
pip install fraudcost # once published
# or, from source:
pip install -e .
```
## Quickstart
```python
import numpy as np
from fraudcost import CostModel, best_threshold, calibrate, cost_recall_curve
# y: 0/1 labels, scores: model probabilities, amounts: transaction amounts
cm = CostModel(admin_cost=5.0) # $5 per review (false positive / flagged)
cal = calibrate(scores_calib, y_calib, method="isotonic")
scores_cal = cal(scores_test)
t = best_threshold(y_calib, cal(scores_calib), amounts_calib, cm) # pick on calib slice
report = cm.evaluate(y_test, scores_cal, amounts_test, t) # apply to test
print(report) # {'cost': ..., 'false_positives': ..., 'recall': ..., 'threshold': t}
curve = cost_recall_curve(y_test, scores_cal, amounts_test, cm) # for plotting
```
## Why it exists
Calibration and cost-sensitive thresholding are two decades old (Elkan 2001; Zadrozny & Elkan 2002;
Bahnsen et al. 2015), yet they rarely make it into how fraud models are *evaluated and reported*, where
AUC still rules. `fraudcost` makes the cost-aware lens a one-import step so teams can recalibrate and
rethreshold often — cheap, fast, and far less risky than retraining.
## API
| Function | Purpose |
|----------|---------|
| `CostModel(admin_cost, fn_cost="amount")` | defines the example-dependent cost matrix |
| `calibrate(scores, y, method)` | returns a calibration map (`"platt"` or `"isotonic"`) |
| `best_threshold(y, scores, amounts, cost_model)` | cost-minimizing threshold |
| `CostModel.evaluate(y, scores, amounts, t)` | cost / FP / recall at a threshold |
| `cost_recall_curve(y, scores, amounts, cost_model)` | DataFrame for the operating curve |
| `expected_calibration_error(p, y)` | ECE for reliability checks |
## Reproduce the paper
```bash
python examples/ieee_cis.py --data_dir /path/to/ieee-cis --ca 5
```
## Roadmap
- [ ] AML / graph support (Elliptic dataset example)
- [ ] sklearn-compatible `CostAwareClassifier` wrapper
- [ ] cost-curve plotting helpers
- [ ] PyPI release
## Citing
If you use `fraudcost`, please cite the companion paper (see `CITATION.cff`).
## Contributing
Issues and PRs welcome — see `CONTRIBUTING.md`. Licensed under MIT.