https://github.com/shichenxie/scorecardpy
Scorecard Development in python, 评分卡
https://github.com/shichenxie/scorecardpy
binning credit-scoring python release scorecard woe woebinning
Last synced: 5 months ago
JSON representation
Scorecard Development in python, 评分卡
- Host: GitHub
- URL: https://github.com/shichenxie/scorecardpy
- Owner: ShichenXie
- License: mit
- Created: 2018-04-24T03:31:58.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2024-08-09T16:17:06.000Z (about 1 year ago)
- Last Synced: 2025-04-03T18:11:18.259Z (6 months ago)
- Topics: binning, credit-scoring, python, release, scorecard, woe, woebinning
- Language: Python
- Homepage: http://shichen.name/scorecard
- Size: 195 KB
- Stars: 738
- Watchers: 35
- Forks: 304
- Open Issues: 36
-
Metadata Files:
- Readme: README.md
- Changelog: NEWS.md
- Funding: .github/FUNDING.yml
- License: LICENSE
Awesome Lists containing this project
README
# scorecardpy
[](https://pypi.python.org/pypi/scorecardpy)
[](https://pypi.python.org/pypi/scorecardpy)
[](http://pepy.tech/project/scorecardpy)
[](https://pepy.tech/project/scorecardpy/month)This package is python version of R package [scorecard](https://github.com/ShichenXie/scorecard).
Its goal is to make the development of traditional credit risk scorecard model easier and efficient by providing functions for some common tasks.
- data partition (`split_df`)
- variable selection (`iv`, `var_filter`)
- weight of evidence (woe) binning (`woebin`, `woebin_plot`, `woebin_adj`, `woebin_ply`)
- scorecard scaling (`scorecard`, `scorecard_ply`)
- performance evaluation (`perf_eva`, `perf_psi`)## Installation
- Install the release version of `scorecardpy` from [PYPI](https://pypi.org/project/scorecardpy/) with:
```
pip install scorecardpy
```- Install the latest version of `scorecardpy` from [github](https://github.com/shichenxie/scorecardpy) with:
```
pip install git+git://github.com/shichenxie/scorecardpy.git
```## Example
This is a basic example which shows you how to develop a common credit risk scorecard:
``` python
# Traditional Credit Scoring Using Logistic Regression
import scorecardpy as sc# data prepare ------
# load germancredit data
dat = sc.germancredit()# filter variable via missing rate, iv, identical value rate
dt_s = sc.var_filter(dat, y="creditability")# breaking dt into train and test
train, test = sc.split_df(dt_s, 'creditability').values()# woe binning ------
bins = sc.woebin(dt_s, y="creditability")
# sc.woebin_plot(bins)# binning adjustment
# # adjust breaks interactively
# breaks_adj = sc.woebin_adj(dt_s, "creditability", bins)
# # or specify breaks manually
breaks_adj = {
'age.in.years': [26, 35, 40],
'other.debtors.or.guarantors': ["none", "co-applicant%,%guarantor"]
}
bins_adj = sc.woebin(dt_s, y="creditability", breaks_list=breaks_adj)# converting train and test into woe values
train_woe = sc.woebin_ply(train, bins_adj)
test_woe = sc.woebin_ply(test, bins_adj)y_train = train_woe.loc[:,'creditability']
X_train = train_woe.loc[:,train_woe.columns != 'creditability']
y_test = test_woe.loc[:,'creditability']
X_test = test_woe.loc[:,train_woe.columns != 'creditability']# logistic regression ------
from sklearn.linear_model import LogisticRegression
lr = LogisticRegression(penalty='l1', C=0.9, solver='saga', n_jobs=-1)
lr.fit(X_train, y_train)
# lr.coef_
# lr.intercept_# predicted proability
train_pred = lr.predict_proba(X_train)[:,1]
test_pred = lr.predict_proba(X_test)[:,1]# performance ks & roc ------
train_perf = sc.perf_eva(y_train, train_pred, title = "train")
test_perf = sc.perf_eva(y_test, test_pred, title = "test")# score ------
card = sc.scorecard(bins_adj, lr, X_train.columns)
# credit score
train_score = sc.scorecard_ply(train, card, print_step=0)
test_score = sc.scorecard_ply(test, card, print_step=0)# psi
sc.perf_psi(
score = {'train':train_score, 'test':test_score},
label = {'train':y_train, 'test':y_test}
)
```