https://github.com/s1dewalker/credit-risk-modeling-in-python
Credit risk modeling | EDA | Python | SQL | Model Validation and Tuning | Classifier
https://github.com/s1dewalker/credit-risk-modeling-in-python
bagging-ensemble classifier credit-card credit-risk credit-risk-analysis exploratory-data-analysis model-validation predictive-modeling python random-forest risk risk-modelling sql tuning
Last synced: 2 months ago
JSON representation
Credit risk modeling | EDA | Python | SQL | Model Validation and Tuning | Classifier
- Host: GitHub
- URL: https://github.com/s1dewalker/credit-risk-modeling-in-python
- Owner: s1dewalker
- Created: 2024-10-30T00:45:52.000Z (12 months ago)
- Default Branch: main
- Last Pushed: 2024-12-19T13:12:43.000Z (10 months ago)
- Last Synced: 2025-03-30T23:14:32.309Z (6 months ago)
- Topics: bagging-ensemble, classifier, credit-card, credit-risk, credit-risk-analysis, exploratory-data-analysis, model-validation, predictive-modeling, python, random-forest, risk, risk-modelling, sql, tuning
- Language: Jupyter Notebook
- Homepage:
- Size: 1.86 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Credit Risk Modeling in Python
## Exploratory data analysis (EDA) on credit data and credit risk modeling
### [Python](https://github.com/s1dewalker/Credit-Risk-Modeling-in-Python/blob/main/credit_risk_modeling-2.ipynb) : EDA + Credit Risk Modeling + Model Validation + Tuning
### [SQL](https://github.com/s1dewalker/Credit-Risk-Modeling-in-Python/blob/main/SQLQuery_cr_loan2.sql) : EDA + Data Cleaning
**EDA**: Exploring the data, `drop_duplicates`, finding anomalies or outliers, handling missing values with `fillna()` or `dropna()`, using `crosstab` for pivot tables.
**Risk Modeling**: Using `RandomForestClassifier` with error metrics like recall, F1-score. Dealing with Underfitting (high training error) and overfitting (testing error >> training error). Validating models with cross validation methods.
## Analysing the 5 Cs of credit
- Character - borrower's credit history / creditworthiness, customer segmentation, demographics, card type, usage
- Capacity - income (history of stable income)
- Capital - savings, invvestments
- Collateral - loan, tenure
- Conditions - purpose of credit, economy, employment type## Error metrics
Confusion matrix:
For credit card data, **recall** is the most important, since we want to minimize false negatives (FN).
Meaning, the actual frauds that were not predicted correctly.