https://github.com/dagiteferi/credit-scoring-model
Developing a comprehensive and effective credit scoring model for Bati Bank, leveraging data from an eCommerce platform to enable a buy-now-pay-later service for customers. This project involves understanding credit scoring methodologies, exploratory data analysis (EDA), feature engineering, model training and evaluation, and API development for re
https://github.com/dagiteferi/credit-scoring-model
credit-scoring fastapi mlops
Last synced: 2 months ago
JSON representation
Developing a comprehensive and effective credit scoring model for Bati Bank, leveraging data from an eCommerce platform to enable a buy-now-pay-later service for customers. This project involves understanding credit scoring methodologies, exploratory data analysis (EDA), feature engineering, model training and evaluation, and API development for re
- Host: GitHub
- URL: https://github.com/dagiteferi/credit-scoring-model
- Owner: dagiteferi
- Created: 2025-01-22T17:14:22.000Z (6 months ago)
- Default Branch: main
- Last Pushed: 2025-03-11T21:37:31.000Z (4 months ago)
- Last Synced: 2025-04-01T00:48:16.291Z (4 months ago)
- Topics: credit-scoring, fastapi, mlops
- Language: Jupyter Notebook
- Homepage: https://credit-scoring-frontend.onrender.com/
- Size: 3.49 MB
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Credit Scoring Model for Bati Bank

*Empowering Financial Inclusion with AI-Driven Credit Risk Assessment*---
## 📖 Table of Contents
- [Introduction](#introduction)
- [✨ Features](#-features)
- [📂 Project Structure](#-project-structure)
- [⚙️ Installation](#️-installation)
- [🚀 Usage](#-usage)
- [Running the Backend](#running-the-backend)
- [Using the Frontend](#using-the-frontend)
- [Making API Predictions](#making-api-predictions)
- [🔍 Exploratory Data Analysis (EDA)](#-exploratory-data-analysis-eda)
- [🛠️ Feature Engineering](#️-feature-engineering)
- [🤖 Model Training and Evaluation](#-model-training-and-evaluation)
- [🔮 Model Explainability](#-model-explainability)
- [🌐 API Development](#-api-development)
- [💻 Frontend Interface](#-frontend-interface)
- [🚀 Deployment](#-deployment)
- [🎯 Challenges and Solutions](#-challenges-and-solutions)
- [🔮 Future Work](#-future-work)
- [🤝 Contributing](#-contributing)---
## Introduction
The **Credit Scoring Model for Bati Bank** is an AI-powered platform designed to assess credit risk using eCommerce transaction data. This solution enables financial inclusion through:
- **📈 Accurate Predictions**: Random Forest model achieves **ROC-AUC: 0.9998**
- **🔍 Transparent Decisions**: SHAP explanations and feature importance visualizations
- **⚡ Real-Time Processing**: FastAPI backend with <100ms response times
- **📱 Mobile-First Interface**: Responsive design accessible on all devices---
## ✨ Features
- **Automated Data Pipelines**
- RFMS scoring (Recency, Frequency, Monetary, Score)
- WoE encoding for categorical features
- **Advanced Modeling**
- Hyperparameter-tuned Random Forest & Logistic Regression
- Cross-validation with stratified sampling
- **Production-Ready Deployment**
- Dockerized environment
- CI/CD pipeline with GitHub Actions
- **User-Centric Interface**
- Dual form system (Quick/Detailed assessment)
- Interactive risk visualization dashboard---
## 📂 Project Structure
~~~bash
dagiteferi-credit-scoring-model/
├── 📁 credit_scoring_app/ # FastAPI backend
├── 📁 models/ # Serialized ML models
├── 📁 notebooks/ # Jupyter analysis notebooks
├── 📁 scripts/ # Data processing scripts
├── 📁 static/ # CSS/JS assets
└── 📁 tests/ # Unit/integration tests
~~~---
## ⚙️ Installation
```bash
git clone https://github.com/your-repo/dagiteferi-credit-scoring-model.git
cd dagiteferi-credit-scoring-model
python3 -m venv venv
source venv/bin/activate # Linux/Mac
venv\Scripts\activate # Windows
pip install -r requirements.txt## 🚀 Usage
### Running the Backend
```bash
cd credit_scoring_app
uvicorn main:app --host 0.0.0.0 --port 8000
```
### Using the Frontend
Access at http://localhost:8000/static/index.html
#### Making API Predictions
```bash
curl -X POST "http://localhost:8000/predict/good" \
-H "Content-Type: application/json" \
-d '{
"TransactionId": 1,
"Amount": 0.05,
"FraudResult": 0
}'
```
## 🔍 Exploratory Data Analysis (EDA)**Key Insights:**
- 🎯 **Class Imbalance**: Only 0.2% fraud cases
- 📉 **Skewed Distributions**: Transaction amounts follow power law
- 🔗 **Strong Correlations**:
- `RFMS_score` ↔ `Total_Transaction_Amount` (ρ=0.89)
- `Transaction_Count` ↔ `Product_Variety` (ρ=0.76)---
## 🛠️ Feature Engineering
**Transformations Applied:**
1. **Temporal Features**
- Transaction hour/day/month
- Time since last transaction
2. **Aggregate Features**
- 30-day rolling transaction count
- Customer lifetime value---
## 🤖 Model Training and Evaluation
| Model | ROC-AUC | Precision | Recall | F1-Score |
|---------------------|---------|-----------|--------|----------|
| Random Forest | 0.9998 | 0.997 | 0.998 | 0.997 |
| Logistic Regression | 0.9962 | 0.982 | 0.961 | 0.971 |---
## 🔮 Model Explainability
**SHAP Analysis:**
- **Top Predictive Features**:
1. `Total_Transaction_Amount` (SHAP value: 1.42)
2. `RFMS_score` (SHAP value: 1.18)
3. `Transaction_Recency` (SHAP value: 0.76)---
## 🌐 API Development
**Endpoints:**
```python
@app.post("/predict/good")
async def predict_good_risk(data: CustomerData):
return predict(data, model_path="models/RandomForest_best_model.pkl")