https://github.com/razamehar/telco-customer-churn-prediction

Last synced: 4 months ago
JSON representation

Host: GitHub
URL: https://github.com/razamehar/telco-customer-churn-prediction
Owner: razamehar
Created: 2025-06-23T00:29:53.000Z (4 months ago)
Default Branch: main
Last Pushed: 2025-06-23T19:02:37.000Z (4 months ago)
Last Synced: 2025-06-23T20:20:36.350Z (4 months ago)
Language: Jupyter Notebook
Size: 5.87 MB
Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Telco Customer Churn Prediction

This repository contains an end-to-end **MLOps project** built as a final assignment for the [MLOps Zoomcamp](https://github.com/DataTalksClub/mlops-zoomcamp) organized by **DataTalksClub** (site: https://datatalks.club/) and taught by **Alexey Grigorev**.

The project focuses on predicting customer churn for a telecom company using ML pipelines, data versioning, experiment tracking, and model deployment via FastAPI.

## Features

- Data ingestion from Kaggle's Telco Customer Churn dataset
- Data preprocessing (type conversion, NA handling, encoding)
- Train-test split and feature engineering
- Model training using Random Forest with MLflow logging
- Model evaluation with accuracy and precision metrics
- Model persistence with joblib
- Testing using `pytest`
- Code linting using `pylint`
- API deployment using FastAPI
- Experiment tracking using MLflow + DagsHub
- Feature scaling using StandardScaler
- Data versioning using DVC
- Docker containerization for consistent reproducibility

## Example FastAPI Usage

### Root Endpoint
``` http
GET /
```
Response
``` json
{ "message": "This is the Telco Customer Churn Prediction API" }
```

### Predict Endpoint
``` http
POST /predict
```
Sample Request
``` json
{
"gender": "Female",
"tenure": 12,
"MonthlyCharges": 80.5,
"TotalCharges": 956.2
}
```

Response
``` json
{
"prediction": 1
}

```

## Tech Stack

- **Python 3.10**
- **Data & Modeling**: pandas, scikit-learn, joblib
- **Experiment Tracking**: MLflow, DagsHub
- **API & Deployment**: FastAPI
- **Containerization**: Docker
- **Data Ingestion**: Kaggle (via kaggle CLI or KaggleHub)
- **Version Control**: DVC (Data Version Control)
- **Testing & Code Quality**: pytest, pylint

## MLflow Experiment Tracking

- Tracking URI: DagsHub MLflow Dashboard (https://dagshub.com/razamehar/telco-customer-churn-prediction.mlflow)
- Logged artifacts:
- Parameters: max_depth
- Metrics: accuracy, precision
- Artifacts: trained model (.pkl), scaler (.pkl)

## Run Locally
1. Clone the repository
```bash
git clone https://github.com/razamehar/telco-customer-churn-prediction.git
cd telco-customer-churn-prediction
```
2. Install dependencies
```bash
pip install -r requirements.txt
```
3. Start FastAPI app
```bash
uvicorn app.main:app --reload
```

## Docker Support
### Build Docker Image
```bash
docker build -t churn-predictor .
```

### Run Docker Container
```bash
docker run -d -p 8000:8000 churn-predictor
```

## Run Tests
```bash
pytest tests/
```

## Run Linting
```bash
pylint app/*.py
```

## Contact
For any questions or clarifications, please contact Raza Mehar at [raza.mehar@gmail.com](mailto:raza.mehar@gmail.com).

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/razamehar/telco-customer-churn-prediction

Awesome Lists containing this project

README