https://github.com/l1ght14/customer-churn-prediction
Predict customer churn using machine learning models like Logistic Regression and Random Forest. Includes data preprocessing, model evaluation, feature importance, and insights to drive retention strategies.
https://github.com/l1ght14/customer-churn-prediction
churn-prediction classification customer-churn customer-churn-prediction data-analysis logistic-regression machine-learning python random-forest scikit-learn telecom
Last synced: 8 days ago
JSON representation
Predict customer churn using machine learning models like Logistic Regression and Random Forest. Includes data preprocessing, model evaluation, feature importance, and insights to drive retention strategies.
- Host: GitHub
- URL: https://github.com/l1ght14/customer-churn-prediction
- Owner: l1ght14
- License: mit
- Created: 2025-04-10T09:57:15.000Z (6 months ago)
- Default Branch: main
- Last Pushed: 2025-04-10T10:00:22.000Z (6 months ago)
- Last Synced: 2025-04-10T11:37:15.023Z (6 months ago)
- Topics: churn-prediction, classification, customer-churn, customer-churn-prediction, data-analysis, logistic-regression, machine-learning, python, random-forest, scikit-learn, telecom
- Language: Jupyter Notebook
- Homepage:
- Size: 2.75 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Customer Churn Prediction
This project uses machine learning to predict customer churn based on service usage, contract type, billing method, and demographic details from the Telco Customer dataset.
## Dataset
- **Source**: [Kaggle - Telco Customer Churn](https://www.kaggle.com/datasets/blastchar/telco-customer-churn)
- **Records**: 7032 customers
- **Target**: `Churn` (Yes/No)## Project Goals
- Predict whether a customer is likely to churn
- Identify the most important features influencing churn
- Compare different classification models## Key Steps
- Data cleaning (handling TotalCharges nulls)
- Label and one-hot encoding for categorical features
- Train-test split with stratification
- Model training: Logistic Regression & Random Forest
- Evaluation using accuracy, recall, and F1-score
- Feature importance visualization## Results
| Model | Accuracy | Recall (Churn) | F1-Score (Churn) |
|---------------------|----------|----------------|------------------|
| Logistic Regression | 79.9% | 57% | 60% |
| Random Forest | 78.5% | 50% | 55% |> Logistic Regression performed best on recall and F1 for churn class.
## Feature Insights
Top predictors of churn:
- TotalCharges
- Tenure
- MonthlyCharges
- Contract Type
- Internet Service Type
- Payment Method## Folder Structure
customer-churn-prediction/ ├── models/ │ ├── logistic_model.pkl │ └── random_forest_model.pkl ├── churn_prediction.ipynb ├── WA_Fn-UseC_-Telco-Customer-Churn.csv ├── README.md
## Tools Used
- Python
- Pandas, NumPy
- scikit-learn, joblib
- Matplotlib, Seaborn## Author
Prakash Sharma