https://github.com/rebeccamorolong/mtn-customer-churn-prdiction
This project addresses a real-world business problem: predicting customer churn for MTN, a major telecommunications company. Customer churn significantly impacts profitability, and accurate predictions enable targeted retention efforts.
https://github.com/rebeccamorolong/mtn-customer-churn-prdiction
anova-test matplotlib-pyplot numpy pandas python seaborn
Last synced: about 1 month ago
JSON representation
This project addresses a real-world business problem: predicting customer churn for MTN, a major telecommunications company. Customer churn significantly impacts profitability, and accurate predictions enable targeted retention efforts.
- Host: GitHub
- URL: https://github.com/rebeccamorolong/mtn-customer-churn-prdiction
- Owner: RebeccaMorolong
- Created: 2025-05-06T16:22:12.000Z (6 months ago)
- Default Branch: main
- Last Pushed: 2025-08-05T09:52:14.000Z (3 months ago)
- Last Synced: 2025-08-17T01:34:40.130Z (3 months ago)
- Topics: anova-test, matplotlib-pyplot, numpy, pandas, python, seaborn
- Language: Jupyter Notebook
- Homepage:
- Size: 180 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# MTN Customer Churn Prediction
# 🎯 Business Problem
MTN observed a decline in customer retention, resulting in revenue loss. The objective was to:
Predict the likelihood of churn for each customer.
Identify key drivers of churn.
Provide actionable insights for retention strategies.
# 2️⃣ Statistical Framing
Translated the business problem into a binary classification task (churn = 1, no churn = 0).
Conducted Chi-Square Tests for independence between categorical variables and churn.
Analyzed numeric feature distributions across churn classes.
Created new features to capture customer tenure, service usage, and contract types.
# 3️⃣ Modeling
Baseline Model: Logistic Regression for interpretability.
Advanced Models:
Random Forest Classifier
XGBoost Classifier
Validation:
Stratified K-Fold Cross-Validation
Evaluation metrics: Accuracy, Precision, Recall, F1 Score, ROC-AUC
# 4️⃣ Model Evaluation
Compared model performance:
Logistic Regression ROC-AUC: ~0.78
Random Forest ROC-AUC: ~0.83
XGBoost ROC-AUC: ~0.88 (best performance)
Plotted confusion matrices and ROC curves.
# 📊 Key Statistical Techniques
Logistic Regression
Chi-Square Test of Independence
Cross-Validation
ROC Curve Analysis
Feature Importance Analysis
XGBoost Gradient Boosting
# ✅ Outcomes and Business Impact
Top churn predictors identified:
Contract type (month-to-month contracts have higher churn risk)
Tenure (newer customers are more likely to churn)
Monthly charges (higher charges correlate with churn)
Developed a predictive model enabling MTN to:
Prioritize high-risk customers for targeted offers.
Improve retention campaigns and reduce churn-related losses.
# 🚀 Next Steps
Deploy the model as an API for real-time scoring.
Integrate predictions into CRM workflows.
Conduct A/B testing of retention interventions.
# 📂 Repository Contents
notebooks/: Exploratory Data Analysis and Modeling
data/: Cleaned dataset (anonymized sample)
scripts/: Model training and evaluation scripts
outputs/: Visualizations and performance metrics
# 🔗 Links
📂 Project Repository
📄 Detailed Report or Notebook
🙋♀️ Contact
Rebecca Morolong
LinkedIn
Email
## 🖼️ Example Visualizations





