https://github.com/ceodaniyal/telecom_customer_churn_prediction
A machine learning project that predicts whether a telecom customer will churn (leave the service) using customer demographics, account information, and service usage. The repository includes data preprocessing, model training (with logistic regression), feature scaling, and example predictions.
https://github.com/ceodaniyal/telecom_customer_churn_prediction
classification customer-churn-prediction data-science logistic-regression machine-learning ml-project pandas prediction python scikit-learn streamlit telecom
Last synced: about 1 month ago
JSON representation
A machine learning project that predicts whether a telecom customer will churn (leave the service) using customer demographics, account information, and service usage. The repository includes data preprocessing, model training (with logistic regression), feature scaling, and example predictions.
- Host: GitHub
- URL: https://github.com/ceodaniyal/telecom_customer_churn_prediction
- Owner: ceodaniyal
- Created: 2026-02-08T19:27:46.000Z (4 months ago)
- Default Branch: main
- Last Pushed: 2026-02-09T06:36:21.000Z (4 months ago)
- Last Synced: 2026-05-04T01:39:47.188Z (about 1 month ago)
- Topics: classification, customer-churn-prediction, data-science, logistic-regression, machine-learning, ml-project, pandas, prediction, python, scikit-learn, streamlit, telecom
- Language: Jupyter Notebook
- Homepage: https://telecomcustomerchurnprediction-4ihq7aesvxe8dxfknyzysm.streamlit.app/
- Size: 797 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# ๐ Telecom Customer Churn Prediction
A machine learning project to predict customer churn for a telecom company using historical usage and demographic data. By analyzing customer attributes and behavior, this project builds a predictive model that identifies customers likely to leave (churn), helping businesses improve retention strategies and reduce revenue loss.
---
## ๐ Table of Contents
* ๐ [Problem Statement](#-problem-statement)
* ๐ฏ [Objective](#-objective)
* ๐ฆ [Dataset](#-dataset)
* ๐ ๏ธ [Tech Stack](#-tech-stack)
* ๐ง [Methodology](#-methodology)
* ๐ [Model Training & Evaluation](#-model-training--evaluation)
* ๐ [Usage](#-usage)
* ๐งพ [Project Structure](#-project-structure)
* ๐ [Results & Insights](#-results--insights)
* ๐ [Future Work](#-future-work)
* ๐ [Contact](#-contact)
---
## ๐ Problem Statement
Telecom companies face a major challenge: **customer churn** โ when existing customers discontinue services for a competitor. As acquiring new customers is significantly more expensive than retaining existing ones, predicting churn to proactively retain high-risk customers is critical for profitability and strategic decision-making.
---
## ๐ฏ Objective
Build a robust machine learning model that:
* Predicts whether a customer will churn or not.
* Identifies key factors contributing to churn.
* Supports data-driven customer retention strategies.
---
## ๐ฆ Dataset
The project uses the `Telco_Customer_Churn.csv` dataset, containing customer information such as:
* Customer demographics (gender, senior citizen status, dependents)
* Account details (tenure, contract type, billing method)
* Service subscriptions (internet, tech support, online security)
* Financial details (monthly charges, total charges)
* Target variable: `Churn` (Yes/No)
---
## ๐ ๏ธ Tech Stack
This project utilizes:
* ๐ **Python**
* ๐ **pandas**, **NumPy**
* ๐ **matplotlib**, **seaborn**
* ๐ค **scikit-learn** for ML modeling
* ๐ **joblib / pickle** for model persistence
* ๐ง Jupyter Notebook for experimentation
---
## ๐ง Methodology
1. **Data Cleaning & Preprocessing**
* Handle missing values
* Encode categorical features
* Scale/Normalize numerical features using `MinMaxScaler` (saved as `minmax_scaler.joblib`)
2. **Exploratory Data Analysis (EDA)**
* Understand customer distribution by churn
* Analyze patterns across features like contract type and monthly charges
3. **Model Training**
* Train classification models
* Start with baseline models like Logistic Regression
* Save best model (`logistic_regression.pkl`)
4. **Evaluation**
* Accuracy, Precision, Recall
* Confusion matrix and other metrics
---
## ๐ Model Training & Evaluation
The Logistic Regression model is trained to classify customers as either:
* **Churn = Yes**
* **Churn = No**
The trained model and preprocessor are stored as:
* `logistic_regression.pkl` โ Trained ML model
* `minmax_scaler.joblib` โ Preprocessing scaler
You can evaluate performance on a hold-out test set or cross-validation.
---
## ๐ Usage
### ๐ป Run the Prediction Script
1. Clone the repository:
```bash
git clone https://github.com/ceodaniyal/telecom_customer_churn_prediction.git
cd telecom_customer_churn_prediction
```
2. Install dependencies:
```bash
pip install -r requirements.txt
```
3. Run the prediction script:
```bash
python main.py
```
### ๐ Prediction
Provide customer feature values via the script interface or API endpoint (if integrated) to get churn predictions.
---
## ๐งพ Project Structure
```
telecom_customer_churn_prediction/
โโโ Telco_Customer_Churn.csv # Churn dataset
โโโ telecom_customer_churn_prediction.ipynb # Notebook with EDA & modeling
โโโ main.py # Inference script
โโโ logistic_regression.pkl # Saved trained model
โโโ minmax_scaler.joblib # Preprocessing scaler
โโโ pyproject.toml # Project metadata / dependencies
โโโ .gitignore
โโโ README.md
```
---
## ๐ Results & Insights
Typical insights from this kind of churn prediction (can be updated with your actual results):
* ๐ **Monthly charges**, **Contract type**, and **Tenure** often strongly correlate with churn likelihood.
* ๐งโ๐คโ๐ง Customers with **month-to-month contracts churn more** than those on long-term plans. ([GitHub][1])
* ๐ **Paperless billing customers** tend to show higher churn rates. ([GitHub][1])
---
## ๐ Future Work
Future improvements could include:
* Feature engineering (interaction terms, tenure buckets, etc.)
* Hyperparameter tuning (GridSearch / RandomSearch)
* Ensemble methods like Random Forest / Gradient Boosting
* Handling class imbalance (SMOTE)
* Deployment (Flask/Streamlit app)
---
## ๐ Contact
Have questions or feedback? Reach out:
๐ง **Email:** [kdaniyal7865@gmail.com](mailto:kdaniyal7865@gmail.com)
[1]: https://github.com/Pradnya1208/Telecom-Customer-Churn-prediction?utm_source=chatgpt.com "GitHub - Pradnya1208/Telecom-Customer-Churn-prediction: Customers in the telecom industry can choose from a variety of service providers and actively switch from one to the next. With the help of ML classification algorithms, we are going to predict the Churn."