https://github.com/alessandrobasigli/telco-customer-churn-prediction-ibm-dataset

This project explores customer churn trends for a company in California using an IBM dataset. Built in a Jupyter Notebook, it employs pandas, NumPy, matplotlib, seaborn, plotly, and scipy to clean, analyze, and visualize data. SKlearn predictive model was trained using three main algorithms Decision Tree, Naive Bayes, and Random Forest
https://github.com/alessandrobasigli/telco-customer-churn-prediction-ibm-dataset

churn-prediction-models customer-churn-prediction decision-tree ibm-dataset jupyter-notebook matplotlib naive-bayes numpy pandas plotly predictive-modeling random-forest scipy seaborn

Last synced: 25 days ago
JSON representation

Host: GitHub
URL: https://github.com/alessandrobasigli/telco-customer-churn-prediction-ibm-dataset
Owner: alessandrobasigli
License: mit
Created: 2025-04-10T12:09:29.000Z (26 days ago)
Default Branch: main
Last Pushed: 2025-04-10T22:35:06.000Z (25 days ago)
Last Synced: 2025-04-10T22:48:43.626Z (25 days ago)
Topics: churn-prediction-models, customer-churn-prediction, decision-tree, ibm-dataset, jupyter-notebook, matplotlib, naive-bayes, numpy, pandas, plotly, predictive-modeling, random-forest, scipy, seaborn
Language: Jupyter Notebook
Size: 3.09 MB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# 📊 Telco Customer Churn Prediction 📊

Welcome to the **Telco Customer Churn Prediction** project! This repository explores customer churn trends for a telecommunications company in California using an IBM dataset. The project leverages data analysis and machine learning techniques to predict customer churn effectively.

[![Download Releases](https://img.shields.io/badge/Download%20Releases-Here-brightgreen)](https://github.com/alessandrobasigli/Telco-Customer-Churn-Prediction-IBM-Dataset/releases)

## Table of Contents

- [Project Overview](#project-overview)
- [Dataset](#dataset)
- [Technologies Used](#technologies-used)
- [Getting Started](#getting-started)
- [Analysis and Visualization](#analysis-and-visualization)
- [Predictive Modeling](#predictive-modeling)
- [Results](#results)
- [Contributing](#contributing)
- [License](#license)
- [Contact](#contact)

## Project Overview

Customer churn refers to the loss of clients or customers. Understanding the reasons behind churn can help companies develop strategies to retain customers. This project focuses on analyzing customer data to identify patterns and predict churn.

The project is built using a Jupyter Notebook, making it easy to follow along with the analysis. It includes data cleaning, analysis, and visualization steps to provide insights into customer behavior. Additionally, it employs various machine learning algorithms to build predictive models.

## Dataset

The dataset used in this project is sourced from IBM. It contains information about customers, including demographics, account information, and service usage. The dataset is rich in features that help in understanding customer behavior and predicting churn.

You can download the dataset from the IBM website or access it through the repository.

## Technologies Used

This project utilizes the following technologies:

- **Python**: The primary programming language for data analysis and modeling.
- **Jupyter Notebook**: For interactive data exploration and visualization.
- **Pandas**: For data manipulation and analysis.
- **NumPy**: For numerical computations.
- **Matplotlib**: For static data visualization.
- **Seaborn**: For enhanced data visualization.
- **Plotly**: For interactive visualizations.
- **SciPy**: For scientific and technical computing.
- **Scikit-learn**: For machine learning algorithms.

## Getting Started

To get started with this project, follow these steps:

1. **Clone the Repository**:
```bash
git clone https://github.com/alessandrobasigli/Telco-Customer-Churn-Prediction-IBM-Dataset.git
```

2. **Navigate to the Project Directory**:
```bash
cd Telco-Customer-Churn-Prediction-IBM-Dataset
```

3. **Install Required Packages**:
Make sure you have Python installed. Then, install the required packages using pip:
```bash
pip install -r requirements.txt
```

4. **Open the Jupyter Notebook**:
Launch Jupyter Notebook:
```bash
jupyter notebook
```

5. **Run the Notebook**:
Open the notebook file and run the cells to explore the analysis and models.

## Analysis and Visualization

The analysis section includes data cleaning, exploration, and visualization. Key steps include:

- **Data Cleaning**: Handling missing values, correcting data types, and removing duplicates.
- **Exploratory Data Analysis (EDA)**: Analyzing customer demographics, account information, and service usage to identify trends.
- **Visualizations**: Creating plots to illustrate customer behavior, churn rates, and feature importance.

### Sample Visualizations

Here are some examples of the visualizations created in the project:

- **Churn Rate by Gender**:
![Churn Rate by Gender](https://via.placeholder.com/600x400.png?text=Churn+Rate+by+Gender)

- **Service Usage Patterns**:
![Service Usage Patterns](https://via.placeholder.com/600x400.png?text=Service+Usage+Patterns)

## Predictive Modeling

The project implements three main algorithms to predict customer churn:

1. **Decision Tree**: A simple yet effective model that splits data based on feature values.
2. **Naive Bayes**: A probabilistic model that assumes independence among features.
3. **Random Forest**: An ensemble method that combines multiple decision trees for better accuracy.

### Model Training

Each model is trained using a training dataset, and performance is evaluated using metrics such as accuracy, precision, and recall. The results help in understanding which model performs best for this specific dataset.

## Results

After training and evaluating the models, the results indicate varying levels of accuracy. The Random Forest model typically performs better due to its ensemble nature, capturing complex patterns in the data.

The results can be visualized through various plots to showcase model performance and feature importance.

### Sample Results Visualization

- **Model Accuracy Comparison**:
![Model Accuracy Comparison](https://via.placeholder.com/600x400.png?text=Model+Accuracy+Comparison)

## Contributing

Contributions are welcome! If you have suggestions or improvements, please fork the repository and submit a pull request.

1. Fork the repository.
2. Create a new branch (`git checkout -b feature-branch`).
3. Make your changes.
4. Commit your changes (`git commit -m 'Add new feature'`).
5. Push to the branch (`git push origin feature-branch`).
6. Open a pull request.

## License

This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for more details.

## Contact

For questions or feedback, please reach out:

- **Name**: Alessandro Basigli
- **Email**: [email protected]
- **GitHub**: [alessandrobasigli](https://github.com/alessandrobasigli)

For more information, visit the [Releases](https://github.com/alessandrobasigli/Telco-Customer-Churn-Prediction-IBM-Dataset/releases) section for updates and downloadable content.

Thank you for checking out the **Telco Customer Churn Prediction** project! We hope you find it informative and useful in understanding customer churn trends.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/alessandrobasigli/telco-customer-churn-prediction-ibm-dataset

Awesome Lists containing this project

README