Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/cliffordnwanna/expresso_churn_prediction_app

The Expresso Churn Prediction App is a machine learning-based web application designed to predict customer churn for Expresso, a telecommunications service provider in Africa. This project demonstrates proficiency in data preprocessing, machine learning model training, and deploying interactive web applications using Streamlit.
https://github.com/cliffordnwanna/expresso_churn_prediction_app

churn-prediction cloud deployment jupyter-notebook machine-learning modeling preprocessing streamlit

Last synced: about 1 month ago
JSON representation

The Expresso Churn Prediction App is a machine learning-based web application designed to predict customer churn for Expresso, a telecommunications service provider in Africa. This project demonstrates proficiency in data preprocessing, machine learning model training, and deploying interactive web applications using Streamlit.

Awesome Lists containing this project

README

        

# **Expresso Churn Prediction**

## **Overview**
This project aims to predict customer churn for **Expresso**, an African telecommunications company, using machine learning techniques. By analyzing customer behavior and usage patterns, the model identifies clients who are likely to churn, helping the company take proactive steps to retain them.

The project involves data preprocessing, training a machine learning classifier, and deploying a simple web application to allow users to make churn predictions based on customer data inputs. The dataset used for this project is provided by the **Expresso Churn Prediction Challenge** on the Zindi platform.

## **Features**
- **Data Preprocessing**: Handle missing values, outliers, and encode categorical features.
- **Machine Learning**: Train a **Random Forest Classifier** to predict customer churn with high accuracy.
- **Web Application**: User-friendly interface built with **Streamlit** to make predictions based on customer data inputs.
- **Deployment**: Easily deploy the application on **Streamlit Cloud** for remote access.

## **Project Structure**
```
Expresso-Churn-Prediction/

├── data/ # Directory for storing datasets
│ └── Expresso_churn_dataset.csv # Dataset file (place it here)

├── notebooks/ # Jupyter notebooks for data exploration & analysis
│ └── data_cleaning.ipynb # Notebook for data preprocessing (optional)

├── models/ # Directory for trained models
│ └── main_trained_model_1.sav # Trained machine learning model file

├── app/ # Streamlit web application
│ └── app.py # Streamlit script for the web app

├── requirements.txt # File for project dependencies
├── README.md # Project overview and documentation
└── .gitignore # File to ignore unnecessary files on GitHub
```

## **Technologies Used**
- **Python**: Programming language for data processing and model building.
- **Pandas & NumPy**: Data manipulation and analysis.
- **Scikit-Learn**: Machine learning model training and evaluation.
- **Streamlit**: Web framework for deploying the app.
- **Pickle**: For saving and loading trained machine learning models.

## **Dataset**
The dataset includes customer behavior data and usage patterns for **2.5 million clients** across two African markets: **Mauritania** and **Senegal**. You can download the dataset from the following link:
[Expresso Churn Dataset](https://drive.google.com/file/d/12_KUHr5NlHO_6bN5SylpkxWc-JvpJNWe/view?usp=sharing)

## **Installation**
To get a copy of the project up and running on your local machine, follow these steps:

1. **Clone the Repository:**
```bash
git clone https://github.com/cliffordnwanna/EXPRESSO_CHURN_PREDICTION_APP.git
cd EXPRESSO_CHURN_PREDICTION_APP
```

2. **Install Dependencies:**
```bash
pip install -r requirements.txt
```

3. **Run the Web Application:**
```bash
streamlit run APP/app.py
```

## **Usage**
1. **Data Preprocessing**:
- The dataset is cleaned by handling missing values, encoding categorical features, and removing outliers.
- The script `Data_cleaning.ipynb` in the `NOTEBOOK/` directory contains the code for data preprocessing and training the Random Forest model.
- The trained model is saved as `main_trained_model_1.sav` in the `models/` directory.

2. **Streamlit Web App**:
- The Streamlit application (`APP/app.py`) allows users to input features and make churn predictions.
- Enter values for customer data fields, click the **"Make Prediction"** button, and see the prediction result (CHURN or NO CHURN).

3. **Example Prediction**:
- Input: `{"user_id": 482, "REGION": 0, "TENURE": 6, ...}`
- Output: **NO CHURN**

## **Deployment**
1. **Deploy on Streamlit Cloud**:
- Go to [Streamlit Cloud](https://share.streamlit.io/).
- Connect your GitHub repository.
- Set the `app/app.py` as the main file.
- Click **Deploy**.

2. **Configure Environment Variables**:
- Ensure the `requirements.txt` file is properly set up for deployment.
- Once deployed, the app will be accessible via a unique URL.

## **Model Performance**
The **Random Forest Classifier** achieved over **85% accuracy** on the test dataset. It was trained on **80% of the data** and validated on **20%**. The model was evaluated using metrics such as **accuracy, precision, recall, and F1-score**.

## **Contributing**
Contributions are welcome! If you have ideas to improve this project, feel free to fork the repository and create a pull request. Here’s how you can contribute:
1. **Fork the Repository**
2. **Create a New Branch** (`git checkout -b feature/YourFeature`)
3. **Commit Changes** (`git commit -m 'Add new feature'`)
4. **Push to the Branch** (`git push origin feature/YourFeature`)
5. **Open a Pull Request**

## **License**
This project is licensed under the **MIT License** - see the [LICENSE](LICENSE) file for details.

## **Contact**
For any inquiries or suggestions, please reach out to:
- **Name:** Chukwuma Nwanna
- **Email:** [email protected]
- **GitHub:** [https://github.com/yourusername](https://github.com/cliffordnwanna)