https://github.com/deaneeth/churn-prediction-model-training
Step-by-step guide to building machine learning models for customer churn prediction, continuing from the data preprocessing phase. The repo covers training, evaluation, and saving of models, with weekly updates.
https://github.com/deaneeth/churn-prediction-model-training
churn-prediction data-science-projects jupyter-notebook machine-learning model-evaluation model-training model-training-and-evaluation python scikit-learn
Last synced: about 2 months ago
JSON representation
Step-by-step guide to building machine learning models for customer churn prediction, continuing from the data preprocessing phase. The repo covers training, evaluation, and saving of models, with weekly updates.
- Host: GitHub
- URL: https://github.com/deaneeth/churn-prediction-model-training
- Owner: deaneeth
- Created: 2025-08-09T21:39:12.000Z (about 2 months ago)
- Default Branch: main
- Last Pushed: 2025-08-09T22:05:57.000Z (about 2 months ago)
- Last Synced: 2025-08-10T00:08:47.995Z (about 2 months ago)
- Topics: churn-prediction, data-science-projects, jupyter-notebook, machine-learning, model-evaluation, model-training, model-training-and-evaluation, python, scikit-learn
- Language: Jupyter Notebook
- Homepage:
- Size: 742 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
Awesome Lists containing this project
README
# π Customer Churn Prediction β Model Training & Evaluation Pipeline
Welcome to the **model training and evaluation** phase of the **Customer Churn Prediction** project! This repo follows the data preprocessing pipeline from [**Customer Churn Prediction β EDA & Data Preprocessing Pipeline**](https://github.com/deaneeth/churn-prediction-data-pipeline), where we prepared the data for churn modeling. Here, we focus on training machine learning models, evaluating their performance, and saving the trained models for future use.
π **This repo is updated weekly** with:
- Clean, progressive Jupyter notebooks
- Raw & processed datasets
- Practical steps using Python, pandas and scikit-learn
- Real-world-style applied model Training & Evaluation for a customer churn analysis---
### π What's Inside?
This repo covers the complete **model training and evaluation pipeline**, built step-by-step:
| Notebook | Description |
|-----------------------------------|-----------------------------------------------------------------------------------------------------------------------------|
| `0_data_preparation.ipynb` | Preparing the data for model training and evaluation. It includes loading datasets and applying necessary transformations. |
| `1_base_model_training.ipynb` | Traning the base machine learning model for the analysis using Logistic regression, and plotting confusion_matrixes. |
| `2_kfold_validation.ipynb` | Performing K-Fold cross-validation to evaluate model performance, calculate metrics, and ensure generalization. |
| `3_multi_model_training.ipynb` | Training and evaluating multiple machine learning models to compare performance and select the best approach. |
| `4_hyperparameter_tuning.ipynb` | Optimizing model performance through hyperparameter tuning using search techniques to find the best parameter settings. |
| `5_threshhold_optimization.ipynb` | Adjusting the classification threshold to improve performance metrics and align predictions with specific objectives. |---
### π Folder Structure:
```
π artifacts/ β Model training results, including training/test data (X, Y) saved as .npz files
π processed/ β Processed data used for model training
π raw/ β Raw input data and initial notebook for data preparation
π Notebooks β Notebooks to prepare data for training, testing and evaluation
```---
### π§ Tools Used:
- Python, Pandas, Scikit-learn
- Matplotlib, Seaborn
- NumPy
- Jupyter Notebooks---
### π― Goals:
- Train machine learning models on the churn prediction dataset
- Evaluate models' performance using various metrics
- Save and export model artifacts (X_train, X_test, Y_train, Y_test)
- Provide a solid template for future machine learning projects---
## π Steps Followed from the Previous Repo
If you havenβt already gone through the **Data preprocessing steps**, make sure to check out the [Customer Churn Prediction β EDA & Data Preprocessing Pipeline](https://github.com/deaneeth/churn-prediction-data-pipeline) repo first. This repo focuses on preprocessing the data, including handling missing values, encoding features, and scaling the dataset, which are essential steps before model training.
---
## π Getting Started
To get started with this repo, clone the repository and install the required dependencies:
```
git clone https://github.com/deaneeth/churn-prediction-model-training.git
cd churn-prediction-model-training
pip install -r requirements.txt
```---
## π Why Youβll Like It:
- π Easy-to-follow structure for model building and evaluation
- π§ Consistent with the preprocessing steps from the previous repo
- π§Ό Learn how to build, evaluate, and save machine learning models in Python
- πΎ Continuous weekly updates with new models, techniques, and results---
## π€ Contribute or Follow Along
This repo is updated **weekly**, with new models, evaluation metrics, and results. Star β the repo to stay updated, and fork π΄ it to experiment with your own models. Contributions & feedback are always welcome β just make sure to check the [contributing guidelines](CONTRIBUTING.md) before submitting any pull requests.
---
### π Want to continue building real-world models for churn prediction?
You're in the right place! Let's train some powerful models together and predict customer churn like a pro.