{"id":30360507,"url":"https://github.com/deaneeth/churn-prediction-model-training","last_synced_at":"2026-05-11T07:06:53.792Z","repository":{"id":309113006,"uuid":"1035199094","full_name":"deaneeth/churn-prediction-model-training","owner":"deaneeth","description":"Step-by-step guide to building machine learning models for customer churn prediction, continuing from the data preprocessing phase. The repo covers training, evaluation, and saving of models, with weekly updates.","archived":false,"fork":false,"pushed_at":"2025-08-09T22:05:57.000Z","size":760,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-08-10T00:08:47.995Z","etag":null,"topics":["churn-prediction","data-science-projects","jupyter-notebook","machine-learning","model-evaluation","model-training","model-training-and-evaluation","python","scikit-learn"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/deaneeth.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-08-09T21:39:12.000Z","updated_at":"2025-08-09T22:06:00.000Z","dependencies_parsed_at":"2025-08-10T00:08:52.440Z","dependency_job_id":"cbf734cc-3f72-4da5-980a-924c63ebc94b","html_url":"https://github.com/deaneeth/churn-prediction-model-training","commit_stats":null,"previous_names":["deaneeth/churn-prediction-model-training"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/deaneeth/churn-prediction-model-training","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/deaneeth%2Fchurn-prediction-model-training","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/deaneeth%2Fchurn-prediction-model-training/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/deaneeth%2Fchurn-prediction-model-training/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/deaneeth%2Fchurn-prediction-model-training/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/deaneeth","download_url":"https://codeload.github.com/deaneeth/churn-prediction-model-training/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/deaneeth%2Fchurn-prediction-model-training/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":271166860,"owners_count":24710583,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-19T02:00:09.176Z","response_time":63,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["churn-prediction","data-science-projects","jupyter-notebook","machine-learning","model-evaluation","model-training","model-training-and-evaluation","python","scikit-learn"],"created_at":"2025-08-19T14:22:56.872Z","updated_at":"2026-05-11T07:06:53.715Z","avatar_url":"https://github.com/deaneeth.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# 🚀 Customer Churn Prediction – Model Training \u0026 Evaluation Pipeline\n\nWelcome to the **model training and evaluation** phase of the **Customer Churn Prediction** project! This repo follows the data preprocessing pipeline from [**Customer Churn Prediction – EDA \u0026 Data Preprocessing Pipeline**](https://github.com/deaneeth/churn-prediction-data-pipeline), where we prepared the data for churn modeling. Here, we focus on training machine learning models, evaluating their performance, and saving the trained models for future use.\n\n\n🚀 **This repo is updated weekly** with:\n- Clean, progressive Jupyter notebooks\n- Raw \u0026 processed datasets\n- Practical steps using Python, pandas and scikit-learn\n- Real-world-style applied model Training \u0026 Evaluation for a customer churn analysis\n\n---\n\n### 📋 What's Inside?\n\nThis repo covers the complete **model training and evaluation pipeline**, built step-by-step:\n\n| Notebook                          | Description                                                                                                                 |\n|-----------------------------------|-----------------------------------------------------------------------------------------------------------------------------|\n| `0_data_preparation.ipynb`        | Preparing the data for model training and evaluation. It includes loading datasets and applying necessary transformations.  |\n| `1_base_model_training.ipynb`     | Traning the base machine learning model for the analysis using Logistic regression, and plotting confusion_matrixes.        |\n| `2_kfold_validation.ipynb`        | Performing K-Fold cross-validation to evaluate model performance, calculate metrics, and ensure generalization.             |\n| `3_multi_model_training.ipynb`    | Training and evaluating multiple machine learning models to compare performance and select the best approach.               |\n| `4_hyperparameter_tuning.ipynb`   | Optimizing model performance through hyperparameter tuning using search techniques to find the best parameter settings.     |\n| `5_threshhold_optimization.ipynb` | Adjusting the classification threshold to improve performance metrics and align predictions with specific objectives.       |\n\n---\n\n### 📁 Folder Structure:\n\n```\n📂 artifacts/ → Model training results, including training/test data (X, Y) saved as .npz files\n📂 processed/ → Processed data used for model training\n📂 raw/ → Raw input data and initial notebook for data preparation\n📓 Notebooks → Notebooks to prepare data for training, testing and evaluation\n```\n\n---\n\n### 🔧 Tools Used:\n\n- Python, Pandas, Scikit-learn\n- Matplotlib, Seaborn\n- NumPy\n- Jupyter Notebooks\n\n---\n\n### 🎯 Goals:\n\n- Train machine learning models on the churn prediction dataset\n- Evaluate models' performance using various metrics\n- Save and export model artifacts (X_train, X_test, Y_train, Y_test)\n- Provide a solid template for future machine learning projects\n\n---\n\n## 📌 Steps Followed from the Previous Repo\n\nIf you haven’t already gone through the **Data preprocessing steps**, make sure to check out the [Customer Churn Prediction – EDA \u0026 Data Preprocessing Pipeline](https://github.com/deaneeth/churn-prediction-data-pipeline) repo first. This repo focuses on preprocessing the data, including handling missing values, encoding features, and scaling the dataset, which are essential steps before model training.\n\n---\n\n## 🚀 Getting Started\n\nTo get started with this repo, clone the repository and install the required dependencies:\n\n```\ngit clone https://github.com/deaneeth/churn-prediction-model-training.git\ncd churn-prediction-model-training\npip install -r requirements.txt\n```\n\n---\n\n## 🌟 Why You’ll Like It:\n\n- 📚 Easy-to-follow structure for model building and evaluation\n- 🧠 Consistent with the preprocessing steps from the previous repo\n- 🧼 Learn how to build, evaluate, and save machine learning models in Python\n- 💾 Continuous weekly updates with new models, techniques, and results\n\n---\n\n## 🤝 Contribute or Follow Along\n\nThis repo is updated **weekly**, with new models, evaluation metrics, and results. Star ⭐ the repo to stay updated, and fork 🍴 it to experiment with your own models. Contributions \u0026 feedback are always welcome — just make sure to check the [contributing guidelines](CONTRIBUTING.md) before submitting any pull requests.\n\n---\n\n### 👀 Want to continue building real-world models for churn prediction?\n\nYou're in the right place! Let's train some powerful models together and predict customer churn like a pro.\n\n---\n\n_Created with ❤️ by [deaneeth](https://github.com/deaneeth)_\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdeaneeth%2Fchurn-prediction-model-training","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdeaneeth%2Fchurn-prediction-model-training","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdeaneeth%2Fchurn-prediction-model-training/lists"}