Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/stephen-adwini-badu/10.-bank-churn-project
This project aims to predict customer churn in a banking environment using machine learning models. The goal is to identify patterns in customer behavior that lead to churn and use predictive analytics to classify customers as churners or non-churners.
https://github.com/stephen-adwini-badu/10.-bank-churn-project
churn-prediction data-science jupyter-notebook machine-learning predictive-modeling
Last synced: about 1 month ago
JSON representation
This project aims to predict customer churn in a banking environment using machine learning models. The goal is to identify patterns in customer behavior that lead to churn and use predictive analytics to classify customers as churners or non-churners.
- Host: GitHub
- URL: https://github.com/stephen-adwini-badu/10.-bank-churn-project
- Owner: Stephen-Adwini-Badu
- Created: 2025-01-15T19:21:56.000Z (about 1 month ago)
- Default Branch: main
- Last Pushed: 2025-01-15T19:24:00.000Z (about 1 month ago)
- Last Synced: 2025-01-15T21:44:26.031Z (about 1 month ago)
- Topics: churn-prediction, data-science, jupyter-notebook, machine-learning, predictive-modeling
- Language: Jupyter Notebook
- Homepage:
- Size: 6.87 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Bank Churn Prediction Project
## Project Objective
This project aims to predict customer churn in a banking environment using machine learning models. The goal is to identify patterns in customer behavior that lead to churn and use predictive analytics to classify customers as churners or non-churners.## Dataset Description
Two datasets are utilized:
- **Training Dataset:** Contains labeled data for model training.
- **Testing Dataset:** Unlabeled data for evaluating model performance.## Methodology
### 1. Data Exploration and Preprocessing
- **Data Loading:** Importing datasets for analysis.
- **Data Cleaning:** Handling missing values and duplicate records.
- **Feature Transformation:** Encoding categorical features for machine learning.### 2. Exploratory Data Analysis (EDA)
- **Statistical Analysis:** Summary statistics to understand feature distributions.
- **Visualizations:** Using Seaborn and Matplotlib to explore trends and relationships in the data.### 3. Feature Engineering
- Selection of relevant features to improve model accuracy.
- Transformation of categorical variables using encoding techniques.### 4. Model Training and Evaluation
- Splitting data into training and validation subsets.
- Training various models, including:
- Random Forest Classifier
- Gradient Boosting Classifier
- XGBoost Classifier
- Evaluation metrics include:
- Accuracy
- Precision, Recall, F1-Score
- ROC-AUC Score
- Confusion Matrix Visualization## Machine Learning Models
- **Random Forest Classifier:** Ensemble method that builds multiple decision trees.
- **Gradient Boosting Classifier:** Boosting technique for sequential model improvement.
- **XGBoost Classifier:** Optimized gradient boosting implementation.## Results and Metrics
- **Model Performance:** Metrics for evaluating the effectiveness of predictions, including confusion matrices.
- **Insights:** Identification of key factors contributing to customer churn.
- Key Factors: **Age**, **Active Membership** and **Number of Products**