Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/ashishsingh789/customer_purchase_prediction_using_decision-tree-_classifier
Decision Tree Classifier to predict customer purchases using demographic and behavioral data. Key steps: data preprocessing, EDA, model training, evaluation, and feature importance analysis.
https://github.com/ashishsingh789/customer_purchase_prediction_using_decision-tree-_classifier
data datascience desiciontree eda machine-learning-algorithms matplotlib numpy pandas-dataframe python seaborn
Last synced: about 10 hours ago
JSON representation
Decision Tree Classifier to predict customer purchases using demographic and behavioral data. Key steps: data preprocessing, EDA, model training, evaluation, and feature importance analysis.
- Host: GitHub
- URL: https://github.com/ashishsingh789/customer_purchase_prediction_using_decision-tree-_classifier
- Owner: AshishSingh789
- Created: 2024-10-01T17:18:45.000Z (about 2 months ago)
- Default Branch: main
- Last Pushed: 2024-10-01T17:28:02.000Z (about 2 months ago)
- Last Synced: 2024-10-19T12:04:27.703Z (27 days ago)
- Topics: data, datascience, desiciontree, eda, machine-learning-algorithms, matplotlib, numpy, pandas-dataframe, python, seaborn
- Language: Jupyter Notebook
- Homepage:
- Size: 130 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Customer Purchase Prediction Using Decision Tree Classifier
# Project Overview
This project involves building a decision tree classifier to predict whether a customer will purchase a product or service. The prediction is based on demographic and behavioral data. The dataset used is similar to the Bank Marketing dataset from the UCI Machine Learning Repository.
# Task Description
The task was part of my virtual internship at Prodigy Infotech. The main objective was to build a machine learning model (Decision Tree) that can classify customers based on whether they are likely to purchase a product or service. This model helps in understanding which customer segments are most likely to convert, aiding marketing and sales efforts.
# Key Steps Performed
# Data Preprocessing:
#Handled missing values.
Categorical variables were encoded using techniques like one-hot encoding.
Scaled and normalized the data for better model performance.
Exploratory Data Analysis (EDA):# Analyzed the distribution of variables (age, job type, marital status, etc.).
Examined relationships between key variables and purchase decisions.
Visualized correlations between features using heatmaps.
Building the Decision Tree Classifier:# Split the data into training and testing sets.
Trained a decision tree classifier to predict customer purchases.
Optimized the model by tuning hyperparameters like maximum depth, splitting criteria, etc.
Model Evaluation:# Evaluated the model using metrics like accuracy, precision, recall, and F1-score.
Generated a confusion matrix to visualize performance.
Feature Importance:Analyzed which features contributed the most to the model's decision-making process.
# Libraries Used
pandas for data manipulation
matplotlib and seaborn for data visualization
scikit-learn for machine learning model building and evaluation
NumPy for numerical operations# Dataset
The dataset used is similar to the Bank Marketing dataset from the UCI Machine Learning Repository, containing features like:
Age: Age of the customer.
Job: Type of job.
Marital Status: Whether the customer is married, single, or divorced.
Education: Level of education.
Balance: Bank balance of the customer.
Campaign: Number of contacts performed during this campaign.
Previous Outcome: Outcome of the previous marketing campaign.
Purchase: Whether the customer purchased the product (target variable).# Results
The decision tree classifier performed well in predicting customer purchases, with a focus on improving accuracy and interpretability through feature importance analysis.
# How to Use This Repository
Could you clone the repository to your local machine?
Install the required libraries by running the following:
bash
Copy code
pip install -r requirements.txt
Run the Jupyter Notebook or Python scripts to see the full analysis and model-building process.
Future WorkImplement other machine learning models like Random Forest or Logistic Regression for comparison.
Explore feature engineering techniques to improve model performance.
# Glimps of Data Visualisation
![classification_report_heatmap](https://github.com/user-attachments/assets/be02b9c6-3e7c-4e5a-9ee9-f272321078de)