https://github.com/ashishsingh789/customer_purchase_prediction_using_decision-tree-_classifier

Decision Tree Classifier to predict customer purchases using demographic and behavioral data. Key steps: data preprocessing, EDA, model training, evaluation, and feature importance analysis.
https://github.com/ashishsingh789/customer_purchase_prediction_using_decision-tree-_classifier

data datascience desiciontree eda machine-learning-algorithms matplotlib numpy pandas-dataframe python seaborn

Last synced: 4 months ago
JSON representation

Decision Tree Classifier to predict customer purchases using demographic and behavioral data. Key steps: data preprocessing, EDA, model training, evaluation, and feature importance analysis.

Host: GitHub
URL: https://github.com/ashishsingh789/customer_purchase_prediction_using_decision-tree-_classifier
Owner: AshishSingh789
Created: 2024-10-01T17:18:45.000Z (9 months ago)
Default Branch: main
Last Pushed: 2024-10-01T17:28:02.000Z (9 months ago)
Last Synced: 2024-10-19T12:04:27.703Z (9 months ago)
Topics: data, datascience, desiciontree, eda, machine-learning-algorithms, matplotlib, numpy, pandas-dataframe, python, seaborn
Language: Jupyter Notebook
Homepage:
Size: 130 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Customer Purchase Prediction Using Decision Tree Classifier

# Project Overview

This project involves building a decision tree classifier to predict whether a customer will purchase a product or service. The prediction is based on demographic and behavioral data. The dataset used is similar to the Bank Marketing dataset from the UCI Machine Learning Repository.

# Task Description

The task was part of my virtual internship at Prodigy Infotech. The main objective was to build a machine learning model (Decision Tree) that can classify customers based on whether they are likely to purchase a product or service. This model helps in understanding which customer segments are most likely to convert, aiding marketing and sales efforts.

# Key Steps Performed

# Data Preprocessing:

#Handled missing values.

Categorical variables were encoded using techniques like one-hot encoding.
Scaled and normalized the data for better model performance.
Exploratory Data Analysis (EDA):

# Analyzed the distribution of variables (age, job type, marital status, etc.).

Examined relationships between key variables and purchase decisions.
Visualized correlations between features using heatmaps.
Building the Decision Tree Classifier:

# Split the data into training and testing sets.

Trained a decision tree classifier to predict customer purchases.
Optimized the model by tuning hyperparameters like maximum depth, splitting criteria, etc.
Model Evaluation:

# Evaluated the model using metrics like accuracy, precision, recall, and F1-score.

Generated a confusion matrix to visualize performance.
Feature Importance:

Analyzed which features contributed the most to the model's decision-making process.

# Libraries Used
pandas for data manipulation
matplotlib and seaborn for data visualization
scikit-learn for machine learning model building and evaluation
NumPy for numerical operations

# Dataset

The dataset used is similar to the Bank Marketing dataset from the UCI Machine Learning Repository, containing features like:

Age: Age of the customer.
Job: Type of job.
Marital Status: Whether the customer is married, single, or divorced.
Education: Level of education.
Balance: Bank balance of the customer.
Campaign: Number of contacts performed during this campaign.
Previous Outcome: Outcome of the previous marketing campaign.
Purchase: Whether the customer purchased the product (target variable).

# Results

The decision tree classifier performed well in predicting customer purchases, with a focus on improving accuracy and interpretability through feature importance analysis.

# How to Use This Repository

Could you clone the repository to your local machine?
Install the required libraries by running the following:
bash
Copy code
pip install -r requirements.txt
Run the Jupyter Notebook or Python scripts to see the full analysis and model-building process.
Future Work

Implement other machine learning models like Random Forest or Logistic Regression for comparison.

Explore feature engineering techniques to improve model performance.

# Glimps of Data Visualisation

![classification_report_heatmap](https://github.com/user-attachments/assets/be02b9c6-3e7c-4e5a-9ee9-f272321078de)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/ashishsingh789/customer_purchase_prediction_using_decision-tree-_classifier

Awesome Lists containing this project

README