https://github.com/razalkr70/customer-segmentation-using-dataset

A data science project that segments mall customers using K-Means clustering. Based on age, income, and spending score, it identifies customer groups and visualizes them with 2D and 3D plots for targeted marketing insights.
https://github.com/razalkr70/customer-segmentation-using-dataset

clustering customer-segmentation data-science data-visualization kmeans machine-learning pca python scikit-learn

Last synced: about 2 months ago
JSON representation

Host: GitHub
URL: https://github.com/razalkr70/customer-segmentation-using-dataset
Owner: Razalkr70
Created: 2025-05-13T11:51:14.000Z (about 1 year ago)
Default Branch: main
Last Pushed: 2025-05-14T04:32:52.000Z (about 1 year ago)
Last Synced: 2025-05-14T06:05:49.183Z (about 1 year ago)
Topics: clustering, customer-segmentation, data-science, data-visualization, kmeans, machine-learning, pca, python, scikit-learn
Language: Python
Homepage:
Size: 382 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# 🛍️ Customer Segmentation using K-Means Clustering

This project performs customer segmentation on a mall customer dataset using the K-Means clustering algorithm. It identifies groups based on features like age, income, and spending score, and visualizes the clusters using pair plots and PCA in 3D.

## 📌 Overview

Customer segmentation is a key technique in marketing and business analytics. In this project, the K-Means algorithm is applied to group customers based on their demographics and spending patterns.

### ✨ Features

- Data preprocessing and feature scaling
- Gender encoding
- Elbow method to determine optimal `k`
- Cluster formation using K-Means
- Cluster-wise statistical summary
- Visualizations using seaborn and matplotlib
- 3D PCA for better insight into clusters
- Customer labeling using custom logic

## 🛠️ Tech Stack

- Python
- Pandas, NumPy
- Matplotlib, Seaborn
- Scikit-learn
- PCA (Principal Component Analysis)

## 📊 How It Works

1. Dataset is preprocessed and gender is encoded.
2. Elbow method is used to determine the optimal number of clusters.
3. K-Means is applied to group customers.
4. Cluster visualization using seaborn and PCA.
5. Each cluster is labeled with intuitive names like "Young Spenders", "Savers", etc.

## 📂 Dataset

`Mall_Customers.csv` should be in your working directory. It contains:
- CustomerID
- Gender
- Age
- Annual Income (k$)
- Spending Score (1-100)

## 🚀 Run the Code

```bash
pip install pandas numpy matplotlib seaborn scikit-learn
python customer_segmentation.py
```
### 📈 Sample Output
- Cluster visualization via pairplots
- 3D PCA cluster plot
- Cluster statistics

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/razalkr70/customer-segmentation-using-dataset

Awesome Lists containing this project

README