https://github.com/kanika300393/customer_segmentation
This project performs customer segmentation on a retail dataset using K-Means clustering. The dataset includes features like annual income and spending score, which are used to group customers into 5 distinct segments. The Elbow method is applied to determine the optimal number of clusters, and visualizations are generated to represent the groups.
https://github.com/kanika300393/customer_segmentation
Last synced: 5 months ago
JSON representation
This project performs customer segmentation on a retail dataset using K-Means clustering. The dataset includes features like annual income and spending score, which are used to group customers into 5 distinct segments. The Elbow method is applied to determine the optimal number of clusters, and visualizations are generated to represent the groups.
- Host: GitHub
- URL: https://github.com/kanika300393/customer_segmentation
- Owner: Kanika300393
- Created: 2024-12-26T00:43:25.000Z (7 months ago)
- Default Branch: main
- Last Pushed: 2024-12-26T01:04:22.000Z (7 months ago)
- Last Synced: 2025-01-06T02:18:53.920Z (6 months ago)
- Language: Jupyter Notebook
- Homepage:
- Size: 552 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Customer Segmentation

This project focuses on segmenting customers based on their annual income and spending score using K-Means Clustering. The objective is to identify distinct customer groups that can be targeted with personalized marketing strategies.
## Project Overview
The project involves analyzing a dataset of mall customers to cluster them into groups based on their annual income and spending score. The K-Means algorithm is used for unsupervised learning, with the number of clusters determined using the Elbow Method.To represent your workflow as a clean and organized table or box-based graph in your README file, you can include the following formatted table:
---
### Workflow
| **Step** | **Description** |
|------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------|
| **Data Collection and Analysis** | Loaded the dataset from `Mall_Customers.csv` and explored its structure. Checked for missing values and summary statistics of the dataset. |
| **Feature Selection** | Selected two features: **Annual Income** and **Spending Score**, to cluster the customers. |
| **Optimal Number of Clusters** | Used the Elbow Method to determine the optimal number of clusters. The number of clusters was found to be **5** based on the Within-Cluster Sum of Squares (WCSS) graph. |
| **K-Means Clustering** | Applied K-Means clustering with **5 clusters**, and obtained labels for each customer. |---
## Visualization
Plotted the customer clusters with different colors and highlighted the centroids to show how customers are grouped based on income and spending scores.







## Results
The model successfully segmented the customers into 5 distinct groups based on their annual income and spending score, enabling targeted strategies for each group.