An open API service indexing awesome lists of open source software.

https://github.com/harshjuly12/customer-segmentation-using-k-means-clustering

Repository for customer segmentation using KMeans clustering, utilizing techniques for data analysis and cluster identification. Includes dataset from Kaggle and open-source tools.
https://github.com/harshjuly12/customer-segmentation-using-k-means-clustering

aiml kmeans-clustering unsupervised-machine-learning

Last synced: 2 months ago
JSON representation

Repository for customer segmentation using KMeans clustering, utilizing techniques for data analysis and cluster identification. Includes dataset from Kaggle and open-source tools.

Awesome Lists containing this project

README

          



Customer Segmentation Using K-Means


## Table of Contents
1. [Project Overview](#project-overview)
2. [Dataset](#dataset)
3. [Project Structure](#project-structure)
4. [Requirements](#requirements)
5. [Installation](#installation)
6. [Usage](#usage)
7. [Analysis and Results](#analysis-and-results)
8. [Contributing](#contributing)
9. [License](#license)
10. [Author](#author)

## Project Overview
CustomerSegmentationUsingKMeans is a project that demonstrates customer segmentation using the K-means clustering algorithm. The goal is to group customers of a retail store based on their purchase history and behavior.

## Dataset
The dataset used for this project can be found on [Kaggle](https://www.kaggle.com/datasets/vjchoudhary7/customer-segmentation-tutorial-in-python). It contains the following columns:
- `CustomerID`: Unique identifier for each customer
- `Gender`: Gender of the customer
- `Age`: Age of the customer
- `Annual Income (k$)`: Annual income of the customer in thousands of dollars
- `Spending Score (1-100)`: Spending score assigned by the store based on customer behavior

## Project Structure
The repository contains the following files:
- `CustomerSegmentationUsingKMeans.ipynb`: Jupyter Notebook with the complete analysis and K-means clustering implementation
- `Mall_Customers.csv`: Dataset used for the analysis

## Requirements
To run the project, you need the following libraries:
- NumPy
- Pandas
- Matplotlib
- Seaborn
- Plotly
- Scikit-learn

## Installation
1. **Clone the repo:**
```sh
git clone https://github.com/your-username/CustomerSegmentationUsingKMeans.git
cd CustomerSegmentationUsingKMeans
```

2. **Install the required Python packages:**
```sh
pip install numpy pandas matplotlib seaborn plotly scikit-learn
```

## Usage
**jupyter notebook CustomerSegmentationUsingKMeans.ipynb**

## Analysis and Results
The notebook contains the following steps:
1. Importing Libraries: Importing necessary libraries for analysis and visualization.
2. Data Exploration: Exploring the dataset to understand the distribution and relationships between different variables.
3. Data Preprocessing: Preparing the data for clustering by scaling the features.
4. K-means Clustering: Implementing K-means clustering to group customers into segments.
5. Visualization: Visualizing the clusters to interpret the results.

## Contributing
Contributions are welcome! Please open an issue or submit a pull request for any improvements or bug fixes.

## License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## Author
For any questions or suggestions, please contact:
- Harsh Singh: [harshjuly12@gmail.com](harshjuly12@gmail.com)
- GitHub: [harshjuly12](https://github.com/harshjuly12)