Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/dmarks84/ind_project_mall-customer-clustering--kaggle
Independent Project - Kaggle Dataset-- I worked with the Mall Customer Segmentation Dataset, which provided a various instances of shoppers of different ages, incomes, etc. I utilized unsupervised ML clustering algorithms to identify useful customer segments.
https://github.com/dmarks84/ind_project_mall-customer-clustering--kaggle
clustering dataframes dbscan kmeans-clustering market-segmentation mean-shift pandas python sklearn technical-analysis technical-communication unsupervised-ml
Last synced: about 7 hours ago
JSON representation
Independent Project - Kaggle Dataset-- I worked with the Mall Customer Segmentation Dataset, which provided a various instances of shoppers of different ages, incomes, etc. I utilized unsupervised ML clustering algorithms to identify useful customer segments.
- Host: GitHub
- URL: https://github.com/dmarks84/ind_project_mall-customer-clustering--kaggle
- Owner: dmarks84
- License: bsd-3-clause
- Created: 2024-02-12T19:58:53.000Z (9 months ago)
- Default Branch: main
- Last Pushed: 2024-02-27T22:39:11.000Z (8 months ago)
- Last Synced: 2024-02-27T23:37:13.286Z (8 months ago)
- Topics: clustering, dataframes, dbscan, kmeans-clustering, market-segmentation, mean-shift, pandas, python, sklearn, technical-analysis, technical-communication, unsupervised-ml
- Language: Jupyter Notebook
- Homepage: https://www.kaggle.com/code/danpmarks/mall-customer-clustering
- Size: 1.36 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Ind_Project_Mall-Customer-Clustering--Kaggle
## Screenshot
![screenshot](https://github.com/dmarks84/Ind_Project_Mall-Customer-Clustering--Kaggle/blob/main/mall_screenshot.png)## Summary
I worked with the Mall Customer Segmentation Dataset, which provided a various instances of shoppers of different ages, incomes, etc. I first reviewed the data and used paired plots to determine that gender was not a greaty differentiator/indicator of customer segmentation. I then employed three different Scikit-learn unsupervised clustering algorithms: K-Means, DBSCAN, and Mean Shift to determine if customer segmentation as a function of the other features-- Age and Income-- was possible and/or clear. With K-Means, I plotted inertia for different numbers of clusters to hone in on potential elbow points, and I ran through several combinations of min_samples and epsilon for DBSCAN.## Results
Both K-Means and Mean Shift settled on a cluster number of 5 when looking at the Score as a function of Income. They overlapped substantially in their assignment of segment/cluster, leading to clear indication behind the "meaning" of each segment:
- **Label 0** is "mid" income and "mid" spending
- **Label 1** is "high" income and "low" spending
- **Label 2** is "high" income and "high" spending
- **Label 3** is "low" income and "low" spending
- **Label 4** is "low" income and "high" spending## Skills (Developed & Applied)
Programming, Python, Pandas, Dataframes, Unsupervised-ML, Clustering, DBSCAN, K-Means, Mean Shift, Scikit-learn, Technical Communication