https://github.com/saniyaabushakimova/brewing-insights-with-unsupervised-learning
Conducted a comprehensive clustering analysis to categorize beers based on features such as Astringency, Alcohol content, Bitterness, Sourness, and more. Utilized k-medoids and hierarchical agglomerative clustering algorithms to achieve this classification. Tech: Python (numpy, pandas, seaborn, matplotlib, sklearn, scipy)
https://github.com/saniyaabushakimova/brewing-insights-with-unsupervised-learning
dendrogram hierarchical-clustering k-medoids-clustering python silhouette-score t-sne unsupervised-learning
Last synced: 7 months ago
JSON representation
Conducted a comprehensive clustering analysis to categorize beers based on features such as Astringency, Alcohol content, Bitterness, Sourness, and more. Utilized k-medoids and hierarchical agglomerative clustering algorithms to achieve this classification. Tech: Python (numpy, pandas, seaborn, matplotlib, sklearn, scipy)
- Host: GitHub
- URL: https://github.com/saniyaabushakimova/brewing-insights-with-unsupervised-learning
- Owner: SaniyaAbushakimova
- Created: 2024-07-12T14:45:56.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-07-13T22:14:15.000Z (over 1 year ago)
- Last Synced: 2024-07-14T17:21:05.134Z (about 1 year ago)
- Topics: dendrogram, hierarchical-clustering, k-medoids-clustering, python, silhouette-score, t-sne, unsupervised-learning
- Language: Jupyter Notebook
- Homepage:
- Size: 50 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
Project completed on May 7, 2024.
## Project descriptionIn the world of beer, certain varieties stand out due to their versatile flavors, making them popular choices among consumers. A business owner aiming to meet popular demand needs to curate a simple yet appealing range of beers. However, given the overwhelming number of beer styles available, it is impractical to include every type in the inventory.
This project utilizes **clustering analysis** to assist the business owner in identifying a representative sample of beers. By examining various features of different beers (e.g. Astringency, Bitter, Alcohol etc), the analysis seeks to group them into distinct clusters, enabling the owner to select a diverse yet manageable assortment for their inventory.
## Project outline
`analysis_and_report.ipynb`
1. Introduction
2. Dataset Discussion
3. Dataset Cleaning and Exploration
4. Basic Descriptive Analytics
5. Scaling Decisions
6. Clusterability and Clustering Structure
7. Clustering Algorithm Selection Motivation
8. Clustering Algorithm #1: K-Medoids
9. Clustering Algorithm #2: HAC with Ward's Linkage
10. Discussion
11. Conclusion## Unsupervised Learning tools used in this project
- Hopkin's Statistic
- t-SNE plot
- Elbow plot
- Average Silhouette score
- Silhouette plot
- Cluster Sorted Similarity Matrix
- K-Medoids Clustering
- Hierarchical Agglomerative Clustering (HAC) with Single, Complete, Average, Ward's linkages
- Dendrogram## Other details
`beer_profile_and_ratings.csv` -- raw dataset (retreived from [Kaggle](https://www.kaggle.com/datasets/ruthgn/beer-profile-and-ratings-data-set/data))
`presentation.pdf` -- a short presentation with the project overview