An open API service indexing awesome lists of open source software.

https://github.com/saniyaabushakimova/brewing-insights-with-unsupervised-learning

Conducted a comprehensive clustering analysis to categorize beers based on features such as Astringency, Alcohol content, Bitterness, Sourness, and more. Utilized k-medoids and hierarchical agglomerative clustering algorithms to achieve this classification. Tech: Python (numpy, pandas, seaborn, matplotlib, sklearn, scipy)
https://github.com/saniyaabushakimova/brewing-insights-with-unsupervised-learning

dendrogram hierarchical-clustering k-medoids-clustering python silhouette-score t-sne unsupervised-learning

Last synced: 7 months ago
JSON representation

Conducted a comprehensive clustering analysis to categorize beers based on features such as Astringency, Alcohol content, Bitterness, Sourness, and more. Utilized k-medoids and hierarchical agglomerative clustering algorithms to achieve this classification. Tech: Python (numpy, pandas, seaborn, matplotlib, sklearn, scipy)

Awesome Lists containing this project

README

          

Project completed on May 7, 2024.

## Project description

In the world of beer, certain varieties stand out due to their versatile flavors, making them popular choices among consumers. A business owner aiming to meet popular demand needs to curate a simple yet appealing range of beers. However, given the overwhelming number of beer styles available, it is impractical to include every type in the inventory.

This project utilizes **clustering analysis** to assist the business owner in identifying a representative sample of beers. By examining various features of different beers (e.g. Astringency, Bitter, Alcohol etc), the analysis seeks to group them into distinct clusters, enabling the owner to select a diverse yet manageable assortment for their inventory.

## Project outline

`analysis_and_report.ipynb`
1. Introduction
2. Dataset Discussion
3. Dataset Cleaning and Exploration
4. Basic Descriptive Analytics
5. Scaling Decisions
6. Clusterability and Clustering Structure
7. Clustering Algorithm Selection Motivation
8. Clustering Algorithm #1: K-Medoids
9. Clustering Algorithm #2: HAC with Ward's Linkage
10. Discussion
11. Conclusion

## Unsupervised Learning tools used in this project

- Hopkin's Statistic
- t-SNE plot
- Elbow plot
- Average Silhouette score
- Silhouette plot
- Cluster Sorted Similarity Matrix
- K-Medoids Clustering
- Hierarchical Agglomerative Clustering (HAC) with Single, Complete, Average, Ward's linkages
- Dendrogram

## Other details

`beer_profile_and_ratings.csv` -- raw dataset (retreived from [Kaggle](https://www.kaggle.com/datasets/ruthgn/beer-profile-and-ratings-data-set/data))

`presentation.pdf` -- a short presentation with the project overview