https://github.com/saniyaabushakimova/brewing-insights-with-unsupervised-learning

Conducted a comprehensive clustering analysis to categorize beers based on features such as Astringency, Alcohol content, Bitterness, Sourness, and more. Utilized k-medoids and hierarchical agglomerative clustering algorithms to achieve this classification. Tech: Python (numpy, pandas, seaborn, matplotlib, sklearn, scipy)
https://github.com/saniyaabushakimova/brewing-insights-with-unsupervised-learning

dendrogram hierarchical-clustering k-medoids-clustering python silhouette-score t-sne unsupervised-learning

Last synced: 7 months ago
JSON representation

Host: GitHub
URL: https://github.com/saniyaabushakimova/brewing-insights-with-unsupervised-learning
Owner: SaniyaAbushakimova
Created: 2024-07-12T14:45:56.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2024-07-13T22:14:15.000Z (over 1 year ago)
Last Synced: 2024-07-14T17:21:05.134Z (about 1 year ago)
Topics: dendrogram, hierarchical-clustering, k-medoids-clustering, python, silhouette-score, t-sne, unsupervised-learning
Language: Jupyter Notebook
Homepage:
Size: 50 MB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

Project completed on May 7, 2024.

## Project description

In the world of beer, certain varieties stand out due to their versatile flavors, making them popular choices among consumers. A business owner aiming to meet popular demand needs to curate a simple yet appealing range of beers. However, given the overwhelming number of beer styles available, it is impractical to include every type in the inventory.

This project utilizes **clustering analysis** to assist the business owner in identifying a representative sample of beers. By examining various features of different beers (e.g. Astringency, Bitter, Alcohol etc), the analysis seeks to group them into distinct clusters, enabling the owner to select a diverse yet manageable assortment for their inventory.

## Project outline

`analysis_and_report.ipynb`
1. Introduction
2. Dataset Discussion
3. Dataset Cleaning and Exploration
4. Basic Descriptive Analytics
5. Scaling Decisions
6. Clusterability and Clustering Structure
7. Clustering Algorithm Selection Motivation
8. Clustering Algorithm #1: K-Medoids
9. Clustering Algorithm #2: HAC with Ward's Linkage
10. Discussion
11. Conclusion

## Unsupervised Learning tools used in this project

- Hopkin's Statistic
- t-SNE plot
- Elbow plot
- Average Silhouette score
- Silhouette plot
- Cluster Sorted Similarity Matrix
- K-Medoids Clustering
- Hierarchical Agglomerative Clustering (HAC) with Single, Complete, Average, Ward's linkages
- Dendrogram

## Other details

`beer_profile_and_ratings.csv` -- raw dataset (retreived from [Kaggle](https://www.kaggle.com/datasets/ruthgn/beer-profile-and-ratings-data-set/data))

`presentation.pdf` -- a short presentation with the project overview

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/saniyaabushakimova/brewing-insights-with-unsupervised-learning

Awesome Lists containing this project

README