https://github.com/quantumudit/analyzing-beerwulf-beers
This project focuses on scraping all the beers related information available on the BeerWulf website by using its backend private API; making necessary data transformations on the scraped data and then, analyzing & visualizing the data with Jupyter Notebook and Power BI.
https://github.com/quantumudit/analyzing-beerwulf-beers
api data-analytics power-bi python webscraping
Last synced: 3 months ago
JSON representation
This project focuses on scraping all the beers related information available on the BeerWulf website by using its backend private API; making necessary data transformations on the scraped data and then, analyzing & visualizing the data with Jupyter Notebook and Power BI.
- Host: GitHub
- URL: https://github.com/quantumudit/analyzing-beerwulf-beers
- Owner: quantumudit
- License: other
- Created: 2021-10-10T07:56:04.000Z (about 4 years ago)
- Default Branch: master
- Last Pushed: 2022-06-18T14:56:45.000Z (over 3 years ago)
- Last Synced: 2025-05-15T11:50:12.352Z (6 months ago)
- Topics: api, data-analytics, power-bi, python, webscraping
- Language: Python
- Homepage:
- Size: 1.81 MB
- Stars: 3
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: readme.md
- License: LICENSE
Awesome Lists containing this project
README
# ![Project Logo][project_logo]
---
Scraping & Analyzing beers from Beerwulf website with Python and Power BI
Overview •
Prerequisites •
Architecture •
Demo •
Support •
License
## Overview
This project focuses on scraping the varieties of beers and their associated metrics from [Beerwulf][website_link], performing exploratory data analysis to generate insights and visualize them with the help of Power BI.
[![Website Snippet][website_snippet]][website_link]
The repository directory structure is as follows:
Analyzing-Beerwulf-Beers
├─ 01_WEBSCRAPING
├─ 02_ETL
├─ 03_DATA
├─ 04_ANALYSIS
├─ 05_DASHBOARD
├─ 06_RESOURCES
The type of content present in the directories is as follows:
**01_WEBSCRAPING**
This directory contains the python script to scrape data from the website along with flat file that has the scraped data.
**02_ETL**
This directory contains the ETL script that takes the scraped dataset as input, transforms it and exports an analysis-ready dataset into the _03_DATA_ directory.
**03_DATA**
This directory contains the data that can be directly used for exploratory data analysis and data visualization purposes.
**04_ANALYSIS**
This directory contains the python notebooks that analyzes the clean dataset to generate insights
**05_DASHBOARD**
This directory contains the python notebook with an embedded Power BI report that visualizes the data. The Power BI dashboard contains slicers, cross-filtering and other advance capabilities that end user can play with to visualize a specific facet of the data or, to get additional insights.
**06_RESOURCES**
This directory contains images, icons, layouts, etc. that are used in this project
## Prerequisites
The major skills that are required as prerequisite to fully understand this project are as follows:
- Basics of Python & Jupyter Notebook
- Understanding of Python libraries mentioned in [requirements.txt][requirements] file
- Basics of HTML & CSS
- Basics of Power BI
> The choice of applications & their installation might vary based on individual preferences & system settings.
## Architecture
The project architecture is quite straight forward and can be explained through the below image:
![Process Architecture][process_workflow]
As per the above workflow suggests; we are first scraping the data from the website using the Python script and collecting the same in a flat file which is then processed and cleaned with another ETL specific Python script.
Finally; we leverage the clean & analysis-ready dataset for some exploratory data analysis (EDA) using Jupyter Notebook and creating an insightful report using Power BI
## Demo
The below graphic shows scraping of data from the website:
![Scraping Graphic][scraping_graphic]
The interactive Power BI dashboard can be viewed here:
[![Power BI Dashboard][dashboard_image]][dashboard_link]
## Support
If you have any doubts, queries or, suggestions then, please connect with me in any of the following platforms:
[![Linkedin Badge][linkedinbadge]][linkedin] [![Twitter Badge][twitterbadge]][twitter]
If you like my work then, you may support me at [Patreon][patreon]:
## License
This license allows reusers to distribute, remix, adapt, and build upon the material in any medium or format for noncommercial purposes only, and only so long as attribution is given to the creator. If you remix, adapt, or build upon the material, you must license the modified material under identical terms.
[project_logo]: 06_RESOURCES/project_cover_image.png
[process_workflow]: 06_RESOURCES/process_architecture.png
[scraping_graphic]: 06_RESOURCES/scraping_graphic.gif
[website_snippet]: 06_RESOURCES/website_snip.png
[dashboard_image]: 06_RESOURCES/dashboard_image.png
[website_link]: https://www.beerwulf.com/en-gb/c/all-beers
[requirements]: ./requirements.txt
[dashboard_link]: https://app.powerbi.com/view?r=eyJrIjoiZDk0MmNkOWQtODAwYS00YzIyLWIzYWYtNWNmMGI2MDI4OGY2IiwidCI6IjcwODlkNGIxLTQyMmUtNDYzZi1hNGM3LTViY2FiOTk0MGRiZCJ9
[linkedin]: https://www.linkedin.com/in/uditkumarchatterjee/
[twitter]: https://twitter.com/quantumudit
[patreon]: https://www.patreon.com/quantumudit
[linkedinbadge]: https://img.shields.io/badge/-uditkumarchatterjee-0e76a8?style=flat&labelColor=0e76a8&logo=linkedin&logoColor=white
[twitterbadge]: https://img.shields.io/badge/-@quantumudit-1ca0f1?style=flat&labelColor=1ca0f1&logo=twitter&logoColor=white&link=https://twitter.com/quantumudit