https://github.com/henrylin03/video-games
Using Python and SQL to clean, analyse and visualise video games' data from Metacritic. Includes scraping using BeautifulSoup.
https://github.com/henrylin03/video-games
analysis beautifulsoup beautifulsoup4 data data-analysis data-science eda jupyter-notebook pandas python sql sqlite3 video-game video-games
Last synced: about 2 months ago
JSON representation
Using Python and SQL to clean, analyse and visualise video games' data from Metacritic. Includes scraping using BeautifulSoup.
- Host: GitHub
- URL: https://github.com/henrylin03/video-games
- Owner: henrylin03
- Created: 2021-10-27T08:00:17.000Z (over 4 years ago)
- Default Branch: main
- Last Pushed: 2023-04-30T00:35:22.000Z (about 3 years ago)
- Last Synced: 2025-01-14T12:51:59.159Z (over 1 year ago)
- Topics: analysis, beautifulsoup, beautifulsoup4, data, data-analysis, data-science, eda, jupyter-notebook, pandas, python, sql, sqlite3, video-game, video-games
- Language: Jupyter Notebook
- Homepage:
- Size: 22 MB
- Stars: 2
- Watchers: 2
- Forks: 1
- Open Issues: 7
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Video Games: Data Analysis
## Description
This personal project showcases my skills and experience in data scraping, data pipeline building, and analysis/visualisation. In this project, I analyse video games' review data from [Metacritic.com](https://www.metacritic.com) using Python.
## Methodology
The project is split into two parts:
### Data Scraping
I wrote a scraper using `BeautifulSoup` to extract data from [Metacritic.com](https://www.metacritic.com). For each of the gaming platforms, I rank all games by name, alphabetically, then scrape required attributes. This process has approximately halved the time taken to scrape the required input data, compared to using `selenium`.
### Data Analysis & Visualisation
I then perform data analysis and visualisation using SQL (`sqlite3`), and Python's `pandas`, `seaborn`, and `matplotlib` libraries. The analysis includes exploring the relationships between game scores and their respective platforms and release years. I also created several visualisations, such as bar charts, boxplots and scatterplots, to highlight insights.
## Results
Through this project, I was able to identify trends and patterns in the video game industry, such as the most popular gaming platforms, and the top-rated games in recent years. These insights can be useful for game developers, publishers, and other stakeholders in the industry.
## Feedback
Thank you for reviewing my personal project! If you have any feedback or suggestions, please feel free to reach out to me via email or raise a [GitHub Issue](https://github.com/henrylin03/video-games/issues).