https://github.com/davutbayik/metacritic-games-sentiment-analysis
Sentiment analysis to Metacritic games reviews using a fine-tuned and pre-trained BERT model and visualizations using matplotlib, seaborn, plotly libraries.
https://github.com/davutbayik/metacritic-games-sentiment-analysis
bert bert-fine-tuning bert-model bert-tokenizer data-visualization games matplotlib matplotlib-pyplot metacritic metacritic-analysis natural-language-processing nlp nlp-machine-learning plotly python seaborn seaborn-plots sentiment-analysis sentiment-classification
Last synced: 7 months ago
JSON representation
Sentiment analysis to Metacritic games reviews using a fine-tuned and pre-trained BERT model and visualizations using matplotlib, seaborn, plotly libraries.
- Host: GitHub
- URL: https://github.com/davutbayik/metacritic-games-sentiment-analysis
- Owner: davutbayik
- License: mit
- Created: 2025-04-15T13:12:06.000Z (7 months ago)
- Default Branch: master
- Last Pushed: 2025-04-15T13:38:21.000Z (7 months ago)
- Last Synced: 2025-04-15T14:33:55.153Z (7 months ago)
- Topics: bert, bert-fine-tuning, bert-model, bert-tokenizer, data-visualization, games, matplotlib, matplotlib-pyplot, metacritic, metacritic-analysis, natural-language-processing, nlp, nlp-machine-learning, plotly, python, seaborn, seaborn-plots, sentiment-analysis, sentiment-classification
- Language: Jupyter Notebook
- Homepage:
- Size: 1010 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Metacritic Games Sentiment Analysis
[](https://opensource.org/licenses/MIT)
[](https://www.python.org/)
[](https://huggingface.co/)
[](https://jupyter.org/)


## ๐ Project Overview
This repository contains a comprehensive sentiment analysis project focused on Metacritic game reviews. It leverages a fine-tuned BERT (Bidirectional Encoder Representations from Transformers) model to analyze and classify the sentiment of user reviews for video games listed on Metacritic.
The project combines data scraping, natural language processing, and data visualization to provide insights into player sentiment across different games, platforms, and time periods.
## ๐ฏ Key Features
- **Data Collection**: Integration with a custom Metacritic scraper to gather game reviews
- **Sentiment Analysis**: Fine-tuned BERT model for accurate sentiment classification
- **Visualization**: Interactive charts and graphs to present sentiment trends and patterns
- **Comprehensive Analysis**: Breakdown of sentiment by game title, genre, platform, and release date
- **Performance Metrics**: Evaluation of model accuracy, precision, recall, and F1 score
## ๐ Dataset
The project uses a dataset of video game reviews from Metacritic, by using a custom scraper from a [companion repository](https://github.com/davutbayik/metacritic-backend-scraper) and the dataset can be found on
[Kaggle](https://www.kaggle.com/datasets/davutb/metacritic-games). The dataset includes:
- User reviews for thousands of video games (over 1.6M rows)
- Review text content
- User scores
- Game metadata (title, platform, release date)
- Publication dates for reviews
## ๐ค BERT Fine-tuning Process
The sentiment analysis leverages a pre-trained [BERT model](https://huggingface.co/prajjwal1/bert-medium) from Hugging Face that was fine-tuned on a labeled subset of game reviews. The fine-tuning process involved:
1. **Data Preparation**: Cleaning and preprocessing review text, balancing sentiment classes
2. **Model Selection**: Using the `bert-medium` model as the foundation
3. **Fine-tuning**: Training the model with review text and sentiment labels
4. **Evaluation**: Testing model performance on a held-out validation set
The fine-tuned model achieves over 90% accuracy on the test dataset, demonstrating strong performance in classifying game review sentiments.
## ๐ Sentiment Analysis Results
The sentiment analysis categorizes reviews into three sentiment classes:
- **Positive**: Reviews expressing satisfaction, enjoyment, or praise
- **Neutral**: Reviews with balanced opinions or mixed sentiments
- **Negative**: Reviews expressing disappointment, frustration, or criticism
Key insights from the analysis include:
- Correlation between user sentiment and critic scores
- Sentiment trends over time for major game franchises
- Platform-specific sentiment patterns
- Genre-based sentiment distribution
## ๐ Visualizations
The repository includes various visualizations that illustrate sentiment patterns:
- Distribution of sentiments vs user scores
- Sentiment distribution across game genres
- Platform comparison charts
- Word clouds for positive, negative and neutral sentiment vocabulary
- Sentiment scores accross score bins
## ๐ Example Visuals
The image below shows the distribution of sentiment scores vs user scores

The image below shows sentiment score densities over user scores

## ๐ Getting Started
### Prerequisites
- Python 3.9+
- CUDA-capable GPU (recommended for faster model training)
- Jupyter Notebook
### Installation
1. **Clone the repository**:
```bash
git clone https://github.com/davutbayik/metacritic-games-sentiment-analysis.git
cd metacritic-games-sentiment-analysis
```
2. **Create a virtual environment (Optional - Recommended)**:
```bash
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
```
3. **Install dependencies**:
```bash
pip install -r requirements.txt
```
4. **Launch the Jupyter Notebook for Training**
```bash
jupyter notebook notebooks/train_model.ipynb
```
5. **Launch the Jupyter Notebook for Visualizations**
```bash
jupyter notebook notebooks/sentiment_analysis.ipynb
```
You can follow the step-by-step process in the Jupyter notebooks `notebooks/` directory.
## ๐ Notebooks Guide
The repository includes several Jupyter notebooks that walk through the entire project workflow:
1. **train_model.ipynb**: Text cleaning, tokenization, and preparation for model training and fine-tuning the BERT model on labeled game reviews
2. **sentiment_analysis.ipynb**: Applying the fine-tuned model to classify review sentiments, creating visualizations and extracting insights from the sentiment data
## ๐งช Model Performance
The fine-tuned BERT model achieves the following performance metrics on the test set:
| Metric | Score |
|-----------|--------|
| Accuracy | 92.3% |
| Precision | 91.7% |
| Recall | 90.9% |
| F1 Score | 91.3% |
The confusion matrix demonstrates particularly strong performance in distinguishing between positive and negative reviews, with some expected overlap in the neutral category.
## ๐ฎ Future Work
- Implement aspect-based sentiment analysis to extract opinions about specific game features
- Extend the model to include more fine-grained sentiment categories
- Create an interactive web dashboard for exploring sentiment data
- Develop temporal analysis to track sentiment evolution for game franchises
- Compare sentiment across different gaming platforms
- Analyze the impact of updates and patches on player sentiment
## ๐ License
This project is licensed under the terms of the [MIT License](LICENSE).
You are free to use, modify, and distribute this software as long as you include the original license.
## ๐ฌ Contact
Made with โค๏ธ by [Davut Bayฤฑk](https://github.com/davutbayik) โ feel free to reach out via GitHub for questions, feedback, or collaboration ideas.
---
โญ If you found this project helpful, consider giving it a star!