https://github.com/maguids/supervised-learning---video-games
This project consists on exploratory data analysis and the application of supervised learning models for classification using a Video Games dataset. Second Semester of the First Year of the Bachelor's Degree in Artificial Intelligence and Data Science.
https://github.com/maguids/supervised-learning---video-games
jupyter-notebook machine-learning matplotlib numpy pandas scikit-learn seaborn supervised-learning
Last synced: about 2 months ago
JSON representation
This project consists on exploratory data analysis and the application of supervised learning models for classification using a Video Games dataset. Second Semester of the First Year of the Bachelor's Degree in Artificial Intelligence and Data Science.
- Host: GitHub
- URL: https://github.com/maguids/supervised-learning---video-games
- Owner: Maguids
- Created: 2024-01-25T14:29:13.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2025-01-22T15:35:49.000Z (over 1 year ago)
- Last Synced: 2025-02-26T13:15:56.919Z (over 1 year ago)
- Topics: jupyter-notebook, machine-learning, matplotlib, numpy, pandas, scikit-learn, seaborn, supervised-learning
- Language: Jupyter Notebook
- Homepage:
- Size: 3.48 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Supervised Learning - Video Games Dataset
This project was focused on the video game dataset, required **data exploration**, **creation of new features**, and **pre-processing** (treatment of missing values, normalization, etc.). The goal was to **train and evaluate two supervised classification algorithms** (Decision Trees and K-NN) to predict the average user rating (bad, mediocre, good or great), comparing the results with metrics such as accuracy and confusion matrix.
It was developed for the "Elements of Artificial Intelligence and Data Science" course.
**Authors:**
- [Magda Costa](https://github.com/Maguids)
- Sofia Machado
## Requirements:
- Python
- NumPy
- Pandas
- Scikit-learn
- Matplotlib
- Seaborn
## Work Developed:
- **Problem Formulation**;
- **Data Analysis**;
- **Data Preprocessing**;
- **Exploratory Analysis**;
- **Classification** - in this fase we made several tests, in each test we did the following:
- Data Configuration;
- Data Division;
- Decision Tree;
- KNN;
- Cross-Validation (except in test 1);
- Assess the Importance of Variables;
- **Comparison of Results**;
- **Parameter Tunning**;
## About the repository:
In this repository, you can find several files:
- video_games.csv➡️ The dataset that was provided by the teacher;
- video_games_clean.csv ➡️ The dataset after pre-processing;
- Trabalho.ipynb ➡️ The worked developed;
- DataScience_videoQ.pdf ➡️ The powerpoint used for the presentation of this project, in PDF
- EIACD_Assignment2_2022_23.pdf ➡️ The assignment;
## Link to the course:
This course is part of the **second semester** of the **first year** of the **Bachelor's Degree in Artificial Intelligence and Data Science** at **FCUP** and **FEUP** in the academic year 2022/2023. You can find more information about this course at the following link: