https://github.com/cyblx/clustering
This project explores clustering techniques and supervised learning applied to World Cup team performance analysis. The methodologies include K-Means, DBSCAN, K-Nearest Neighbors, Gaussian Mixture Models (GMM), and Agglomerative Clustering.
https://github.com/cyblx/clustering
clustering data-analysis dbscan gmm kmeans supervised-learning unsupervised-learning world-cup
Last synced: 11 months ago
JSON representation
This project explores clustering techniques and supervised learning applied to World Cup team performance analysis. The methodologies include K-Means, DBSCAN, K-Nearest Neighbors, Gaussian Mixture Models (GMM), and Agglomerative Clustering.
- Host: GitHub
- URL: https://github.com/cyblx/clustering
- Owner: CybLX
- License: mit
- Created: 2024-10-18T17:34:09.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-10-18T18:02:33.000Z (over 1 year ago)
- Last Synced: 2025-05-28T07:58:06.300Z (about 1 year ago)
- Topics: clustering, data-analysis, dbscan, gmm, kmeans, supervised-learning, unsupervised-learning, world-cup
- Language: Jupyter Notebook
- Homepage:
- Size: 1.6 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Clustering Techniques and Supervised Learning
## Overview
This project explores various clustering techniques and supervised learning applied to the analysis of team performance in the World Cup. The methodologies covered include K-Means, DBSCAN, K-Nearest Neighbors, Gaussian Mixture Models (GMM), and Agglomerative Clustering.
## Dataset Features
The dataset used in this project contains information such as:
- **Position**: Team's ranking position
- **Team**: Name of the team
- **Games Played**: Total number of games played
- **Win**: Total number of wins
- **Draw**: Total number of draws
- **Loss**: Total number of losses
- **Goals For**: Total goals scored by the team
- **Goals Against**: Total goals conceded by the team
- **Goal Difference**: Difference between goals scored and conceded
- **Points**: Total points accumulated
- **Year**: Year of the competition
## Project Goals
The main objective of this project is to apply clustering techniques to gain a better understanding of the data structure and the relationships among the variables. We aim to identify groups of similar teams, effectively segment the data, and evaluate the performance of machine learning algorithms in different scenarios, with an emphasis on teaching unsupervised learning techniques.
## Tools Used
- Python
- Jupyter Notebook
- Libraries: Pandas, NumPy, Scikit-learn, Matplotlib, Seaborn, among others.
## How to Use
1. Clone the repository to your local machine:
```bash
git clone https://github.com/cyblx/clustering.git
```
2. Install the required libraries:
```bash
pip install -r requirements.txt
```
3. Open Jupyter Notebook and run the analysis:
```bash
jupyter notebook
```
4. Follow the instructions within the notebook to explore the dataset and view the analysis results.
## For More Information
For more information, codes, tutorials, and exciting projects, visit the links below:
- Email: alves_lucasoliveira@usp.br
- GitHub: [cyblx](https://github.com/cyblx)
- LinkedIn: [Cyblx](https://www.linkedin.com/in/cyblx)