Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/baghettigh/f1-podium-prediction

A Streamlit web application that predicts if a driver will finish on the podium (1st,2nd, or 3rd Place) in a Formula 1 Grand Prix Race using their qualifying times and initial position. This application performs EDA, Data Preprocessing, and Supervised Machine Learning to predict instances using Logistic Regression.
https://github.com/baghettigh/f1-podium-prediction

data-science formula1 logistic-regression prediction streamlit

Last synced: about 1 month ago
JSON representation

Host: GitHub
URL: https://github.com/baghettigh/f1-podium-prediction
Owner: BaghettiGH
Created: 2024-11-01T03:27:48.000Z (about 2 months ago)
Default Branch: main
Last Pushed: 2024-11-11T04:41:44.000Z (about 1 month ago)
Last Synced: 2024-11-11T05:30:38.363Z (about 1 month ago)
Topics: data-science, formula1, logistic-regression, prediction, streamlit
Language: Python
Homepage:
Size: 1.48 MB
Stars: 0
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# F1 Podium Prediction
A Streamlit web application that predicts if a driver will finish on the podium (1st,2nd, or 3rd Place) in a Formula 1 Grand Prix Race using their qualifying times and initial position. This application performs **EDA**, **Data Preprocessing**, and **Supervised Machine Learning** to predict instances using **Logistic Regression**.

![Main Page Screenshot](screenshots/main_page_screenshot.png)

### 🔗 Links:
- [Streamlit Link]()
- [Google Colab Notebook](https://colab.research.google.com/drive/1AxRBCJX24u00DtShTovHU3tbSkpzfIt7?usp=sharing)

### 📊 Dataset:
- [Formula 1 World Championship (1950 - 2024)(Kaggle)](https://www.kaggle.com/datasets/rohanrao/formula-1-world-championship-1950-2020)
This repository serves as a project guide template for my students in **Introduction to Data Science** course for their final project. It contains a Python file `dashboard_template.py` which contains a boilerplate for a Streamlit dashboard.
### 📖 Pages:

1. `Dataset` - Brief description of the Formula 1 Dataset used in this dashboard.
2. `EDA` - Exploratory Data Analysis of the F1 Dataset. Highlights relationship of initial positionl, qualifying times and finishing on the podium. Includes bar graphs,histogram, scatter graph, etc.
3. `Data Cleaning / Pre-processing` - Data cleaning and pre-processing steps such as encoding the species column and splitting the dataset into training and testing sets.
4. `Machine Learning` - Training Logistic Regression model. This page also includes the model evaluation, feature importance and classification report.
5. `Prediction` - Prediction page where users can input values to predict if the driver will finish on the podium.
6. `Conclusion` - Summary of the insights and observations from the EDA and model training.

### 💡 Findings / Insights

Through exploratory data analysis and training of two classification models (`Decision Tree Classifier` and `Random Forest Regressor`) on the **Iris Flower dataset**, the key insights and observations are:

#### 1. 📊 **Dataset Characteristics**:

-
-

#### 2. 📝 **Feature Distributions and Separability**:

-
-

#### 3. 📈 **Model Performance (Decision Tree Classifier)**:

-
-

#### 4. 📈 **Model Performance (Random Forest Regressor)**:

-
-

##### **Conclusion:**

(Text here)