Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/baghettigh/f1-podium-prediction

A Streamlit web application that predicts if a driver will finish on the podium (1st,2nd, or 3rd Place) in a Formula 1 Grand Prix Race using their qualifying times and initial position. This application performs EDA, Data Preprocessing, and Supervised Machine Learning to predict instances using Logistic Regression.
https://github.com/baghettigh/f1-podium-prediction

data-science formula1 logistic-regression prediction streamlit

Last synced: about 1 month ago
JSON representation

A Streamlit web application that predicts if a driver will finish on the podium (1st,2nd, or 3rd Place) in a Formula 1 Grand Prix Race using their qualifying times and initial position. This application performs EDA, Data Preprocessing, and Supervised Machine Learning to predict instances using Logistic Regression.

Awesome Lists containing this project

README

        

# F1 Podium Prediction
A Streamlit web application that predicts if a driver will finish on the podium (1st,2nd, or 3rd Place) in a Formula 1 Grand Prix Race using their qualifying times and initial position. This application performs **EDA**, **Data Preprocessing**, and **Supervised Machine Learning** to predict instances using **Logistic Regression**.

![Main Page Screenshot](screenshots/main_page_screenshot.png)

### 🔗 Links:
- [Streamlit Link]()
- [Google Colab Notebook](https://colab.research.google.com/drive/1AxRBCJX24u00DtShTovHU3tbSkpzfIt7?usp=sharing)

### 📊 Dataset:
- [Formula 1 World Championship (1950 - 2024)(Kaggle)](https://www.kaggle.com/datasets/rohanrao/formula-1-world-championship-1950-2020)
This repository serves as a project guide template for my students in **Introduction to Data Science** course for their final project. It contains a Python file `dashboard_template.py` which contains a boilerplate for a Streamlit dashboard.
### 📖 Pages:

1. `Dataset` - Brief description of the Formula 1 Dataset used in this dashboard.
2. `EDA` - Exploratory Data Analysis of the F1 Dataset. Highlights relationship of initial positionl, qualifying times and finishing on the podium. Includes bar graphs,histogram, scatter graph, etc.
3. `Data Cleaning / Pre-processing` - Data cleaning and pre-processing steps such as encoding the species column and splitting the dataset into training and testing sets.
4. `Machine Learning` - Training Logistic Regression model. This page also includes the model evaluation, feature importance and classification report.
5. `Prediction` - Prediction page where users can input values to predict if the driver will finish on the podium.
6. `Conclusion` - Summary of the insights and observations from the EDA and model training.

### 💡 Findings / Insights

Through exploratory data analysis and training of two classification models (`Decision Tree Classifier` and `Random Forest Regressor`) on the **Iris Flower dataset**, the key insights and observations are:

#### 1. 📊 **Dataset Characteristics**:

-
-

#### 2. 📝 **Feature Distributions and Separability**:

-
-

#### 3. 📈 **Model Performance (Decision Tree Classifier)**:

-
-

#### 4. 📈 **Model Performance (Random Forest Regressor)**:

-
-

##### **Conclusion:**

(Text here)