An open API service indexing awesome lists of open source software.

https://github.com/oceanuz/car-price-regression

A comprehensive ML evaluation and improvement notebook for a car price prediction model. It includes topics such as scoring with r2, cross-validation, overfitting/underfitting diagnosis, and polynomial regression. *Ridge regression* is applied to reduce overfitting, and (GridSearchCV) techniques are used to find the best alpha hyperparameter.
https://github.com/oceanuz/car-price-regression

cross-validation data-science grid-search hyperparameter-tuning machine-learning machine-learning-models model-evaluation overfitting python regression ridge-regression scikit-learn

Last synced: 6 months ago
JSON representation

A comprehensive ML evaluation and improvement notebook for a car price prediction model. It includes topics such as scoring with r2, cross-validation, overfitting/underfitting diagnosis, and polynomial regression. *Ridge regression* is applied to reduce overfitting, and (GridSearchCV) techniques are used to find the best alpha hyperparameter.

Awesome Lists containing this project

README

          

# 🚗 Model Evaluation and Refinement for Car Price Prediction

This repository contains a comprehensive Jupyter Notebook (`Model_Evaluation_and_Refinement_cars.ipynb`) focused on the essential steps of **Machine Learning model evaluation and refinement**.

The analysis uses a car features dataset to predict prices, demonstrating key concepts from model selection to hyperparameter tuning using Python and `scikit-learn`.

## 📂 Notebook Contents

The notebook systematically covers the following crucial topics:

1. **Data Splitting:** Splitting the dataset into **training** and **testing** sets.
2. **Model Scoring:** Evaluating performance using the **$R^2$ (Coefficient of Determination)** metric.
3. **Cross-Validation:** Implementation of **K-Fold Cross-Validation** (using `cross_val_score` and `cross_val_predict`).
4. **Model Diagnostics:** Identification and visualization of **Overfitting** and **Underfitting**, especially with **Polynomial Regression**.
5. **Regularization:** Application of **Ridge Regression** to control model complexity.
6. **Hyperparameter Tuning:** Using **Grid Search (`GridSearchCV`)** to find the optimal $\alpha$ (alpha) hyperparameter for the Ridge model.

## 🛠️ Requirements

To run this notebook, you will need the following Python libraries:

* `pandas`
* `numpy`
* `scikit-learn` (`sklearn`)
* `matplotlib`
* `seaborn`

Install them via pip:

```bash
pip install pandas numpy scikit-learn matplotlib seaborn