https://github.com/cyprianfusi/kaggle_bulldozer_price_prediction

Validation RMSLE obtained: 0.21163 which is less than the RMSLE score (0.22909) that won the Kaggle Competition.
https://github.com/cyprianfusi/kaggle_bulldozer_price_prediction

bulldozer-price-prediction feature-engineering feature-selection pca random-forest-regressor rmsle

Last synced: 8 months ago
JSON representation

Validation RMSLE obtained: 0.21163 which is less than the RMSLE score (0.22909) that won the Kaggle Competition.

Host: GitHub
URL: https://github.com/cyprianfusi/kaggle_bulldozer_price_prediction
Owner: CyprianFusi
Created: 2024-02-04T09:44:11.000Z (almost 2 years ago)
Default Branch: main
Last Pushed: 2024-07-23T03:12:29.000Z (over 1 year ago)
Last Synced: 2025-01-26T11:28:34.383Z (10 months ago)
Topics: bulldozer-price-prediction, feature-engineering, feature-selection, pca, random-forest-regressor, rmsle
Language: Jupyter Notebook
Homepage:
Size: 1.11 MB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Predicting the Sale Price of Bulldozers using RandomForestRegressor
## The Goal
* "Predict the auction sale price for a piece of heavy equipment to create a "blue book" for bulldozers".
The goal was to try and obtain a **better RMSLE** compared to the **best** score obtained at in the **Kaggle Competition**.

## Results Obtained
* Validation RMSLE obtained: **`0.21163`**, Validation $R^2$: `0.90759`
* Best Kaggle RMSLE: **`0.22909`**

## Dataset
For Dataset Description see **Kaggle's** [Blue Book for Bulldozers](https://www.kaggle.com/c/bluebook-for-bulldozers/data). The data for this competition is split into three parts:

* **Train.csv** is the training set, which contains data through the end of 2011.
* **Valid.csv** is the validation set, which contains data from January 1, 2012 - April 30, 2012 You make predictions on this set throughout the majority of the competition. Your score on this set is used to create the public leaderboard.
* **Test.csv** is the test set, which won't be released until the last week of the competition. It contains data from May 1, 2012 - November 2012. Your score on the test set determines your final rank for the competition

## Evaluation
"The evaluation metric for this competition is the **RMSLE (root mean squared log error)** between the actual and predicted auction prices." See [Evaluation](https://www.kaggle.com/c/bluebook-for-bulldozers/overview) on Kaggle.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/cyprianfusi/kaggle_bulldozer_price_prediction

Awesome Lists containing this project

README