Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/abideen-olawuwo/bulldozer-prediction

Predicting the Future Price of Bulldozer
https://github.com/abideen-olawuwo/bulldozer-prediction

machine-learning matplotlib numpy pandas python random-forest-regressor scikit-learn

Last synced: 10 days ago
JSON representation

Predicting the Future Price of Bulldozer

Host: GitHub
URL: https://github.com/abideen-olawuwo/bulldozer-prediction
Owner: abideen-olawuwo
Created: 2023-02-12T12:08:28.000Z (almost 2 years ago)
Default Branch: main
Last Pushed: 2023-05-29T09:09:52.000Z (over 1 year ago)
Last Synced: 2024-11-15T00:33:13.090Z (2 months ago)
Topics: machine-learning, matplotlib, numpy, pandas, python, random-forest-regressor, scikit-learn
Language: Jupyter Notebook
Homepage:
Size: 353 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

* Problem defintion

How well can we predict the future sale price of a bulldozer, given its characteristics and previous examples of how much similar bulldozers have been sold for?

* Data

The data is downloaded from the Kaggle Bluebook for Bulldozers competition: https://www.kaggle.com/c/bluebook-for-bulldozers/data

There are 3 main datasets:

* Train.csv is the training set, which contains data through the end of 2011.
* Valid.csv is the validation set, which contains data from January 1, 2012 - April 30, 2012 You make predictions on this set throughout the majority of the competition. Your score on this set is used to create the public leaderboard.
* Test.csv is the test set, which won't be released until the last week of the competition. It contains data from May 1, 2012 - November 2012. Your score on the test set determines your final rank for the competition.

* Evaluation

The evaluation metric for this competition is the RMSLE (root mean squared log error) between the actual and predicted auction prices.

For more on the evaluation of this project check: https://www.kaggle.com/c/bluebook-for-bulldozers/overview/evaluation

**Note:** The goal for most regression evaluation metrics is to minimize the error. LE.

* Features

Kaggle provides a data dictionary detailing all of the features of the dataset. You can view this data dictionary on Google Sheets: https://docs.google.com/spreadsheets/d/18ly-bLR8sbDJLITkWG7ozKm8l3RyieQ2Fpgix-beSYI/edit?usp=sharing

* The Model Used for prediction was RandomForestRegressor