https://github.com/vanilladucky/housing-prediction

This is a data analytics and machine learning project that I undertook using a housing dataset on Kaggle in order to put my machine learning knowledge to practice and some practical application.
https://github.com/vanilladucky/housing-prediction

data-science machine-learning python scikit-learn

Last synced: about 2 months ago
JSON representation

This is a data analytics and machine learning project that I undertook using a housing dataset on Kaggle in order to put my machine learning knowledge to practice and some practical application.

Host: GitHub
URL: https://github.com/vanilladucky/housing-prediction
Owner: vanilladucky
Created: 2022-04-28T13:38:00.000Z (about 4 years ago)
Default Branch: main
Last Pushed: 2022-05-11T09:13:46.000Z (about 4 years ago)
Last Synced: 2025-01-23T06:13:05.033Z (over 1 year ago)
Topics: data-science, machine-learning, python, scikit-learn
Language: Jupyter Notebook
Homepage: https://share.streamlit.io/vanilladucky/housing-prediction/main/prediction_project.py
Size: 21.9 MB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# A housing price predicting web application

## Summary
This is a data analytics and machine learning project that I undertook using a housing dataset on Kaggle in order to put my machine learning knowledge to practice and some practical application.

## Work explained
In the **data** folder, there are the cleaned and external datasets.
The external data had numerical and categorical values and also numerous NaN values. I used logical imputation methods, taking into consideration the scenario, to ensure there were no NaN values and even if there were, which are logical for houses, I utilized label encoding for categorical features.

All of these data cleaning, visualization and feature engineering + categorical mapping are present in the **notebooks** folder

Meanwhile, in the **model** notebook, I go onto utilize these datasets to come up with different models, varying in complexities. I went onto choose two specific algorithms which were better than the others and went onto tune their hyperparameters, and finally stacked them with linear regression for the final model, yielding the lowest error.

## Tech used
* Python
* Jupyter Notebook
* Scikit-Learn
* Matplotlib
* Streamlit

## Web App
https://share.streamlit.io/vanilladucky/housing-prediction/main/prediction_project.py

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/vanilladucky/housing-prediction

Awesome Lists containing this project

README