Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/hugo-hattori/property_price_prediction_machine_learning
Machine Learning Project to create an AI capable of predicting property's rental prices.
https://github.com/hugo-hattori/property_price_prediction_machine_learning
ai-models artificial-intelligence joblib jupyter-notebook machine-learning machine-learning-algorithms matplotlib matplotlib-pyplot numpy pandas pathlib plotly plotly-express python seaborn sklearn sklearn-linear-model sklearn-metrics sklearn-models streamlit
Last synced: 6 days ago
JSON representation
Machine Learning Project to create an AI capable of predicting property's rental prices.
- Host: GitHub
- URL: https://github.com/hugo-hattori/property_price_prediction_machine_learning
- Owner: Hugo-Hattori
- License: mit
- Created: 2023-09-04T04:18:44.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2023-11-06T21:55:12.000Z (about 1 year ago)
- Last Synced: 2024-11-07T14:32:07.489Z (about 2 months ago)
- Topics: ai-models, artificial-intelligence, joblib, jupyter-notebook, machine-learning, machine-learning-algorithms, matplotlib, matplotlib-pyplot, numpy, pandas, pathlib, plotly, plotly-express, python, seaborn, sklearn, sklearn-linear-model, sklearn-metrics, sklearn-models, streamlit
- Language: Jupyter Notebook
- Homepage:
- Size: 16.5 MB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Property Price Prediction Machine Learning
## Project Scenario Description
Airbnb is an american company operating online marketplace for short and long-term homestays
and experiences alike. The company acts as a broker and charges commissions from each booking.
By using Airbnb's services anyone who has a room or a property of any kind (apartment, house,
chalet, hostel, etc.) can offer their property to be rented out on a daily basis.In the scenario where you are a property owner and wants to offer it for renting,
you create a profile as a host and an ad for your property. The ad must describe
the characteristics of the property as fully as possible, in order to help renters/travelers
choose the best property for them and also to make your ad more attractive. There are dozens
of possible customizations from minimum daily rate, price, number of rooms to cancellation
rules, extra fees for extra guests and requirement to verify the landlord's identity, etc.## Project's Objective
To build a price prediction AI model that allows an average person who owns a property
to know how much they should charge per day for their property's renting.Or even to allow them (given a certain type of property they are looking for) to know
whether that property is attractively priced (below the average for properties with the
same characteristics) or not.### Inspiration and Database Source
The database was taken from kaggle (https://www.kaggle.com/allanbruno/airbnb-rio-de-janeiro)
containing Airbnb's data regarding Rio de Janeiro city properties.This project was inspired by the following project: https://www.kaggle.com/allanbruno/helping-regular-people-price-listings-on-airbnb
### Packages used:
+ sklearn.metrics
+ sklearn.linear_model
+ sklearn.ensemble
+ sklearn.model_selection
+ pandas
+ pathlib
+ numpy
+ seaborn
+ matplotlib.pyplot
+ plotly.express
+ joblib
+ streamlit## The Database
The database contains .csv files from April 2018 to September 2019 containing 108 different
features from properties located in Rio de Janeiro. After a qualitative analysis the features
were reduced to 34. The analysis excluded features following these criteria:1) IDs, Links, and information not relevant to the model;
2) Repeated or redundant information;
3) Columns filled with long descriptive text;
4) Columns where all or almost all values are equal.## Handling Empty Values
Visualizing the data we realize that there is a large disparity in missing data.
Columns with more than 300,000 NaN values were excluded from the analysis;For the other columns with relatively small amount of NaN values the rows
containing NaN data were deleted (since we have over 900,000 rows of data).
After this second analysis the database was left with 23 features.## Exploratory Data Analysis (EDA)
In this project the EDA will have 3 major objectives:
1) See the correlation between the features and decide if we will keep all the features we have;
2) Exclude outliers by following the Interquartile Range Rule;
3) Confirm if all the features we have really make sense for our model or
if any of them won't help us and if we should delete them.There are 3 different types of features to be analyzed: Continuous Numeric Values
(e.g. monetary value), Discrete Numeric Values (e.g. number of bedrooms) and Categories
(e.g. property type).## Machine Learning
With the database ready it's time to start the Machine Learning process which consists of:
1) Defining the Algorithms for the AI Model creation which are: Random Forest,
Linear Regression and Extra Trees;
2) Separating the database into Training Data and Testing Data;
3) Training the AI Models;
4) Evaluating and choosing the best AI Model.## Project Deployment
The last step is to deploy the project. For that purpose we will export the processed database
to a .csv file and convert chosen AI Model to a file by using the joblib package. The deployed
project will utilize the processed database for the Property Price Prediction."DeployProjetoAirbnb.py" is the Python Script responsible for this project deployment.
![img.png](img.png)
![img_1.png](img_1.png)
As shown in the images the AI Model can be used by inserting the proper information in each
gap, and by pressing "Prever Valor do Imóvel" the predicted price will be displayed bellow.Note: this project has no direct relationship with Airbnb and was created for academic purposes,
the data used was made publicly available by Airbnb.