Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/viveckh/lilhomie
A Machine Learning Project implemented from scratch which involves web scraping, data engineering, exploratory data analysis and machine learning to predict housing prices in New York Tri-State Area.
https://github.com/viveckh/lilhomie
data-engineering eda housing-price-analysis housing-price-prediction machine-learning machine-learning-projects predictions random-forest-regressor scrapy-crawler spiders trulia web-crawler
Last synced: about 2 months ago
JSON representation
A Machine Learning Project implemented from scratch which involves web scraping, data engineering, exploratory data analysis and machine learning to predict housing prices in New York Tri-State Area.
- Host: GitHub
- URL: https://github.com/viveckh/lilhomie
- Owner: Viveckh
- Created: 2019-02-18T15:07:53.000Z (almost 6 years ago)
- Default Branch: master
- Last Pushed: 2022-12-08T01:37:18.000Z (about 2 years ago)
- Last Synced: 2024-10-11T10:39:30.177Z (2 months ago)
- Topics: data-engineering, eda, housing-price-analysis, housing-price-prediction, machine-learning, machine-learning-projects, predictions, random-forest-regressor, scrapy-crawler, spiders, trulia, web-crawler
- Language: Jupyter Notebook
- Homepage:
- Size: 10.5 MB
- Stars: 81
- Watchers: 4
- Forks: 19
- Open Issues: 14
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
### LilHomie - Housing Price Prediction Rapid Prototype
### Author: [(EJ) Vivek Pandey](https://viveckh.com)
LilHomie is a rapid prototyping project that aims to generate housing appraisals to determine values of properties in the New York Tri-state Area.
This repository contains all the associated work that has been done for the area which includes:
* Web Crawler to gather housing data
* Notebooks associated with data engineering, EDA, and ML Modeling
* Serverless API setup to make predictions off the serialized models
* Web App### Future Enhancements
* Adding support to crawl and extract through remaining 3 property page formats in Trulia
* Spiders in Web Crawler to extract data from Zillow
* Speeding up the crawler with distributed spiders
* Feeding the ML model with data of properties across the US and making necessary adjustments based on new results, instead of the tri-states properties it is limited to (but this requires the above three enhancements to be done first)### Questions?
Email the author at [email protected]