{"id":16210575,"url":"https://github.com/viveckh/lilhomie","last_synced_at":"2025-09-09T07:42:29.554Z","repository":{"id":37062521,"uuid":"171305173","full_name":"Viveckh/LilHomie","owner":"Viveckh","description":"A Machine Learning Project implemented from scratch which involves web scraping, data engineering, exploratory data analysis and machine learning to predict housing prices in New York Tri-State Area.","archived":false,"fork":false,"pushed_at":"2022-12-08T01:37:18.000Z","size":10959,"stargazers_count":92,"open_issues_count":14,"forks_count":19,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-04-03T01:06:42.642Z","etag":null,"topics":["data-engineering","eda","housing-price-analysis","housing-price-prediction","machine-learning","machine-learning-projects","predictions","random-forest-regressor","scrapy-crawler","spiders","trulia","web-crawler"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Viveckh.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-02-18T15:07:53.000Z","updated_at":"2025-04-02T16:38:28.000Z","dependencies_parsed_at":"2023-01-24T21:00:46.472Z","dependency_job_id":null,"html_url":"https://github.com/Viveckh/LilHomie","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/Viveckh/LilHomie","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Viveckh%2FLilHomie","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Viveckh%2FLilHomie/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Viveckh%2FLilHomie/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Viveckh%2FLilHomie/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Viveckh","download_url":"https://codeload.github.com/Viveckh/LilHomie/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Viveckh%2FLilHomie/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":274261390,"owners_count":25251946,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-09T02:00:10.223Z","response_time":80,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-engineering","eda","housing-price-analysis","housing-price-prediction","machine-learning","machine-learning-projects","predictions","random-forest-regressor","scrapy-crawler","spiders","trulia","web-crawler"],"created_at":"2024-10-10T10:39:22.775Z","updated_at":"2025-09-09T07:42:29.527Z","avatar_url":"https://github.com/Viveckh.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"### LilHomie - Housing Price Prediction Rapid Prototype\n\n### Author: [(EJ) Vivek Pandey](https://viveckh.com)\n\nLilHomie is a rapid prototyping project that aims to generate housing appraisals to determine values of properties in the New York Tri-state Area. \n\nThis repository contains all the associated work that has been done for the area which includes:\n* Web Crawler to gather housing data\n* Notebooks associated with data engineering, EDA, and ML Modeling\n* Serverless API setup to make predictions off the serialized models\n* Web App\n\n### Future Enhancements\n* Adding support to crawl and extract through remaining 3 property page formats in Trulia\n* Spiders in Web Crawler to extract data from Zillow\n* Speeding up the crawler with distributed spiders\n* Feeding the ML model with data of properties across the US and making necessary adjustments based on new results, instead of the tri-states properties it is limited to (but this requires the above three enhancements to be done first)\n\n\n### Questions?\nEmail the author at anton.503.overload@gmail.com","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fviveckh%2Flilhomie","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fviveckh%2Flilhomie","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fviveckh%2Flilhomie/lists"}