Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/henleykuang/rent_price_collection
Rent Price Collection is an end-to-end collection process to collect rental data from Trulia and Zillow and a web application to visualize the collected data.
https://github.com/henleykuang/rent_price_collection
javascript mariadb mysql python python27 react
Last synced: 20 days ago
JSON representation
Rent Price Collection is an end-to-end collection process to collect rental data from Trulia and Zillow and a web application to visualize the collected data.
- Host: GitHub
- URL: https://github.com/henleykuang/rent_price_collection
- Owner: HenleyKuang
- Created: 2019-08-28T01:30:31.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2022-12-27T15:36:43.000Z (about 2 years ago)
- Last Synced: 2024-11-22T13:44:58.786Z (3 months ago)
- Topics: javascript, mariadb, mysql, python, python27, react
- Language: Python
- Homepage:
- Size: 164 KB
- Stars: 0
- Watchers: 2
- Forks: 1
- Open Issues: 3
-
Metadata Files:
- Readme: README.rst
Awesome Lists containing this project
README
=====================
Rent Price Collection
=====================
By Henley KuangPre-requisites
--------------.. code-block:: bash
https://www.python.org/downloads/release/python-2712/
https://pip.pypa.io/en/stable/installing/
https://sourceforge.net/projects/mysql-python/
https://nodejs.org/en/download/
https://www.atlassian.com/git/tutorials/install-git#windowsInstallation
------------.. code-block:: bash
git clone https://github.com/HenleyKuang/rent_price_collection.git
cd rent_price_collection
pip install -r requirements.txtConfiguration
-------------Set up your configuration for email notifications & connection for mysql storage
.. code-block:: bash
# To send gmail notifications
GMAIL_USER = '@gmail.com'
GMAIL_PASSWORD = ''
GMAIL_SENT_TO_EMAILS = ['@gmail.com'] # [..., ..., ...]# mysql connection for storage of crawled data
MYSQL_HOST = ''
MYSQL_USER = ''
MYSQL_PASS = ''
MYSQL_DB = ''
MYSQL_PORT =Create the MySql table for Trulia RPC Data
.. code-block:: bash
CREATE TABLE `trulia_listings` (
`listing_id` bigint(20) unsigned NOT NULL,
`card_url` varchar(255) NOT NULL,
`street_address` varchar(255) NOT NULL,
`city` varchar(255) NOT NULL,
`state` varchar(255) NOT NULL,
`zip_code` mediumint(6) signed NOT NULL,
`beds` varchar(255) NOT NULL,
`baths` varchar(255) NOT NULL,
`lat` double(10,7) NOT NULL,
`lng` double(10,7) NOT NULL,
`sqft` varchar(255) NOT NULL,
`price` varchar(255) NOT NULL,
`date_collected` datetime DEFAULT CURRENT_TIMESTAMP,
`date_updated` datetime DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`listing_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 ROW_FORMAT=COMPRESSED;CREATE TABLE `zillow_listings` (
`listing_id` varchar(255) NOT NULL,
`detail_url` varchar(255) NOT NULL,
`street_address` varchar(255) NOT NULL,
`city` varchar(255) NOT NULL,
`state` varchar(255) NOT NULL,
`building_name` varchar(255) NOT NULL,
`zip_code` mediumint(6) signed NOT NULL,
`beds` varchar(255) NOT NULL,
`baths` varchar(255) NOT NULL,
`lat` double(10,7) NOT NULL,
`lng` double(10,7) NOT NULL,
`sqft` varchar(255) NOT NULL,
`price` varchar(255) NOT NULL,
`date_collected` datetime DEFAULT CURRENT_TIMESTAMP,
`date_updated` datetime DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`listing_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 ROW_FORMAT=COMPRESSED;CREATE TABLE `all_listings` (
`id` varchar(255) NOT NULL,
`source` varchar(255) NOT NULL,
`url` varchar(255) NOT NULL,
`street_address` varchar(255) NOT NULL,
`city` varchar(255) NOT NULL,
`state` varchar(255) NOT NULL,
`zip_code` varchar(10) NOT NULL,
`beds` varchar(255) NOT NULL,
`baths` varchar(255) NOT NULL,
`sqft` varchar(255) NOT NULL,
`price` varchar(255) NOT NULL,
`date_collected` datetime DEFAULT CURRENT_TIMESTAMP,
`date_updated` datetime DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 ROW_FORMAT=COMPRESSED;Options
-------.. csv-table::
:header: "Name", "Description", "Required", "Type", "Accepted Values", "Default Value"
:widths: 30, 75, 5, 5, 5, 5"--location", "Location Name", "Yes", "String"
"--start-page-num", "Page number to start crawling from", "No", "Integer", "", "1"
"--proxy-ip", "Proxy Host [Optional]", "No", "String", "", ""
"--proxy-port", "Proxy Port [Optional]", "No", "Integer", "", ""
"--proxy-user", "Proxy User Auth for proxy [Optional]", "No", "String", "", ""
"--proxy-pass", "Proxy Password Auth for proxy [Optiona]", "No", "String", "", ""How to Run
----------1. Collect Data
.. code-block:: bash
python .\rent_price_collection\app\trulia_rpc.py --location "Hanover Park,IL"
python .\rent_price_collection\app\trulia_rpc.py --location "Bolingbrook,IL"
python .\rent_price_collection\app\trulia_rpc.py --location "Round Lake,IL"
python .\rent_price_collection\app\trulia_rpc.py --location-file "location_file.txt"python .\rent_price_collection\app\zillow_rpc.py --location "Round Lake,IL"
python .\rent_price_collection\app\zillow_rpc.py --location-file "location_file.txt"2. Post Process Data (Merge data into 1 table)
.. code-block:: bash
python .\rent_price_collection\storage\all_listings_mysql.py union
3. Run API & Database UI
.. code-block:: bash
.\rent_price_collection\scripts\run_api_and_ui.sh