Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/sunlightpolicy/sql4housing
Create housing databases
https://github.com/sunlightpolicy/sql4housing
census-data cities civic-tech databases housing housing-advocates housing-data hud postgis socrata-open-data-api
Last synced: 4 days ago
JSON representation
Create housing databases
- Host: GitHub
- URL: https://github.com/sunlightpolicy/sql4housing
- Owner: sunlightpolicy
- License: mit
- Created: 2019-07-02T19:38:53.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2022-06-21T22:31:39.000Z (over 2 years ago)
- Last Synced: 2024-12-14T00:32:54.073Z (8 days ago)
- Topics: census-data, cities, civic-tech, databases, housing, housing-advocates, housing-data, hud, postgis, socrata-open-data-api
- Language: Python
- Homepage: https://sunlightfoundation.com/2019/07/22/hacking-for-housing-how-open-data-and-civic-hacking-creates-wins-for-housing-advocates/
- Size: 579 KB
- Stars: 6
- Watchers: 4
- Forks: 3
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: license.txt
Awesome Lists containing this project
README
# sql4housing
## Background
Sql4housing is based on a broader effort to encourage collaboration between civic hackers and housing advocates. Read more about this work on our blog here:
[Hacking for Housing: How open data and civic hacking creates wins for housing advocates](https://sunlightfoundation.com/2019/07/22/hacking-for-housing-how-open-data-and-civic-hacking-creates-wins-for-housing-advocates/)
[Ownership, evictions, and violations: an overview of housing data use cases
](https://sunlightfoundation.com/2019/08/20/ownership-evictions-and-violations-an-overview-of-housing-data-use-cases/)## Introduction
Sql4housing is based on a cloned copy of Dallas Morning News' [socrata2sql](https://github.com/DallasMorningNews/socrata2sql). Socrata2sql is a tool which allows you to import any dataset on the Socrata API and copy it into a SQL database of your choice using a command line interface. Here, I aim to adapt socrata2sql to be able to import datasets from the following sources:
-[HUD's Open Data Portal](https://hudgis-hud.opendata.arcgis.com/)
-Any locally saved Excel file or Excel download hyperlink
-Any locally saved .csv file or .csv download hyperlink
-Any locally saved .shp file or .zip download hyperlink containing a .shp file
-Any locally saved .geojson file or .geojson download hyperlink
-Any dataset on a Socrata open data portal
-Census variables within the 5-year American Community Survey or Decennial Census## Requirements
- Python 3.x
- Any database supported by SQLAlchemy
- Download package via: `pip install sql4housing`## Usage
Changes in usage will be periodically updated and documented within the docstring of [cli.py](https://github.com/sunlightpolicy/sql4housing/blob/master/sql4housing/cli.py)
See [/chicago_examples](https://github.com/sunlightpolicy/sql4housing/tree/master/chicago_example) for a detailed use case.
"""Housing data to SQL database loader
Load a dataset directly from an API (Socrata, HUD) or file (csv or shp)
into a SQL database. The loader supports any database supported by SQLalchemy.
This file is adapted from a forked copy of DallasMorningNews/socrata2sqlUsage:
sql4housing bulk_load
sql4housing hud [--d=] [--t=]
sql4housing socrata [--a=] [--d=] [--t=]
sql4housing csv [--d=] [--t=]
sql4housing excel [--d=] [--t=]
sql4housing shp [--d=] [--t=]
sql4housing geojson [--d=] [--t=]
sql4housing census (decennial2010 | (acs [--y=])) (--m= | --c= | --n= | --s= | --p=) [--l=] [--d=] [--t=]
sql4housing (-h | --help)
sql4housing (-v | --version)Options:
Loads all datasets documented within a file entitled bulk_load.yaml.
Must be run in the same folder where bulk_load.yaml is saved.
The domain for the open data site. For Socrata, this is the
URL to the open data portal (Ex: www.dallasopendata.com).
For HUD, this is the Query URL as created in the API
Explorer portion of each dataset's page on the site
https://hudgis-hud.opendata.arcgis.com. See example use cases
for detailed instructions.
The ID of the dataset on Socrata's open data site. This is
usually a few characters, separated by a hyphen, at the end
of the URL. Ex: 64pp-jeba
Either the path or download URL where the file can be accessed.
--d= Database connection string for destination database as
diacdlect+driver://username:password@host:port/database.
Default: "postgresql:///mydb"
--t= Destination table in the database. Defaults to a sanitized
version of the dataset or file's name.
--a= App token for the Socrata site. Only necessary for
high-volume requests. Default: None
--y= Optional year specification for the 5-year American Community
survey. Defaults to 2017.
--m= The metropolitan statistical area to include.
Ex: --m="new york-newark-jersey city"
--c= The combined statistical area to include.
Ex: --c="New York-Newark, NY-NJ-CT-PA"
--n= The county to include.
Ex: --n="cook county, IL"
--s= The state to include.
Ex: --s="illinois"
--p= The census place to include.
Ex: --p="chicago, IL"
--l= The geographic level at which to extract data. i.e. tract,
block, county, region, division. Reference cenpy documentation
to learn more: https://github.com/cenpy-devs/cenpy
-h --help Show this screen.
-v --version Show version.Examples:
Load the Dallas check register into a local SQLite file (file name chosen
from the dataset name):
$ sql4housing socrata www.dallasopendata.com 64pp-jebaLoad it into a PostgreSQL database called mydb:
$ sql4housing socrata www.dallasopendata.com 64pp-jeba -d"postgresql:///mydb"Load Public Housing Buildings from HUD into a PostgreSQL database called mydb:
$ sql4housing hud "https://services.arcgis.com/VTyQ9soqVukalItT/arcgis/rest/services/Public_Housing_Buildings/FeatureServer/0/query?outFields=*&where=1%3D1" -d=postgresql:///mydbLoad Public Housing Physical Inspection scores into a PostgreSQL database called housingdb:
$ sql4housing excel "http://www.huduser.org/portal/datasets/pis/public_housing_physical_inspection_scores.xlsx" -d=postgresql:///housingdb
"""