Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/deep-diver/data-wrangling-on-openstreemap
https://github.com/deep-diver/data-wrangling-on-openstreemap
audit data-wrangling open-street-map seattle
Last synced: 23 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/deep-diver/data-wrangling-on-openstreemap
- Owner: deep-diver
- Created: 2017-08-20T22:56:24.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2017-09-14T03:59:11.000Z (over 7 years ago)
- Last Synced: 2024-10-04T13:42:21.860Z (3 months ago)
- Topics: audit, data-wrangling, open-street-map, seattle
- Language: Jupyter Notebook
- Size: 1.69 MB
- Stars: 2
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Data Wrangling on Open Street Map data in Seattle area.
## What files this project includes?
- wrangling.py
- standalone python source. This file consists of functions of the following
- audit_speed(), audit_phone(), audit_street_name() to audit some data
- update_speed(), update_phone(), update_street_name() to update inconsistent data to be consistent
- shape_tag_element(), shape_tag_elements(), shape_nd_element(), shape_common_for_node_and_way(), shape_element() to operate extracting, auditing, updating, and save the given data into dictionary datatype.
- process_map() to write re-organized data into seperate CSV files to import into SQLite DB later.
- generate_sample() to generate sample OSM data file from the original since the original file is too big to audit at first.
- wrangle_us.ipynb
- this file gives a better description how I did analyse, audit, update, and make SQL quries in step by step manner.
- wrangle_seattle.pdf
- this file is converted version of wrangle_us.ipynb file.## About the data
- Open Street Map
- This kind of data is chosen to practive data wrangling because the data is not machine generated rather lots of human participated into form the entire set. I means there are lots of inconsistent data, so I could give a shot to look into it and re-organize them which is very good to practive data wrangling.
- Map area
- Seattle
- the data can be downloaded here (https://mapzen.com/data/metro-extracts/metro/seattle_washington/)
- mapzen provides pre-generated dataset for popular areas.
- why seattle?
- just for my interesting since I have spent an year as an exchange student at UW in 2011.## Reference Sites
- Open Stree Map (https://www.openstreetmap.org)
- Mapzen (https://mapzen.com/data/metro-extracts/)
- Open Street Map Wiki (https://wiki.openstreetmap.org)