https://github.com/jplusplus/overlap

Extrapolate data from one set of administrative entities to another
https://github.com/jplusplus/overlap

Last synced: 11 months ago
JSON representation

Extrapolate data from one set of administrative entities to another

Host: GitHub
URL: https://github.com/jplusplus/overlap
Owner: jplusplus
Created: 2015-03-30T10:35:46.000Z (about 11 years ago)
Default Branch: master
Last Pushed: 2015-03-30T10:48:06.000Z (about 11 years ago)
Last Synced: 2024-04-14T04:55:28.500Z (about 2 years ago)
Language: Python
Size: 125 KB
Stars: 0
Watchers: 12
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: readme.md

Awesome Lists containing this project

README

The problem
-----------
You have two different, overlapping administrative divisions, and want statistics based on one of them extrapolated to the other.

Case: We have hardly any demographic data for the 15,000 or so Swedish postal codes. On the other hand we have plenty of interesting data on the ≈ 6,000 electorial districts. Given a fairly large dataset where we know the postal codes, we can extrapolate statistics from electorial districts, and get a fair approximation.

The solution
------------
Use your favourite GIS software to intersect the two administrative systems. Create a .dfb (QGIS) or .csv file containing an area column for the intersections. Run the file through `create_factors.py`, to create a table of weighing factors. Then run your statistics through `run_stats.py` to apply.

This will obviously create useful results only for fairly small and homogenous administrative entities, and fairly large datasets. Common sense is your friend here.

Example
-------
We have two administrative systems: *Counties* and *provinces*. We know the number of camels in each *province*:

province, num_camels
Värmland, 12
Dalarna, 20

Now we want to know the approximate number of camels in each *county*.

1. Using QGIS, we produce a .dbf file with all intersections:

2. Then we run `weighted_data --id_1=county --id_2=province --area=area` to produce a json file, `factors.json`, with weighing factors:

`"Värmlands län": {"Värmland": 1, "Dalarna": .03},`
`"Dalarnas län": {"Dalarna": .97}`

3. Finally, we run our camel data through this filter, `run_stats --id=province --value=num_camels --factors=factors.json --input=input.csv`, to get the approximate count of camels in each county:

`Värmlands län, 12.6`
`Dalarnas län, 19.4`

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/jplusplus/overlap

Awesome Lists containing this project

README