Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/eyeseast/dorchester
A toolkit for making dot-density maps in Python
https://github.com/eyeseast/dorchester
Last synced: 13 days ago
JSON representation
A toolkit for making dot-density maps in Python
- Host: GitHub
- URL: https://github.com/eyeseast/dorchester
- Owner: eyeseast
- License: apache-2.0
- Created: 2021-04-07T00:40:21.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2022-03-30T15:37:59.000Z (over 2 years ago)
- Last Synced: 2024-10-28T06:01:35.280Z (17 days ago)
- Language: Jupyter Notebook
- Homepage:
- Size: 3.2 MB
- Stars: 3
- Watchers: 4
- Forks: 0
- Open Issues: 12
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# dorchester
[![PyPI](https://img.shields.io/pypi/v/dorchester.svg)](https://pypi.org/project/dorchester/)
[![Changelog](https://img.shields.io/github/v/release/eyeseast/dorchester?include_prereleases&label=changelog)](https://github.com/eyeseast/dorchester/releases)
[![Tests](https://github.com/eyeseast/dorchester/workflows/Test/badge.svg)](https://github.com/eyeseast/dorchester/actions?query=workflow%3ATest)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/eyeseast/dorchester/blob/master/LICENSE)A tool for making dot-density maps in Python.
## Caveat emptor
This is very alpha right now. Use at your own risk and evaluate any editorial usage of this of this library before publishing.
## Installation
Install this tool using `pip`:
$ pip install dorchester
## Usage
The main command is `dorchester plot`. That takes an input file, an output file and one or more property keys to extract population counts.
```sh
dorchester plot --help
Usage: dorchester plot [OPTIONS] SOURCE DESTGenerate data for a dot-density map. Input may be any GIS format readable
by Fiona (Shapefile, GeoJSON, etc).Options:
-k, --key TEXT Property name for a population. Use multiple
to map different population classes.-f, --format [csv|geojson|null]
Output format. If not given, will guess
based on output file extension.-m, --mode [w|a|x] File mode for destination [default: w]
--fid TEXT Use a property key (instead of feature.id)
to uniquely identify each feature--coerce Coerce properties passed in --key to
integers. BE CAREFUL. This could cause
incorrect results if misused.--progress Show a progress bar [default: False]
-m, --multiprocessing Use multiprocessing
--help Show this message and exit.
```Input can be in any format readable by [Fiona](https://fiona.readthedocs.io/en/stable/index.html), such as Shapefiles and GeoJSON. The input file needs to contain both population data and boundaries. You may need to join different files together before plotting with `dorchester`.
Output format (`--format`) can be CSV or GeoJSON (more formats coming soon). For GeoJSON, the output will be a stream of newline-delimited `Point` features, like this:
```json
{"type": "Feature", "geometry": {"type": "Point", "coordinates": [76, 38]}, "properties": {"group": "population", "fid": 1}}
{"type": "Feature", "geometry": {"type": "Point", "coordinates": [77, 39]}, "properties": {"group": "population", "fid": 1}}
{"type": "Feature", "geometry": {"type": "Point", "coordinates": [78, 37]}, "properties": {"group": "population", "fid": 1}}
```This will be _big_ files, because we are creating a point for every individual. Massachusetts, for example, had a population of 6.631 million in 2010, which means a dot density CSV file will be 6,336,107 lines long and 305 mb.
Each key (`--key`) should correspond to a property on each feature whose value is a whole number. In a block like this, use `--key POP10` to extract population:
```json
{
"geometry": {
"coordinates": [...],
"type": "Polygon"
},
"id": "0",
"properties": {
"BLOCKCE": "4023",
"BLOCKID10": "250010112004023",
"COUNTYFP10": "001",
"HOUSING10": 16,
"PARTFLG": "N",
"POP10": 12,
"STATEFP10": "25",
"TRACTCE10": "011200"
},
"type": "Feature"
}
```You can pass multiple `--key` options to create different groups that will be layered together. This is how you would create a map showing different racial groups, for example.
The `--mode` option controls how the output file is opened:
- `w` will create or overwrite the output file
- `a` will append to an existing file
- `x` will try to create a new file and fail if that file already existsSetting `--fid` will use a property key to identify each feature, instead of the feature's `id` field (which is often missing, or will be an index number in shapefiles). In the Census block example above, `BLOCKID10` will uniquely identify this block, while `id: 0` only identifies it as the first feature in its source shapefile.
For data sources where properties are encoded as strings, the `--coerce` option will recast anything passed via `--key` to integers. Be careful with this option, as it involves changing data. It will fail (and stop plotting) if it encounters something that can't be coerced into an integer.
Use the `--progress` flag to show a progress bar. This is off by default.
Use `-m` or `--multiprocessing` to use Python's [multiprocessing](https://docs.python.org/3/library/multiprocessing.html) module to significantly speed up point generation. This will try to use every processor on your machine instead of just one.
## Putting points on a map
For small-ish areas, QGIS will render lots of points just fine. Generate points, and load the output as a delimited or GeoJSON file.
To build an interactive dot density map, you can use [tippecanoe](https://github.com/mapbox/tippecanoe) to generate an MBTiles file, which can be uploaded to Mapbox (or possibly other hosting providers). This has worked for me:
```sh
tippecanoe -zg -o points.mbtiles --drop-densest-as-needed --extend-zooms-if-still-dropping points.csv
```## About the name
[Dorchester](https://en.wikipedia.org/wiki/Dorchester,_Boston) is the largest and most diverse neighborhood in Boston, Massachusetts, and is often referred to as Dot.
The name is also a nod to [Englewood](https://github.com/newsapps/englewood), built by the Chicago Tribune News Apps team. This is, hopefully, a worthy successor.
## Development
To contribute to this tool, first checkout the code. Then create a new virtual environment:
cd dorchester
python -m venv .venv
source .venv/bin/activateOr if you are using `pipenv`:
pipenv shell
Now install the dependencies and tests:
pip install -e '.[test]'
To run the tests:
pytest