Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/kraina-ai/overturemaestro

An open-source tool for reading OvertureMaps data with multiprocessing and additional Quality-of-Life features
https://github.com/kraina-ai/overturemaestro

geo geospatial open-source openstreetmap overture-maps overturemaps pyarrow python

Last synced: about 2 months ago
JSON representation

An open-source tool for reading OvertureMaps data with multiprocessing and additional Quality-of-Life features

Awesome Lists containing this project

README

        




Generated using DALLΒ·E 3 model with this prompt: Cute stylized conducting virtuoso using a paper map as music sheet. White background, minimalistic, vector graphics, clean background, encased in a circle. In navy and gold colours. Logo for a python library, should work well as small icon.


GitHub
Checks

GitHub Workflow Status - DEV


GitHub Workflow Status - PROD


pre-commit.ci status

CodeFactor Grade
Codecov

Package version


Supported Python versions


PyPI - Downloads

# OvertureMaestro

An open-source tool for reading OvertureMaps data with multiprocessing and additional Quality-of-Life features.

## What is **OvertureMaestro** 🎼🌍?

- Scalable reader for OvertureMaps data.
- Is based on top of `PyArrow`[^1].
- Saves files in the `GeoParquet`[^2] file format for easier integration with modern cloud stacks.
- Filters data based on geometry.
- Can filter data using PyArrow expressions.
- Utilizes multiprocessing for faster data download.
- Utilizes dedicated index of all features in the Overture Maps dataset to download only specific parts based on the geometry filter.
- Utilizes caching to reduce repeatable computations.
- Can be used as Python module as well as a beautiful CLI based on `Typer`[^3].

[^1]: [PyArrow Website](https://arrow.apache.org/docs/python/)
[^2]: [GeoParquet data format](https://geoparquet.org/)
[^3]: [Typer docs](https://typer.tiangolo.com/)

## Installing

### As pure Python module

```
pip install overturemaestro
```

### With beautiful CLI

```
pip install overturemaestro[cli]
```

### Required Python version?

OvertureMaestro supports **Python >= 3.9**

### Dependencies

Required:

- `overturemaps (>=0.8.0)`: Reusing oficial CLI library with dedicated schema related functions

- `pyarrow (>=16.0.0)`: For OvertureMaps GeoParquet dataset wrangling

- `geopandas (>=1.0)`: For returning GeoDataFrames and reading Geo files

- `shapely (>=2.0)`: For parsing WKT and GeoJSON strings and filtering data with STRIndex

- `geoarrow-rust-core (>=0.3.0)`: For transforming Arrow data to Shapely objects

- `pooch (>=1.6.0)`: For downloading precalculated dataset indexes

- `rich (>=12.0.0)`: For showing progress bars

- `fsspec (>=2021.04.0)` & `aiohttp (>=3.8.0)`: For accessing AWS S3 datasets in PyArrow and GitHub files for precalculated datasets

- `geopy (>=2.0.0)`: For geocoding of strings

Optional:

- `typer[all] (>=0.9.0)` (click, colorama, rich, shellingham): Required in CLI

- `h3 (>=4.0.0b1)`: For reading H3 strings. Required in CLI

- `s2 (>=0.1.9)`: For transforming S2 indexes into geometries. Required in CLI

- `python-geohash (>=0.8)`: For transforming GeoHash indexes into geometries. Required in CLI

- `scikit-learn (>=1.0)`: For clustering geometries when generating release index. Required for generating release index

- `polars (>=0.20.4)`: For calculating total bounding box from many bounding boxes. Required for generating release index

## Usage

TODO