Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/kraina-ai/overturemaestro
An open-source tool for reading OvertureMaps data with multiprocessing and additional Quality-of-Life features
https://github.com/kraina-ai/overturemaestro
geo geospatial open-source openstreetmap overture-maps overturemaps pyarrow python
Last synced: 14 days ago
JSON representation
An open-source tool for reading OvertureMaps data with multiprocessing and additional Quality-of-Life features
- Host: GitHub
- URL: https://github.com/kraina-ai/overturemaestro
- Owner: kraina-ai
- License: apache-2.0
- Created: 2024-08-14T20:27:28.000Z (3 months ago)
- Default Branch: main
- Last Pushed: 2024-10-26T21:34:31.000Z (20 days ago)
- Last Synced: 2024-10-28T01:00:39.055Z (19 days ago)
- Topics: geo, geospatial, open-source, openstreetmap, overture-maps, overturemaps, pyarrow, python
- Language: Python
- Homepage: https://kraina-ai.github.io/overturemaestro/
- Size: 1.14 MB
- Stars: 13
- Watchers: 3
- Forks: 0
- Open Issues: 12
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
Awesome Lists containing this project
README
Generated using DALLΒ·E 3 model with this prompt: Cute stylized conducting virtuoso using a paper map as music sheet. White background, minimalistic, vector graphics, clean background, encased in a circle. In navy and gold colours. Logo for a python library, should work well as small icon.# OvertureMaestro
An open-source tool for reading OvertureMaps data with multiprocessing and additional Quality-of-Life features.
## What is **OvertureMaestro** πΌπ?
- Scalable reader for OvertureMaps data.
- Is based on top of `PyArrow`[^1].
- Saves files in the `GeoParquet`[^2] file format for easier integration with modern cloud stacks.
- Filters data based on geometry.
- Can filter data using PyArrow expressions.
- Utilizes multiprocessing for faster data download.
- Utilizes dedicated index of all features in the Overture Maps dataset to download only specific parts based on the geometry filter.
- Utilizes caching to reduce repeatable computations.
- Can be used as Python module as well as a beautiful CLI based on `Typer`[^3].[^1]: [PyArrow Website](https://arrow.apache.org/docs/python/)
[^2]: [GeoParquet data format](https://geoparquet.org/)
[^3]: [Typer docs](https://typer.tiangolo.com/)## Installing
### As pure Python module
```
pip install overturemaestro
```### With beautiful CLI
```
pip install overturemaestro[cli]
```### Required Python version?
OvertureMaestro supports **Python >= 3.9**
### Dependencies
Required:
- `overturemaps (>=0.8.0)`: Reusing oficial CLI library with dedicated schema related functions
- `pyarrow (>=16.0.0)`: For OvertureMaps GeoParquet dataset wrangling
- `geopandas (>=1.0)`: For returning GeoDataFrames and reading Geo files
- `shapely (>=2.0)`: For parsing WKT and GeoJSON strings and filtering data with STRIndex
- `geoarrow-rust-core (>=0.3.0)`: For transforming Arrow data to Shapely objects
- `pooch (>=1.6.0)`: For downloading precalculated dataset indexes
- `rich (>=12.0.0)`: For showing progress bars
- `fsspec (>=2021.04.0)` & `aiohttp (>=3.8.0)`: For accessing AWS S3 datasets in PyArrow and GitHub files for precalculated datasets
- `geopy (>=2.0.0)`: For geocoding of strings
Optional:
- `typer[all] (>=0.9.0)` (click, colorama, rich, shellingham): Required in CLI
- `h3 (>=4.0.0b1)`: For reading H3 strings. Required in CLI
- `s2 (>=0.1.9)`: For transforming S2 indexes into geometries. Required in CLI
- `python-geohash (>=0.8)`: For transforming GeoHash indexes into geometries. Required in CLI
- `scikit-learn (>=1.0)`: For clustering geometries when generating release index. Required for generating release index
- `polars (>=0.20.4)`: For calculating total bounding box from many bounding boxes. Required for generating release index
## Usage
TODO