https://github.com/kalisio/kazarr
Lightweight service for reading and transforming Zarr data
https://github.com/kalisio/kazarr
access processing python3 zarr
Last synced: 5 months ago
JSON representation
Lightweight service for reading and transforming Zarr data
- Host: GitHub
- URL: https://github.com/kalisio/kazarr
- Owner: kalisio
- License: mit
- Created: 2025-11-26T08:36:38.000Z (6 months ago)
- Default Branch: master
- Last Pushed: 2025-12-24T08:48:51.000Z (5 months ago)
- Last Synced: 2025-12-25T04:42:43.837Z (5 months ago)
- Topics: access, processing, python3, zarr
- Language: Python
- Homepage:
- Size: 77.1 KB
- Stars: 5
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# kazarr
A lightweight **FastAPI** service that exposes endpoints to interact with **Zarr datasets** stored in a **Simple Storage Service (S3)**:
- a **datasets** endpoint to explore available multi-dimensional arrays,
- an **extraction** endpoint to slice and dice data,
- a **probe** endpoint to query specific values at given coordinates,
- an **isoline** endpoint to compute contour lines dynamically.
## API
> [!TIP]
> You can find auto-generated documentation about API at endpoints `/docs` or `/redoc`
### /health (GET)
Check for service's health, return a json object with a single member `status`.
### /datasets (GET)
Return a list of all available Zarr datasets with their id and description.
### /datasets/{dataset} (GET)
Return metadata (dimensions, variables, attributes) for a specific Zarr dataset.
The `dataset` parameter is expected to be the dataset id, that can be found with the previous endpoint.
### /datasets/{dataset}/extract (GET)
Extracts a subset of the data based on a bounding box and a specific variable.
> [!WARNING]
> Large extractions may impact performance. Be mindful of the bounding box size for high-resolution datasets.
The `extract` endpoint accepts the following query parameters:
| Name | Description | Optional |
|------------|----------------------------------------------|:--------:|
| `variable` | The variable to extract. | ✗ |
| `lon_min` | Minimum longitude of the bounding box. | ✓ |
| `lat_min` | Minimum latitude of the bounding box. | ✓ |
| `lon_max` | Maximum longitude of the bounding box. | ✓ |
| `lat_max` | Maximum latitude of the bounding box. | ✓ |
| `time` | The time value/slice to extract. | ✓ |
> [!IMPORTANT]
> You may need to specify additional non-generic variables or dimensions according to your dataset. To do so, you can add query parameters with `&my_additional_variable={VALUE}`
### /datasets/{dataset}/probe (GET)
Retrieves the values of specified variables at a specific geographical location (point query).
The `probe` endpoint accepts the following query parameters:
| Name | Description | Optional |
|-------------|----------------------------------------------|:--------:|
| `variables` | The list of variables to probe. | ✗ |
| `lon` | The longitude coordinate to probe. | ✗ |
| `lat` | The latitude coordinate to probe. | ✗ |
| `height` | The height coordinate to probe (if 3D data). | ✓ |
> [!IMPORTANT]
> You may need to specify additional non-generic variables or dimensions according to your dataset. To do so, you can add query parameters with `&my_additional_variable={VALUE}`
> [!TIP]
> You can request multiple variables at once by repeating the `variables` parameter in the query string (e.g., `?variables=temp&variables=wind`).
### /datasets/{dataset}/isoline (GET)
Computes isolines (contour lines) for a given variable and specific levels.
The `isoline` endpoint accepts the following query parameters:
| Name | Description | Optional |
|------------|---------------------------------------------------------------|:--------:|
| `variable` | The variable to generate isolines for. | ✗ |
| `levels` | Comma-separated list of levels for isoline generation. | ✗ |
| `time` | The time value to use for isoline generation. | ✓ |
> [!IMPORTANT]
> You may need to specify additional non-generic variables or dimensions according to your dataset. To do so, you can add query parameters with `&my_additional_variable={VALUE}`
## Configuring
### Environment variables
| Variable | Description | Default value |
|-----------------------|----------------------------------------------------------|---------------|
| PORT | The port to be used when exposing the service | 8000 |
| HOSTNAME | The hostname to be used when exposing the service | localhost |
| AWS_ACCESS_KEY_ID | Access key ID of the S3 in which zarr data is stored | |
| AWS_SECRET_ACCESS_KEY | Secret access key of the S3 in which zarr data is stored | |
| AWS_REGION | Region of the S3 in which zarr data is stored | |
| AWS_ENDPOINT_URL | Endpoint URL of the S3 in which zarr data is stored | |
| BUCKET_NAME | The name of the bucket in which zarr data is stored | |
## Usage
### Manual build
You can build the image with the following command:
```bash
docker build -t .
```
And then start the service with:
```bash
docker run -p 8000:8000
```
### Run locally
You will need to install multiple Python packages to run this app.
To simplify, you can install Anaconda and run these commands :
```bash
conda create -y -n kazarr_env python=3.11
```
```bash
conda install -y -n kazarr_env -c conda-forge \
fastapi \
uvicorn \
xarray \
zarr \
cfgrib \
numpy \
pyproj \
dask \
s3fs \
matplotlib \
pyvista
```
```bash
conda activate kazarr_env
```
```bash
python main.py start-api
```
## Contributing
Please read the [Contributing file](https://github.com/kalisio/k2/blob/master/.github/CONTRIBUTING.md) for details on our code of conduct, and the process for submitting pull requests to us.
## Versioning
We use [SemVer](https://semver.org/) for versioning. For the versions available, see the tags on this repository.
## Authors
This project is sponsored by

## License
This project is licensed under the MIT License - see the [license file](./LICENSE.md) for details.