https://github.com/cedadev/padocc

Last synced: 5 months ago
JSON representation

Host: GitHub
URL: https://github.com/cedadev/padocc
Owner: cedadev
License: other
Created: 2023-11-24T11:05:38.000Z (over 2 years ago)
Default Branch: main
Last Pushed: 2025-06-20T12:46:26.000Z (12 months ago)
Last Synced: 2025-06-20T13:47:25.487Z (12 months ago)
Language: Python
Size: 17.5 MB
Stars: 3
Watchers: 6
Forks: 1
Open Issues: 13
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          # PADOCC Package

[![PyPI version](https://badge.fury.io/py/padocc.svg)](https://pypi.python.org/pypi/padocc/)

Padocc (Pipeline to Aggregate Data for Optimal Cloud Capabilities) is a Data Aggregation pipeline for creating Kerchunk (or alternative) files to represent various datasets in different original formats.

Currently the Pipeline supports writing JSON/Parquet Kerchunk files for input NetCDF/HDF files. Further developments will allow GeoTiff, GRIB and possibly MetOffice (.pp) files to be represented, as well as using the Pangeo [Rechunker](https://rechunker.readthedocs.io/en/latest/) tool to create Zarr stores for Kerchunk-incompatible datasets.

[Example Notebooks at this link](https://mybinder.org/v2/gh/cedadev/padocc.git/main?filepath=showcase/notebooks)

[Documentation hosted at this link](https://cedadev.github.io/kerchunk-builder/)

![Kerchunk Pipeline](docs/source/_images/pipeline.png)

## Release 1.4.4

Release date: 22nd January 2026

See the ![release notes](https://github.com/cedadev/padocc/releases/tag/v1.4.4) for details.

This package acknowledges contributions by [Matt Brown](matbro@ceh.ac.uk) as a pre-release tester.

## Installation

To install this package, clone the repository using git clone, then follow the steps below to install the package with the necessary dependencies.

```

python -m venv .venv

source .venv/bin/activate

pip install poetry

poetry install

```

## Usage

Please refer to the documentation pages linked above for exact specifications on how to effectively use PADOCC.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/cedadev/padocc

Awesome Lists containing this project

README