https://github.com/cedadev/padocc
https://github.com/cedadev/padocc
Last synced: 5 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/cedadev/padocc
- Owner: cedadev
- License: other
- Created: 2023-11-24T11:05:38.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2025-06-20T12:46:26.000Z (12 months ago)
- Last Synced: 2025-06-20T13:47:25.487Z (12 months ago)
- Language: Python
- Size: 17.5 MB
- Stars: 3
- Watchers: 6
- Forks: 1
- Open Issues: 13
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# PADOCC Package
[](https://pypi.python.org/pypi/padocc/)
Padocc (Pipeline to Aggregate Data for Optimal Cloud Capabilities) is a Data Aggregation pipeline for creating Kerchunk (or alternative) files to represent various datasets in different original formats.
Currently the Pipeline supports writing JSON/Parquet Kerchunk files for input NetCDF/HDF files. Further developments will allow GeoTiff, GRIB and possibly MetOffice (.pp) files to be represented, as well as using the Pangeo [Rechunker](https://rechunker.readthedocs.io/en/latest/) tool to create Zarr stores for Kerchunk-incompatible datasets.
[Example Notebooks at this link](https://mybinder.org/v2/gh/cedadev/padocc.git/main?filepath=showcase/notebooks)
[Documentation hosted at this link](https://cedadev.github.io/kerchunk-builder/)

## Release 1.4.4
Release date: 22nd January 2026
See the  for details.
This package acknowledges contributions by [Matt Brown](matbro@ceh.ac.uk) as a pre-release tester.
## Installation
To install this package, clone the repository using git clone, then follow the steps below to install the package with the necessary dependencies.
```
python -m venv .venv
source .venv/bin/activate
pip install poetry
poetry install
```
## Usage
Please refer to the documentation pages linked above for exact specifications on how to effectively use PADOCC.