https://github.com/xarray-contrib/xarray-schema
Schema validation for Xarray objects
https://github.com/xarray-contrib/xarray-schema
Last synced: 3 months ago
JSON representation
Schema validation for Xarray objects
- Host: GitHub
- URL: https://github.com/xarray-contrib/xarray-schema
- Owner: xarray-contrib
- License: mit
- Created: 2021-11-08T05:16:39.000Z (over 4 years ago)
- Default Branch: main
- Last Pushed: 2025-03-31T18:43:43.000Z (11 months ago)
- Last Synced: 2025-04-11T18:16:25.828Z (11 months ago)
- Language: Python
- Homepage: https://xarray-schema.readthedocs.io/en/latest/index.html
- Size: 203 KB
- Stars: 42
- Watchers: 6
- Forks: 9
- Open Issues: 21
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# xarray-schema
Schema validation for Xarray
[](https://github.com/carbonplan/xarray-schema/actions/workflows/main.yaml)
[](https://codecov.io/gh/xarray-contrib/xarray-schema)

## installation
Install xarray-schema from PyPI:
```shell
pip install xarray-schema
```
Conda:
```shell
conda install -c conda-forge xarray-schema
```
Or install it from source:
```shell
pip install git+https://github.com/xarray-contrib/xarray-schema
```
## usage
Xarray-schema's API is modeled after [Pandera](https://pandera.readthedocs.io/en/stable/). The `DataArraySchema` and `DatasetSchema` objects both have `.validate()` methods.
The basic usage is as follows:
```python
import numpy as np
import xarray as xr
from xarray_schema import DataArraySchema, DatasetSchema, CoordsSchema
da = xr.DataArray(np.ones(4, dtype='i4'), dims=['x'], name='foo')
schema = DataArraySchema(dtype=np.integer, name='foo', shape=(4, ), dims=['x'])
schema.validate(da)
```
You can also use it to validate a `Dataset` like so:
```
schema_ds = DatasetSchema({'foo': schema})
schema_ds.validate(da.to_dataset())
```
Each component of the Xarray data model is implemented as a stand alone class:
```python
from xarray_schema.components import (
DTypeSchema,
DimsSchema,
ShapeSchema,
NameSchema,
ChunksSchema,
ArrayTypeSchema,
AttrSchema,
AttrsSchema
)
# example constructions
dtype_schema = DTypeSchema('i4')
dims_schema = DimsSchema(('x', 'y', None)) # None is used as a wildcard
shape_schema = ShapeSchema((5, 10, None)) # None is used as a wildcard
name_schema = NameSchema('foo')
chunk_schema = ChunksSchema({'x': None, 'y': -1}) # None is used as a wildcard, -1 is used as
ArrayTypeSchema = ArrayTypeSchema(np.ndarray)
# Example usage
dtype_schema.validate(da.dtype)
# Each object schema can be exported to JSON format
dtype_json = dtype_schema.to_json()
```
## roadmap
This is a very early prototype of a library. Some key things are missing:
1. Exceptions: Pandera accumulates schema exceptions and reports them all at once. Currently, we are a eagerly raising `SchemaErrors` when the are found.
## license
All the code in this repository is [MIT](https://choosealicense.com/licenses/mit/) licensed.
## history
This project was originally developed at [CarbonPlan](https://carbonplan.org/). It was transferred to the xarray-contrib organization in August 2022.