{"id":16285404,"url":"https://github.com/constantinpape/z5","last_synced_at":"2026-04-01T17:50:18.198Z","repository":{"id":24474050,"uuid":"101700504","full_name":"constantinpape/z5","owner":"constantinpape","description":"Lightweight C++ and Python interface for datasets in zarr and N5 format","archived":false,"fork":false,"pushed_at":"2026-03-18T22:24:05.000Z","size":1210,"stargazers_count":130,"open_issues_count":38,"forks_count":30,"subscribers_count":10,"default_branch":"master","last_synced_at":"2026-03-19T09:46:00.004Z","etag":null,"topics":["chunked-storage","cloud","h5py","multidimensional-arrays","multidimensional-data","n5","storage","zarr"],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/constantinpape.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2017-08-29T00:31:10.000Z","updated_at":"2026-03-18T22:22:17.000Z","dependencies_parsed_at":"2024-07-09T10:03:54.344Z","dependency_job_id":"a2007b15-2d71-4b53-8c46-bc526991a903","html_url":"https://github.com/constantinpape/z5","commit_stats":{"total_commits":648,"total_committers":27,"mean_commits":24.0,"dds":0.5679012345679013,"last_synced_commit":"632fd551b5add84026dea45439a8ad749f9b592a"},"previous_names":[],"tags_count":43,"template":false,"template_full_name":null,"purl":"pkg:github/constantinpape/z5","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/constantinpape%2Fz5","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/constantinpape%2Fz5/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/constantinpape%2Fz5/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/constantinpape%2Fz5/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/constantinpape","download_url":"https://codeload.github.com/constantinpape/z5/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/constantinpape%2Fz5/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31290625,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-01T13:12:26.723Z","status":"ssl_error","status_checked_at":"2026-04-01T13:12:25.102Z","response_time":53,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["chunked-storage","cloud","h5py","multidimensional-arrays","multidimensional-data","n5","storage","zarr"],"created_at":"2024-10-10T19:23:26.181Z","updated_at":"2026-04-01T17:50:18.180Z","avatar_url":"https://github.com/constantinpape.png","language":"C++","funding_links":[],"categories":["Zarr \u0026 other array data formats"],"sub_categories":["N5"],"readme":"# z5\n\n[![Anaconda-Server Badge](https://anaconda.org/conda-forge/z5py/badges/version.svg)](https://anaconda.org/conda-forge/z5py)\n[![Build Status](https://github.com/constantinpape/z5/workflows/build/badge.svg)](https://github.com/constantinpape/z5/actions)\n[![Documentation Status](https://readthedocs.org/projects/z5/badge/?version=latest)](https://z5.readthedocs.io/en/latest/?badge=latest)\n[![DOI](https://zenodo.org/badge/101700504.svg)](https://zenodo.org/badge/latestdoi/101700504)\n\n\n\nC++ and Python wrapper for [zarr](https://github.com/zarr-developers/zarr-python) and [n5](https://github.com/saalfeldlab/n5) file formats.\nImplements the file system specification of these formats. Implementations for cloud based storage are work in progress. Any\nhelp is highly appreciated. See issues [#136](https://github.com/constantinpape/z5/issues/136) and [#137](https://github.com/constantinpape/z5/issues/137) for details.\n\nSupport for the following compression codecs:\n- [Blosc](https://github.com/Blosc/c-blosc)\n- [Zlib / Gzip](https://zlib.net/)\n- [Bzip2](http://www.bzip.org/)\n- [XZ](https://tukaani.org/xz/)\n- [LZ4](https://github.com/lz4/lz4)\n\n## Installation\n\n### Conda\n\nConda packages for the relevant systems and python versions are hosted on conda-forge:\n\n```\n$ conda install -c conda-forge z5py\n```\n\n### From Source\n\nThe easiest way to build the library from source is using a conda-environment with all necessary dependencies.\nYou can find the conda environment files for build environments in `.environments/unix`\n\nTo set up the conda environment and install the package on unix:\n\n```bash\n$ conda env create -f environments/unix/z5-dev.yaml\n$ conda activate z5-dev\n$ mkdir bld\n$ cd bld\n$ cmake -DWITH_ZLIB=ON -DWITH_BZIP2=ON -DCMAKE_INSTALL_PREFIX=/path/to/install ..\n$ make install\n```\n\nNote that in the CMakeLists.txt, we try to infer the active conda-environment automatically.\nIf this fails, you can set it manually via `-DCMAKE_PREFIX_PATH=/path/to/conda-env`.\nTo specify where to install the package, set:\n\n- `CMAKE_INSTALL_PREFIX`: where to install the C++ headers\n- `PYTHON_MODULE_INSTALL_DIR`: where to install the python package (set to `site-packages` of active conda env by default)\n\nIf you want to include z5 in another C++ project, note that the library itself is header-only. However, you need to link against the compression codecs that you use.\n\nIf you don't want to use conda for dependency management, the following dependencies are necessary:\n- [xtensor](https://github.com/xtensor-stack/xtensor)\n- [nlohmann_json](https://github.com/nlohmann/json)\n- [pybind11](https://github.com/pybind/pybind11) (only for python bindings)\n- [xtensor_python](https://github.com/xtensor-stack/xtensor-python) (only for python bindings)\n\n## Examples / Usage\n\n### Python\n\nThe Python API is very similar to `h5py`.\nSome differences are: \n- The constructor of `File` takes the boolean argument `use_zarr_format`, which determines whether\nthe zarr or N5 format is used (if set to `None`, an attempt is made to automatically infer the format).\n- There is no need to close `File`, hence the `with` block isn't necessary (but supported).\n- Linked datasets (`my_file['new_ds'] = my_file['old_ds']`) are not supported\n- Broadcasting is only supported for scalars in `Dataset.__setitem__`\n- Arbitrary leading and trailing singleton dimensions can be added/removed/rolled through in `Dataset.__setitem__`\n- Compatibility of exception handling is a goal, but not necessarily guaranteed.\n- Because zarr/N5 are usually used with large data, `z5py` compresses blocks by default where `h5py` does not. The default compressors are\n  - zarr: `\"blosc\"`\n  - n5: `\"gzip\"`\n\nSome examples:\n\n```python\nimport z5py\nimport numpy as np\n\n# create a file and a dataset\nf = z5py.File('array.zr', use_zarr_format=True)\nds = f.create_dataset('data', shape=(1000, 1000), chunks=(100, 100), dtype='float32')\n\n# write array to a roi\nx = np.random.random_sample(size=(500, 500)).astype('float32')\nds[:500, :500] = x\n\n# broadcast a scalar to a roi\nds[500:, 500:] = 42.\n\n# read array from a roi\ny = ds[250:750, 250:750]\n\n# create a group and create a dataset in the group\ng = f.create_group('local_group')\ng.create_dataset('local_data', shape=(100, 100), chunks=(10, 10), dtype='uint32')\n\n# open dataset from group or file\nds_local1 = f['local_group/local_data']\nds_local2 = g['local_data']\n\n# read and write attributes\nattributes = ds.attrs\nattributes['foo'] = 'bar'\nbaz = attributes['foo']\n```\n\nThere are convenience functions to convert n5 and zarr files to and from hdf5 or tif.\nAdditional data formats will follow.\n\n```python\n# convert existing h5 file to n5\n# this only works if h5py is available\nfrom z5py.converter import convert_from_h5\n\nh5_file = '/path/to/file.h5'\nn5_file = '/path/to/file.n5'\nh5_key = n5_key = 'data'\ntarget_chunks = (64, 64, 64)\nn_threads = 8\n\nconvert_from_h5(h5_file, n5_file,\n                in_path_in_file=h5_key,\n                out_path_in_file=n5_key,\n                chunks=target_chunks,\n                n_threads=n_threads,\n                compression='gzip')\n```\n\n### C++\n\n`Z5` aims to supports different storage implementations. The default is to use the filesystem, implementations to also supports AWS-S3 and Google Cloud Storage are work in progress.\nThe API implements factory functions like `createFile` or `createDataset` in [the factory header](https://github.com/constantinpape/z5/blob/master/include/z5/factory.hxx). \nThese functions need to be called with the corresponding handle, like `z5::filesystem::handle::File` or `z5::s3::handle::File` in order to specify which backend to use.\n\nThe library is intended to be used with a multiarray, that holds data in memory.\nBy default [xtensor](https://github.com/QuantStack/xtensor) is used, see [implementation](https://github.com/constantinpape/z5/blob/master/include/z5/multiarray/xtensor_access.hxx).\nThere also exists an interface for [marray](https://github.com/bjoern-andres/marray), see [implementation](https://github.com/constantinpape/z5/blob/master/include/z5/multiarray/marray_access.hxx).\nTo interface with other multiarray implementation, reimplement `readSubarray` and `writeSubarray`.\nPull requests for additional multiarray support are welcome.\n\nSome examples:\n\n```c++\n#include \"json.hpp\"\n#include \"xtensor/xarray.hpp\"\n\n// factory functions to create files, groups and datasets\n#include \"z5/factory.hxx\"\n// handles for z5 filesystem objects\n#include \"z5/filesystem/handle.hxx\"\n// io for xtensor multi-arrays\n#include \"z5/multiarray/xtensor_access.hxx\"\n// attribute functionality\n#include \"z5/attributes.hxx\"\n\nint main() {\n\n  // get handle to a File on the filesystem\n  z5::filesystem::handle::File f(\"data.zr\");\n  // if you wanted to use a different backend, for example AWS, you\n  // would need to use this instead:\n  // z5::s3::handle::File f;\n\n  // create the file in zarr format\n  const bool createAsZarr = true;\n  z5::createFile(f, createAsZarr);\n\n  // create a new zarr dataset\n  const std::string dsName = \"data\";\n  std::vector\u003csize_t\u003e shape = { 1000, 1000, 1000 };\n  std::vector\u003csize_t\u003e chunks = { 100, 100, 100 };\n  auto ds = z5::createDataset(f, dsName, \"float32\", shape, chunks);\n\n  // write array to roi\n  z5::types::ShapeType offset1 = { 50, 100, 150 };\n  xt::xarray\u003cfloat\u003e::shape_type shape1 = { 150, 200, 100 };\n  xt::xarray\u003cfloat\u003e array1(shape1, 42.0);\n  z5::multiarray::writeSubarray\u003cfloat\u003e(ds, array1, offset1.begin());\n\n  // read array from roi (values that were not written before are filled with a fill-value)\n  z5::types::ShapeType offset2 = { 100, 100, 100 };\n  xt::xarray\u003cfloat\u003e::shape_type shape2 = { 300, 200, 75 };\n  xt::xarray\u003cfloat\u003e array2(shape2);\n  z5::multiarray::readSubarray\u003cfloat\u003e(ds, array2, offset2.begin());\n\n  // get handle for the dataset\n  const auto dsHandle = z5::filesystem::handle::Dataset(f, dsName);\n\n  // read and write json attributes\n  nlohmann::json attributesIn;\n  attributesIn[\"bar\"] = \"foo\";\n  attributesIn[\"pi\"] = 3.141593;\n  z5::writeAttributes(dsHandle, attributesIn);\n\n  nlohmann::json attributesOut;\n  z5::readAttributes(dsHandle, attributesOut);\n  \n  return 0;\n}\n```\n\n### C\n\nThere are external efforts to implement a C-Api wrapper for z5.\nYou can check it out [here](https://github.com/kmpaul/cz5test).\n\n### R\n\nThere exists a prototype by @gdkrmr to provide [R bindings for z5](https://github.com/gdkrmr/zarr-R).\nIt is still in an early stage, but looks very promising.\n\n## Citation\n\nIf you use this library in your research, please cite it via the associated DOI:\n```\n@misc{pape_z5_2019,\n  doi = {10.5281/ZENODO.3585752},\n  url = {https://zenodo.org/record/3585752},\n  author = {Pape,  Constantin},\n  title = {constantinpape/z5},\n  publisher = {Zenodo},\n  year = {2019}\n}\n```\n\n## When to use this library?\n\nThis library implements the zarr and n5 data specification in C++ and Python.\nUse it, if you need access to these formats from these languages.\nZarr / n5 have native implementations in Python / Java.\nIf you only need access in the respective native language,\nit is recommended to use these implementations, which are more thoroughly tested.\n\n\n## Current Limitations / TODOs\n\n- No thread / process synchronization -\u003e writing to the same chunk in parallel will lead to undefined behavior.\n- Supports only little endianness and C-order for the zarr format.\n\n\n## A note on axis ordering\n\nInternally, n5 uses column-major (i.e. x, y, z) axis ordering, while z5 uses row-major (i.e. z, y, x).\nWhile this is mostly handled internally, it means that the metadata does not transfer\n1 to 1, but needs to be reversed for most shapes. Concretely:\n\n|           |n5                      |z5              |\n|----------:|-----------------------:|---------------:|  \n|Shape      | s_x, s_y, s_z          |s_z, s_y, s_x   |\n|Chunk-Shape| c_x, c_y, c_z          |c_z, c_y, c_x   | \n|Chunk-Ids  | i_x, i_y, i_z          |i_z, i_y, i_x   |\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fconstantinpape%2Fz5","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fconstantinpape%2Fz5","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fconstantinpape%2Fz5/lists"}