{"id":15195184,"url":"https://github.com/gbeckers/darr","last_synced_at":"2025-08-21T00:30:58.668Z","repository":{"id":33172560,"uuid":"151593293","full_name":"gbeckers/Darr","owner":"gbeckers","description":"A Python library for numpy arrays that persist on disk in a format that is simple, self-documented and tool-independent, and maximizes universal readability.","archived":false,"fork":false,"pushed_at":"2024-11-07T14:34:46.000Z","size":2008,"stargazers_count":21,"open_issues_count":5,"forks_count":2,"subscribers_count":3,"default_branch":"master","last_synced_at":"2024-12-15T11:48:06.862Z","etag":null,"topics":["array","bsd-3-clause","data-science","data-sharing","data-storage","idl","interoperability","jagged-array","julia-language","maple","mathematica","matlab","numeric","octave","python","r","ragged-array","science","scilab"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/gbeckers.png","metadata":{"files":{"readme":"README.rst","changelog":null,"contributing":"docs/contributing.rst","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2018-10-04T15:21:58.000Z","updated_at":"2024-11-07T14:35:27.000Z","dependencies_parsed_at":"2023-09-24T05:23:18.935Z","dependency_job_id":"b56012e6-dbec-41da-82b0-bdca1976173d","html_url":"https://github.com/gbeckers/Darr","commit_stats":{"total_commits":1014,"total_committers":4,"mean_commits":253.5,"dds":"0.21597633136094674","last_synced_commit":"bc5f225e9266fea1f9afd24648532e4c32c634f7"},"previous_names":[],"tags_count":30,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gbeckers%2FDarr","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gbeckers%2FDarr/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gbeckers%2FDarr/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gbeckers%2FDarr/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/gbeckers","download_url":"https://codeload.github.com/gbeckers/Darr/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":230471175,"owners_count":18231193,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["array","bsd-3-clause","data-science","data-sharing","data-storage","idl","interoperability","jagged-array","julia-language","maple","mathematica","matlab","numeric","octave","python","r","ragged-array","science","scilab"],"created_at":"2024-09-27T23:06:53.172Z","updated_at":"2025-08-21T00:30:58.661Z","avatar_url":"https://github.com/gbeckers.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"Darr\n====\n\n|Github CI Status| |PyPi version| |Conda Forge|\n|Codecov Badge| |Docs Status| |Zenodo Badge|\n\nDarr is a Python science library that allows you to work efficiently with\npotentially very large, disk-based Numpy arrays that are widely readable and\nself-documented. Documentation includes copy-paste ready code to read arrays\nin many popular data science languages such as R, Julia, Scilab, IDL,\nMatlab, Maple, and Mathematica, or in Python/Numpy without Darr. Without\nexporting them and with minimal effort.\n\nUniversal readability of data is a pillar of good scientific practice. It is\nalso generally a good idea for anyone who wants to flexibly move between\nanalysis environments, who wants to save data for the longer term, or who\nwants to share data with others without spending much time on figuring out\nand/or explaining how the receiver can read it. No idea how to read your\n7-dimensional uint32 numpy array in Matlab to quickly try out an algorithm\nyour colleague wrote? No worries, a quick copy-paste of code from the array\ndocumentation is all that is needed to read your data in, e.g. R or Matlab\n(see `example\n\u003chttps://github.com/gbeckers/Darr/tree/master/examplearrays/arrays/array_int32_2D.darr\u003e`__).\nAs you work with your array, its documentation is automatically kept up to\ndate. No need to export anything, make notes, or to provide elaborate\nexplanation. No looking up things. No dependence on complicated formats or\nspecialized libraries for reading you data elsewhere later.\n\nIn essence, Darr makes it trivially easy to share your numerical arrays with\nothers or with yourself when working in different computing environments,\nand stores them in a future-proof way.\n\nMore rationale for a tool-independent approach to numeric array storage is\nprovided `here \u003chttps://darr.readthedocs.io/en/latest/rationale.html\u003e`__.\n\nUnder the hood, Darr uses NumPy memory-mapped arrays, which is a widely\nestablished and trusted way of working with disk-based numerical data, and\nwhich makes Darr fully NumPy compatible. This enables efficient out-of-core\nread/write access to potentially very large arrays. In addition to automatic\ndocumentation, Darr adds other functionality to NumPy's memmap, such as easy\nthe appending and truncating of data, support for ragged arrays, the ability\nto create arrays from iterators, and easy use of metadata. Flat binary files\nand (JSON) text files are accompanied by a README text file that explains how\nthe array and metadata are stored (`see example arrays\n\u003chttps://github.com/gbeckers/Darr/tree/master/examplearrays/\u003e`__).\n\nSee this `tutorial \u003chttps://darr.readthedocs.io/en/latest/tutorialarray.html\u003e`__\nfor a brief introduction, or the\n`documentation \u003chttp://darr.readthedocs.io/\u003e`__ for more info.\n\nDarr is currently pre-1.0, and still undergoing development. It is open source\nand freely available under the `New BSD License\n\u003chttps://opensource.org/licenses/BSD-3-Clause\u003e`__ terms.\n\nFeatures\n--------\n-  Data is stored purely based on flat binary and text files, maximizing\n   universal readability.\n-  Automatic self-documention, including copy-paste ready code snippets for\n   reading the array in a number of popular data analysis environments, such as\n   Python (without Darr), R, Julia, Scilab, Octave/Matlab, GDL/IDL, and\n   Mathematica\n   (see `example array\n   \u003chttps://github.com/gbeckers/Darr/tree/master/examplearrays/arrays/array_int32_2D.darr\u003e`__).\n-  Disk-persistent array data is directly accessible through `NumPy\n   indexing \u003chttps://numpy.org/doc/stable/reference/arrays.indexing.html\u003e`__\n   and may be larger than RAM\n-  Easy and efficient appending of data (`see example \u003chttps://darr.readthedocs.io/en/latest/tutorialarray.html#appending-data\u003e`__).\n-  Supports ragged arrays.\n-  Easy use of metadata, stored in a widely readable separate\n   JSON text file (`see example\n   \u003chttps://darr.readthedocs.io/en/latest/tutorialarray.html#metadata\u003e`__).\n-  Many numeric types are supported: (u)int8-(u)int64, float16-float64,\n   complex64, complex128.\n-  Integrates easily with the `Dask \u003chttps://dask.pydata.org/en/latest/\u003e`__\n   library for out-of-core computation on very large arrays.\n-  Minimal dependencies, only `NumPy \u003chttp://www.numpy.org/\u003e`__.\n\nLimitations\n-----------\n- No `structured (record) arrays \u003chttps://numpy.org/doc/stable/user/basics.rec.html\u003e`__\n  supported yet, just\n  `ndarrays \u003chttps://numpy.org/doc/stable/reference/arrays.ndarray.html\u003e`__\n- No string data, just numeric.\n- No compression, although compression for archiving purposes is supported.\n- Uses multiple files per array, as binary data is separated from text\n  documentation and metadata. This can be a disadvantage in terms of storage\n  space if you have very many very small arrays.\n\nInstallation\n------------\n\nDarr officially depends on Python 3.9 or higher. Older versions may work\n(probably \u003e= 3.6) but are not tested.\n\nInstall Darr from PyPI::\n\n    $ pip install darr\n\nOr, install Darr via conda::\n\n    $ conda install -c conda-forge darr\n\nTo install the latest development version, use pip with the latest GitHub\nmaster::\n\n    $ pip install git+https://github.com/gbeckers/darr@master\n\n\nDocumentation\n-------------\nSee the `documentation \u003chttp://darr.readthedocs.io/\u003e`_ for more information.\n\nContributing\n------------\nAny help / suggestions / ideas / contributions are welcome and very much\nappreciated. For any comment, question, or error, please open an issue or\npropose a pull request.\n\n\nOther interesting projects\n--------------------------\nIf Darr is not exactly what you are looking for, have a look at these projects:\n\n-  `asdf \u003chttps://github.com/asdf-format/asdf\u003e`__\n-  `exdir \u003chttps://github.com/CINPLA/exdir/\u003e`__\n-  `h5py \u003chttps://github.com/h5py/h5py\u003e`__\n-  `pyfbf \u003chttps://github.com/davidh-ssec/pyfbf\u003e`__\n-  `pytables \u003chttps://github.com/PyTables/PyTables\u003e`__\n-  `zarr \u003chttps://github.com/zarr-developers/zarr\u003e`__\n\n\n\nDarr is BSD licensed (BSD 3-Clause License). (c) 2017-2024, Gabriël\nBeckers\n\n.. |Github CI Status| image:: https://github.com/gbeckers/Darr/actions/workflows/python_package.yml/badge.svg\n   :target: https://github.com/gbeckers/Darr/actions/workflows/python_package.yml\n.. |PyPi version| image:: https://img.shields.io/badge/pypi-0.6.3-orange.svg\n   :target: https://pypi.org/project/darr/\n.. |Conda Forge| image:: https://anaconda.org/conda-forge/darr/badges/version.svg\n   :target: https://anaconda.org/conda-forge/darr\n.. |Docs Status| image:: https://readthedocs.org/projects/darr/badge/?version=stable\n   :target: https://darr.readthedocs.io/en/latest/\n.. |Repo Status| image:: https://www.repostatus.org/badges/latest/active.svg\n   :alt: Project Status: Active – The project has reached a stable, usable state and is being actively developed.\n   :target: https://www.repostatus.org/#active\n.. |Codacy Badge| image:: https://api.codacy.com/project/badge/Grade/c0157592ce7a4ecca5f7d8527874ce54\n   :alt: Codacy Badge\n   :target: https://app.codacy.com/app/gbeckers/Darr?utm_source=github.com\u0026utm_medium=referral\u0026utm_content=gbeckers/Darr\u0026utm_campaign=Badge_Grade_Dashboard\n.. |Zenodo Badge| image:: https://zenodo.org/badge/151593293.svg\n   :target: https://zenodo.org/badge/latestdoi/151593293\n.. |Codecov Badge| image:: https://codecov.io/gh/gbeckers/Darr/branch/master/graph/badge.svg?token=BBV0WDIUSJ\n   :target: https://codecov.io/gh/gbeckers/Darr\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgbeckers%2Fdarr","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgbeckers%2Fdarr","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgbeckers%2Fdarr/lists"}