{"id":19407442,"url":"https://github.com/observingclouds/car_referencer","last_synced_at":"2025-04-24T09:31:36.530Z","repository":{"id":50362550,"uuid":"514521371","full_name":"observingClouds/car_referencer","owner":"observingClouds","description":"Create reference filesystem for collections of car files.","archived":false,"fork":false,"pushed_at":"2025-01-06T18:13:35.000Z","size":74,"stargazers_count":2,"open_issues_count":4,"forks_count":1,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-04-03T02:23:17.478Z","etag":null,"topics":["content-addressable-storage","ipfs"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/observingClouds.png","metadata":{"files":{"readme":"README.rst","changelog":null,"contributing":"CONTRIBUTING.rst","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":"AUTHORS.rst","dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-07-16T08:26:35.000Z","updated_at":"2024-10-21T05:50:35.000Z","dependencies_parsed_at":"2024-04-01T18:52:33.305Z","dependency_job_id":"33cc1bc0-1a0a-4e5c-9af2-85ca276e7025","html_url":"https://github.com/observingClouds/car_referencer","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/observingClouds%2Fcar_referencer","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/observingClouds%2Fcar_referencer/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/observingClouds%2Fcar_referencer/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/observingClouds%2Fcar_referencer/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/observingClouds","download_url":"https://codeload.github.com/observingClouds/car_referencer/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":250600707,"owners_count":21457012,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["content-addressable-storage","ipfs"],"created_at":"2024-11-10T11:47:17.650Z","updated_at":"2025-04-24T09:31:36.206Z","avatar_url":"https://github.com/observingClouds.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"==============\ncar_referencer: Creating parquet file reference system for car collections.\n==============\n\n.. image:: https://github.com/observingClouds/car_referencer/actions/workflows/ci.yaml/badge.svg\n        :target: https://github.com/observingClouds/car_referencer/actions\n        :alt: Github-CI Status\n\n\n.. warning:\n    Note this project is still under development and needs further testing.\n\nSimilar to tape archive (tar) files, `content addressable archive \u003chttps://ipld.io/specs/transport/car/\u003e`_ (car) files are a possibility to group objects to larger quantities.\nBesides uploading these car files to an object store, they also pose the possibility to save the collections of objects on a traditional filesystem. Accessing these collections without the need of extracting the individual objects can be realized by the usage of a reference file system.\n\n``car_referencer`` can create the needed reference file from single ``car`` s or multiple ``car`` s that are part of the same merkle DAG.\n\nCommand line usage\n------------------\n\n``car_referencer`` creates the reference file internally in two steps. The first step is to identify all available references within the provided ``car`` s ( here ``carfiles.*.car``) and save this as an index file (e.g. ``index.parquet``) that will be reused if it already exists. In a second step the reference file (e.g. ``preffs.parquet``) is created based on the ``ROOT-HASH`` that identifies NOT the root-CID of the car file, but the root-CID of the root file-object. In case of a zarr file, like ``example.zarr``, the ROOT-CID would refer to ``example.zarr`` itself.\n\n.. code-block:: bash\n\n    car_referencer -c \"carfiles.*.car\" -p preffs.parquet -r ROOT-HASH -i index.parquet\n\nThe created file ``preffs.parquet`` can then be opened by\n\n.. code-block:: python\n\n    import xarray as xr\n\n    ds = xr.open_zarr(\"preffs::preffs.parquet\")\n\nthanks to https://github.com/d70-t/preffs.\n\n\nInstallation\n------------\n\n.. code-block:: bash\n\n    git clone https://github.com/observingClouds/car_referencer.git\n    cd car_referencer\n    pip install .\n\nDevelopment\n-----------\n\nFor testing purposes additional dependencies need to be installed including some packages written in go. The needed environment can be installed by\n\n.. code-block:: bash\n\n    git clone https://github.com/observingClouds/car_referencer.git\n    cd car_referencer\n    mamba env create\n    source activate test-env\n\nCredits\n-------\n\nThis package was created with Cookiecutter_ and the `audreyr/cookiecutter-pypackage`_ project template.\n\n.. _Cookiecutter: https://github.com/audreyr/cookiecutter\n.. _`audreyr/cookiecutter-pypackage`: https://github.com/audreyr/cookiecutter-pypackage\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fobservingclouds%2Fcar_referencer","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fobservingclouds%2Fcar_referencer","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fobservingclouds%2Fcar_referencer/lists"}