{"id":27185032,"url":"https://github.com/threefoldtech/0-stor_v2","last_synced_at":"2026-02-17T07:39:36.682Z","repository":{"id":39742703,"uuid":"321800808","full_name":"threefoldtech/0-stor_v2","owner":"threefoldtech","description":null,"archived":false,"fork":false,"pushed_at":"2025-01-29T16:36:14.000Z","size":860,"stargazers_count":2,"open_issues_count":11,"forks_count":1,"subscribers_count":9,"default_branch":"master","last_synced_at":"2025-04-09T17:10:33.366Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/threefoldtech.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":"CODEOWNERS","security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-12-15T22:07:17.000Z","updated_at":"2025-01-21T13:40:04.000Z","dependencies_parsed_at":"2024-08-23T14:33:41.409Z","dependency_job_id":"0e335e56-ff15-412f-a69c-35191cce063e","html_url":"https://github.com/threefoldtech/0-stor_v2","commit_stats":null,"previous_names":[],"tags_count":20,"template":false,"template_full_name":null,"purl":"pkg:github/threefoldtech/0-stor_v2","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/threefoldtech%2F0-stor_v2","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/threefoldtech%2F0-stor_v2/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/threefoldtech%2F0-stor_v2/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/threefoldtech%2F0-stor_v2/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/threefoldtech","download_url":"https://codeload.github.com/threefoldtech/0-stor_v2/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/threefoldtech%2F0-stor_v2/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29536918,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-17T05:00:25.817Z","status":"ssl_error","status_checked_at":"2026-02-17T04:57:16.126Z","response_time":100,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-04-09T17:10:10.324Z","updated_at":"2026-02-17T07:39:36.670Z","avatar_url":"https://github.com/threefoldtech.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"# 0-stor_v2\n\n`zstor` is an object encoding storage system. It can be run in either a\ndaemon - client setup, or it can perform single actions without an\nassociated daemon, which is mainly useful for uploading/retrieving\nsingle items. The daemon is part of the same binary, and will run other\nuseful features, such as a repair queue which periodically verifies the\nintegrity of objects.\n\n## Storage and data integrity\n\nZstor uses 0-db's to store the data. It does so by splitting up the data in chunks and distributing them over N 0-db's.\n\n```mermaid\nC4Component\ntitle Zstor setup\n\nComponent(zstor, \"Zstor instance\")\n\nDeployment_Node(zerodbgroup0,\"0-db group\", \"\"){\n    System(zerodb1,\"0-db 1\")\n    System(zerodb2,\"0-db 2\")\n}\nDeployment_Node(zerodbgroup1,\"0-db group\", \"\"){\n    System(zerodbx,\"0-db ...\") \n    System(zerodbn,\"0-db N\") \n}\n\nRel(zstor, zerodb1, \"\")\nRel(zstor, zerodb2, \"\")\nRel(zstor, zerodbx, \"\")\nRel(zstor, zerodbn, \"\")\n```\n\nZstor uses forward looking error correcting codes (FLECC) for data consistency and to protect against data loss.\n\nThis means zstor constantly tries to spread the data over N, being the *expected_shards*, 0-db's.\n\nAs long as there are M (*minimal_shards*), M being smaller than N off course, chunks of data intact, zstor can recover the data.\n\n## Expected setup\n\nCurrently, `zstor` expects a stable system to start from, which is user\nprovided:\n\n- `zstor` has a redundancy configuration which introduces the notion of\n `groups`: a group is a list of 0-db's which have an inherent larger risk of going\n down together. For example, grid 0-db's which are deployed on the same farm.\n\n## Daemon - client usage vs standalone usage\n\nThe daemon, or monitor, can be started by invoking `zstor` with the\n`monitor` subcommand. This starts a long running process, and opens up a\nunix socket on the path specified in the config. Regular command\ninvocations (example \"store\") of `zstor` will then read the path to the\nunix socket from the config, connect to it, send the command, and wait\nuntil the monitor daemon returns a response after executing the command.\nThis setup is recommended as:\n\n- It exposes optional metrics for prometheus to scrape.\n- Only a single upload/download of a file happens at once, meaning you\n won't burn out your whole cpu by sending multiple upload commands in\n quick succession.\n\nIf the socket path is not specified, `zstor` will fall back to its\nsingle command flow, where it executes the command in process, and then\nexits. Invoking `zstor` multiple times in quick succession might cause\nmultiple uploads to be performed at the same time, causing multiple cpu\ncores to be used for the encryption/compression.\n\n## Current features\n\n### Supported commands\n\n- `Store` data in multiple chunks on zdb backends, according to a given policy\n- `Retrieve` said data, using just the path and the metadata store. Zdbs can be\nremoved, as long as sufficient are left to recover the data.\n- `Rebuild` the data, loading existing data (as long as sufficient zdbs are left),\nreencoding it, and storing it in (new) zdbs according to the current config\n- `Check` a file, returning a 16 byte `blake2b` checksum (in hex) if it\n is present in the backend (by fetching it from the metastore).\n- `Status` : get statistics about backends\n\n### Other features\n\n- Config file hot reloading. If the config file is editted, `SIGUSR1` can\n be send to a running 0-stor process to reload the configuration. Only the\n backends in the new configuration are used. Data is not rebuild on these\n new backends as long as the old backends are still operational\n- Monitoring of active 0-db backends. An active backend is considered a\nbackend that is tracked in the config, which has sufficient space\n left to write new blocks.\n- Repair queue: periodically, all 0-db's used are checked, to see if the\n are still online. If a 0-db is unreachable, all objects which have a\n chunk stored on that 0-db will be rebuild on fully healthy 0-db's.\n- Prometheus metrics. The metrics server is bound to all interfaces, on\n the port specified in the config. The path is `/metrics`. If no port\n is set in the config, the metrics server won't be enabled.\n\n## Building\n\nMake sure you have the latest Rust stable installed. Clone the repository:\n\n```shell\ngit clone https://github.com/threefoldtech/0-stor_v2\ncd 0-stor_v2\n```\n\nThen build with the standard toolchain through cargo:\n\n```shell\ncargo build\n```\n\nThis will produce the executable in `./target/debug/zstor_v2`.\n\n### Static binary\n\nOn linux, a fully static binary can be compiled by using the `x86_64-unknown-linux-musl`\ntarget, as follows:\n\n```rust\ncargo build --target x86_64-unknown-linux-musl --release\n```\n\n## Config file\n\nRunning `zstor` requires a config file. An example config, and\nexplanation of the parameters is found below.\n\n### Example config file\n\n```toml\nminimal_shards = 10\nexpected_shards = 15\nredundant_groups = 1\nredundant_nodes = 1\nroot = \"/virtualroot\"\nsocket = \"/tmp/zstor.sock\"\nprometheus_port = 9100\nzdb_data_dir_path = \"/tmp/0-db/data\"\nmax_zdb_data_dir_size = 25600\n\n[encryption]\nalgorithm = \"AES\"\nkey = \"0000000000000000000000000000000000000000000000000000000000000000\"\n\n[compression]\nalgorithm = \"snappy\"\n\n[meta]\ntype = \"zdb\"\n\n[meta.config]\nprefix = \"someprefix\"\n\n[meta.config.encryption]\nalgorithm = \"AES\"\nkey = \"0101010101010101010101010101010101010101010101010101010101010101\"\n\n[[meta.config.backends]]\naddress = \"[2a02:1802:5e::dead:beef]:9900\"\nnamespace = \"test2\"\npassword = \"supersecretpass\"\n\n[[meta.config.backends]]\naddress = \"[2a02:1802:5e::dead:beef]:9901\"\nnamespace = \"test2\"\npassword = \"supersecretpass\"\n\n[[meta.config.backends]]\naddress = \"[2a02:1802:5e::dead:beef]:9902\"\nnamespace = \"test2\"\npassword = \"supersecretpass\"\n\n[[meta.config.backends]]\naddress = \"[2a02:1802:5e::dead:beef]:9903\"\nnamespace = \"test2\"\npassword = \"supersecretpass\"\n\n[[groups]]\n[[groups.backends]]\naddress = \"[fe80::1]:9900\"\n\n[[groups.backends]]\naddress = \"[fe80::1]:9900\"\nnamespace = \"test\"\n\n[[groups]]\n[[groups.backends]]\naddress = \"[2a02:1802:5e::dead:babe]:9900\"\n\n[[groups.backends]]\naddress = \"[2a02:1802:5e::dead:beef]:9900\"\nnamespace = \"test2\"\npassword = \"supersecretpass\"\n```\n\n### Config file explanation\n\n- `minimal_shards`: The minimum amount of shards which are needed to recover\n    the original data.\n- `expected_shards`: The amount of shards which are generated when the data is\n    encoded. Essentially, this is the amount of shards which is needed to be able\n    to recover the data, and some disposable shards which could be lost. The\n    amount of disposable shards can be calculated as\n    `expected_shards - minimal_shards`.\n- `redundant_groups`: The amount of groups which one should be able to\n    loose while still being able to recover the original data.\n- `redundant_nodes`: The amount of nodes that can be lost in every group\n    while still being able to recover the original data.\n- `root`: virtual root on the filesystem to use, this path will be removed\n    from all files saved. If a file path is loaded, the path will be\n    interpreted as relative to this directory\n- `socket`: Optional path to a unix socket. This socket is required in\n    case zstor needs to run in daemon mode. If this is present, zstor\n    invocations will first try to connect to the socket. If it is not found,\n    the command is run in-process, else it is encoded and send to the socket\n    so the daemon can process it.\n- `zdb_data_dir_path`: Optional path to the local 0-db data file directory.\n    If set, it will be monitored and kept within the size limits. This is primarily\n    used when 0-stor is running as part of a QSFS deployment. In this case, a 0-db-fs\n    instance is running, which is using a local 0-db as read/write cache. When this\n    option is set, the size of this cache is monitored, and if needed the least recently\n    accessed files are removed.\n- `max_zdb_data_dir_size`: Maximum size of the data dir in MiB, if this\n    is set and the sum of the file sizes in the data dir gets higher than\n    this value, the least used, already encoded file will be removed.\n- `zdbfs_mountpoint`: Optional path of a 0-db-fs mount. If present, a syscall\n    will be executed periodically to retrieve file system statistics, which will\n    then be exposed through the build-in prometheus server.\n- `prometheus_port`: An optional port on which prometheus metrics will be\n    exposed. If this is not set, the metrics will not get exposed.\n- `encryption`: configuration to use for the encryption stage. Currently\n    only `AES` is supported. The encryption `key` is 32 random bytes in hexadecimal form.\n- `compression`: configuration to use for the compression stage.\n    Currently only `snappy` is supported\n- `meta`: configuration for the metadata store to use, currently only\n    `zdb` is supported\n- `groups`: The backend groups to write the data to.\n\nExplanation:\n\n## Metadata\n\nWhen data is encoded, metadata is generated to later retrieve this data.\nThe metadata is stored in 4 0-dbs, with a given prefix.\n\nFor every file, we get the full path of the file on the system, generate a 16 byte\nblake2b hash, and hex encode the bytes. We then append this to the prefix to\ngenerate the final key.\n\nThe key structure is: `/{prefix}/meta/{hashed_path_hex}`\n\nThe metadata itself is encrypted, binary encoded, and then dispersed in\nthe metadata 0-dbs.\n\n### Metadata cluster requirements\n\nSince the metadata is also encoded before being stored, we need to know the used\nencoding to be able to decode again. Since we can't store metadata about metadata\nitself, this is a static setup. As said, at present there are 4 metadata storage\n0-db's defined. Since the key is defined by the system, these must be run in `user`\nmode. At the moment, it is not possible to define more metadata stores as can be\ndone with regular data stores.\n\nThe actual metadata is encoded in a 2:2 setup, that is, 2 data shards and 2 parity\nshards. This allows up to 2 (i.e. half) of the metadata stores to be lost, while\nstill retaining access to the data. Any 2 stores can be used to recover the data,\nthere is no specific difference between them.\n\nBecause the system is designed to prioritize recoverability over availability,\nwriters will be rejected if the metadata storage is in the degraded state, that is,\nnot all 4 stores are available and writeable.\nHowever, read operations are still possible with at least two stores available.\nSimilarly, the 0-stor daemon can be started with a minimum of two stores available.\n\nA metadata store can be replaced by a new one, by removing the old one in the config\nand inserting the new one. The repair subsystem will take care of rebulding the data,\nregenerating the shards, and storing the new shards on the new metatada store.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fthreefoldtech%2F0-stor_v2","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fthreefoldtech%2F0-stor_v2","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fthreefoldtech%2F0-stor_v2/lists"}