{"id":21598503,"url":"https://github.com/aviksaikat/bmt-py","last_synced_at":"2025-10-26T04:34:57.634Z","repository":{"id":230269294,"uuid":"778948475","full_name":"Aviksaikat/bmt-py","owner":"Aviksaikat","description":"Binary Merkle Tree operations on data","archived":false,"fork":false,"pushed_at":"2024-10-28T09:04:14.000Z","size":81682,"stargazers_count":3,"open_issues_count":1,"forks_count":2,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-04-11T01:04:44.300Z","etag":null,"topics":["algorithm","bmt","hatch","python3","swarm"],"latest_commit_sha":null,"homepage":"https://aviksaikat.github.io/bmt-py/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Aviksaikat.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":".github/funding.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null},"funding":{"github":["aviksaikat"],"buy_me_a_coffee":"aviksaikat","patreon":"CyberPuzzlePros","polar":"aviksaikat","custom":["https://www.paypal.me/aviksaikat007"]}},"created_at":"2024-03-28T18:17:09.000Z","updated_at":"2024-12-21T07:36:28.000Z","dependencies_parsed_at":"2024-11-24T23:31:19.990Z","dependency_job_id":null,"html_url":"https://github.com/Aviksaikat/bmt-py","commit_stats":null,"previous_names":["aviksaikat/bmt-py"],"tags_count":3,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Aviksaikat%2Fbmt-py","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Aviksaikat%2Fbmt-py/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Aviksaikat%2Fbmt-py/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Aviksaikat%2Fbmt-py/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Aviksaikat","download_url":"https://codeload.github.com/Aviksaikat/bmt-py/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248322597,"owners_count":21084336,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["algorithm","bmt","hatch","python3","swarm"],"created_at":"2024-11-24T18:12:23.389Z","updated_at":"2025-10-26T04:34:57.582Z","avatar_url":"https://github.com/Aviksaikat.png","language":"Python","funding_links":["https://github.com/sponsors/aviksaikat","https://buymeacoffee.com/aviksaikat","https://patreon.com/CyberPuzzlePros","https://polar.sh/aviksaikat","https://www.paypal.me/aviksaikat007"],"categories":[],"sub_categories":[],"readme":"# Bmt-py\n\n\u003cp align=\"center\"\u003e\n    \u003cem\u003eBinary Merkle Tree operations on data\u003c/em\u003e\n\u003c/p\u003e\n\n\u003cdiv align=\"center\"\u003e\n\n| Feature       | Value                     |\n| ------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |\n| Technology    | [![Python](https://img.shields.io/badge/Python-3776AB.svg?style=flat\u0026logo=Python\u0026logoColor=white)](https://www.python.org/) [![Hatch project](https://img.shields.io/badge/%F0%9F%A5%9A-Hatch-4051b5.svg)](https://github.com/pypa/hatch) [![GitHub Actions](https://img.shields.io/badge/GitHub%20Actions-2088FF.svg?style=flat\u0026logo=GitHub-Actions\u0026logoColor=white)](https://github.com/features/actions) [![Pytest](https://img.shields.io/badge/Pytest-0A9EDC.svg?style=flat\u0026logo=Pytest\u0026logoColor=white)](https://github.com/aviksaikat/bmt-py/actions/workflows/tests.yml/badge.svg)                           |\n| Type Checking | [![Ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff) [![Checked with mypy](http://www.mypy-lang.org/static/mypy_badge.svg)](http://mypy-lang.org/)                                                                                                                                                                                                                                                                                                                                                                                                                                                     |\n| CI/CD         | [![Build](https://github.com/Aviksaikat/bmt-py/actions/workflows/build.yml/badge.svg)](https://github.com/Aviksaikat/bmt-py/actions/workflows/build.yml) [![Tests](https://github.com/aviksaikat/bmt-py/actions/workflows/tests.yml/badge.svg)](https://github.com/aviksaikat/bmt-py/actions/workflows/tests.yml) [![Labeler](https://github.com/aviksaikat/bmt-py/actions/workflows/labeler.yml/badge.svg)](https://github.com/aviksaikat/bmt-py/actions/workflows/labeler.yml) [![pre-commit](https://img.shields.io/badge/pre--commit-enabled-brightgreen?logo=pre-commit\u0026logoColor=white)](https://github.com/pre-commit/pre-commit) [![codecov](https://codecov.io/gh/Aviksaikat/bmt-py/graph/badge.svg?token=ISTIW37DO6)](https://codecov.io/gh/Aviksaikat/bmt-py)                                                                                                                                                                                                           |\n| Docs          | [![Docs](https://github.com/Aviksaikat/bmt-py/actions/workflows/documentation.yml/badge.svg)](https://github.com/Aviksaikat/bmt-py/actions/workflows/build.yml)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |\n| Package       | [![PyPI - Version](https://img.shields.io/pypi/v/bmt_py.svg)](https://pypi.org/project/bmt_py/) [![PyPI - Python Version](https://img.shields.io/pypi/pyversions/bmt_py)](https://pypi.org/project/bmt_py/) [![PyPI - License](https://img.shields.io/pypi/l/bmt_py)](https://pypi.org/project/bmt_py/)                                                                                                                                                                                                                                                                                                                                                                                                        |\n| Meta          | [![GitHub license](https://img.shields.io/github/license/aviksaikat/bmt-py?style=flat\u0026color=1573D5)](https://github.com/aviksaikat/bmt-py/blob/main/LICENSE) [![GitHub last commit](https://img.shields.io/github/last-commit/aviksaikat/bmt-py?style=flat\u0026color=1573D5)](https://github.com/aviksaikat/bmt-py/commits/main) [![GitHub commit activity](https://img.shields.io/github/commit-activity/m/aviksaikat/bmt-py?style=flat\u0026color=1573D5)](https://github.com/aviksaikat/bmt-py/graphs/commit-activity) [![GitHub top language](https://img.shields.io/github/languages/top/aviksaikat/bmt-py?style=flat\u0026color=1573D5)](https://github.com/aviksaikat/bmt-py) |\n\n\u003c/div\u003e\n\n# Installation\n- Install using `pip`\n```py\npip install bmt_py\n```\n\n\u003cdetails open\u003e\n\u003csummary\u003eUsage\u003c/summary\u003e\n\u003cbr\u003e\n\n# Usage\n\n```py\n\u003e\u003e\u003e from bmt_py import make_chunk\n\n\u003e\u003e\u003e payload = bytes([1, 2, 3])\n\u003e\u003e\u003e chunk = make_chunk(payload)\n\u003e\u003e\u003e result = chunk.address()\n\u003e\u003e\u003e print(bytes_to_hex(result, 64))\n# ca6357a08e317d15ec560fef34e4c45f8f19f01c372aa70f1da72bfa7f1a4338\n```\n\n- Chunking with Payload Lesser Than 4KB\n```py\nfrom bmt_py import make_chunked_file\npayload = bytes([1, 2, 3])\nchunked_file = make_chunked_file(payload)\n\nprint(len(chunked_file.leaf_chunks()))\n# 1\nonly_chunk = chunked_file.leaf_chunks()[0]\nonly_chunk.span() == chunked_file.span()\n# True\nonly_chunk.address() == chunked_file.address()\n# True\n```\n\n- Chunking with Payload Greater Than 4KB\n```py\nfrom bmt_py import make_chunked_file, get_span_value, bytes_to_hex\nwith open(\"The-Book-of-Swarm.pdf\", \"rb\") as f:\n    file_bytes = f.read()\nchunked_file = make_chunked_file(file_bytes)\n\nprint(get_span_value(chunked_file.span()))\n# 15726634\ntree = chunked_file.bmt()\nprint(len(tree))\n# 3\nprint(len(tree[2])) # last level only contains the root_chunk\n# 1\n\nroot_chunk = tree[2][0]\nsecond_level_first_chunk = tree[1][0]  # first intermediate chunk on the first intermediate chunk level\nroot_chunk.payload[:32] == second_level_first_chunk.address()\n# True\nprint(len(second_level_first_chunk.payload))\n# 4096\n\nprint(bytes_to_hex(chunked_file.address(), 64))\n# b8d17f296190ccc09a2c36b7a59d0f23c4479a3958c3bb02dc669466ec919c5d\n```\n\n\n```py\ndef test_collect_required_segments_for_inclusion_proof():\n    with open(\"carrier-chunk-blob\", \"rb\") as f:\n        file_bytes = f.read()\n    chunked_file = make_chunked_file(file_bytes)\n    file_hash = chunked_file.address()\n\n    # Segment to prove\n    segment_index = (len(file_bytes) - 1) // 32\n\n    # Check segment array length for carrierChunk inclusion proof\n    proof_chunks = file_inclusion_proof_bottom_up(chunked_file, segment_index)\n    assert len(proof_chunks) == 2  # 1 level is skipped because the segment was in a carrierChunk\n\n    def test_get_file_hash(segment_index):\n        proof_chunks = file_inclusion_proof_bottom_up(chunked_file, segment_index)\n        prove_segment = file_bytes[segment_index * SEGMENT_SIZE : segment_index * SEGMENT_SIZE + SEGMENT_SIZE]\n        # Padding\n        prove_segment += bytearray(SEGMENT_SIZE - len(prove_segment))\n\n        # Check the last segment has the correct span value.\n        file_size_from_proof = get_span_value(proof_chunks[-1].span)\n        assert file_size_from_proof == len(file_bytes)\n\n        return file_address_from_inclusion_proof(proof_chunks, prove_segment, segment_index)\n\n    # Edge case\n    hash1 = test_get_file_hash(segment_index)\n    assert hash1 == file_hash\n    hash2 = test_get_file_hash(1000)\n    assert hash2 == file_hash\n```\n\n\n```py\ndef test_collect_required_segments_for_inclusion_proof_2(bos_bytes):\n    with open(\"The-Book-of-Swarm.pdf\", \"rb\") as f:\n        bos_bytes = f.read()\n    chunked_file = make_chunked_file(bos_bytes)\n    file_hash = chunked_file.address()\n\n    # Segment to prove\n    last_segment_index = (len(file_bytes) - 1) // 32\n\n    def test_get_file_hash(segment_index):\n        proof_chunks = file_inclusion_proof_bottom_up(chunked_file, segment_index)\n        prove_segment = file_bytes[segment_index * SEGMENT_SIZE : segment_index * SEGMENT_SIZE + SEGMENT_SIZE]\n        # Padding\n        prove_segment += bytearray(SEGMENT_SIZE - len(prove_segment))\n\n        # Check the last segment has the correct span value.\n        file_size_from_proof = get_span_value(proof_chunks[-1].span)\n        assert file_size_from_proof == len(file_bytes)\n\n        return file_address_from_inclusion_proof(proof_chunks, prove_segment, segment_index)\n\n    # Edge case\n    hash1 = test_get_file_hash(last_segment_index)\n    assert hash1 == file_hash\n    hash2 = test_get_file_hash(1000)\n    assert hash2 == file_hash\n    with pytest.raises(Exception, match=r\"^The given segment index\"):\n        test_get_file_hash(last_segment_index + 1)\n```\n\n\n```py\ndef test_collect_required_segments_for_inclusion_proof_3():\n    # the file's byte counts will cause carrier chunk in the intermediate BMT level\n    # 128 * 4096 * 128 = 67108864 \u003c- left tree is saturated on bmt level 1\n    # 67108864 + 2 * 4096 = 67117056 \u003c- add two full chunks at the end thereby\n    # the zero level won't have carrier chunk, but its parent will be that.\n    with open(\"carrier-chunk-blob-2\", \"rb\") as f:\n        carrier_chunk_file_bytes_2 = f.read()\n    assert len(carrier_chunk_file_bytes_2) == 67117056\n\n    file_bytes = carrier_chunk_file_bytes_2\n    chunked_file = make_chunked_file(file_bytes)\n    file_hash = chunked_file.address()\n    # segment to prove\n    last_segment_index = (len(file_bytes) - 1) // 32\n\n    def test_get_file_hash(segment_index):\n        proof_chunks = file_inclusion_proof_bottom_up(chunked_file, segment_index)\n        prove_segment = file_bytes[segment_index * SEGMENT_SIZE : (segment_index * SEGMENT_SIZE) + SEGMENT_SIZE]\n        # padding\n        prove_segment = prove_segment.ljust(SEGMENT_SIZE, b\"\\0\")\n\n        # check the last segment has the correct span value.\n        file_size_from_proof = get_span_value(proof_chunks[-1].span)\n        assert file_size_from_proof == len(file_bytes)\n\n        return file_address_from_inclusion_proof(proof_chunks, prove_segment, segment_index)\n\n    # edge case\n    hash1 = test_get_file_hash(last_segment_index)\n    assert hash1 == file_hash\n    hash2 = test_get_file_hash(1000)\n    assert hash2 == file_hash\n    with pytest.raises(Exception, match=r\"^The given segment index\"):\n        test_get_file_hash(last_segment_index + 1)\n```\n\n\n- More examples are [here](https://aviksaikat.github.io/bmt-py/reference/Usage/).\n\n\u003c/details\u003e\n\n\n\n\u003cdetails close\u003e\n\u003csummary\u003eHow it works\u003c/summary\u003e\n\u003cbr\u003e\n\n# How it works\n\nFirst, it splits the data into `chunks` that have maximum 4KB payload by default, but this condition can modified as well as its `span` byte length (8 bytes) that indicates how long byte payload subsumed under the chunk.\n\nIf the payload byte length cannot fit exactly to this chunk division, the rightmost chunk's data will be padded with zeros in order to have fixed length data for the BMT operations.\n\nThis basic unit is also required to effectively distribute data on decentralized storage systems with regard to _plausible deniability_, _garbage collection_, _load balancing_ and else. \nFor more details, please visit [Etherem Swarm]() webpage that has full implementation of this logic.\n\nThe used hashing algorithm is the `keccak256` function that results in a 32 bytes long `segment`.\n\nPerforming BMT hashing on the chunk data will define the `BMT root hash` of the chunk.\nThen, for integrity considerations, the BMT root hash is hashed with the chunk's span from the left which takes the `Chunk address`.\n\n![BMT Hashing](./docs/bmt-hashing.png)\n\nIn order to refer files also with a single 32 byte segment, the chunk addresses of the payload have to be hashed in the same way until the `File address`:\n\nchunks can encapsulate 128 chunk addresses on the subsequent BMT tree level by default. These kind of chunks are called `Intermediate chunks`\nBy the properties of a BMT tree, the chunks will end in a `Root chunk` that refers all chunks below (directly or indirectly) and its address will be the `File address` as well.\n\nOne can realize, the BMT levels can have an orphan chunk on the rightmost-side that cannot be hashed with a neighbour chunk, because it does not have a neighbour chunk (e.g. 129/129 chunk).\nWhen it occurs, it does not have a sense to hash this orphan chunk on every BMT level since it will be BMT hashed with zero data.\nThat's why the the algorithm handles orphan chunk as `Carrier chunk` and tries to place it into that BMT tree level where the chunk address can be encapsulated with other addresses.\n\nThis BMT hashing of data allows to reference any file with unified 32 bytes unique address which is called _content addressing_.\n\n![File BMT calculation](./docs/file-bmt.png)\n\nNevertheless, it also allows to perform lightweight _compact inclusion proof_ on data.\nThis proof requires little amount of data to be provided for proving whether any particular segment (32 bytes) of the data is present at a particular offset under the file address.\n\nThis feature allows to create a logic around data referenced by 32 bytes file addresses where the data segment values have to meet some conditions.\nThe first/best use-case for this can happen via smart contracts that implement the validation functions that check the provided `inclusion proof segments` are indeed subsumed under the commited file references.\n\nTo get these inclusion segments, the library collects all required segments from the BMT tree that can be used for input of smart contract validation parameters.\n\n![Inclusion proof](./docs/inclusion-proof.png)\n\n\u003c/details\u003e\n\n\n---\n\n**Documentation**: \u003ca href=\"https://aviksaikat.github.io/bmt-py/\" target=\"_blank\"\u003ehttps://aviksaikat.github.io/bmt-py/\u003c/a\u003e\n\n**Source Code**: \u003ca href=\"https://github.com/aviksaikat/bmt-py\" target=\"_blank\"\u003ehttps://github.com/Aviksaikat/bmt-py\u003c/a\u003e\n\n---\n\n\u003cdetails close\u003e\n\u003csummary\u003eDevelopment\u003c/summary\u003e\n\u003cbr\u003e\n\n## Development\n\n### Setup environment\n\nWe use [Hatch](https://hatch.pypa.io/latest/install/) to manage the development environment and production build. Ensure it's installed on your system.\n\n### Run unit tests\n\nYou can run all the tests with:\n\n```bash\nhatch run test\n```\n\n### Format the code\n\nExecute the following command to apply linting and check typing:\n\n```bash\nhatch run lint\n```\n\n### Publish a new version\n\nYou can bump the version, create a commit and associated tag with one command:\n\n```bash\nhatch version patch\n```\n\n```bash\nhatch version minor\n```\n\n```bash\nhatch version major\n```\n\nYour default Git text editor will open so you can add information about the release.\n\nWhen you push the tag on GitHub, the workflow will automatically publish it on PyPi and a GitHub release will be created as draft.\n\n## Serve the documentation\n\nYou can serve the Mkdocs documentation with:\n\n```bash\nhatch run docs-serve\n```\n\n\u003c/details\u003e\n\n## License\n\nThis project is licensed under the terms of the [BSD-3-Clause](./LICENSE) license.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faviksaikat%2Fbmt-py","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Faviksaikat%2Fbmt-py","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faviksaikat%2Fbmt-py/lists"}