{"id":15833683,"url":"https://github.com/itzmeanjan/merklize-sha","last_synced_at":"2025-03-15T07:32:05.049Z","repository":{"id":43044201,"uuid":"451434791","full_name":"itzmeanjan/merklize-sha","owner":"itzmeanjan","description":"SYCL accelerated Binary Merklization using SHA1, SHA2 \u0026 SHA3","archived":false,"fork":false,"pushed_at":"2022-03-22T05:25:04.000Z","size":249,"stargazers_count":5,"open_issues_count":0,"forks_count":0,"subscribers_count":3,"default_branch":"master","last_synced_at":"2024-10-06T13:41:34.359Z","etag":null,"topics":["binary-merklization","dpcpp","gpgpu","heterogeneous-computing","keccak256","merkle-tree","sha1","sha224","sha256","sha3","sha3-256","sha3-512","sha384","sha512","sha512-224","sha512-256","sycl"],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/itzmeanjan.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2022-01-24T11:21:03.000Z","updated_at":"2024-01-30T07:36:44.000Z","dependencies_parsed_at":"2022-09-06T10:20:58.339Z","dependency_job_id":null,"html_url":"https://github.com/itzmeanjan/merklize-sha","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/itzmeanjan%2Fmerklize-sha","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/itzmeanjan%2Fmerklize-sha/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/itzmeanjan%2Fmerklize-sha/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/itzmeanjan%2Fmerklize-sha/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/itzmeanjan","download_url":"https://codeload.github.com/itzmeanjan/merklize-sha/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243701276,"owners_count":20333615,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["binary-merklization","dpcpp","gpgpu","heterogeneous-computing","keccak256","merkle-tree","sha1","sha224","sha256","sha3","sha3-256","sha3-512","sha384","sha512","sha512-224","sha512-256","sycl"],"created_at":"2024-10-05T13:41:35.029Z","updated_at":"2025-03-15T07:32:04.620Z","avatar_url":"https://github.com/itzmeanjan.png","language":"C++","funding_links":[],"categories":[],"sub_categories":[],"readme":"# merklize-sha\n\nSYCL accelerated Binary Merklization using SHA1, SHA2 \u0026 SHA3 ( along with keccak256 )\n\n## Motivation\n\nAfter implementing BLAKE3 using SYCL, I decided to accelerate 2-to-1 hash implementation of all variants of SHA1, SHA2 \u0026 SHA3 families of cryptographic hash functions ( along with keccak256 ). BLAKE3 lends itself pretty well to parallelization efforts, due to its inherent data parallel friendly algorithmic construction, where each 1024 -bytes chunk can be compressed independently ( read parallelly ) and finally it's a binary merklization problem with compressed chunks as leaf nodes of binary merkle tree. But none of SHA1, SHA2 \u0026 SHA3 ( or keccak256 ) families of cryptographic hash functions are data parallel, requiring to process each message block ( can be 512 -bit/ 1024 -bit or padded to 1600 -bit in case of SHA3 family ) sequentially, which is why I only concentrated on accelerating Binary Merklization where SHA1/ SHA2/ SHA3 families of cryptographic ( 2-to-1 ) hash functions are used for computing all intermediate nodes of tree when N -many leaf nodes are provided, where `N = 2 ^ i | i = {1, 2, 3 ...}`. Each of these N -many leaf nodes are respective hash digests --- for example, when using SHA2-256 variant for computing all intermediate nodes of binary merkle tree, each of provided leaf node is 32 -bytes wide, representing a SHA2-256 digest. Now, N -many leaf digests are merged into N/ 2 -many digests which are intermediate nodes, living just above leaf nodes. Then in next phase, those N/ 2 -many intermediates are used for computing N/ 4 -many of intermediates which are living just above them. This process continues until root of merkle tree is computed. Notice, that in each level of tree, each consecutive pair of digests can be hashed independently --- and that's the scope of parallelism I'd like to make use of during binary merklization. In following depiction, when N ( = 4 ) nodes are provided as input, two intermediates can be computed in parallel and once they're computed root of tree can be computed as a single task.\n\n```bash\n  ((a, b), (c, d))          \u003c --- [Level 1] [Root]\n     /       \\\n    /         \\\n (a, b)      (c, d)         \u003c --- [Level 2] [Intermediates]\n  / \\        /  \\\n /   \\      /    \\\na     b     c     d         \u003c --- [Level 3] [Leaves]\n```\n\nI'd also like you to note that, computation of nodes of level-i of tree are data dependent on level-(i + 1).\n\nWhen N is power of 2 and those many nodes are provided as input, (N - 1) -many intermediates to be computed. For that reason, size of allocated memory for output is of same size as input is. That means, very first few bytes ( = digest size of hash function in use ) of output memory allocation will be empty. To be more specific, if SHA2-224 is our choice of hash function, then first 28 -bytes of output memory allocation will not be of interest, but skipping that next 28 -bytes chunk should have root of tree, once offloaded computation finishes its execution.\n\n```bash\ninput   = [a, b, c, d]\noutput  = [0, ((a, b), (c, d)), (a, b), (c, d)]\n```\n\nHere in this repository, I'm keeping binary merklization kernels, implemented in SYCL, while using SHA1/ SHA2/ SHA3 variants as 2-to-1 hash function ( along with keccak256 ), which one to use is compile-time choice using pre-processor directive.\n\nIf you happen to be interested in Binary Merklization using Rescue Prime Hash/ BLAKE3, consider seeing following links.\n\n- [Binary Merklization using Rescue Prime Hash](https://github.com/itzmeanjan/ff-gpu)\n- [Binary Merklization using BLAKE3](https://github.com/itzmeanjan/blake3)\n\n\u003e During SHA1, SHA2 implementations, I've followed Secure Hash Standard [specification](http://dx.doi.org/10.6028/NIST.FIPS.180-4).\n\n\u003e During SHA3 implementations, I've followed SHA-3 Standard [specification](http://dx.doi.org/10.6028/NIST.FIPS.202).\n\n\u003e During Keccak256 implementation, I took some inspiration from [here](https://keccak.team/files/Keccak-implementation-3.2.pdf); though note that, keccak256 \u0026 sha3-256 are very much similar, except input message padding rule; see https://github.com/itzmeanjan/merklize-sha/pull/10 PR description.\n\n\u003e I'm also keeping an alternative implementation of keccak256 2-to-1 hash, where each 64 -bit lane of keccak-p[1600, 24] state array is represented in terms of two 32 -bit unsigned integers ( in bit interleaved form ) and applying rounds involve only using 32 -bit bitwise operations. This implementation takes motivation from section 2.1 of [this](https://keccak.team/files/Keccak-implementation-3.2.pdf) document. *If interested see keccak256 2-to-1 hash implementation using 32 -bit word size, [here](https://github.com/itzmeanjan/merklize-sha/blob/12f61fa52b5eb2d674a4dafd124585a9a76dae52/include/keccak_256.hpp#L232-L257)*. **This same implementation guided me while writing keccak256 2-to-1 hash function in Polygon Miden Assembly**, see [PR](https://github.com/maticnetwork/miden/pull/154).\n\n\u003e Using SHA1 for binary merklization may not be a good choice these days, see [here](https://csrc.nist.gov/Projects/Hash-Functions/NIST-Policy-on-Hash-Functions). But still I'm keeping SHA1 implementation, just as a reference.\n\n## Prerequisites\n\n- I'm using \n\n```bash\n$ lsb_release -d\n\nDescription:    Ubuntu 20.04.3 LTS\n```\n\n- You should have Intel's DPCPP compiler, which is an open-source llvm-based SYCL specification's implementation; see [here](https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit-download.html)\n\n```bash\n$ dpcpp --version\n\nIntel(R) oneAPI DPC++/C++ Compiler 2022.0.0 (2022.0.0.20211123)\nTarget: x86_64-unknown-linux-gnu\nThread model: posix\nInstalledDir: /opt/intel/oneapi/compiler/2022.0.2/linux/bin-llvm\n```\n\n- If you're planning to target Nvidia GPU, I suggest you compile aforementioned toolchain from source; see [here](https://intel.github.io/llvm-docs/GetStartedGuide.html#prerequisites)\n\n```bash\n$ clang++ --version\n\nclang version 14.0.0 (https://github.com/intel/llvm c690ac8d771e8bb1a1be651872b782f4044d936c)\nTarget: x86_64-unknown-linux-gnu\nThread model: posix\nInstalledDir: /home/ubuntu/sycl_workspace/llvm/build/bin\n```\n\n- You will also need to have `make` utility for easily running compilation flow, along with that `clang-format` for source formatting can be helpful.\n- Another useful tool to have is `sycl-info`, for quickly checking available SYCL implementation related details; see [here](https://github.com/codeplaysoftware/sycl-info)\n\n## Usage\n\nIf you happen to be interested in 2-to-1 hash implementation of\n\n- [SHA1](https://github.com/itzmeanjan/merklize-sha/blob/fd76b7a/example/sha1.cpp)\n- [SHA2-224](https://github.com/itzmeanjan/merklize-sha/blob/fd76b7a/example/sha2_224.cpp)\n- [SHA2-256](https://github.com/itzmeanjan/merklize-sha/blob/fd76b7a/example/sha2_256.cpp)\n- [SHA2-384](https://github.com/itzmeanjan/merklize-sha/blob/fd76b7a/example/sha2_384.cpp)\n- [SHA2-512](https://github.com/itzmeanjan/merklize-sha/blob/fd76b7a/example/sha2_512.cpp)\n- [SHA2-512/224](https://github.com/itzmeanjan/merklize-sha/blob/fd76b7a/example/sha2_512_224.cpp)\n- [SHA2-512/256](https://github.com/itzmeanjan/merklize-sha/blob/fd76b7a/example/sha2_512_256.cpp)\n- [SHA3-224](https://github.com/itzmeanjan/merklize-sha/blob/8f9b168/example/sha3_224.cpp)\n- [SHA3-256](https://github.com/itzmeanjan/merklize-sha/blob/8f9b168/example/sha3_256.cpp)\n- [SHA3-384](https://github.com/itzmeanjan/merklize-sha/blob/8f9b168/example/sha3_384.cpp)\n- [SHA3-512](https://github.com/itzmeanjan/merklize-sha/blob/8f9b168/example/sha3_512.cpp)\n- [KECCAK-256](https://github.com/itzmeanjan/merklize-sha/blob/fb41136/example/keccak_256.cpp)\n\nwhere two digests of respective hash functions are input, in byte concatenated form, to `hash( ... )` function, consider taking a look at above hyperlinked examples.\n\n\u003e Compile above examples using `dpcpp -fsycl example/\u003cfile\u003e.cpp -I./include`\n\nYou will probably like to see how binary merklization kernels use these 2-to-1 hash functions; see [here](https://github.com/itzmeanjan/merklize-sha/blob/ddb7ac9/include/merklize.hpp)\n\n## Tests\n\nI've accompanied each hash function implementation along with binary merklization using them, with test cases which can be executed as\n\n```bash\nbash run.sh\n```\n\n## Benchmarks\n\nFor benchmarking binary merklization, I'm taking randomly generated N -many leaf nodes as input, which are explicitly transferred to accelerator's memory; computing all (N - 1) -many intermediate nodes; finally transferring them back to host memory. This flow is executed 8 times, before taking average of kernel execution/ host \u003c-\u003e device data tx time, for some N.\n\nI'm keeping binary merklization benchmark results of\n\n- SHA1\n  - [Nvidia GPU(s)](results/sha1/nvidia_gpu.md)\n  - [Intel CPU(s)](results/sha1/intel_cpu.md)\n  - [Intel GPU(s)](results/sha1/intel_gpu.md)\n- SHA2-224\n  - [Nvidia GPU(s)](results/sha2-224/nvidia_gpu.md)\n  - [Intel CPU(s)](results/sha2-224/intel_cpu.md)\n  - [Intel GPU(s)](results/sha2-224/intel_gpu.md)\n- SHA2-256\n  - [Nvidia GPU(s)](results/sha2-256/nvidia_gpu.md)\n  - [Intel CPU(s)](results/sha2-256/intel_cpu.md)\n  - [Intel GPU(s)](results/sha2-256/intel_gpu.md)\n- SHA2-384\n  - [Nvidia GPU(s)](results/sha2-384/nvidia_gpu.md)\n  - [Intel CPU(s)](results/sha2-384/intel_cpu.md)\n  - [Intel GPU(s)](results/sha2-384/intel_gpu.md)\n- SHA2-512\n  - [Nvidia GPU(s)](results/sha2-512/nvidia_gpu.md)\n  - [Intel CPU(s)](results/sha2-512/intel_cpu.md)\n  - [Intel GPU(s)](results/sha2-512/intel_gpu.md)\n- SHA2-512/224\n  - [Nvidia GPU(s)](results/sha2-512-224/nvidia_gpu.md)\n  - [Intel CPU(s)](results/sha2-512-224/intel_cpu.md)\n  - [Intel GPU(s)](results/sha2-512-224/intel_gpu.md)\n- SHA2-512/256\n  - [Nvidia GPU(s)](results/sha2-512-256/nvidia_gpu.md)\n  - [Intel CPU(s)](results/sha2-512-256/intel_cpu.md)\n  - [Intel GPU(s)](results/sha2-512-256/intel_gpu.md)\n- SHA3-256\n  - [Nvidia GPU(s)](results/sha3-256/nvidia_gpu.md)\n  - [Intel CPU(s)](results/sha3-256/intel_cpu.md)\n  - [Intel GPU(s)](results/sha3-256/intel_gpu.md)\n- SHA3-224\n  - [Nvidia GPU(s)](results/sha3-224/nvidia_gpu.md)\n  - [Intel CPU(s)](results/sha3-224/intel_cpu.md)\n  - [Intel GPU(s)](results/sha3-224/intel_gpu.md)\n- SHA3-384\n  - [Nvidia GPU(s)](results/sha3-384/nvidia_gpu.md)\n  - [Intel CPU(s)](results/sha3-384/intel_cpu.md)\n  - [Intel GPU(s)](results/sha3-384/intel_gpu.md)\n- SHA3-512\n  - [Nvidia GPU(s)](results/sha3-512/nvidia_gpu.md)\n  - [Intel CPU(s)](results/sha3-512/intel_cpu.md)\n  - [Intel GPU(s)](results/sha3-512/intel_gpu.md)\n- KECCAK-256\n  - [Nvidia GPU(s)](results/keccak-256/nvidia_gpu.md)\n  - [Intel CPU(s)](results/keccak-256/intel_cpu.md)\n  - [Intel GPU(s)](results/keccak-256/intel_gpu.md)\n\nobtained after executing them on multiple accelerators.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fitzmeanjan%2Fmerklize-sha","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fitzmeanjan%2Fmerklize-sha","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fitzmeanjan%2Fmerklize-sha/lists"}