{"id":26493039,"url":"https://github.com/benalexau/hash-bench","last_synced_at":"2025-03-20T09:37:50.041Z","repository":{"id":49362163,"uuid":"43110529","full_name":"benalexau/hash-bench","owner":"benalexau","description":"Java Hashing, CRC and Checksum Benchmark (JMH)","archived":false,"fork":false,"pushed_at":"2021-03-31T20:49:59.000Z","size":9661,"stargazers_count":64,"open_issues_count":4,"forks_count":13,"subscribers_count":7,"default_branch":"master","last_synced_at":"2023-03-16T16:10:17.561Z","etag":null,"topics":["benchmark","crc","hash","java","jmh"],"latest_commit_sha":null,"homepage":"","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/benalexau.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2015-09-25T04:58:26.000Z","updated_at":"2022-12-23T15:09:32.000Z","dependencies_parsed_at":"2022-09-04T14:02:24.201Z","dependency_job_id":null,"html_url":"https://github.com/benalexau/hash-bench","commit_stats":null,"previous_names":[],"tags_count":null,"template":null,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/benalexau%2Fhash-bench","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/benalexau%2Fhash-bench/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/benalexau%2Fhash-bench/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/benalexau%2Fhash-bench/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/benalexau","download_url":"https://codeload.github.com/benalexau/hash-bench/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244588287,"owners_count":20477297,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["benchmark","crc","hash","java","jmh"],"created_at":"2025-03-20T09:37:49.517Z","updated_at":"2025-03-20T09:37:50.034Z","avatar_url":"https://github.com/benalexau.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"## Overview\nHash-Bench provides a JMH (Java microbenchmark harness) and\n[published results](#results) for 114 Java implementations of major\nhash, CRC and checksum algorithms. These include:\n\n* [Adler32](https://en.wikipedia.org/wiki/Adler-32)\n* [BSD Checksum](https://en.wikipedia.org/wiki/BSD_checksum)\n* [CityHash](https://en.wikipedia.org/wiki/CityHash)\n* [cksum](https://en.wikipedia.org/wiki/Cksum)\n* [crc16](https://en.wikipedia.org/wiki/Crc16)\n* [crc24](https://en.wikipedia.org/wiki/Cyclic_redundancy_check#Standards_and_common_use)\n* [crc32](https://en.wikipedia.org/wiki/Crc32)\n* [crc64](https://en.wikipedia.org/wiki/Crc64)\n* [Ed2k Hash](https://en.wikipedia.org/wiki/Ed2k_URI_scheme#eD2k_hash_algorithm)\n* [ElfHash](https://en.wikipedia.org/wiki/PJW_hash_function)\n* [FarmHash](https://github.com/google/farmhash)\n* [Fast Frame Check Sequence (FCS-16)](http://www.ietf.org/rfc/rfc1331.txt)\n* [Good Fast Hash](https://github.com/google/guava/wiki/HashingExplained)\n* [GOST](https://en.wikipedia.org/wiki/GOST_(hash_function))\n* [HAS-160](https://en.wikipedia.org/wiki/HAS-160)\n* [HAVAL](https://en.wikipedia.org/wiki/HAVAL)\n* [MD2](https://en.wikipedia.org/wiki/MD2_(cryptography))\n* [MD4](https://en.wikipedia.org/wiki/MD4)\n* [MD5](https://en.wikipedia.org/wiki/MD5)\n* [MurmurHash](https://en.wikipedia.org/wiki/MurmurHash)\n* [RIPEMD](https://en.wikipedia.org/wiki/RIPEMD) (128, 160, 256 and 320)\n* [SHA-1](https://en.wikipedia.org/wiki/SHA-1) (including SHA-0)\n* [SHA-2](https://en.wikipedia.org/wiki/SHA-2) (SHA-224, 256, 384, 512, 512/t)\n* [SHA-3](https://en.wikipedia.org/wiki/SHA-3)\n* [SipHash](https://en.wikipedia.org/wiki/SipHash)\n* [Skein](https://en.wikipedia.org/wiki/Skein_(hash_function)) (256, 512, 1024)\n* [SM3](http://tools.ietf.org/html/draft-shen-sm3-hash-00)\n* [Sum](https://en.wikipedia.org/wiki/List_of_hash_functions#Checksums)\n* [SYSV Checksum](https://en.wikipedia.org/wiki/SYSV_checksum)\n* [Tiger](https://en.wikipedia.org/wiki/Tiger_(cryptography)) (including Tiger 2)\n* [Whirlpool](https://en.wikipedia.org/wiki/Whirlpool_(cryptography))\n* [xor8](https://en.wikipedia.org/wiki/Longitudinal_redundancy_check)\n* [xxHash](https://github.com/Cyan4973/xxHash) (both XXH32 and XXH64)\n\nImplementations tested:\n\n* [Bouncy Castle](http://bouncycastle.org/java.html)\n* Forward Engineering [SipHash_2_4](http://www.forward.com.au/pfod/SipHashJavaLibrary/index.html)\n* Google [Guava](https://github.com/google/guava/wiki/HashingExplained)\n* Inline [siphash-java-inline](https://github.com/nahi/siphash-java-inline)\n* Johann Löfflmann [Jacksum](http://www.jonelo.de/java/jacksum/)\n* JRE [Adler32](https://docs.oracle.com/javase/8/docs/api/java/util/zip/Adler32.html)\n* JRE [CRC32](https://docs.oracle.com/javase/8/docs/api/java/util/zip/CRC32.html)\n* Adrien Grand (@jpountz) [xxHash for Java](https://github.com/jpountz/lz4-java)\n* OpenHFT [Zero Allocation Hashing](https://github.com/OpenHFT/Zero-Allocation-Hashing)\n\n## Results\nA wide variety of plots are generated, including by byte slice length,\nby hash specification, and by implementation. This should let you determine the\nlowest latency hash for your target byte size, compare the different\nimplementations of that hash, and evaluate how well an implementation responds\nto different input types (eg ``byte[]`` vs ``ByteBuffer``) and lengths.\n\nAn example plot is below, but there are [many more](results/5/README.md):\n\n![Results](results/5/2048.png)\n\n| Date       | Processor     | JVM              | Results Link             |\n| ---------- | ------------- | ---------------- | ------------------------ |\n| 2015-09-24 | Xeon E5-2667  | OpenJDK 1.8.0_60 | [1](results/1/README.md) |\n| 2015-09-26 | Xeon E5-2667  | OpenJDK 1.8.0_60 | [2](results/2/README.md) |\n| 2015-09-30 | Xeon E5-2667  | OpenJDK 1.8.0_60 | [3](results/3/README.md) |\n| 2015-10-01 | Xeon E5-2667  | OpenJDK 1.8.0_60 | [4](results/4/README.md) |\n| 2015-12-04 | Xeon E5-2667  | OpenJDK 1.8.0_66 | [5](results/5/README.md) |\n\n## Scope\nThis project is focused on JVM performance.\n\nIt does not test the accuracy of the implementations.\n\nPlease remember that latency is only one consideration. There is considerable\nvariation between hash algorithms. Some common variations include:\n\n* Overall hash quality (eg see [SMHasher](http://code.google.com/p/smhasher/))\n* Lack of [cryptographic hash](https://en.wikipedia.org/wiki/Cryptographic_hash_function) support\n* Whether a hash remains consistent across process restarts or not\n* Guarantees around machine-dependence (eg byte order)\n* Output length (and associated storage and/or transmission costs)\n\n## Test Configuration\nImplementations vary considerably with respect to the inputs they are able to\nhash. The most basic support is hashing an entire ``byte[]``. Most\nimplementations permit an offset and length to be nominated for a ``byte[]``.\nSome implementations offer\n[ByteBuffer](http://docs.oracle.com/javase/8/docs/api/java/nio/ByteBuffer.html)\nawareness, others use ``Unsafe``, and some delegate to native code via JNI.\n\nMy motivating use case for developing this benchmark required hashing\nvariable-length messages from a proprietary framed IO stream. As such this\nbenchmark populates a 64 KB buffer with random bytes and then requires each\nimplementation to hash a particular slice of that buffer. In order to provide\neach implementation with the best opportunity to efficiently hash such\nIO-sourced slices, the following three scenarios are benchmarked:\n\n* ``byte[]`` from a given offset for a given length\n* Array-backed ``ByteBuffer`` from a given offset for a given length\n* Direct (native) ``ByteBuffer`` from a given offset for a given length\n\nThe [adapter pattern](https://en.wikipedia.org/wiki/Adapter_pattern) is used to\nabstract each implementation. This ensures each implementation is tested in the\nsame manner and by the same harness. Each adapter implementation contains the\nminimal logic required to support the above three scenarios. For some of the\nsimpler implementations it was necessary to copy bytes into a dedicated\n``byte[]`` or prepare a ``ByteBuffer`` view.\n\n## Preparation\nUntil [xxHash for Java](https://github.com/jpountz/lz4-java) 1.4 is released,\nplease clone and build it locally to access the latest buffer fixes. Then\nedit the ``hash-bench/pom.xml`` to reflect the locally-installed snapshot.\n\nHash-Bench also requires [Jacksum](http://www.jonelo.de/java/jacksum/).\nJacksum is not in any Maven repository, so download it, unzip, then\n\n    mvn install:install-file -Dfile=jacksum.jar -DgroupId=jonelo.jacksum -DartifactId=jacksum -Dversion=1.7 -Dpackaging=jar\n    mv jacksum-src.zip jacksum-sources.jar\n    mvn deploy:deploy-file -Dfile=jacksum-sources.jar -DgroupId=jonelo.jacksum -DartifactId=jacksum -Dversion=1.7 -Dpackaging=jar -Dclassifier=sources -Durl=file://$HOME/.m2/repository/\n\n## Running\nYou'll need at least Java 8 and Maven 3 installed. Then:\n\n    cd hash-bench\n    mvn clean package\n    java -jar target/benchmarks.jar\n\nThis will run in default mode, testing all known libraries and input lengths.\nThis takes roughly 15 hours with server-grade (Xeon E5-2667) hardware.\n\nYou can append ``-h`` to the ``java -jar`` line for JMH help. For example, use:\n\n  * ``-wi 0`` to run zero warm-ups (not recommended)\n  * ``-i 1`` to run one iteration only (not recommended)\n  * ``-f 1`` to run one fork only (not recommended)\n  * ``-p length=8,1024`` to test input lengths of 8 and 1024 only\n  * ``-p algo=xxh64-zah,xxh64-jpountz-unsafe`` to test two XXH64 implementations\n  * ``-lp`` to list all available parameter (``-p`` keys and values)\n  * ``-rf csv`` to emit CSV output (for use with the ``plot`` command)\n  * ``-foe true`` to stop on any error (recommended)\n\n## Naming Convention\nAlgorithm names (such as ``xxh64-jpountz-unsafe``) are used in reports and\noptionally for the ``-p algo`` option. The naming convention is:\n\n    hash-implementation[-qualifier]\n\nThe ``hash`` portion denotes the underlying hash specification (and potential\nsize disambiguation). The ``implementation`` is a short abbreviation that\nidentifes the implementation from those listed at the top of this document. A\n``qualifier`` is used if the implementation has been tested in a specific mode.\n\n## License\nMIT License, as per [LICENSE.txt](LICENSE.txt).\n\nThis project uses [Jacksum](http://sourceforge.net/projects/jacksum/), which is\nGPLv2 licensed. Hash-Bench is not derived from Jacksum and is not\nincluding or redistributing any Jacksum files (you must manually download and\ninstall Jacksum yourself, as described above).\n\nTwo hash implementations (SipHash_2_4, Siphash-java-inline) are not available\nfrom any known Maven repository. As each implementation is a single file, they\nhave been placed in the ``thirdparty`` directory. Their licenses are shown\nin those files.\n\n## Contributing\nPlease send a pull request if you'd like to improve the project (eg use a\nparticular hash library in a more efficient manner, add new libraries, update\nto new library versions etc).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbenalexau%2Fhash-bench","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbenalexau%2Fhash-bench","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbenalexau%2Fhash-bench/lists"}