{"id":13466692,"url":"https://github.com/google/highwayhash","last_synced_at":"2025-05-15T14:05:08.034Z","repository":{"id":4087935,"uuid":"51937191","full_name":"google/highwayhash","owner":"google","description":"Fast strong hash functions: SipHash/HighwayHash","archived":false,"fork":false,"pushed_at":"2024-04-18T16:08:17.000Z","size":590,"stargazers_count":1576,"open_issues_count":9,"forks_count":186,"subscribers_count":49,"default_branch":"master","last_synced_at":"2025-04-07T18:09:11.497Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/google.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2016-02-17T16:29:40.000Z","updated_at":"2025-03-27T04:43:39.000Z","dependencies_parsed_at":"2024-01-13T16:23:41.316Z","dependency_job_id":"ccb9f8ad-986e-41b8-ba49-f8e841e17e25","html_url":"https://github.com/google/highwayhash","commit_stats":{"total_commits":125,"total_committers":19,"mean_commits":6.578947368421052,"dds":"0.28800000000000003","last_synced_commit":"c13d28517a4db259d738ea4886b1f00352a3cc33"},"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google%2Fhighwayhash","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google%2Fhighwayhash/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google%2Fhighwayhash/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google%2Fhighwayhash/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/google","download_url":"https://codeload.github.com/google/highwayhash/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254355334,"owners_count":22057354,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-07-31T15:00:48.822Z","updated_at":"2025-05-15T14:05:07.992Z","avatar_url":"https://github.com/google.png","language":"C++","readme":"Strong (well-distributed and unpredictable) hashes:\n\n*   Portable implementation of\n    [SipHash](https://www.131002.net/siphash/siphash.pdf)\n*   HighwayHash, a 5x faster SIMD hash with [security\n    claims](https://arxiv.org/abs/1612.06257)\n\n## Quick Start\n\nTo build on a Linux or Mac platform, simply run `make`. For Windows, we provide\na Visual Studio 2015 project in the `msvc` subdirectory.\n\nRun `benchmark` for speed measurements. `sip_hash_test` and `highwayhash_test`\nensure the implementations return known-good values for a given set of inputs.\n\n64-bit SipHash for any CPU:\n\n```\n    #include \"highwayhash/sip_hash.h\"\n    using namespace highwayhash;\n    HH_ALIGNAS(16) const HH_U64 key2[2] = {1234, 5678};\n    char in[8] = {1};\n    return SipHash(key2, in, 8);\n```\n\n64, 128 or 256 bit HighwayHash for the CPU determined by compiler flags:\n\n```\n    #include \"highwayhash/highwayhash.h\"\n    using namespace highwayhash;\n    HH_ALIGNAS(32) const HHKey key = {1, 2, 3, 4};\n    char in[8] = {1};\n    HHResult64 result;  // or HHResult128 or HHResult256\n    HHStateT\u003cHH_TARGET\u003e state(key);\n    HighwayHashT(\u0026state, in, 8, \u0026result);\n```\n\n64, 128 or 256 bit HighwayHash for the CPU on which we're currently running:\n\n```\n    #include \"highwayhash/highwayhash_target.h\"\n    #include \"highwayhash/instruction_sets.h\"\n    using namespace highwayhash;\n    HH_ALIGNAS(32) const HHKey key = {1, 2, 3, 4};\n    char in[8] = {1};\n    HHResult64 result;  // or HHResult128 or HHResult256\n    InstructionSets::Run\u003cHighwayHash\u003e(key, in, 8, \u0026result);\n```\n\nC-callable 64-bit HighwayHash for the CPU on which we're currently running:\n\n    #include \"highwayhash/c_bindings.h\"\n    const uint64_t key[4] = {1, 2, 3, 4};\n    char in[8] = {1};\n    return HighwayHash64(key, in, 8);\n\nPrinting a 256-bit result in a hexadecimal format similar to sha1sum:\n\n    HHResult256 result;\n    printf(\"%016\"PRIx64\"%016\"PRIx64\"%016\"PRIx64\"%016\"PRIx64\"\\n\",\n         result[3], result[2], result[1], result[0]);\n\n## Introduction\n\nHash functions are widely used, so it is desirable to increase their speed and\nsecurity. This package provides two 'strong' (well-distributed and\nunpredictable) hash functions: a faster version of SipHash, and an even faster\nalgorithm we call HighwayHash.\n\nSipHash is a fast but 'cryptographically strong' pseudo-random function by\nAumasson and Bernstein [https://www.131002.net/siphash/siphash.pdf].\n\nHighwayHash is a new way of mixing inputs which may inspire new\ncryptographically strong hashes. Large inputs are processed at a rate of 0.24\ncycles per byte, and latency remains low even for small inputs. HighwayHash is\nfaster than SipHash for all input sizes, with 5 times higher throughput at 1\nKiB. We discuss design choices and provide statistical analysis and preliminary\ncryptanalysis in https://arxiv.org/abs/1612.06257.\n\n## Applications\n\nUnlike prior strong hashes, these functions are fast enough to be recommended\nas safer replacements for weak hashes in many applications. The additional CPU\ncost appears affordable, based on profiling data indicating C++ hash functions\naccount for less than 0.25% of CPU usage.\n\nHash-based selection of random subsets is useful for A/B experiments and similar\napplications. Such random generators are idempotent (repeatable and\ndeterministic), which is helpful for parallel algorithms and testing. To avoid\nbias, it is important that the hash function be unpredictable and\nindistinguishable from a uniform random generator. We have verified the bit\ndistribution and avalanche properties of SipHash and HighwayHash.\n\n64-bit hashes are also useful for authenticating short-lived messages such as\nnetwork/RPC packets. This requires that the hash function withstand\ndifferential, length extension and other attacks. We have published a formal\nsecurity analysis for HighwayHash. New cryptanalysis tools may still need to be\ndeveloped for further analysis.\n\nStrong hashes are also important parts of methods for protecting hash tables\nagainst unacceptable worst-case behavior and denial of service attacks\n(see \"hash flooding\" below).\n\n128 and 256-bit hashes can be useful for verifying data integrity (checksums).\n\n## SipHash\n\nOur SipHash implementation is a fast and portable drop-in replacement for\nthe reference C code. Outputs are identical for the given test cases (messages\nbetween 0 and 63 bytes).\n\nInterestingly, it is about twice as fast as a SIMD implementation using SSE4.1\n(https://goo.gl/80GBSD). This is presumably due to the lack of SIMD bit rotate\ninstructions prior to AVX-512.\n\nSipHash13 is a faster but weaker variant with one mixing round per update and\nthree during finalization.\n\nWe also provide a data-parallel 'tree hash' variant that enables efficient SIMD\nwhile retaining safety guarantees. This is about twice as fast as SipHash, but\ndoes not return the same results.\n\n## HighwayHash\n\nWe have devised a new way of mixing inputs with SIMD multiply and permute\ninstructions. The multiplications are 32x32 -\u003e 64 bits and therefore infeasible\nto reverse. Permuting equalizes the distribution of the resulting bytes.\n\nThe internal state is quite large (1024 bits) but fits within SIMD registers.\nDue to limitations of the AVX2 instruction set, the registers are partitioned\ninto two 512-bit halves that remain independent until the reduce phase. The\nalgorithm outputs 64 bit digests or up to 256 bits at no extra cost.\n\nIn addition to high throughput, the algorithm is designed for low finalization\ncost. The result is more than twice as fast as SipTreeHash.\n\nWe also provide an SSE4.1 version (80% as fast for large inputs and 95% as fast\nfor short inputs), an implementation for VSX on POWER and a portable version\n(10% as fast). A third-party ARM implementation is referenced below.\n\nStatistical analyses and preliminary cryptanalysis are given in\nhttps://arxiv.org/abs/1612.06257.\n\n## Versioning and stability\n\nNow that 21 months have elapsed since their initial release, we have declared\nall (64/128/256 bit) variants of HighwayHash frozen, i.e. unchanging forever.\n\nSipHash and HighwayHash are 'fingerprint functions' whose input -\u003e hash\nmapping will not change. This is important for applications that write hashes to\npersistent storage.\n\n## Speed measurements\n\nTo measure the CPU cost of a hash function, we can either create an artificial\n'microbenchmark' (easier to control, but probably not representative of the\nactual runtime), or insert instrumentation directly into an application (risks\ninfluencing the results through observer overhead). We provide novel variants of\nboth approaches that mitigate their respective disadvantages.\n\nprofiler.h uses software write-combining to stream program traces to memory\nwith minimal overhead. These can be analyzed offline, or when memory is full,\nto learn how much time was spent in each (possibly nested) zone.\n\nnanobenchmark.h enables cycle-accurate measurements of very short functions.\nIt uses CPU fences and robust statistics to minimize variability, and also\navoids unrealistic branch prediction effects.\n\nWe compile the 64-bit C++ implementations with a patched GCC 4.9 and run on a\nsingle idle core of a Xeon E5-2690 v3 clocked at 2.6 GHz. CPU cost is measured\nas cycles per byte for various input sizes:\n\nAlgorithm        | 8     | 31   | 32   | 63   | 64   | 1024\n---------------- | ----- | ---- | ---- | ---- | ---- | ----\nHighwayHashAVX2  | 7.34  | 1.81 | 1.71 | 1.04 | 0.95 | 0.24\nHighwayHashSSE41 | 8.00  | 2.11 | 1.75 | 1.13 | 0.96 | 0.30\nSipTreeHash      | 16.51 | 4.57 | 4.09 | 2.22 | 2.29 | 0.57\nSipTreeHash13    | 12.33 | 3.47 | 3.06 | 1.68 | 1.63 | 0.33\nSipHash          | 8.13  | 2.58 | 2.73 | 1.87 | 1.93 | 1.26\nSipHash13        | 6.96  | 2.09 | 2.12 | 1.32 | 1.33 | 0.68\n\nSipTreeHash is slower than SipHash for small inputs because it processes blocks\nof 32 bytes. AVX2 and SSE4.1 HighwayHash are faster than SipHash for all input\nsizes due to their highly optimized handling of partial vectors.\n\nNote that previous measurements included the initialization of their input,\nwhich dramatically increased timings especially for small inputs.\n\n## CPU requirements\n\nSipTreeHash(13) requires an AVX2-capable CPU (e.g. Haswell). HighwayHash\nincludes a dispatcher that chooses the implementation (AVX2, SSE4.1, VSX or\nportable)  at runtime, as well as a directly callable function template that can\nonly run on the CPU for which it was built. SipHash(13) and\nScalarSipTreeHash(13) have no particular CPU requirements.\n\n### AVX2 vs SSE4\n\nWhen both AVX2 and SSE4 are available, the decision whether to use AVX2 is\nnon-obvious. AVX2 vectors are twice as wide, but require a higher power license\n(integer multiplications count as 'heavy' instructions) and can thus reduce the\nclock frequency of the core or entire socket(!) on Haswell systems. This\npartially explains the observed 1.25x (not 2x) speedup over SSE4. Moreover, it\nis inadvisable to only sporadically use AVX2 instructions because there is also\na ~56K cycle warmup period during which AVX2 operations are slower, and Haswell\ncan even stall during this period. Thus, we recommend avoiding AVX2 for\ninfrequent hashing if the rest of the application is also not using AVX2. For\nany input larger than 1 MiB, it is probably worthwhile to enable AVX2.\n\n### SIMD implementations\n\nOur x86 implementations use custom vector classes with overloaded operators\n(e.g. `const V4x64U a = b + c`) for type-safety and improved readability vs.\ncompiler intrinsics (e.g. `const __m256i a = _mm256_add_epi64(b, c)`).\nThe VSX implementation uses built-in vector types alongside Altivec intrinsics.\nA high-performance third-party ARM implementation is mentioned below.\n\n### Dispatch\n\nOur instruction_sets dispatcher avoids running newer instructions on older CPUs\nthat do not support them. However, intrinsics, and therefore also any vector\nclasses that use them, require (on GCC \u003c 4.9 or Clang \u003c 3.9) a compiler flag\nthat also allows the compiler to generate code for that CPU. This means the\nintrinsics must be placed in separate translation units that are compiled with\nthe required flags. It is important that these source files and their headers\nnot define any inline functions, because that might break the one definition\nrule and cause crashes.\n\nTo minimize dispatch overhead when hashes are computed often (e.g. in a loop),\nwe can inline the hash function into its caller using templates. The dispatch\noverhead will only be paid once (e.g. before the loop). The template mechanism\nalso avoids duplicating code in each CPU-specific implementation.\n\n## Defending against hash flooding\n\nTo mitigate hash flooding attacks, we need to take both the hash function and\nthe data structure into account.\n\nWe wish to defend (web) services that utilize hash sets/maps against\ndenial-of-service attacks. Such data structures assign attacker-controlled\ninput messages `m` to a hash table bin `b` by computing the hash `H(s, m)`\nusing a hash function `H` seeded by `s`, and mapping it to a bin with some\nnarrowing function `b = R(h)`, discussed below.\n\nAttackers may attempt to trigger 'flooding' (excessive work in insertions or\nlookups) by finding multiple `m` that map to the same bin. If the attacker has\nlocal access, they can do far worse, so we assume the attacker can only issue\nremote requests. If the attacker is able to send large numbers of requests,\nthey can already deny service, so we need only ensure the attacker's cost is\nsufficiently large compared to the service's provisioning.\n\nIf the hash function is 'weak', attackers can easily generate 'hash collisions'\n(inputs mapping to the same hash values) that are independent of the seed. In\nother words, certain input messages will cause collisions regardless of the seed\nvalue. The author of SipHash has published C++ programs to generate such\n'universal (key-independent) multicollisions' for CityHash and Murmur. Similar\n'differential' attacks are likely possible for any hash function consisting only\nof reversible operations (e.g. addition/multiplication/rotation) with a constant\noperand. `n` requests with such inputs cause `n^2` work for an unprotected hash\ntable, which is unacceptable.\n\nBy contrast, 'strong' hashes such as SipHash or HighwayHash require infeasible\nattacker effort to find a hash collision (an expected 2^32 guesses of `m` per\nthe birthday paradox) or recover the seed (2^63 requests). These security claims\nassume the seed is secret. It is reasonable to suppose `s` is initially unknown\nto attackers, e.g. generated on startup or even per-connection. A timing attack\nby Wool/Bar-Yosef recovers 13-bit seeds by testing all 8K possibilities using\nmillions of requests, which takes several days (even assuming unrealistic 150 us\nround-trip times). It appears infeasible to recover 64-bit seeds in this way.\n\nHowever, attackers are only looking for multiple `m` mapping to the same bin\nrather than identical hash values. We assume they know or are able to discover\nthe hash table size `p`. It is common to choose `p = 2^i` to enable an efficient\n`R(h) := h \u0026 (p - 1)`, which simply retains the lower hash bits. It may be\neasier for attackers to compute partial collisions where only the lower `i` bits\nmatch. This can be prevented by choosing a prime `p` so that `R(h) := h % p`\nincorporates all hash bits. The costly modulo operation can be avoided by\nmultiplying with the inverse (https://goo.gl/l7ASm8). An interesting alternative\nsuggested by Kyoung Jae Seo chooses a random subset of the `h` bits. Such an `R`\nfunction can be computed in just 3 cycles using PEXT from the BMI2 instruction\nset. This is expected to defend against SAT-solver attacks on the hash bits at a\nslightly lower cost than the multiplicative inverse method, and still allows\npower-of-two table sizes.\n\nSummary thus far: given a strong hash function and secret seed, it appears\ninfeasible for attackers to generate hash collisions because `s` and/or `R` are\nunknown. However, they can still observe the timings of data structure\noperations for various `m`. With typical table sizes of 2^10 to 2^17 entries,\nattackers can detect some 'bin collisions' (inputs mapping to the same bin).\nAlthough this will be costly for the attacker, they can then send many instances\nof such inputs, so we need to limit the resulting work for our data structure.\n\nHash tables with separate chaining typically store bin entries in a linked list,\nso worst-case inputs lead to unacceptable linear-time lookup cost. We instead\nseek optimal asymptotic worst-case complexity for each operation (insertion,\ndeletion and lookups), which is a constant factor times the logarithm of the\ndata structure size. This naturally leads to a tree-like data structure for each\nbin. The Java8 HashMap only replaces its linked list with trees when needed.\nThis leads to additional cost and complexity for deciding whether a bin is a\nlist or tree.\n\nOur first proposal (suggested by Github user funny-falcon) avoids this overhead\nby always storing one tree per bin. It may also be worthwhile to store the first\nentry directly in the bin, which avoids allocating any tree nodes in the common\ncase where bins are sparsely populated. What kind of tree should be used?\n\nGiven SipHash and HighwayHash provide high quality randomness, depending on\nexpecting attack surface simple non-balancing binary search tree could perform\nreasonably well. [Wikipedia says](https://en.wikipedia.org/wiki/Binary_search_tree#Definition)\n\u003e After a long intermixed sequence of random insertion and deletion, the\n\u003e expected height of the tree approaches square root of the number of keys, √n,\n\u003e which grows much faster than log n.\n\nWhile `O(√n)` is much larger than `O(log n)`, it is still much smaller than `O(n)`.\nAnd it will certainly complicate the timing attack, since the time of operation\non collisioned bin will grow slower.\n\nIf stronger safety guarantees are needed, then a balanced tree should be used.\nScapegoat and splay trees only offer amortized complexity guarantees, whereas\ntreaps require an entropy source and have higher constant factors in practice.\nSelf-balancing structures such as 2-3 or red-black trees require additional\nbookkeeping information. We can hope to reduce rebalancing cost by realizing\nthat the output bits of strong `H` functions are uniformly distributed. When\nusing them as keys instead of the original message `m`, recent relaxed balancing\nschemes such as left-leaning red-black or weak AVL trees may require fewer tree\nrotations to maintain their invariants. Note that `H` already determines the\nbin, so we should only use the remaining bits. 64-bit hashes are likely\nsufficient for this purpose, and HighwayHash generates up to 256 bits. It seems\nunlikely that attackers can craft inputs resulting in worst cases for both the\nbin index and tree key without being able to generate hash collisions, which\nwould contradict the security claims of strong hashes. Even if they succeed, the\nrelaxed tree balancing still guarantees an upper bound on height and therefore\nthe worst-case operation cost. For the AVL variant, the constant factors are\nslightly lower than for red-black trees.\n\nThe second proposed approach uses augmented/de-amortized cuckoo hash tables\n(https://goo.gl/PFwwkx). These guarantee worst-case `log n` bounds for all\noperations, but only if the hash function is 'indistinguishable from random'\n(uniformly distributed regardless of the input distribution), which is claimed\nfor SipHash and HighwayHash but certainly not for weak hashes.\n\nBoth alternatives retain good average case performance and defend against\nflooding by limiting the amount of extra work an attacker can cause. The first\napproach guarantees an upper bound of `log n` additional work even if the hash\nfunction is compromised.\n\nIn summary, a strong hash function is not, by itself, sufficient to protect a\nchained hash table from flooding attacks. However, strong hash functions are\nimportant parts of two schemes for preventing denial of service. Using weak hash\nfunctions can slightly accelerate the best-case and average-case performance of\na service, but at the risk of greatly reduced attack costs and worst-case\nperformance.\n\n## Third-party implementations / bindings\n\nThanks to Damian Gryski and Frank Wessels for making us aware of these\nthird-party implementations or bindings. Please feel free to get in touch or\nraise an issue and we'll add yours as well.\n\nBy | Language | URL\n--- | --- | ---\nDamian Gryski | Go and x64 assembly | https://github.com/dgryski/go-highway/\nSimon Abdullah | NPM package | https://www.npmjs.com/package/highwayhash-nodejs\nLovell Fuller | node.js bindings | https://github.com/lovell/highwayhash\nAndreas Sonnleitner | [WebAssembly](https://github.com/asonnleitner/highwayhash-wasm) and NPM package | https://www.npmjs.com/package/highwayhash-wasm\nNick Babcock | Rust port | https://github.com/nickbabcock/highway-rs\nCaleb Zulawski | Rust portable SIMD | https://github.com/calebzulawski/autobahn-hash\nVinzent Steinberg | Rust bindings | https://github.com/vks/highwayhash-rs\nFrank Wessels \u0026 Andreas Auernhammer | Go and ARM assembly | https://github.com/minio/highwayhash\nPhil Demetriou | Python 3 bindings | https://github.com/kpdemetriou/highwayhash-cffi\nJonathan Beard | C++20 constexpr | https://gist.github.com/jonathan-beard/632017faa1d9d1936eb5948ac9186657\nJames Cook | Ruby bindings | https://github.com/jamescook/highwayhash\nJohn Platts | C++17 Google Highway port | https://github.com/johnplatts/simdhwyhash\n\n## Modules\n\n### Hashes\n\n*   c_bindings.h declares C-callable versions of SipHash/HighwayHash.\n*   sip_hash.cc is the compatible implementation of SipHash, and also provides\n    the final reduction for sip_tree_hash.\n*   sip_tree_hash.cc is the faster but incompatible SIMD j-lanes tree hash.\n*   scalar_sip_tree_hash.cc is a non-SIMD version.\n*   state_helpers.h simplifies the implementation of the SipHash variants.\n*   highwayhash.h is our new, fast hash function.\n*   hh_{avx2,sse41,vsx,portable}.h are its various implementations.\n*   highwayhash_target.h chooses the best available implementation at runtime.\n\n### Infrastructure\n\n*   arch_specific.h offers byte swapping and CPUID detection.\n*   compiler_specific.h defines some compiler-dependent language extensions.\n*   data_parallel.h provides a C++11 ThreadPool and PerThread (similar to\n    OpenMP).\n*   instruction_sets.h and targets.h enable efficient CPU-specific dispatching.\n*   nanobenchmark.h measures elapsed times with \u003c 1 cycle variability.\n*   os_specific.h sets thread affinity and priority for benchmarking.\n*   profiler.h is a low-overhead, deterministic hierarchical profiler.\n*   tsc_timer.h obtains high-resolution timestamps without CPU reordering.\n*   vector256.h and vector128.h contain wrapper classes for AVX2 and SSE4.1.\n\nBy Jan Wassenberg \u003cjan.wassenberg@gmail.com\u003e and Jyrki Alakuijala\n\u003cjyrki.alakuijala@gmail.com\u003e, updated 2023-03-29\n\nThis is not an official Google product.\n","funding_links":[],"categories":["Miscellaneous","C++","Hashing","C++ (70)","散列"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgoogle%2Fhighwayhash","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgoogle%2Fhighwayhash","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgoogle%2Fhighwayhash/lists"}