{"id":18843386,"url":"https://github.com/tpn/perfecthash","last_synced_at":"2026-03-14T03:18:26.287Z","repository":{"id":54968253,"uuid":"133875103","full_name":"tpn/perfecthash","owner":"tpn","description":"A performant, parallel, probabilistic, random acyclic-graph, low-latency, perfect hash generation library. ","archived":false,"fork":false,"pushed_at":"2026-03-07T17:17:00.000Z","size":25110,"stargazers_count":88,"open_issues_count":17,"forks_count":15,"subscribers_count":4,"default_branch":"main","last_synced_at":"2026-03-07T22:53:02.577Z","etag":null,"topics":["arm64","assembly","c","hypergraph","linux","macos","nt","perfect-hash","perfect-hashing","windows","x64"],"latest_commit_sha":null,"homepage":"","language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/tpn.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null}},"created_at":"2018-05-17T22:34:59.000Z","updated_at":"2026-03-07T17:17:04.000Z","dependencies_parsed_at":"2023-10-11T08:19:00.964Z","dependency_job_id":"9668bf0c-e6da-49cf-97d8-526408f69058","html_url":"https://github.com/tpn/perfecthash","commit_stats":{"total_commits":1008,"total_committers":1,"mean_commits":1008.0,"dds":0.0,"last_synced_commit":"6c2e2a09cf6eb1efbf2059f3ef484612bbe63445"},"previous_names":[],"tags_count":71,"template":false,"template_full_name":null,"purl":"pkg:github/tpn/perfecthash","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tpn%2Fperfecthash","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tpn%2Fperfecthash/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tpn%2Fperfecthash/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tpn%2Fperfecthash/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/tpn","download_url":"https://codeload.github.com/tpn/perfecthash/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tpn%2Fperfecthash/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30283703,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-09T02:57:19.223Z","status":"ssl_error","status_checked_at":"2026-03-09T02:56:26.373Z","response_time":61,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["arm64","assembly","c","hypergraph","linux","macos","nt","perfect-hash","perfect-hashing","windows","x64"],"created_at":"2024-11-08T02:57:41.437Z","updated_at":"2026-03-14T03:18:26.264Z","avatar_url":"https://github.com/tpn.png","language":"C","readme":"# Perfect Hash\n\n[![macos](https://github.com/tpn/perfecthash/actions/workflows/macos.yml/badge.svg)](https://github.com/tpn/perfecthash/actions/workflows/macos.yml)\n[![linux](https://github.com/tpn/perfecthash/actions/workflows/linux.yml/badge.svg)](https://github.com/tpn/perfecthash/actions/workflows/linux.yml)\n[![windows](https://github.com/tpn/perfecthash/actions/workflows/windows.yml/badge.svg)](https://github.com/tpn/perfecthash/actions/workflows/windows.yml)\n\n[Helper Utility for Generating Command Line Syntax](https://tpn.github.io/perfecthash-ui/)\nThe `ui/` directory contains the companion web UI (submodule) for generating\ncommand line syntax. See [ui/README.md](ui/README.md) for details.\n\n## Project Status (2026)\n\n- Cross-platform CMake builds and CI are active for Linux, macOS, and Windows.\n- Tag-driven release automation is available via GitHub Actions (`.github/workflows/release.yml`).\n- CMake build profiles are supported:\n  - `full`\n  - `online-rawdog-jit`\n  - `online-rawdog-and-llvm-jit`\n  - `online-llvm-jit`\n- CMake package export/config support is available for downstream\n  `find_package(PerfectHash CONFIG REQUIRED)` consumers.\n\n\n## Overview\n\nThis project is a library for creating perfect hash tables from 32-bit key sets.\nIt is based on the acyclic random 2-part hypergraph algorithm.  Its\nprimary goal is finding the fastest possible runtime solution.\n\nIt is geared toward offline table generation: a command line application is used\nto generate a small C library that implements an `Index()` routine, which, given\nan input key, will return the order-preserved index of that key within the\noriginal key set, e.g.:\n\n```c\n    uint32_t ix;\n    uint32_t key = 0x20190903;\n\n    ix = Index(key);\n```\n\nThis allows the efficient implementation of key-value tables, e.g.:\n\n```c\n    extern uint32_t Table[];\n\n    uint32_t\n    Lookup(uint32_t Key)\n    {\n        return Table[Index(Key)]\n    };\n\n    void\n    Insert(uint32_t Key, uint32_t Value)\n    {\n        Table[Index(Key)] = Value;\n    }\n\n    void\n    Delete(uint32_t Key)\n    {\n        Table[Index(Key)] = 0;\n    }\n\n```\n\nThe fastest `Index()` routine is the [MultiplyShiftRX](https://github.com/tpn/perfecthash/blob/main/src/CompiledPerfectHashTable/CompiledPerfectHashTableChm01IndexMultiplyShiftRXAnd.c)\nroutine, which clocks in at about 6 cycles in practice, and boils down to\nsomething like this:\n\n```c\nextern uint32_t Assigned[];\n\nuint32_t\nIndex(uint32_t Key)\n{\n    uint32_t Vertex1;\n    uint32_t Vertex2;\n\n    Vertex1 = ((Key * Seed1) \u003e\u003e Shift);\n    Vertex2 = ((Key * Seed2) \u003e\u003e Shift);\n\n    return ((Assigned[Vertex1] + Assigned[Vertex2]) \u0026 IndexMask);\n}\n```\nN.B. `Seed1`, `Seed2`, `Shift`, and `IndexMask` will all be literal\nconstants in the final source code, not variables.\n\nThis compiles down to:\n```assembly\nlea     r8, ptr [rip+0x23fc0]\nimul    edx, ecx, 0xe8d9cdf9\nshr     rdx, 0x10\nmovzx   eax, word ptr [r8+rdx*2]\nimul    edx, ecx, 0xc2e3c0b7\nshr     rdx, 0x10\nmovzx   ecx, word ptr [r8+rdx*2]\nadd     eax, ecx\nret\n```\n\nThe IACA profile reports 8 uops:\n\n```\nIntel(R) Architecture Code Analyzer Version -  v3.0-28-g1ba2cbb build date: 2017-10-23;17:30:24\nAnalyzed File -  .\\x64\\Release\\HologramWorld_31016_Chm01_MultiplyShiftRX_And.dll\nBinary Format - 64Bit\nArchitecture  -  SKL\nAnalysis Type - Throughput\n\nThroughput Analysis Report\n--------------------------\nBlock Throughput: 10.00 Cycles       Throughput Bottleneck: Backend\nLoop Count:  26\nPort Binding In Cycles Per Iteration:\n----------------------------------------------------------------------------\n| Port   |  0  - DV  |  1  |  2  - D   |  3  - D   |  4  |  5  |  6  |  7  |\n----------------------------------------------------------------------------\n| Cycles | 1.3   0.0 | 2.0 | 1.0   1.0 | 1.0   1.0 | 0.0 | 1.3 | 1.3 | 0.0 |\n----------------------------------------------------------------------------\n\n| # Of |         Ports pressure in cycles                     |\n| Uops |0 - DV | 1   | 2 - D   | 3 - D    | 4 | 5   | 6   | 7 |\n---------------------------------------------------------------\n|  1   |       |     |         |          |   | 1.0 |     |   | lea r8, ptr [rip+0x23fc0]\n|  1   |       | 1.0 |         |          |   |     |     |   | imul edx, ecx, 0xe8d9cdf9\n|  1   | 0.3   |     |         |          |   |     | 0.7 |   | shr rdx, 0x10\n|  1   |       |     | 1.0 1.0 |          |   |     |     |   | movzx eax, word ptr [r8+rdx*2]\n|  1   |       | 1.0 |         |          |   |     |     |   | imul edx, ecx, 0xc2e3c0b7\n|  1   | 0.7   |     |         |          |   |     | 0.3 |   | shr rdx, 0x10\n|  1   |       |     |         | 1.0  1.0 |   |     |     |   | movzx ecx, word ptr [r8+rdx*2]\n|  1   | 0.3   |     |         |          |   | 0.3 | 0.3 |   | add eax, ecx\nTotal Num of Uops: 8\n```\n\nThe \"cost\" behind the perfect hash table is the `Assigned` array.  The size of\nthis array will be the number of keys, rounded up to a power of two, and then\ndoubled.  E.g. `HologramWorld-31016.keys` has 31,016 keys.  Rounded up to a\npower of two is 32,768, then doubled: 65,336.\n\nThe data type used by the `Assigned` array is the smallest C data type that\ncan hold the number of keys rounded up to a power of two.  Thus, a 16-bit\n`unsigned short int` can be used for the `HologramWorld-31016.keys` array:\n\n```c\nunsigned short int Assigned[65336] = { ... };\n```\nThus, `sizeof(Assigned)` will be 131,072, or 128KB.\n\nThe `Index()` routine will perform two memory lookups into this array per call.\nNo pointer chasing or indirection is required.  The most frequent keys will have\nboth locations in L1 cache; the worst-case scenario is two memory lookups for\nboth locations for cold or infrequent keys.\n\n## Quick Guide\n\nCurrent development is CMake-first and cross-platform. CI continuously builds\nLinux, macOS, and Windows configurations, and release automation is tag-driven.\nThe historical Visual Studio solution remains available for Windows workflows.\n\nThe generated compiled perfect hash tables are cross-platform, and will work on\nWindows, Mac, Linux, x86, x64, and ARM64.\n\n### Building\n#### Windows\n\n```\nmkdir c:\\src\ncd src\ngit clone https://github.com/tpn/perfecthash\ngit clone https://github.com/tpn/perfecthash-keys\n\ncd perfecthash/src\n```\nThe `PerfectHash.sln` file lives in `perfecthash/src`.  You can either build\nthis directly via Visual Studio, use one of the `build-*.bat` files, or just\nuse `msbuild` from a Visual Studio 2022 command prompt:\n\n```\nmsbuild /nologo /m /t:Rebuild /p:Configuration=Release;Platform=x64\n```\n\nYou can also download the latest binaries from the [Releases](https://github.com/tpn/perfecthash/releases/)\npage.  The `PGO` zip files refer to profile-guided optimization builds, and are\ngenerally faster than the `Release` builds by up to 30-40%.\n\nOnce built or downloaded, there are two main command line executables:\n`PerfectHashCreate.exe`, and `PerfectHashBulkCreate.exe`.  The former is for\ncreating a single table, and it takes a single input key file.  The latter\ncan be pointed at a directory of keys, and it will create tables for all of\nthem.\n\n#### Linux\n\nPrerequisites: C compiler (GCC 10 tested), CMake.  Optional: Ninja.\n\nRecommended (mamba/conda) environment:\n\n```\n# x86_64 (pre-generated)\nmamba env create -f conda/environments/dev-linux_os-linux_arch-x86_64_py-313_cuda-none_compiler-llvm.yaml\nmamba activate dev-linux_os-linux_arch-x86_64_py-313_cuda-none_compiler-llvm\n\n# ARM64 / aarch64 (pre-generated)\nmamba env create -f conda/environments/dev-linux-arm64_os-linux_arch-aarch64_py-313_cuda-none_compiler-llvm.yaml\nmamba activate dev-linux-arm64_os-linux_arch-aarch64_py-313_cuda-none_compiler-llvm\n```\n\nGenerated environment files live under `conda/environments/` and are produced\nfrom `dependencies.yaml` using `rapids-dependency-file-generator`.\nPython package metadata is maintained separately in the repo-root\n`pyproject.toml`; `dependencies.yaml` is no longer used to generate the legacy\n`python/` package metadata.\n\nIf you need a different Python version, pick the matching `py-314` environment\nfile from `conda/environments/`.  You can also create a minimal dev/test\nenvironment manually:\n\n```\nmamba create -y -n perfecthash-dev -c conda-forge \\\n  python=3.12 rust cmake ninja make pkg-config clang clangxx lld llvmdev pytest\nmamba activate perfecthash-dev\n```\n\n```\nmkdir -p ~/src \u0026\u0026 cd ~/src\ngit clone https://github.com/tpn/perfecthash\ngit clone https://github.com/tpn/perfecthash-keys\n\ncd perfecthash\ncmake -S . -B build -G\"Ninja Multi-Config\"\ncmake --build build --config Release\ncmake --build build --config Debug\n```\nNote: the default build enables `-march=native` for required SIMD intrinsics.\nUse `-DPERFECTHASH_ENABLE_NATIVE_ARCH=OFF` if you need a portable binary.\n\nBuild profile examples:\n\n```\ncmake -S . -B build-full -G\"Ninja Multi-Config\" -DPERFECTHASH_BUILD_PROFILE=full\ncmake -S . -B build-online-rawdog -G\"Ninja Multi-Config\" -DPERFECTHASH_BUILD_PROFILE=online-rawdog-jit\ncmake -S . -B build-online-rawdog-llvm -G\"Ninja Multi-Config\" -DPERFECTHASH_BUILD_PROFILE=online-rawdog-and-llvm-jit\ncmake -S . -B build-online-llvm -G\"Ninja Multi-Config\" -DPERFECTHASH_BUILD_PROFILE=online-llvm-jit\n```\n\nCUDA build (Ninja Multi-Config):\n\n```\ncmake -S . -B build-cuda -G\"Ninja Multi-Config\" \\\n    -DUSE_CUDA=ON -DCMAKE_CUDA_ARCHITECTURES=89\ncmake --build build-cuda --config Release\n```\nCUDA builds require CUDAToolkit on PATH. Set\n`CMAKE_CUDA_ARCHITECTURES` to your GPU (e.g., 86, 89, 90).\nFor normal Makefile support:\n\n```\ncd perfecthash\ncmake -S . -B build -G\"Unix Makefiles\"\ncmake --build build\n```\n\n#### Tests (Linux/macOS)\n\nUnit tests and CLI integration tests are available via CTest.  They require a\nbuild with tests enabled and use the `keys/HologramWorld-31016.keys` fixture.\nThe codegen tests require `cargo` (from the Rust toolchain) to be available.\n\n```\ncmake -S . -B build-tests -G Ninja -DPERFECTHASH_ENABLE_TESTS=ON -DBUILD_TESTING=ON\ncmake --build build-tests\nctest --test-dir build-tests --output-on-failure\n```\n\n#### Mac\n\nPrerequisites: Xcode Command Line Tools, CMake, Ninja. Optional: mamba/conda.\n\nRecommended (mamba/conda) environment:\n\n```\n# Apple Silicon (arm64)\nmamba env create -f conda/environments/dev-macos_os-macos_arch-arm64_py-313_cuda-none_compiler-llvm.yaml\nmamba activate dev-macos_os-macos_arch-arm64_py-313_cuda-none_compiler-llvm\n\n# Intel (x86_64)\nmamba env create -f conda/environments/dev-macos_os-macos_arch-x86_64_py-313_cuda-none_compiler-llvm.yaml\nmamba activate dev-macos_os-macos_arch-x86_64_py-313_cuda-none_compiler-llvm\n```\n\n```\nmkdir -p ~/src \u0026\u0026 cd ~/src\ngit clone https://github.com/tpn/perfecthash\ngit clone https://github.com/tpn/perfecthash-keys\n\ncd perfecthash\ncmake -S . -B build-macos -G\"Ninja Multi-Config\"\ncmake --build build-macos --config Release\ncmake --build build-macos --config Debug\n```\n\nFor Intel macOS CI (or cross-build on Apple Silicon), you can use the preset:\n\n```\ncmake --preset ninja-multi-macos-x86_64\ncmake --build --preset ninja-macos-x86_64-release\n```\n\nTests:\n\n```\nctest --test-dir build-macos --output-on-failure -C Release\n```\n\n### CMake Consumer Integration\n\nInstalled-package flow:\n\n```\ncmake -S . -B build -DCMAKE_INSTALL_PREFIX=$PWD/install\ncmake --build build --config Release\ncmake --install build --config Release\n```\n\nIn a downstream CMake project:\n\n```cmake\nfind_package(PerfectHash CONFIG REQUIRED)\ntarget_link_libraries(my_target PRIVATE PerfectHash::PerfectHashOnlineCore)\n```\n\nFetchContent/CPM-style consumer example:\n\n- `examples/cmake-fetchcontent-consumer`\n\n### Usage\n\nThe usage options are almost identical for both programs.  If you run either one\nwithout arguments, it will print detailed usage instructions, also available\n[here](https://github.com/tpn/perfecthash/blob/main/USAGE.txt).\n\nThe main usage follows:\n\n```\nPerfectHashBulkCreate.exe Usage:\n    \u003cKeysDirectory\u003e \u003cOutputDirectory\u003e\n    \u003cAlgorithm\u003e \u003cHashFunction\u003e \u003cMaskFunction\u003e\n    \u003cMaximumConcurrency\u003e\n    [BulkCreateFlags] [KeysLoadFlags] [TableCreateFlags]\n    [TableCompileFlags] [TableCreateParameters]\n\nPerfectHashCreate.exe Usage:\n    \u003cKeysPath\u003e \u003cOutputDirectory\u003e\n    \u003cAlgorithm\u003e \u003cHashFunction\u003e \u003cMaskFunction\u003e\n    \u003cMaximumConcurrency\u003e\n    [CreateFlags] [KeysLoadFlags] [TableCreateFlags]\n    [TableCompileFlags] [TableCreateParameters]\n```\n\nAssuming you have built a `Release` version of the library, from a Visual Studio\n2022 command prompt (i.e. so the compiler is available in your `PATH`):\n\n```\nmkdir c:\\Temp\\ph.out\ncd c:\\src\\perfecthash\\src\n..\\bin\\timemem.exe x64\\Release\\PerfectHashCreate.exe c:\\src\\perfecthash\\keys\\HologramWorld-31016.keys c:\\Temp\\ph.out Chm01 MultiplyShiftR And 0 --Compile\n```\n\nOn Linux this would look like:\n```\nmkdir -p ~/tmp/ph.out\ncd ~/src/perfecthash/src\ntime ../x64/Release/PerfectHashCreateExe $HOME/src/perfecthash-keys/sys32/HologramWorld-31016.keys ~/tmp/ph.out Chm01 MultiplyShiftR And 0 --DisableCsvOutputFile\n```\n\nThis should result in some output that looks like this:\n```\nc:\\src\\perfecthash\\src\u003e..\\bin\\timemem x64\\Release\\PerfectHashCreate.exe c:\\src\\perfecthash\\keys\\HologramWorld-31016.keys c:\\Temp\\ph.out Chm01 MultiplyShiftR And 0 --Compile\n\nKeys File Name:                                    HologramWorld-31016.keys\nNumber of Keys:                                    31016\nNumber of Table Resize Events:                     0\nKeys to Edges Ratio:                               0.946533203125\nDuration:                                          0 hours, 0 mins, 0 secs\nDuration Since Last Best Graph:\nAttempts:                                          8633\nAttempts Per Second:                               53956.250000000\nCurrent Attempts:                                  8633\nCurrent Attempts Per Second:                       53956.250000000\nSuccessful Attempts:                               1\nFailed Attempts:                                   8625\nFirst Attempt Solved:                              0\nMost Recent Attempt Solved:                        8562\nPredicted Attempts to Solve:                       8634\nPredicted Attempts Remaining until next Solve:     8563\nEstimated Seconds until next Solve:                0.15870265261206998\nNew Best Graph Count:                              0\nEqual Best Graph Count:                            0\nSolutions Found Ratio:                             0.00011583458820803892\nVertex Collision Failures:                         4294\nCyclic Graph Failures:                             4331\nVertex Collision to Cyclic Graph Failure Ratio:    0.99145693835142\nHighest Deleted Edges Count:                       31008\n[r] Refresh [f] Finish [e] Resize [c] Toggle Callback [?] More Help\n.\nExit code      : 0\nElapsed time   : 2.89\nKernel time    : 0.00 (0.0%)\nUser time      : 2.97 (102.7%)\npage fault #   : 6504\nWorking set    : 24164 KB\nPaged pool     : 167 KB\nNon-paged pool : 52 KB\nPage file size : 32280 KB\n```\nN.B. Console output isn't supported on Linux yet.\n\nN.B. If you get an error like this, it means `msbuild` couldn't be found on\nyour path; make sure to launch a Visual Studio 2022 command prompt:\n\n```\nC:\\src\\perfecthash\\src\\PerfectHash\\PerfectHashTableCompile.c: 217: CreateProcessW failed with error: 2 (0x2).  The system cannot find the file specified.\nC:\\src\\perfecthash\\src\\PerfectHash\\PerfectHashContextTableCreate.c: 492: PerfectHashTableCompile failed with error: 3758359076 (0xe0040224).  System call failed.\n```\n\nIf you look in `C:\\Temp\\ph.out`, you'll see something along the following lines.\nIntermediate build files have been omitted for brevity.\n\n```\nC:\\Temp\\ph.out\u003etree /f\nFolder PATH listing for volume Windows\nVolume serial number is 2490-4F03\nC:.\n│   CompiledPerfectHash.h\n│   CompiledPerfectHash.props\n│   CompiledPerfectHashMacroGlue.h\n│   no_sal2.h\n│   PerfectHashTableCreate_10A0ED40.csv\n│\n├───HologramWorld_31016_Chm01_MultiplyShiftR_And\n│       HologramWorld_31016_Chm01_MultiplyShiftR_And.c\n│       HologramWorld_31016_Chm01_MultiplyShiftR_And.def\n│       HologramWorld_31016_Chm01_MultiplyShiftR_And.h\n│       HologramWorld_31016_Chm01_MultiplyShiftR_And.pht1\n│       HologramWorld_31016_Chm01_MultiplyShiftR_And.sln\n│       HologramWorld_31016_Chm01_MultiplyShiftR_And_BenchmarkFull.c\n│       HologramWorld_31016_Chm01_MultiplyShiftR_And_BenchmarkFull.mk\n│       HologramWorld_31016_Chm01_MultiplyShiftR_And_BenchmarkFullExe.c\n│       HologramWorld_31016_Chm01_MultiplyShiftR_And_BenchmarkFullExe.vcxproj\n│       HologramWorld_31016_Chm01_MultiplyShiftR_And_BenchmarkIndex.c\n│       HologramWorld_31016_Chm01_MultiplyShiftR_And_BenchmarkIndex.mk\n│       HologramWorld_31016_Chm01_MultiplyShiftR_And_BenchmarkIndexExe.c\n│       HologramWorld_31016_Chm01_MultiplyShiftR_And_BenchmarkIndexExe.vcxproj\n│       HologramWorld_31016_Chm01_MultiplyShiftR_And_Build.bat\n│       HologramWorld_31016_Chm01_MultiplyShiftR_And_Dll.vcxproj\n│       HologramWorld_31016_Chm01_MultiplyShiftR_And_Keys.c\n│       HologramWorld_31016_Chm01_MultiplyShiftR_And_Lib.mk\n│       HologramWorld_31016_Chm01_MultiplyShiftR_And_So.mk\n│       HologramWorld_31016_Chm01_MultiplyShiftR_And_StdAfx.c\n│       HologramWorld_31016_Chm01_MultiplyShiftR_And_StdAfx.h\n│       HologramWorld_31016_Chm01_MultiplyShiftR_And_Support.c\n│       HologramWorld_31016_Chm01_MultiplyShiftR_And_Support.h\n│       HologramWorld_31016_Chm01_MultiplyShiftR_And_TableData.c\n│       HologramWorld_31016_Chm01_MultiplyShiftR_And_TableValues.c\n│       HologramWorld_31016_Chm01_MultiplyShiftR_And_Test.c\n│       HologramWorld_31016_Chm01_MultiplyShiftR_And_Test.mk\n│       HologramWorld_31016_Chm01_MultiplyShiftR_And_TestExe.c\n│       HologramWorld_31016_Chm01_MultiplyShiftR_And_TestExe.vcxproj\n│       HologramWorld_31016_Chm01_MultiplyShiftR_And_Types.h\n│       main.mk\n│       Makefile\n│\n└───x64\n    └───Release\n            BenchmarkFull_HologramWorld_31016_Chm01_MultiplyShiftR_And.exe\n            BenchmarkFull_HologramWorld_31016_Chm01_MultiplyShiftR_And.pdb\n            BenchmarkIndex_HologramWorld_31016_Chm01_MultiplyShiftR_And.exe\n            BenchmarkIndex_HologramWorld_31016_Chm01_MultiplyShiftR_And.pdb\n            HologramWorld_31016_Chm01_MultiplyShiftR_And.dll\n            HologramWorld_31016_Chm01_MultiplyShiftR_And.lib\n            HologramWorld_31016_Chm01_MultiplyShiftR_And.pdb\n            Test_HologramWorld_31016_Chm01_MultiplyShiftR_And.exe\n            Test_HologramWorld_31016_Chm01_MultiplyShiftR_And.pdb\n```\n\nThe main library implementing the perfect hash table is the `.dll` file.  There\nare three helper `.exe` utilities also compiled for benchmarking and testing.\nYou can assess the performance of just the `Index()` routine via:\n`BenchmarkIndex_HologramWorld_31016_Chm01_MultiplyShiftR_And.exe`:\n\n```\ncd /d c:\\Temp\\ph.out\\x64\\Release\nc:\\src\\perfecthash\\bin\\timemem.exe BenchmarkIndex_HologramWorld_31016_Chm01_MultiplyShiftR_And.exe\n```\n\nOn Windows, the output will look like this:\n```\nC:\\Temp\\ph.out\\x64\\Release\u003ec:\\src\\perfecthash\\bin\\timemem.exe BenchmarkIndex_HologramWorld_31016_Chm01_MultiplyShiftR_And.exe\nExit code      : 7106\nElapsed time   : 0.01\nKernel time    : 0.00 (0.0%)\nUser time      : 0.00 (0.0%)\npage fault #   : 849\nWorking set    : 3220 KB\nPaged pool     : 22 KB\nNon-paged pool : 5 KB\nPage file size : 520 KB\n```\n\nThe exit code, 7106 in this case, is the minimum number of cycles it took, out\nof 1000 attempts, to do 1000 calls to the `Index()` routine.  So, you can\ndivide by 1000 to get the approximate number of cycles per call, in this case,\nabout 7.\n\nThe `BenchmarkFull` executable returns the minimum number of cycles it took, out\nof 100 attempts, to do 10 iterations of the following:\n\n- For each key, call `Insert(Key, RotateLeft(Key, 15))`.\n- For each key, call `Value = Lookup(Key)`.\n- For each key, call `Previous = Delete(Key)`.\n\n```\nC:\\Temp\\ph.out\\x64\\Release\u003ec:\\src\\perfecthash\\bin\\timemem.exe BenchmarkFull_HologramWorld_31016_Chm01_MultiplyShiftR_And.exe\nExit code      : 7321346\nElapsed time   : 0.28\nKernel time    : 0.00 (0.0%)\nUser time      : 0.16 (55.7%)\npage fault #   : 1036\nWorking set    : 3832 KB\nPaged pool     : 31 KB\nNon-paged pool : 5 KB\nPage file size : 564 KB\n```\n\nThe `.dll` files are compiled with special `IACA` versions of each `Index()`\nroutine (i.e. `IndexIaca()`), so you can call `iaca.exe` on them to get an\nanalysis of the generated code, e.g.:\n\n```\nC:\\Temp\\ph.out\\x64\\Release\u003ec:\\src\\perfecthash\\bin\\iaca.exe HologramWorld_31016_Chm01_MultiplyShiftR_And.dll\nIntel(R) Architecture Code Analyzer Version -  v3.0-28-g1ba2cbb build date: 2017-10-23;17:30:24\nAnalyzed File -  HologramWorld_31016_Chm01_MultiplyShiftR_And.dll\nBinary Format - 64Bit\nArchitecture  -  SKL\nAnalysis Type - Throughput\n\nThroughput Analysis Report\n--------------------------\nBlock Throughput: 8.93 Cycles       Throughput Bottleneck: Backend\nLoop Count:  30\nPort Binding In Cycles Per Iteration:\n--------------------------------------------------------------------------------------------------\n|  Port  |   0   -  DV   |   1   |   2   -  D    |   3   -  D    |   4   |   5   |   6   |   7   |\n--------------------------------------------------------------------------------------------------\n| Cycles |  1.0     0.0  |  1.0  |  1.0     1.0  |  1.0     1.0  |  0.0  |  1.0  |  1.0  |  0.0  |\n--------------------------------------------------------------------------------------------------\n\nDV - Divider pipe (on port 0)\nD - Data fetch pipe (on ports 2 and 3)\nF - Macro Fusion with the previous instruction occurred\n* - instruction micro-ops not bound to a port\n^ - Micro Fusion occurred\n# - ESP Tracking sync uop was issued\n@ - SSE instruction followed an AVX256/AVX512 instruction, dozens of cycles penalty is expected\nX - instruction not supported, was not accounted in Analysis\n\n| Num Of   |                    Ports pressure in cycles                         |      |\n|  Uops    |  0  - DV    |  1   |  2  -  D    |  3  -  D    |  4   |  5   |  6   |  7   |\n-----------------------------------------------------------------------------------------\n|   1      |             | 1.0  |             |             |      |      |      |      | imul ecx, ecx, 0xff8d672d\n|   1      | 1.0         |      |             |             |      |      |      |      | shr rax, 0x9\n|   1*     |             |      |             |             |      |      |      |      | movzx edx, ax\n|   1      |             |      |             |             |      |      | 1.0  |      | shr rcx, 0xd\n|   1      |             |      | 1.0     1.0 |             |      |      |      |      | movzx eax, word ptr [r8+rdx*2]\n|   1*     |             |      |             |             |      |      |      |      | movzx edx, cx\n|   1      |             |      |             | 1.0     1.0 |      |      |      |      | movzx ecx, word ptr [r8+rdx*2]\n|   1      |             |      |             |             |      | 1.0  |      |      | add eax, ecx\nTotal Num Of Uops: 8\nAnalysis Notes:\nBackend allocation was stalled due to unavailable allocation resources.\n```\n\nAlthough note that this isn't an exact science, sometimes the compiler\nreorders the `IACA_VC_START()` and `IACA_VC_END()` markers such that you end up\nmissing a couple of instructions in the analysis.  In the example above, the\nactual assembly for the `Index()` routine is as follows:\n\n```\nC:\\Temp\\ph.out\\x64\\Release\u003edumpbin /disasm HologramWorld_31016_Chm01_MultiplyShiftR_And.dll\nMicrosoft (R) COFF/PE Dumper Version 14.34.31935.0\nCopyright (C) Microsoft Corporation.  All rights reserved.\n\n\nDump of file HologramWorld_31016_Chm01_MultiplyShiftR_And.dll\n\nFile Type: DLL\n\nCompiledPerfectHash_HologramWorld_31016_Chm01_MultiplyShiftR_And_Index:\n  0000000180001000: 69 C1 DF 0E AD FF  imul        eax,ecx,0FFAD0EDFh\n  0000000180001006: 4C 8D 05 F3 3F 02  lea         r8,[HologramWorld_31016_Chm01_MultiplyShiftR_And_TableData]\n                    00\n  000000018000100D: 69 C9 2D 67 8D FF  imul        ecx,ecx,0FF8D672Dh\n  0000000180001013: 48 C1 E8 09        shr         rax,9\n  0000000180001017: 0F B7 D0           movzx       edx,ax\n  000000018000101A: 48 C1 E9 0D        shr         rcx,0Dh\n  000000018000101E: 41 0F B7 04 50     movzx       eax,word ptr [r8+rdx*2]\n  0000000180001023: 0F B7 D1           movzx       edx,cx\n  0000000180001026: 41 0F B7 0C 50     movzx       ecx,word ptr [r8+rdx*2]\n  000000018000102B: 03 C1              add         eax,ecx\n  000000018000102D: 25 FF 7F 00 00     and         eax,7FFFh\n  0000000180001032: C3                 ret\n```\n\n### Linux Compilation\nTo compile the hash table on Linux (using WSL1 and GCC 9 as an example):\n```\n% cd /mnt/c/Temp/ph.out/HologramWorld_31016_Chm01_MultiplyShiftR_And\n% make\n% export LD_LIBRARY_PATH=.\n% ./BenchmarkIndex_HologramWorld_31016_Chm01_MultiplyShiftR_And\n8094\n```\n\nWith clang (version 10):\n```\n% make clean\n% CC=clang make\n% ./BenchmarkIndex_HologramWorld_31016_Chm01_MultiplyShiftR_And\n7068\n```\n\n### Mac Compilation\nIdentical to Linux, except you don't need `export LD_LIBRARY_PATH=.`:\n```\n% make\n% ./BenchmarkIndex_HologramWorld_31016_Chm01_MultiplyShiftR_And\n```\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftpn%2Fperfecthash","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftpn%2Fperfecthash","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftpn%2Fperfecthash/lists"}