{"id":17179257,"url":"https://github.com/bytehamster/mphf-experiments","last_synced_at":"2026-01-04T16:18:29.876Z","repository":{"id":110323232,"uuid":"548898122","full_name":"ByteHamster/MPHF-Experiments","owner":"ByteHamster","description":"Comparison of different MPHF algorithms","archived":false,"fork":false,"pushed_at":"2025-04-02T15:24:20.000Z","size":2955,"stargazers_count":7,"open_issues_count":0,"forks_count":3,"subscribers_count":6,"default_branch":"main","last_synced_at":"2025-04-02T15:24:30.467Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ByteHamster.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-10-10T11:04:58.000Z","updated_at":"2025-04-02T15:24:04.000Z","dependencies_parsed_at":null,"dependency_job_id":"f73ef593-3ea0-43dd-abb6-e28918d3ffc7","html_url":"https://github.com/ByteHamster/MPHF-Experiments","commit_stats":{"total_commits":225,"total_committers":1,"mean_commits":225.0,"dds":0.0,"last_synced_commit":"aef529ac6303e768bf142a214dd09971127c16d8"},"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ByteHamster%2FMPHF-Experiments","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ByteHamster%2FMPHF-Experiments/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ByteHamster%2FMPHF-Experiments/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ByteHamster%2FMPHF-Experiments/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ByteHamster","download_url":"https://codeload.github.com/ByteHamster/MPHF-Experiments/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248750108,"owners_count":21155686,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-15T00:25:21.458Z","updated_at":"2026-01-04T16:18:29.871Z","avatar_url":"https://github.com/ByteHamster.png","language":"C++","funding_links":[],"categories":[],"sub_categories":[],"readme":"# MPHF-Experiments\n\nComparison of a wide range different minimal perfect hash functions (MPHFs).\nFrom these, it can generate comprehensive plots like Pareto plots, and simple comparison tables used in several papers.\n\n\u003cimg src=\"img/preview-dominance-map.png\" width=\"500\"/\u003e\n\nThe framework provides a unified interface to test basically all modern MPHF constructions that are currently available, including:\n\n- Bucket Placement\n  - CHD ([Paper](https://doi.org/10.1007/978-3-642-04128-0_61), [Code](https://cmph.sourceforge.net/))\n  - PHOBIC ([Paper](https://doi.org/10.4230/LIPIcs.ESA.2024.69), [Code](https://github.com/jermp/pthash))\n  - FCH ([Paper](https://doi.org/10.1145/133160.133209), [Code](https://cmph.sourceforge.net/))\n  - FCH (Re-Implementation by Pibiri) ([Code](https://github.com/roberto-trani/mphf_benchmark/blob/main/include/fch.hpp))\n  - PHOBIC-GPU ([Paper](https://doi.org/10.4230/LIPIcs.ESA.2024.69), [Code](https://github.com/stefanfred/PHOBIC-GPU))\n  - PTHash ([Paper](https://doi.org/10.1145/3404835.3462849), [Code](https://github.com/jermp/pthash))\n  - PTHash-HEM ([Paper](https://doi.org/10.1109/TKDE.2023.3303341), [Code](https://github.com/jermp/pthash))\n  - PHast, PHast+ ([Paper](https://arxiv.org/pdf/2504.17918), [Code](https://github.com/beling/bsuccinct-rs/))\n  - PtrHash ([Paper](https://doi.org/10.48550/ARXIV.2502.15539), [Code](https://github.com/RagnarGrootKoerkamp/PTRHash))\n- Fingerprinting\n  - BBHash ([Paper](https://doi.org/10.4230/LIPICS.SEA.2017.25), [Code](https://github.com/rizkg/BBHash))\n  - FiPS ([Paper](https://doi.org/10.5445/IR/1000176432), [Code](https://github.com/ByteHamster/FiPS))\n  - FMPH ([Paper](https://doi.org/10.1145/3596453), [Code](https://github.com/beling/bsuccinct-rs/))\n  - FMPH-GO ([Paper](https://doi.org/10.1145/3596453), [Code](https://github.com/beling/bsuccinct-rs/))\n- RecSplit\n  - RecSplit ([Paper](https://doi.org/10.1137/1.9781611976007.14), [Code](https://github.com/vigna/sux/blob/master/sux/function/RecSplit.hpp))\n  - GpuRecSplit ([Paper](https://doi.org/10.4230/LIPICS.ESA.2023.19), [Code](https://github.com/ByteHamster/GpuRecSplit))\n  - SIMDRecSplit, RecSplit with rotation fitting ([Paper](https://doi.org/10.4230/LIPICS.ESA.2023.19), [Code](https://github.com/ByteHamster/GpuRecSplit))\n  - Consensus-RS ([Paper](https://doi.org/10.48550/ARXIV.2502.05613), [Code](https://github.com/ByteHamster/ConsensusRecSplit/))\n- Retrieval\n  - BDZ / BPZ ([Paper](https://doi.org/10.1145/1321440.1321532), [Code](https://cmph.sourceforge.net/))\n  - BMZ ([Paper](https://www.researchgate.net/publication/228715398_A_new_algorithm_for_constructing_minimal_perfect_hash_functions), [Code](https://cmph.sourceforge.net/))\n  - CHM ([Paper](https://doi.org/10.1016/0020-0190\\(92\\)90220-P), [Code](https://cmph.sourceforge.net/))\n  - WBPM ([Paper](https://doi.org/10.1609/AAAI.V34I02.5529), [Code](https://github.com/weaversa/MPHF-WBPM))\n  - SicHash ([Paper](https://doi.org/10.1137/1.9781611977561.CH15), [Code](https://github.com/ByteHamster/SicHash))\n- ShockHash\n  - ShockHash (+ SIMD version) ([Paper](https://doi.org/10.1137/1.9781611977929.15), [Code](https://github.com/ByteHamster/ShockHash))\n  - Bipartite ShockHash ([Paper](https://doi.org/10.48550/ARXIV.2310.14959), [Code](https://github.com/ByteHamster/ShockHash))\n  - Bipartite ShockHash-Flat ([Paper](https://doi.org/10.48550/ARXIV.2310.14959), [Code](https://github.com/ByteHamster/ShockHash))\n  - MorphisHash ([Paper](https://doi.org/10.48550/ARXIV.2503.10161), [Code](https://github.com/stefanfred/MorphisHash))\n  - MorphisHash-Flat ([Paper](https://doi.org/10.48550/ARXIV.2503.10161), [Code](https://github.com/stefanfred/MorphisHash))\n\n### Cloning the Repository\n\nThis repository contains submodules.\nTo clone the repository including submodules, use the following command.\n\n```\ngit clone --recursive https://github.com/ByteHamster/MPHF-Experiments.git\n```\n\n### Running the Experiments Directly\n\nCompiling works like with every cmake project.\n\n```\ncmake -B ./build -DCMAKE_BUILD_TYPE=Release\ncmake --build ./build -j\n```\n\nThis might take about 5-15 minutes because of the large number of competitors.\nYou can then run one of the benchmarks, for example `./build/TablePtrHash --help` or `./build/Comparison --help`.\n\n### Code Structure\n\nThe main comparison code can be found in the [`src` directory](/src).\nThis includes tabular comparisons like they are used in different papers, as well as the more general Pareto plot in [`src/Comparison.cpp`](/src/Comparison.cpp).\nTo add a new competitor to the framework, have a look at the [`contenders` directory](/contenders).\nFor each contender, there are two files. (1) A general wrapper header class that unifies the interface of the competitor, and (2) a cpp file that tests a wide range of configurations for the general Pareto plot.\nThe cpp file should contain all meaningful configurations to cover all possible trade-offs.\nAfter adding a contender, make sure to re-run cmake.\nIf you want to add a new comparison table, make sure to also adapt the [`CMakeLists.txt` file](/CMakeLists.txt) accordingly.\n\n### Running the Experiments with Docker\n\nFor easier reproducibility and less setup overhead, we provide a docker image to run the experiments.\nHowever, for the measurements in the papers, we run the code directly and with more data points.\nWe refer to [Docker.md](/Docker.md) for details on how to use this repository with Docker.\n\n### License\n\nThis code is licensed under the [GPLv3](/LICENSE).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbytehamster%2Fmphf-experiments","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbytehamster%2Fmphf-experiments","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbytehamster%2Fmphf-experiments/lists"}