{"id":21963048,"url":"https://github.com/kampersanda/poplar-trie","last_synced_at":"2025-04-23T22:28:11.529Z","repository":{"id":29676267,"uuid":"122721936","full_name":"kampersanda/poplar-trie","owner":"kampersanda","description":"C++17 implementation of memory-efficient dynamic tries","archived":false,"fork":false,"pushed_at":"2022-02-15T18:14:24.000Z","size":870,"stargazers_count":58,"open_issues_count":4,"forks_count":5,"subscribers_count":4,"default_branch":"master","last_synced_at":"2025-03-30T04:11:19.862Z","etag":null,"topics":["c-plus-plus-17","map","string","trie"],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/kampersanda.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2018-02-24T08:51:29.000Z","updated_at":"2025-01-18T05:34:22.000Z","dependencies_parsed_at":"2022-08-07T14:30:35.556Z","dependency_job_id":null,"html_url":"https://github.com/kampersanda/poplar-trie","commit_stats":null,"previous_names":[],"tags_count":4,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kampersanda%2Fpoplar-trie","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kampersanda%2Fpoplar-trie/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kampersanda%2Fpoplar-trie/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kampersanda%2Fpoplar-trie/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/kampersanda","download_url":"https://codeload.github.com/kampersanda/poplar-trie/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":250525705,"owners_count":21445070,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["c-plus-plus-17","map","string","trie"],"created_at":"2024-11-29T10:59:32.100Z","updated_at":"2025-04-23T22:28:11.506Z","avatar_url":"https://github.com/kampersanda.png","language":"C++","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Poplar-trie: A C++17 implementation of memory-efficient dynamic tries\n\nPoplar-trie is a C++17 library of a memory-efficient associative array whose keys are strings. The data structure is based on a dynamic path-decomposed trie (DynPDT) described in the paper, Shunsuke Kanda, Dominik Köppl, Yasuo Tabei, Kazuhiro Morita, and Masao Fuketa: [Dynamic Path-decomposed Tries](https://arxiv.org/abs/1906.06015), *ACM Journal of Experimental Algorithmics (JEA)*, *25*(1): 1–28, 2020.\n\n## Implementation overview\n\nPoplar-trie is a memory-efficient updatable associative array implementation which maps key strings to values of any type like `std::map\u003cstd::string,anytype\u003e`.\nDynPDT is composed of two structures: dynamic trie and node label map (NLM) structures.\nThis library contains some implementations of those structures, as follows.\n\n### Implementations based on m-Bonsai\n\n- Classes [`plain_bonsai_trie`](https://github.com/kampersanda/poplar-trie/blob/master/include/poplar/plain_bonsai_trie.hpp) and [`compact_bonsai_trie`](https://github.com/kampersanda/poplar-trie/blob/master/include/poplar/compact_bonsai_trie.hpp) are dynamic trie implementations based on [m-Bonsai](https://github.com/Poyias/mBonsai).\n- Classes [`plain_bonsai_nlm`](https://github.com/kampersanda/poplar-trie/blob/master/include/poplar/plain_bonsai_nlm.hpp) and [`compact_bonsai_nlm`](https://github.com/kampersanda/poplar-trie/blob/master/include/poplar/compact_bonsai_nlm.hpp) are NLM implementations designed for these dynamic tries.\n\n### Implementations based on FK-hash\n\n- Classes [`plain_fkhash_trie`](https://github.com/kampersanda/poplar-trie/blob/master/include/poplar/plain_fkhash_trie.hpp) and [`compact_fkhash_trie`](https://github.com/kampersanda/poplar-trie/blob/master/include/poplar/compact_fkhash_trie.hpp) are dynamic trie implementations based on [HashTrie](https://github.com/tudocomp/tudocomp) developed by Fischer and Köppl.\n- Classes [`plain_fkhash_nlm`](https://github.com/kampersanda/poplar-trie/blob/master/include/poplar/plain_fkhash_nlm.hpp) and [`compact_fkhash_nlm`](https://github.com/kampersanda/poplar-trie/blob/master/include/poplar/compact_fkhash_nlm.hpp) are NLM implementations designed for these dynamic tries.\n\n### Aliases\n\nClass [`map`](https://github.com/kampersanda/poplar-trie/blob/master/include/poplar/map.hpp) takes these classes as the template arguments and implements the associative array.\nSo, there are some implementation combinations.\nIn [`poplar.hpp`](https://github.com/kampersanda/poplar-trie/blob/master/include/poplar.hpp), the following aliases are provided.\n\n| Alias                     | Trie Impl.            | NLM impl.            |\n| :------------------------ | :-------------------- | :------------------- |\n| `plain_bonsai_map`        | `plain_bonsai_trie`   | `plain_bonsai_nlm`   |\n| `semi_compact_bonsai_map` | `plain_bonsai_trie`   | `compact_bonsai_nlm` |\n| `compact_bonsai_map`      | `compact_bonsai_trie` | `compact_bonsai_nlm` |\n| `plain_fkhash_map`        | `plain_fkhash_trie`   | `plain_fkhash_nlm`   |\n| `semi_compact_fkhash_map` | `plain_fkhash_trie`   | `compact_fkhash_nlm` |\n| `compact_fkhash_map`      | `compact_fkhash_trie` | `compact_fkhash_nlm` |\n\n\n## Install\n\nThis library consists of only header files.\nPlease through the path to the directory [`poplar-trie/include`](https://github.com/kampersanda/poplar-trie/tree/master/include).\n\n\n## Build instructions\n\nYou can download and compile Poplar-trie as the following commands.\n\n```\n$ git clone https://github.com/kampersanda/poplar-trie.git\n$ cd poplar-trie\n$ mkdir build\n$ cd build\n$ cmake ..\n$ make\n$ make install\n```\n\nThe library uses C++17, so please install g++ 7.0 (or greater) or clang 4.0 (or greater).\nIn addition, CMake 2.8 (or greater) has to be installed to compile the library.\n\nOn the default setting, the library attempts to use `SSE4.2` for popcount primitives.\nIf you do not want to use it, please set `DISABLE_SSE4_2` at build time, e.g., `cmake .. -DDISABLE_SSE4_2=1`.\n\n## Easy example\n\nThe following code is an easy example of inserting and searching key-value pairs.\n\n```c++\n#include \u003ciostream\u003e\n#include \u003cpoplar.hpp\u003e\n\nint main() {\n  std::vector\u003cstd::string\u003e keys = {\"Aoba\", \"Yun\",    \"Hajime\", \"Hihumi\", \"Kou\",\n                                   \"Rin\",  \"Hazuki\", \"Umiko\",  \"Nene\"};\n  const auto num_keys = static_cast\u003cint\u003e(keys.size());\n\n  poplar::plain_bonsai_map\u003cint\u003e map;\n\n  try {\n    for (int i = 0; i \u003c num_keys; ++i) {\n      int* ptr = map.update(keys[i]);\n      *ptr = i + 1;\n    }\n    for (int i = 0; i \u003c num_keys; ++i) {\n      const int* ptr = map.find(keys[i]);\n      if (ptr == nullptr or *ptr != i + 1) {\n        return 1;\n      }\n      std::cout \u003c\u003c keys[i] \u003c\u003c \": \" \u003c\u003c *ptr \u003c\u003c std::endl;\n    }\n    {\n      const int* ptr = map.find(\"Hotaru\");\n      if (ptr != nullptr) {\n        return 1;\n      }\n      std::cout \u003c\u003c \"Hotaru: \" \u003c\u003c -1 \u003c\u003c std::endl;\n    }\n  } catch (const poplar::exception\u0026 ex) {\n    std::cerr \u003c\u003c ex.what() \u003c\u003c std::endl;\n    return 1;\n  }\n\n  std::cout \u003c\u003c \"#keys = \" \u003c\u003c map.size() \u003c\u003c std::endl;\n\n  return 0;\n}\n```\n\nThe output will be\n\n```\nAoba: 1\nYun: 2\nHajime: 3\nHihumi: 4\nKou: 5\nRin: 6\nHazuki: 7\nUmiko: 8\nNene: 9\nHotaru: -1\n#keys = 9\n```\n\n### Note: Deletion implementation\n\nSince DynPDT cannot support garbage collection for deleted keys, Poplar-trie does not provide deletion functions. However, you can easily implement that function by setting the value associated with a deleted key to an invalid value. For example,\n\n```c++\nint* ptr = map.update(deleted_key);\n*ptr = -1; // invalid value\n```\n\nIn this approach, the memory used for deleted keys is not released, although it may be reused for keys inserted subsequently.\n\n## Benchmarks\n\nComparison experiments were conducted on one core of a quad-core Intel Xeon CPU E5-2680 v2 clocked at 2.80 Ghz in a machine with 256 GB of RAM, running the 64-bit version of CentOS 6.10 based on Linux 2.6.\nThe source code was compiled with g++ (version 7.3.0) in optimization mode -O3.\n\nTo measure the performance, we inserted strings in a dataset to a data structure in random order, and measured the maximum resident set size and insertion time.\nThe lookup time was measured by retrieving a million strings randomly extracted from the dataset.\n\nThe source codes for the experiments are at [dictionary_bench](https://github.com/kampersanda/dictionary_bench).\n\n### Page Titles of English Wikipedia\n\n- Dataset: All page titles from English Wikipedia in Sep. 2018\n- Number of keys: 14,130,439\n- File size: 0.28 GiB\n\n| Implementation                                                                   | Space (GiB) | Insert (us/key) | Lookup (us/key) |\n| -------------------------------------------------------------------------------- | ----------: | --------------: | --------------: |\n| [`poplar::plain_bonsai_map`](https://github.com/kampersanda/poplar-trie)         |        0.64 |            0.98 |            0.68 |\n| [`poplar::semi_compact_bonsai_map`](https://github.com/kampersanda/poplar-trie)  |        0.28 |            1.60 |            0.96 |\n| [`poplar::compact_bonsai_map`](https://github.com/kampersanda/poplar-trie)       |        0.24 |            1.71 |            1.02 |\n| [`poplar::plain_fkhash_map`](https://github.com/kampersanda/poplar-trie)         |        0.67 |            0.79 |            0.86 |\n| [`poplar::semi_compact_fkhash_map`](https://github.com/kampersanda/poplar-trie)  |        0.31 |            0.96 |            1.15 |\n| [`poplar::compact_fkhash_map`](https://github.com/kampersanda/poplar-trie)       |        0.27 |            1.14 |            1.22 |\n| [`std::unordered_map`](http://en.cppreference.com/w/cpp/container/unordered_map) |        1.29 |            0.50 |            0.27 |\n| [`google::dense_hash_map`](https://github.com/sparsehash/sparsehash)             |        1.64 |            0.54 |            0.14 |\n| [`spp::sparse_hash_map`](https://github.com/greg7mdp/sparsepp)                   |        0.97 |            0.69 |            0.18 |\n| [`tsl::hopscotch_map`](https://github.com/Tessil/hopscotch-map)                  |        1.08 |            0.42 |            0.13 |\n| [`tsl::robin_map`](https://github.com/Tessil/robin-map)                          |        1.83 |            0.41 |            0.12 |\n| [`tsl::array_map`](https://github.com/Tessil/array-hash)                         |        0.69 |            0.73 |            0.14 |\n| [`tsl::htrie_map`](https://github.com/Tessil/hat-trie)                           |        0.43 |            0.60 |            0.27 |\n| [`JudySL`](http://judy.sourceforge.net)                                          |        0.66 |            0.92 |            0.74 |\n| [`libart`](https://github.com/armon/libart)                                      |        1.23 |            1.00 |            0.73 |\n| [`cedar::da`](http://www.tkl.iis.u-tokyo.ac.jp/~ynaga/cedar/) (reduced trie)     |        1.19 |            0.89 |            0.59 |\n| [`cedar::da`](http://www.tkl.iis.u-tokyo.ac.jp/~ynaga/cedar/) (prefix trie)      |        0.63 |            0.89 |            0.61 |\n\n### URLs of UK domain\n\n- Dataset: URLs obtained from a 2005 crawl of the `.uk` domain performed by UbiCrawler\n- Number of keys: 39,459,925\n- File size: 2.7 GiB\n\n| Implementation                                                                   | Space (GiB) | Insert (us/key) | Lookup (us/key) |\n| -------------------------------------------------------------------------------- | ----------: | --------------: | --------------: |\n| [`poplar::plain_bonsai_map`](https://github.com/kampersanda/poplar-trie)         |        2.32 |            1.45 |            0.94 |\n| [`poplar::semi_compact_bonsai_map`](https://github.com/kampersanda/poplar-trie)  |        1.26 |            2.76 |            1.44 |\n| [`poplar::compact_bonsai_map`](https://github.com/kampersanda/poplar-trie)       |        1.09 |            2.87 |            1.44 |\n| [`poplar::plain_fkhash_map`](https://github.com/kampersanda/poplar-trie)         |        2.32 |            1.27 |            1.24 |\n| [`poplar::semi_compact_fkhash_map`](https://github.com/kampersanda/poplar-trie)  |        1.38 |            1.74 |            1.93 |\n| [`poplar::compact_fkhash_map`](https://github.com/kampersanda/poplar-trie)       |        1.21 |            2.04 |            2.02 |\n| [`std::unordered_map`](http://en.cppreference.com/w/cpp/container/unordered_map) |        6.05 |            0.67 |            0.50 |\n| [`google::dense_hash_map`](https://github.com/sparsehash/sparsehash)             |       10.50 |            1.09 |            0.27 |\n| [`spp::sparse_hash_map`](https://github.com/greg7mdp/sparsepp)                   |        5.06 |            0.96 |            0.37 |\n| [`tsl::hopscotch_map`](https://github.com/Tessil/hopscotch-map)                  |        6.23 |            0.75 |            0.25 |\n| [`tsl::robin_map`](https://github.com/Tessil/robin-map)                          |        9.23 |            0.63 |            0.25 |\n| [`tsl::array_map`](https://github.com/Tessil/array-hash)                         |        5.91 |            1.16 |            0.28 |\n| [`tsl::htrie_map`](https://github.com/Tessil/hat-trie)                           |        2.68 |            1.08 |            0.51 |\n| [`JudySL`](http://judy.sourceforge.net)                                          |        2.21 |            1.88 |            1.59 |\n| [`libart`](https://github.com/armon/libart)                                      |        5.17 |            1.64 |            1.19 |\n| [`cedar::da`](http://www.tkl.iis.u-tokyo.ac.jp/~ynaga/cedar/) (reduced trie)     |        7.37 |            2.24 |            2.30 |\n| [`cedar::da`](http://www.tkl.iis.u-tokyo.ac.jp/~ynaga/cedar/) (prefix trie)      |        2.02 |            2.20 |            2.28 |\n\n## Todo\n\n- Add comments to the codes\n- Create the API document\n\n## Licensing\n\nThis library is free software provided under [MIT License](https://github.com/kampersanda/poplar-trie/blob/master/LICENSE).\n\nIf you use the library, please cite the following paper:\n\n```tex\n@article{kanda2020dynamic,\n  title={Dynamic Path-decomposed Tries},\n  author={Kanda, Shunsuke and K{\\\"o}ppl, Dominik and Tabei, Yasuo and Morita, Kazuhiro and Fuketa, Masao},\n  journal={Journal of Experimental Algorithmics (JEA)},\n  volume={25},\n  number={1},\n  pages={1--28},\n  year={2020},\n  publisher={ACM}\n}\n```\n\n## Related work\n\n- [compact\\_sparse\\_hash](https://github.com/tudocomp/compact_sparse_hash) is an efficient implementation of a compact associative array with integer keys.\n- [mBonsai](https://github.com/Poyias/mBonsai) is the original implementation of succinct dynamic tries.\n- [tudocomp](https://github.com/tudocomp/tudocomp) includes many dynamic trie implementations for LZ factorization.\n\n## Special thanks\n\nThanks to [Dr. Dominik Köppl](https://github.com/koeppl) I was able to create the bijective hash function in `bijective_hash.hpp`.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkampersanda%2Fpoplar-trie","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkampersanda%2Fpoplar-trie","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkampersanda%2Fpoplar-trie/lists"}