{"id":21963053,"url":"https://github.com/kampersanda/fast_succinct_trie","last_synced_at":"2025-04-23T22:28:11.083Z","repository":{"id":93201062,"uuid":"199809504","full_name":"kampersanda/fast_succinct_trie","owner":"kampersanda","description":"String map implementation through Fast Succinct Trie","archived":false,"fork":false,"pushed_at":"2021-07-09T15:57:47.000Z","size":654,"stargazers_count":21,"open_issues_count":0,"forks_count":7,"subscribers_count":4,"default_branch":"master","last_synced_at":"2025-03-30T04:11:22.193Z","etag":null,"topics":["map","succinct-data-structure","trie"],"latest_commit_sha":null,"homepage":null,"language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/kampersanda.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-07-31T08:04:02.000Z","updated_at":"2024-10-17T20:40:06.000Z","dependencies_parsed_at":"2023-03-04T05:30:21.295Z","dependency_job_id":null,"html_url":"https://github.com/kampersanda/fast_succinct_trie","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kampersanda%2Ffast_succinct_trie","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kampersanda%2Ffast_succinct_trie/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kampersanda%2Ffast_succinct_trie/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kampersanda%2Ffast_succinct_trie/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/kampersanda","download_url":"https://codeload.github.com/kampersanda/fast_succinct_trie/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":250525704,"owners_count":21445070,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["map","succinct-data-structure","trie"],"created_at":"2024-11-29T10:59:37.889Z","updated_at":"2025-04-23T22:28:11.076Z","avatar_url":"https://github.com/kampersanda.png","language":"C++","funding_links":[],"categories":[],"sub_categories":[],"readme":"# fast\\_succinct\\_trie\n\nThis library provides a string map through Fast Succinct Trie (FST), proposed in [SIGMOD 2018](http://www.cs.cmu.edu/~huanche1/publications/surf_paper.pdf). The library is implemented by modifying the original FST implementation [efficient/SuRF](https://github.com/efficient/SuRF) and applying a compact minimal-prefix trie form to simulate a trie-based string map (c.f. Section 2.2 of [KAIS 2017](https://kampersanda.github.io/pdf/KAIS2017.pdf)).\n\n## What is FST?\n\nFST is a succinct trie data structure proposed in the paper,\n\n\u003e Zhang, Lim, Leis, Andersen, Kaminsky, Keeton and Pavlo: **SuRF: Practical Range Query Filtering with Fast Succinct Trie,** In *SIGMOD 2018*, pp. 323-336.\n\nBriefly, FST is a practical variant of [LOUDS-trie](https://bitbucket.org/vsmirnov/memoria/wiki/LabeledTree). FST uses two LOUDS implementations: one is fast and the other is space-efficient. FST partitions a trie into two layers at a level and applies the fast one to the top layer and the space-efficient one to the bottom layer. More specific explanations can be found in the [slide](http://www.cs.cmu.edu/~huanche1/slides/FST.pdf) by the author.\n\nSince FST was developed for succinct range query filtering, [the original implementation](https://github.com/efficient/SuRF) allows us to include false positives in query solutions. [This library](https://github.com/kampersanda/fast_succinct_trie) modifies it and provides a string map based on FST.\n\n## Install\n\nThis library consists of only header files. Please through the path to the directory [`include`](https://github.com/kampersanda/fast_succinct_trie/tree/master/include).\n\n## Build instructions\n\nYou can download and compile this library as the following commands:\n\n```sh\n$ git clone https://github.com/kampersanda/fast_succinct_trie.git\n$ cd fast_succinct_trie\n$ mkdir build\n$ cd build\n$ cmake ..\n$ make -j\n```\n\n## Sample usage\n\n```cpp\n#include \u003cfstream\u003e\n#include \u003ciostream\u003e\n\n#include \u003cfst.hpp\u003e\n\nint main() {\n    std::vector\u003cstd::string\u003e keys = {\n        \"ACML\",  \"AISTATS\", \"DS\",    \"DSAA\",   \"ICDM\",   \"ICML\",  //\n        \"PAKDD\", \"SDM\",     \"SIGIR\", \"SIGKDD\", \"SIGMOD\",\n    };\n\n    // a trie-index constructed from string keys sorted\n    fst::Trie trie(keys);\n\n    // keys are mapped to unique integers in the range [0,#keys)\n    std::cout \u003c\u003c \"[searching]\" \u003c\u003c std::endl;\n    for (size_t i = 0; i \u003c keys.size(); ++i) {\n        fst::position_t key_id = trie.exactSearch(keys[i]);\n        std::cout \u003c\u003c \" - \" \u003c\u003c keys[i] \u003c\u003c \": \" \u003c\u003c key_id \u003c\u003c std::endl;\n    }\n\n    std::cout \u003c\u003c \"[statistics]\" \u003c\u003c std::endl;\n    std::cout \u003c\u003c \" - number of keys: \" \u003c\u003c trie.getNumKeys() \u003c\u003c std::endl;\n    std::cout \u003c\u003c \" - number of nodes: \" \u003c\u003c trie.getNumNodes() \u003c\u003c std::endl;\n    std::cout \u003c\u003c \" - number of suffix bytes: \" \u003c\u003c trie.getSuffixBytes() \u003c\u003c std::endl;\n    std::cout \u003c\u003c \" - memory usage in bytes: \" \u003c\u003c trie.getMemoryUsage() \u003c\u003c std::endl;\n    std::cout \u003c\u003c \" - output file size in bytes: \" \u003c\u003c trie.getSizeIO() \u003c\u003c std::endl;\n\n    std::cout \u003c\u003c \"[configure]\" \u003c\u003c std::endl;\n    trie.debugPrint(std::cout);\n\n    // write the trie-index to a file\n    {\n        std::ofstream ofs(\"fst.idx\");\n        trie.save(ofs);\n    }\n\n    // read the trie-index from a file\n    {\n        fst::Trie other;\n        std::ifstream ifs(\"fst.idx\");\n        other.load(ifs);\n    }\n\n    std::remove(\"fst.idx\");\n    return 0;\n}\n```\nThe output will be\n\n```\n[searching]\n - ACML: 1\n - AISTATS: 2\n - DS: 4\n - DSAA: 5\n - ICDM: 6\n - ICML: 7\n - PAKDD: 0\n - SDM: 3\n - SIGIR: 8\n - SIGKDD: 9\n - SIGMOD: 10\n[statistics]\n - number of keys: 11\n - number of nodes: 19\n - number of suffix bytes: 24\n - memory usage in bytes: 587\n - output file size in bytes: 312\n[configure]\n-- LoudsDense (heigth=1) --\nLABEL: A D I P S | \nCHILD: 1 1 1 0 1 | \nPREFX: 0         |\n-- LoudsSparse --\nLABEL: C I S C D I ? A D M G I K M ? \nCHILD: 0 0 1 1 0 1 0 0 0 0 1 0 0 0 \nLOUDS: 1 0 1 1 1 0 1 0 1 0 1 1 0 0 \n-- Suffixes --\nPOINTERS: 17 11 1 9 0 22 9 12 7 19 14 \nSUFFIXES: ? S T A T S ? R ? M ? M L ? O D ? A K D D ? A ?  \n```\n\n## Todo\n\n- Support more operations\n- Implement a normal trie form also\n\n## Licensing\n\nThis library is free software provided under [Apache License 2.0](https://github.com/kampersanda/fast_succinct_trie/blob/master/LICENSE), following the License of [efficient/SuRF](https://github.com/efficient/SuRF).\nThe modifications are shown in each source file.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkampersanda%2Ffast_succinct_trie","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkampersanda%2Ffast_succinct_trie","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkampersanda%2Ffast_succinct_trie/lists"}