{"id":17154817,"url":"https://github.com/henrifroese/external_memory_fractal_tree","last_synced_at":"2025-07-23T04:34:06.352Z","repository":{"id":135872893,"uuid":"332419329","full_name":"henrifroese/external_memory_fractal_tree","owner":"henrifroese","description":"Implementation of External Memory Fractal Tree (a variant of the Buffered Repository Tree) in C++ through the STXXL library.","archived":false,"fork":false,"pushed_at":"2021-01-24T13:12:56.000Z","size":4064,"stargazers_count":2,"open_issues_count":0,"forks_count":1,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-03-24T13:44:46.290Z","etag":null,"topics":["b-tree-implementation","cpp","data-structures","external-memory"],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/henrifroese.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-01-24T10:22:34.000Z","updated_at":"2024-10-02T14:53:36.000Z","dependencies_parsed_at":null,"dependency_job_id":"21f87d2f-5b4c-4659-af16-fc9c4b4cf60c","html_url":"https://github.com/henrifroese/external_memory_fractal_tree","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/henrifroese/external_memory_fractal_tree","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/henrifroese%2Fexternal_memory_fractal_tree","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/henrifroese%2Fexternal_memory_fractal_tree/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/henrifroese%2Fexternal_memory_fractal_tree/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/henrifroese%2Fexternal_memory_fractal_tree/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/henrifroese","download_url":"https://codeload.github.com/henrifroese/external_memory_fractal_tree/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/henrifroese%2Fexternal_memory_fractal_tree/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":266618818,"owners_count":23957273,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-07-23T02:00:09.312Z","response_time":66,"last_error":null,"robots_txt_status":null,"robots_txt_updated_at":null,"robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["b-tree-implementation","cpp","data-structures","external-memory"],"created_at":"2024-10-14T21:50:02.830Z","updated_at":"2025-07-23T04:34:06.316Z","avatar_url":"https://github.com/henrifroese.png","language":"C++","funding_links":[],"categories":[],"sub_categories":[],"readme":"# External Memory Fractal Tree\n\nThis repository houses the (header-only) implementation of an [external-memory fractal tree](https://en.wikipedia.org/wiki/Fractal_tree_index)\nusing the C++ [STXXL library](https://stxxl.org/tags/master).\n\nA fractal tree (a variant of the buffered repository tree) is a tree data structure similar to a [B-Tree](https://en.wikipedia.org/wiki/B-tree). Each node occupies one block of memory of size B, and has up to sqrt(B) keys\nthat function just like those in a B-Tree. However, each node additionally fills its block up with up with a buffer of size\nB-sqrt(B). On insertion, new key-datum-pairs are quickly inserted into the root node buffer (and later pushed down to the buffers in the \nnext level when a buffer is full), speeding up insertions by a factor 1/sqrt(B)\ncompared to a B-Tree. Fractal Tree searches are a factor of 2 slower than B-Tree searches.\n\n## Performance\nWith a Block Size of B, N items in the tree, and for range-search X items in the result, fractal trees\nand b-trees need the following number of _disk accesses_ (I/Os):\n\n| Tree Type  | Insertion  | Search  | Range-Search  |\n|---|---|---|---|\n| Fractal Tree  | ![\\frac{1}{\\sqrt{B}}\\log_B (\\frac{N}{B})](https://latex.codecogs.com/gif.latex?%5Cfrac%7B1%7D%7B%5Csqrt%7BB%7D%7D%5Clog_B%20%28%5Cfrac%7BN%7D%7BB%7D%29)  | ![2\\log_B (\\frac{N}{B})](https://latex.codecogs.com/gif.latex?2%5Clog_B%20%28%5Cfrac%7BN%7D%7BB%7D%29)  | ![2\\log_B(\\frac{N}{B}) + \\frac{X}{B}](https://latex.codecogs.com/gif.latex?2%5Clog_B%28%5Cfrac%7BN%7D%7BB%7D%29%20\u0026plus;%20%5Cfrac%7BX%7D%7BB%7D)  |\n| B-Tree  | ![\\log_B(\\frac{N}{B})](https://latex.codecogs.com/gif.latex?%5Clog_B%28%5Cfrac%7BN%7D%7BB%7D%29)  | ![\\log_B(\\frac{N}{B})](https://latex.codecogs.com/gif.latex?%5Clog_B%28%5Cfrac%7BN%7D%7BB%7D%29)  | ![\\log_B(\\frac{N}{B}) + \\frac{X}{B}](https://latex.codecogs.com/gif.latex?%5Clog_B%28%5Cfrac%7BN%7D%7BB%7D%29%20\u0026plus;%20%5Cfrac%7BX%7D%7BB%7D)  |\n\nHere are some results of measurements I ran (for details, see [the report](report.pdf)):\n\n\u003cp float=\"left\"\u003e\n  \u003cimg src=\"/benchmarks/Random_Search.png\" width=\"32%\" /\u003e\n  \u003cimg src=\"/benchmarks/Random_Insertion.png\" width=\"32%\" /\u003e \n  \u003cimg src=\"/benchmarks/Range Search (amortized over 5 queries).png\" width=\"32%\" /\u003e\n\u003c/p\u003e\n\n## Example Usage\nSee the [run-fractal-tree.cpp](run-fractal-tree.cpp) file:\n```cpp\n#include \"include/fractal_tree/fractal_tree.h\"\n\n\nint main()\n{\n    using key_type = unsigned int;\n    using data_type = unsigned int;\n    using value_type = std::pair\u003ckey_type, data_type\u003e;\n    constexpr unsigned block_size = 2u \u003c\u003c 12u; // 4kB blocks\n    constexpr unsigned cache_size = 2u \u003c\u003c 15u; // 32kB cache\n\n    using ftree_type = stxxl::ftree\u003ckey_type, data_type, block_size, cache_size\u003e;\n    ftree_type f;\n\n    // insert 1MB of data\n    for (key_type k = 0; k \u003c (2u \u003c\u003c 20u) / sizeof(value_type); k++) {\n        f.insert(value_type(k, 2*k));\n    }\n\n    // find datum of given key\n    std::pair\u003cdata_type, bool\u003e datum_and_found = f.find(1);\n    assert(datum_and_found.second);\n    assert(datum_and_found.first == 2);\n\n    // find values in key range [100, 1000]\n    std::vector\u003cvalue_type\u003e range_values = f.range_find(100, 1000);\n    std::vector\u003cvalue_type\u003e correct_range_values {};\n    for (key_type k = 100; k \u003c= 1000; k++) {\n        correct_range_values.emplace_back(k, 2*k);\n    }\n    assert(range_values == correct_range_values);\n\n\n    return 0;\n}\n\n```\n## Building \u0026 Using\n1. clone the repo\n2. cd into the repo\n3. run `git submodule init`\n4. run `git submodule update --init --recursive`\n\n## Details\nMore implementation details, an introduction to external memory trees, and benchmarks can be found [in this report](report.pdf).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhenrifroese%2Fexternal_memory_fractal_tree","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhenrifroese%2Fexternal_memory_fractal_tree","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhenrifroese%2Fexternal_memory_fractal_tree/lists"}