https://github.com/henrifroese/external_memory_fractal_tree
Implementation of External Memory Fractal Tree (a variant of the Buffered Repository Tree) in C++ through the STXXL library.
https://github.com/henrifroese/external_memory_fractal_tree
b-tree-implementation cpp data-structures external-memory
Last synced: 2 months ago
JSON representation
Implementation of External Memory Fractal Tree (a variant of the Buffered Repository Tree) in C++ through the STXXL library.
- Host: GitHub
- URL: https://github.com/henrifroese/external_memory_fractal_tree
- Owner: henrifroese
- Created: 2021-01-24T10:22:34.000Z (over 4 years ago)
- Default Branch: master
- Last Pushed: 2021-01-24T13:12:56.000Z (over 4 years ago)
- Last Synced: 2025-01-29T18:46:24.572Z (4 months ago)
- Topics: b-tree-implementation, cpp, data-structures, external-memory
- Language: C++
- Homepage:
- Size: 3.88 MB
- Stars: 2
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# External Memory Fractal Tree
This repository houses the (header-only) implementation of an [external-memory fractal tree](https://en.wikipedia.org/wiki/Fractal_tree_index)
using the C++ [STXXL library](https://stxxl.org/tags/master).A fractal tree (a variant of the buffered repository tree) is a tree data structure similar to a [B-Tree](https://en.wikipedia.org/wiki/B-tree). Each node occupies one block of memory of size B, and has up to sqrt(B) keys
that function just like those in a B-Tree. However, each node additionally fills its block up with up with a buffer of size
B-sqrt(B). On insertion, new key-datum-pairs are quickly inserted into the root node buffer (and later pushed down to the buffers in the
next level when a buffer is full), speeding up insertions by a factor 1/sqrt(B)
compared to a B-Tree. Fractal Tree searches are a factor of 2 slower than B-Tree searches.## Performance
With a Block Size of B, N items in the tree, and for range-search X items in the result, fractal trees
and b-trees need the following number of _disk accesses_ (I/Os):| Tree Type | Insertion | Search | Range-Search |
|---|---|---|---|
| Fractal Tree |  |  |  |
| B-Tree |  |  |  |Here are some results of measurements I ran (for details, see [the report](report.pdf)):
![]()
![]()
![]()
## Example Usage
See the [run-fractal-tree.cpp](run-fractal-tree.cpp) file:
```cpp
#include "include/fractal_tree/fractal_tree.h"int main()
{
using key_type = unsigned int;
using data_type = unsigned int;
using value_type = std::pair;
constexpr unsigned block_size = 2u << 12u; // 4kB blocks
constexpr unsigned cache_size = 2u << 15u; // 32kB cacheusing ftree_type = stxxl::ftree;
ftree_type f;// insert 1MB of data
for (key_type k = 0; k < (2u << 20u) / sizeof(value_type); k++) {
f.insert(value_type(k, 2*k));
}// find datum of given key
std::pair datum_and_found = f.find(1);
assert(datum_and_found.second);
assert(datum_and_found.first == 2);// find values in key range [100, 1000]
std::vector range_values = f.range_find(100, 1000);
std::vector correct_range_values {};
for (key_type k = 100; k <= 1000; k++) {
correct_range_values.emplace_back(k, 2*k);
}
assert(range_values == correct_range_values);return 0;
}```
## Building & Using
1. clone the repo
2. cd into the repo
3. run `git submodule init`
4. run `git submodule update --init --recursive`## Details
More implementation details, an introduction to external memory trees, and benchmarks can be found [in this report](report.pdf).