{"id":13424476,"url":"https://github.com/frozenca/BTree","last_synced_at":"2025-03-15T18:35:06.552Z","repository":{"id":48183429,"uuid":"516673453","full_name":"frozenca/BTree","owner":"frozenca","description":"A general-purpose high-performance lightweight STL-like modern C++ B-Tree","archived":false,"fork":false,"pushed_at":"2024-09-04T00:00:45.000Z","size":118,"stargazers_count":226,"open_issues_count":2,"forks_count":19,"subscribers_count":2,"default_branch":"main","last_synced_at":"2024-10-26T23:55:16.950Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/frozenca.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-07-22T08:35:59.000Z","updated_at":"2024-10-24T15:12:03.000Z","dependencies_parsed_at":"2023-11-13T02:31:01.332Z","dependency_job_id":"51c2fb2a-5ad0-4d0f-8301-5b5bc4362d2b","html_url":"https://github.com/frozenca/BTree","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/frozenca%2FBTree","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/frozenca%2FBTree/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/frozenca%2FBTree/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/frozenca%2FBTree/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/frozenca","download_url":"https://codeload.github.com/frozenca/BTree/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243775850,"owners_count":20346276,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-07-31T00:00:54.887Z","updated_at":"2025-03-15T18:35:01.542Z","avatar_url":"https://github.com/frozenca.png","language":"C++","readme":"# B-Tree\n\nThis library implements a general-purpose header-only STL-like B-Tree in C++, including supports for using it for memory-mapped disk files and fixed-size allocators.\n\nA B-Tree is a self-balancing tree data structure that maintains sorted data and allows searches, sequential access, insertions, and deletions in logarithmic time. Unlike other self-balancing binary search trees, the [B-tree](https://en.wikipedia.org/wiki/B-tree) is well suited for storage systems that read and write relatively large blocks of data, such as databases and file systems\n\nJust like ordered associative containers in the C++ standard library, key-value pairs can be supported and duplicates can be allowed.\n\nThere are four specialized B-Tree classes: ```frozenca::BTreeSet```, ```frozenca::BTreeMultiSet```, ```frozenca::BTreeMap``` and ```frozenca::BTreeMultiMap```, which corresponds to ```std::set```, ```std::multiset```, ```std::map``` and ```std::multimap``` respectively.\n\n## How to use\n\nThis library is header-only, so no additional setup process is required beyond including the headers.\n\nOr\n\nFor cmake projects:\n\nInstall one of package BTree..rpm or BTree..deb or include this project into yours and then\n\n ```cmake\n find_package(BTree)\n #...\n target_link_libraries(${your_target} PRIVATE BTree::BTree)\n ```\n\n\n## Target OS/Compiler version\n\nThis library aggressively uses C++20 features, and verified to work in gcc 11.2 and MSVC 19.32.\n\nPOSIX and Windows operating systems are supported in order to use the memory-mapped disk file interface.\n\nThere are currently no plans to support C++17 and earlier.\n\n## Example usages\n\nUsage is very similar to the C++ standard library ordered associative containers (i.e. ```std::set``` and its friends)\n\n```cpp\n#include \"fc/btree.h\"\n#include \u003ciostream\u003e\n#include \u003cstring\u003e\n\nint main() {\n  namespace fc = frozenca;\n  fc::BTreeSet\u003cint\u003e btree;\n  \n  btree.insert(3);\n  btree.insert(4);\n  btree.insert(2);\n  btree.insert(1);\n  btree.insert(5);\n  \n  // 1 2 3 4 5\n  for (auto num : btree) {\n    std::cout \u003c\u003c num \u003c\u003c ' ';\n  }\n  std::cout \u003c\u003c '\\n';\n  \n  fc::BTreeMap\u003cstd::string, int\u003e strtree;\n\n  strtree[\"asd\"] = 3;\n  strtree[\"a\"] = 6;\n  strtree[\"bbb\"] = 9;\n  strtree[\"asdf\"] = 8;\n  \n  for (const auto \u0026[k, v] : strtree) {\n    std::cout \u003c\u003c k \u003c\u003c ' ' \u003c\u003c v \u003c\u003c '\\n';\n  }\n\n  strtree[\"asdf\"] = 333;\n  \n  // 333\n  std::cout \u003c\u003c strtree[\"asdf\"] \u003c\u003c '\\n';\n\n  strtree.emplace(\"asdfgh\", 200);\n  for (const auto \u0026[k, v] : strtree) {\n    std::cout \u003c\u003c k \u003c\u003c ' ' \u003c\u003c v \u003c\u003c '\\n';\n  }\n}\n```\n\nYou can refer more example usages in ```test/unittest.cpp```.\n\nUsers can specify a fanout parameter for B-tree: the default is 64.\n\n```cpp\n  // btree with fanout 128\n  fc::BTreeSet\u003cint, 128\u003e btree;\n```\n\nThe smallest possible value for fanout is 2, where a B-Tree boils down to an [2-3-4 tree](https://en.wikipedia.org/wiki/2%E2%80%933%E2%80%934_tree) \n\n## Supported operations\n\nOther than regular operations supported by ```std::set``` and its friends (```lower_bound()```, ```upper_bound()```, ```equal_range()``` and etc), the following operations are supported.\n\n```tree.count(const key_type\u0026 key)``` : Returns the number of elements in the tree for their key is equivalent to ```key```. Time complexity: ```O(log n)```\n\n```tree.kth(std::ptrdiff_t k)``` : Returns the k-th element in the tree as 0-based index. Time complexity: ```O(log n)```\n\n```tree.order(const_iterator_type iter)``` : Returns the rank of the element in the iterator in the tree as 0-based index. Time complexity: ```O(log n)```\n\n```tree.enumerate(const key_type\u0026 a, const key_type\u0026 b)``` : Range query. Returns the range of values for their key in ```[a, b]```. Time complexity: ```O(log n)```\n\n```tree.insert_range(ForwardIter first, ForwardIter last)``` : Inserts the elements in ```[first, last)```. The range version also exists. Time complexity: ```O(k log k + log n)``` if all of elements in the range can be fit between two elements in the tree, otherwise ```O(k log n)```\n\n```tree.erase_range(const key_type\u0026 a, const key_type\u0026)``` : Erases the elements for their key in ```[a, b]```. Time complexity: ```O(log n) + O(k)``` (NOT ```O(k log n)```)\n\n```frozenca::join(Tree\u0026\u0026 tree1, Tree\u0026\u0026 tree2)``` : Joins two trees to a single tree. The largest key in ```tree1``` should be less than or equal to the smallest key in ```tree2```. Time complexity: ```O(log n)```\n\n```frozenca::join(Tree\u0026\u0026 tree1, value_type val, Tree\u0026\u0026 tree2)``` : Joins two trees to a single tree. The largest key in ```tree1``` should be less than or equal to the key of ```val``` and the smallest key in ```tree2``` should be greater than or equal to the key of ```val```. Time complexity: ```O(1 + diff_height)```\n\n```frozenca::split(Tree\u0026\u0026 tree, key_type key)``` : Splits a tree to two trees, so that the first tree contains keys less than ```key```, and the second tree contains keys greater than ```key```. Time complexity: ```O(log n)```\n\n```frozenca::split(Tree\u0026\u0026 tree, key_type key1, key_type key2)``` : Splits a tree to two trees, so that the first tree contains keys less than ```key1```, and the second tree contains keys greater than ```key2```. ```key2``` must be greater than or equal to ```key1```. Time complexity: ```O(log n) + O(k)```\n\n## Iterators\nSTL compatible iterators are fully supported. (both ```const``` and non-```const```) However, unlike ```std::set``` and its friends, all insert and erase operations can invalidate iterators. This is because ```std::set``` and its friends are node-based containers where a single node can only have a single key, but a node in B-Trees can have multiple keys.\n\n## Concurrency\n\nCurrently, thread safety is not guaranteed. Lock-free support is the first TODO, but contributions are welcome if you're interested.\n\n## Linear search vs Binary search\n\nThe core operation for B-Tree is a search in the sorted key array of each node. For small arrays with primitive key types that have relatively cheap comparisons, linear search is often better than binary search. This threshold may vary by compiler by a big margin.\n\nIf you use Clang, I recommend that you set this variable to 1. For gcc users, it seems better not to change the variable (may be changed by future gcc optimizations)\nhttps://github.com/frozenca/BTree/blob/7083e8034b5905552cc6a3b8277452c56c05d587/fc_btree.h#L22\n\n## SIMD Operation\n\nWhen keys are signed integers or floating point types, if your machine supports AVX-512, you can activate SIMD intrinsics to speed up B-Tree operations, by setting this variable to 1:\nhttps://github.com/frozenca/BTree/blob/3498a53e75e916015561008cf91fecc3f7df69d1/fc_btree.h#L4\n(Inspired from: [Static B-Trees](https://en.algorithmica.org/hpc/data-structures/s-tree/))\n\n## Disk B-Tree\n\nYou can use a specialized variant that utilizes memory-mapped disk files and an associated fixed-size allocator. You have to include ```fc_disk_btree.h```, ```fc_disk_fixed_alloc.h``` and ```fc_mmfile.h``` to use it.\n\nFor this variant, supported types have stricter type constraints: it should satisfy ```std::trivially_copyable_v```, and its alignment should at least be the alignment of the pointer type in the machine (for both key type and value type for key-value pairs).\n\nThe following code initializes a ```frozenca::DiskBTreeSet```, which generates a memory-mapped disk file ```database.bin``` and uses it, with an initial byte size of 32 megabytes. If the third argument is ```true```, it will destroy the existing file and create a new one (default is ```false```). You can't extend the pool size of the memory-mapped disk file once you initialized (doing so invalidates all pointers in the associated allocator).\n\n```cpp\nfc::DiskBTreeSet\u003cstd::int64_t, 128\u003e btree(\"database.bin\", 1UL \u003c\u003c 25UL, true);\n```\n\n## Serialization and deserialization\n\nSerialization/deserialization of B-Trees via byte streams using ```operator\u003c\u003c``` and ```operator\u003e\u003e``` is also supported when key types (and value types, if present) meet the above requirements for disk B-Tree. You can refer how to do serialization/deserialization in ```test/rwtest.cpp```.\n\n## Performance\n\nUsing a performance test code similar with ```test/perftest.cpp```, that inserts/retrieves/erases 1 million ```std::int64_t``` in random order, I see the following results in my machine (gcc 11.2, -O3, 200 times repeated per each target), compared to ```std::set``` and Google's B-Tree implementation(https://code.google.com/archive/p/cpp-btree/):\n\n```\nBalanced tree test\nWarming up complete...\nfrozenca::BTreeSet test (fanout 64 - default, SIMD)\nTime to insert 1000000 elements: Average : 175.547ms, Stdev   : 8.65575ms, 95%     : 189.553ms,\nTime to lookup 1000000 elements: Average : 197.75ms, Stdev   : 7.75456ms, 95%     : 208.783ms,\nTime to erase 1000000 elements: Average : 211.274ms, Stdev   : 10.3499ms, 95%     : 225.221ms,\n\nfrozenca::BTreeSet test (fanout 96, SIMD)\nTime to insert 1000000 elements: Average : 176.432ms, Stdev   : 9.12931ms, 95%     : 192.688ms,\nTime to lookup 1000000 elements: Average : 194.997ms, Stdev   : 11.3563ms, 95%     : 205.048ms,\nTime to erase 1000000 elements: Average : 212.86ms, Stdev   : 11.3598ms, 95%     : 228.145ms,\n\nfrozenca::DiskBTreeSet test (fanout 128, SIMD)\nTime to insert 1000000 elements: Average : 187.797ms, Stdev   : 8.69872ms, 95%     : 202.318ms,\nTime to lookup 1000000 elements: Average : 200.799ms, Stdev   : 7.10905ms, 95%     : 211.436ms,\nTime to erase 1000000 elements: Average : 216.105ms, Stdev   : 6.83771ms, 95%     : 228.9ms,\n\nfrozenca::BTreeSet test (fanout 128, SIMD)\nTime to insert 1000000 elements: Average : 189.536ms, Stdev   : 15.3073ms, 95%     : 221.393ms,\nTime to lookup 1000000 elements: Average : 204.741ms, Stdev   : 17.8811ms, 95%     : 232.494ms,\nTime to erase 1000000 elements: Average : 219.17ms, Stdev   : 20.6449ms, 95%     : 244.232ms,\n\nfrozenca::BTreeSet test (fanout 64, uint64, don't use SIMD)\nTime to insert 1000000 elements: Average : 204.187ms, Stdev   : 57.3915ms, 95%     : 222.939ms,\nTime to lookup 1000000 elements: Average : 221.049ms, Stdev   : 25.3429ms, 95%     : 245.708ms,\nTime to erase 1000000 elements: Average : 249.832ms, Stdev   : 52.1106ms, 95%     : 288.095ms,\n\nstd::set test\nTime to insert 1000000 elements: Average : 907.104ms, Stdev   : 43.7566ms, 95%     : 966.12ms,\nTime to lookup 1000000 elements: Average : 961.859ms, Stdev   : 30.1132ms, 95%     : 1019.59ms,\nTime to erase 1000000 elements: Average : 990.027ms, Stdev   : 37.1807ms, 95%     : 1049.58ms,\n\nGoogle btree::btree_set test (fanout 64)\nTime to insert 1000000 elements: Average : 425.071ms, Stdev   : 13.117ms, 95%     : 434.819ms,\nTime to lookup 1000000 elements: Average : 377.009ms, Stdev   : 15.2407ms, 95%     : 385.736ms,\nTime to erase 1000000 elements: Average : 421.514ms, Stdev   : 17.3882ms, 95%     : 432.955ms,\n\nGoogle btree::btree_set test (fanout 256 - default value)\nTime to insert 1000000 elements: Average : 251.597ms, Stdev   : 14.3492ms, 95%     : 289.579ms,\nTime to lookup 1000000 elements: Average : 235.204ms, Stdev   : 11.8999ms, 95%     : 255.495ms,\nTime to erase 1000000 elements: Average : 250.782ms, Stdev   : 12.1752ms, 95%     : 270.575ms,\n```\n\nFor 1 million ```std::string```s with length 1~50, I see the following results in my machine:\n```\nfrozenca::BTreeSet test (fanout 64 - default, std::string)\nTime to insert 1000000 elements: Average : 1519.62ms, Stdev   : 81.3793ms, 95%     : 1685.13ms,\nTime to lookup 1000000 elements: Average : 1188.33ms, Stdev   : 83.8154ms, 95%     : 1392.47ms,\nTime to erase 1000000 elements: Average : 1570.44ms, Stdev   : 93.771ms, 95%     : 1747.73ms,\n\nfrozenca::BTreeSet test (fanout 128, std::string)\nTime to insert 1000000 elements: Average : 1774.12ms, Stdev   : 41.601ms, 95%     : 1812.62ms,\nTime to lookup 1000000 elements: Average : 1089.02ms, Stdev   : 22.8206ms, 95%     : 1127.83ms,\nTime to erase 1000000 elements: Average : 1670.09ms, Stdev   : 24.2791ms, 95%     : 1711.33ms,\n\nstd::set test (std::string)\nTime to insert 1000000 elements: Average : 1662.92ms, Stdev   : 178.644ms, 95%     : 1861.37ms,\nTime to lookup 1000000 elements: Average : 1666.16ms, Stdev   : 127.095ms, 95%     : 1845.49ms,\nTime to erase 1000000 elements: Average : 1639.79ms, Stdev   : 82.7256ms, 95%     : 1770.9ms,\n```\n\n\n## Sanity check and unit test\n\nIf you want to contribute and test the code, pay attention and use macro _CONTROL_IN_TEST, which will do full sanity checks on the entire tree:\n\nhttps://github.com/frozenca/BTree/blob/adf3c3309f45a65010d767df674c232c12f5c00a/fc_btree.h#L350\nhttps://github.com/frozenca/BTree/blob/adf3c3309f45a65010d767df674c232c12f5c00a/fc_btree.h#L531-#L532\n\nand by running ```test/unittest.cpp``` you can verify basic operations.\n\n\n## License\n\nThis library is licensed under either of Apache License Version 2.0 with LLVM Exceptions (LICENSE-Apache2-LLVM or https://llvm.org/foundation/relicensing/LICENSE.txt) or Boost Software License Version 1.0 (LICENSE-Boost or https://www.boost.org/LICENSE_1_0.txt).\n","funding_links":[],"categories":["C++","Containers and Algorithms","라이브러리"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffrozenca%2FBTree","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffrozenca%2FBTree","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffrozenca%2FBTree/lists"}