{"id":49989950,"url":"https://github.com/komozoi/libexcessive","last_synced_at":"2026-05-19T04:14:02.308Z","repository":{"id":347370343,"uuid":"1193709611","full_name":"komozoi/libexcessive","owner":"komozoi","description":"C++ On-Disk Datastructure library for performance and reliability, with lots of other goodies included.","archived":false,"fork":false,"pushed_at":"2026-05-19T02:23:29.000Z","size":501,"stargazers_count":0,"open_issues_count":5,"forks_count":0,"subscribers_count":0,"default_branch":"master","last_synced_at":"2026-05-19T03:53:12.007Z","etag":null,"topics":["bigint","biginteger-library","btree","btree-implementation","btree-indexes","concurrency","concurrent","cpp","lib","library","logging","mmap","parallel","persistence","persistent-memory","persistent-storage","raii","range-search"],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/komozoi.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-03-27T13:56:46.000Z","updated_at":"2026-05-19T02:19:21.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/komozoi/libexcessive","commit_stats":null,"previous_names":["komozoi/libexcessive"],"tags_count":4,"template":false,"template_full_name":null,"purl":"pkg:github/komozoi/libexcessive","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/komozoi%2Flibexcessive","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/komozoi%2Flibexcessive/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/komozoi%2Flibexcessive/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/komozoi%2Flibexcessive/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/komozoi","download_url":"https://codeload.github.com/komozoi/libexcessive/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/komozoi%2Flibexcessive/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33201543,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-18T09:27:30.708Z","status":"online","status_checked_at":"2026-05-19T02:00:06.763Z","response_time":58,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bigint","biginteger-library","btree","btree-implementation","btree-indexes","concurrency","concurrent","cpp","lib","library","logging","mmap","parallel","persistence","persistent-memory","persistent-storage","raii","range-search"],"created_at":"2026-05-19T04:14:01.615Z","updated_at":"2026-05-19T04:14:02.293Z","avatar_url":"https://github.com/komozoi.png","language":"C++","funding_links":[],"categories":[],"sub_categories":[],"readme":"# LibExcessive\n\n[![Github Actions Status](https://github.com/komozoi/libexcessive/actions/workflows/ci.yml/badge.svg)](https://github.com/komozoi/excessive/actions)\n\n- High-performance, parallel file I/O with RAII semantics\n- High-performance on-disk data structures\n- A bunch of utility functions and classes\n\n## Try it in 5 minutes\n\nCreate a CMakeLists.txt:\n\n```cmake\ncmake_minimum_required(VERSION 3.14)\nproject(BTreeDemo)\n\n# Adds excessive to the project\ninclude(FetchContent)\n\nFetchContent_Declare(\n        excessive\n        GIT_REPOSITORY https://gitea.com/komozoi/excessive.git\n        GIT_TAG v0.3.0\n        GIT_SHALLOW TRUE\n        GIT_PROGRESS ON\n        SYSTEM\n)\n\nFetchContent_MakeAvailable(excessive)\n\n# Create the demo executable and link excessive into it\nadd_executable(demo demo.cpp)\ntarget_link_libraries(demo PRIVATE excessive)\n```\n\nAnd create a file `demo.cpp`:\n\n```c++\n#include \u003cfs/FdHandle.h\u003e\n#include \u003cfs/BTree.h\u003e\n#include \u003cfcntl.h\u003e\n#include \u003crandom\u003e\n#include \u003ccstdio\u003e\n\n\nstruct btree_entry_s {\n    int key;\n    int value;\n\n    static int compare(const btree_entry_s \u0026a, const btree_entry_s \u0026b) {\n        return a.key - b.key;\n    }\n};\n\nint main() {\n    FdHandle file = FdHandle::open(\"btree.bin\", O_RDWR | O_CREAT, 0644);\n    if (!file) {\n        printf(\"Failed to open file!\\n\");\n        return 1;\n    }\n\n    // It is easy to check if the file already existed or was just created\n    printf(\"File is %s.\\n\", file.isNew() ? \"newly created\" : \"existing\");\n\n    BTree\u003cbtree_entry_s\u003e tree(file, 0, btree_entry_s::compare);\n\n    // Add 100 random elements to the tree\n    std::random_device rd;\n    for (int i = 0; i \u003c 100; ++i) {\n        btree_entry_s new_entry{(int) rd() % 5000, (int) rd()};\n        tree.insert(new_entry);\n    }\n\n    // See what the next highest values are for given inputs\n    // Each run this would change as the BTree grows\n    for (int i = 0; i \u003c 5000; i += 500) {\n        btree_entry_s result{i, 0};\n        if (tree.findNext(result)) {\n            printf(\"Next highest entry from %i is (%i, %i)\\n\", i, result.key, result.value);\n        } else {\n            printf(\"No next highest entry found for %i\\n\", i);\n        }\n    }\n\n    // All data is already written to the file (although not necessarily flushed)\n    // For this reason, no cleanup is needed here for the BTree.\n\n    // File automatically closes when all references go out of scope,\n    // but can be closed manually with:\n    // write_handle.close();\n\n    return 0;\n}\n```\n\nIf you are using an IDE, this may be enough to import the project and run it - very convenient!\n\nIf not, run these commands to compile and run:\n\n```bash\n# Setup\nmkdir build \u0026\u0026 cd build \u0026\u0026 cmake ..\n\n# Compile\nmake\n\n# Run the demo to create the data file\n./demo\n\n# Run the demo again to add more to the data file and see the effects\n./demo\n```\n\nAnd that's it - efficient and persistent data storage in less than 60 lines of code.  No extra installation steps\nor complex APIs.  It just works.\n\n## Overview\n\nLibExcessive is intended for large, data-heavy backend\napplications such as servers and data processing tools where speed and reliability matter. The\ndesign goals and features are:\n\n* Provide familiar, Java-like APIs and richer helper types, especially containers\n  * With less verbosity than Java APIs, thankfully.\n* Provide rich, threadsafe, and extremely efficient utilities for interacting with files\n  * Threadsafe file handles and transactions\n  * Mmap handles\n  * Open file reference counting\n  * Utilities for keeping data on-disk\n    * BTree for indexing various sortable datatypes\n    * DiskBytestringSearchTree for handling sorted bytestrings and anything they can encode\n    * Files with dynamically allocated regions\n* ThreadPool for efficient parallel task execution\n* (planned) Utilities for building on-disk indexes and databases\n* Favor explicit memory and performance control. Many components are designed to be friendly to\n  custom allocators and memory pools.\n  * Tracking memory separately for different components of an application, which helps\n    to find memory hogs\n  * Safer allocation and memory management with less heap fragmentation\n* Small, focused algorithms and helpers for string handling, byte buffers, serialization, and more\n* Support modern C++ compilers from C++17 and up.\n\nMuch of the code was originally written to run on the Teensy 4.1, which is extremely memory\nconstrained compared to our desktop computers, having a mere 1MiB of RAM.  For this reason,\nthere is a lot of code for tightly controlling memory usage and performance.\n\nEverything is tested on Debian Linux currently, although\nany Unix flavor should work.  I do not plan on supporting Windows.\n\nMain repository is on Gitea at https://gitea.com/komozoi/excessive, but is mirrored to GitHub\nat https://github.com/komozoi/libexcessive.\n\n## Building\n\nI develop this with CLion, which imports the CMake project for me, but if you prefer to build on\nthe terminal, it's not hard:\n\n```bash\ngit clone https://gitea.com/komozoi/excessive.git\ncd libexcessive\nmkdir build\ncd build\ncmake -DCMAKE_BUILD_TYPE=Release ..\nmake\n```\n\n## Usage Examples\n\nTests are included for almost everything, which you can use as a larger reference if needed.\n\n### File Access with Mmap\n\n```cpp\n#include \"fs/FdHandle.h\"\n\n\nint main() {\n    const char* temp_filename = \"mmap_test.tmp\";\n\n    // All file handles are smart pointers!\n    FdHandle write_handle = FdHandle::open(temp_filename, O_RDWR | O_CREAT, 0660);\n    MmapHandle write_mmap = write_handle.getMmapHandle(0, sizeof(my_struct_t));\n\n    // Writing structs with mmap is easy and safe\n    my_struct_t value{1, true, {}};\n    write_mmap.write(value);\n\n    // File automatically closes when all references go out of scope,\n    // but can be closed manually with:\n    // write_handle.close();\n}\n```\n\n### On-Disk Search Tree with Bytestring Keys\n\nFor on-disk indexing by variable-length keys (like strings), `DiskBytestringSearchTree` provides an efficient O(log(n))\nlookup in the average case.\n\n```cpp\n#include \"fs/DiskBytestringSearchTree.h\"\n#include \"fs/FreeSpaceFile.h\"\n#include \u003cfcntl.h\u003e\n\n\nint main() {\n    FdHandle file = FdHandle::open(\"search_tree.bin\", O_RDWR | O_CREAT, 0644);\n    FreeSpaceFile fss(file);\n\n    uint64_t rootOffset;\n    if (file.isNew()) {\n        rootOffset = DiskBytestringSearchTree::initialize(fss);\n    } else {\n        // Root offset should be stored and retrieved from a known location.\n        // For this example, we assume it's right after the FreeSpaceFile header.\n        rootOffset = fss.getHeaderEnd();\n    }\n\n    DiskBytestringSearchTree tree(fss, rootOffset);\n\n    // Insert keys\n    tree.insert(Bytestring(\"user_123\"), 0xDEADBEEF);\n\n    // Find keys\n    uint64_t value = tree.find(Bytestring(\"user_123\"));\n    if (value != 0) {\n        printf(\"Found value: %llx\\n\", (unsigned long long)value);\n    }\n\n    // FdHandle closes on its own\n\n    return 0;\n}\n```\n\n### Bigint\n\nUnlike in other libraries, the bigint implementation uses a fixed width.  The bigint code was originally designed for use\nin an EVM implementation, where most values are 256-bit.  This is designed to work like a typical register or even\nnormal fixed-width datatype, just bigger, and acts like you would expect with truncation and such.\n\nThere are plenty of great libraries that implement variable-width bigint; there is no reason to add that to this\nlibrary.  Originally I wasn't going to include my own bigint implementation as I figured existing libraries were\nsufficient, but I changed my mind when I saw that they were all variable-length, which does not work for my\ntypical applications.\n\nExamples:\n\n```c++\n#include \"bigint.h\"\n\nint main() {\n    uint256_t a = uint256_t(\"0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF\");\n    uint256_t b(\"0x1234567890ABCDEF1234567890ABCDEF\");\n\n    uint256_t sum  = a + b;\n    uint256_t prod = a * b;\n    uint256_t pow  = b.pow(4);\n\n    printf(\"a     = %s\\n\", a.toHexString().c_str());\n    printf(\"b     = %s\\n\", b.toHexString().c_str());\n    printf(\"a + b = %s\\n\", sum.toHexString().c_str());\n    printf(\"a * b = %s\\n\", prod.toHexString().c_str());\n    printf(\"b**4  = %s\\n\", pow.toHexString().c_str());\n\n    // Output:\n    // a     = ffffffffffffffffffffffffffffffff\n    // b     = 1234567890abcdef1234567890abcdef\n    // a + b = 0000000000000000000000001234567890abcdef1234567890abcdefffffffff\n    // a * b = 1234567890abcdef1234567890abcdeeffffffffffffffedcba98765432111\n    // b**4  = 33ee0e405772f4bd1fa6d7a4e8c14117ea371272c23e2b10\n\n    // Default sizes include uint128_t, uint192_t, and uint256_t.  Custom sizes are also possible.\n    uint192_t threeWordValue = \"0x6935282358963433459348abcdef1ee7\";\n    UnsignedFixedWidthBigInt\u003c7\u003e sevenWordValue = \"0x8b20159b1c579b1088048f054bedebfd02de6b23919371be36d872ec46fe9cebe684edd2675ab1101262b78877b3c09966366c07df0fcccf\";\n\n    // It is also possible to multiply directly with doubles\n    // The result of the multiplication is floored and almost exact\n    uint192_t productWithDouble = threeWordValue * 0.7311;\n    // productWithDouble would be something like 0x4cead4b7be06ccb79c0e92711d09028c\n\n    uint256_t x3(\"0x783924abc37678847777fcba\");\n    x3 = x3.root(3);  // 0xc6fc718e\n}\n```\n\n### Smart Pointer with Copy on Write Behavior\n\n```c++\n#include \u003calloc/pointer.h\u003e\n\nstruct Data {\n    int value;\n};\n\nint main() {\n    // UNIQUE (default-style ownership)\n    sp\u003cData\u003e x(SpPointerType::UNIQUE, Data{10});\n\n    // Copying a UNIQUE pointer does NOT immediately copy the data\n    sp\u003cData\u003e y = x;\n    // x stays UNIQUE\n    // y becomes COPY_ON_WRITE\n    // both point to the same underlying object (for now)\n\n    // First write triggers a deep copy\n    y.mut().value = 20;\n    // now:\n    // x-\u003evalue == 10\n    // y-\u003evalue == 20\n    // they no longer share memory\n\n\n    // You can keep copying before mutation is needed\n    sp\u003cData\u003e z = x;\n    // still sharing with x\n\n    z.mut().value = 30;\n    // z detaches and becomes independent\n    // x is still unchanged\n\n\n    // SHARED mode = always shared, no copy-on-write\n    sp\u003cData\u003e sharedA(SpPointerType::SHARED, Data{100});\n    sp\u003cData\u003e sharedB = sharedA;\n\n    sharedB.mut().value = 200;\n    // both see the change:\n    // sharedA-\u003evalue == 200\n    // sharedB-\u003evalue == 200\n\n\n    // Move = transfer ownership, no copies\n    sp\u003cData\u003e moved = std::move(sharedA);\n    // sharedA is now null\n    // moved owns the data\n\n\n    // Scoped lifetime (RAII)\n    {\n        sp\u003cData\u003e temp(SpPointerType::UNIQUE, Data{5});\n        sp\u003cData\u003e alias = temp;\n        // alias is COPY_ON_WRITE\n\n        // object is destroyed exactly once when both go out of scope\n    }\n\n    // Polymorphic support (New in v0.3.0)\n    // Seamlessly convert from sp\u003cDerived\u003e to sp\u003cBase\u003e\n    // sp\u003cDerived\u003e derived(SpPointerType::UNIQUE, Derived{});\n    // sp\u003cBase\u003e base = derived;\n\n    return 0;\n}\n```\n\n### Thread Pool\n\nEfficient parallel task execution using a pool of worker threads.\n\n```cpp\n#include \"parallel/ThreadPool.h\"\n#include \u003ccstdio\u003e\n\n\nint main() {\n    // Create a pool with 8 worker threads\n    ThreadPool pool(8);\n\n    // Submit a lambda\n    pool.submit([]() {\n        printf(\"Parallel task running\\n\");\n    });\n\n    // Submit a function with arguments\n    pool.submit([](int x, int y) {\n        printf(\"Result: %d\\n\", x + y);\n    }, 10, 20);\n\n    // Graceful shutdown\n    pool.shutdown();\n    return 0;\n}\n```\n\n### Simple Containers\n\nSimple examples showing the style of usage that matches the library design:\n\n```cpp\n#include \"ds/ArrayList.h\"\n\n\nint main() {\n    ArrayList\u003cint\u003e list;\n    list.add(1);\n    list.add(2);\n    list.addCopies(5, 3); // add three copies of 5\n\n    for (int element: list)\n        printf(\"value %d\\n\", element);\n\n    return 0;\n}\n```\n\n```cpp\n#include \"ds/ArrayList.h\"\n\n\nint main() {\n    // Initialize list as {2, 3, 4}\n    ArrayList\u003cint\u003e list{2,3,4};\n\n    // Add 1 to the beginning\n    list.addFirst(1);\n\n    // 1, 2, 3, 4\n    for (int element: list)\n        printf(\"value %d\\n\", element);\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkomozoi%2Flibexcessive","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkomozoi%2Flibexcessive","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkomozoi%2Flibexcessive/lists"}