{"id":18717363,"url":"https://github.com/rekola/radix-cpp","last_synced_at":"2025-08-14T21:21:08.401Z","repository":{"id":220002327,"uuid":"742627793","full_name":"rekola/radix-cpp","owner":"rekola","description":"Radix set/map implementation","archived":false,"fork":false,"pushed_at":"2024-03-15T12:42:15.000Z","size":122,"stargazers_count":16,"open_issues_count":7,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-04-12T13:51:48.033Z","etag":null,"topics":["c-plus-plus","cplusplus","data-structures","map","radix-sort","set"],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/rekola.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2024-01-12T22:47:17.000Z","updated_at":"2024-10-27T16:23:35.000Z","dependencies_parsed_at":"2024-03-15T13:54:32.976Z","dependency_job_id":"4a3031e4-5391-4123-9659-f5f5c5ebc05c","html_url":"https://github.com/rekola/radix-cpp","commit_stats":null,"previous_names":["rekola/radix-cpp"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/rekola/radix-cpp","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rekola%2Fradix-cpp","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rekola%2Fradix-cpp/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rekola%2Fradix-cpp/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rekola%2Fradix-cpp/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/rekola","download_url":"https://codeload.github.com/rekola/radix-cpp/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rekola%2Fradix-cpp/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":265167543,"owners_count":23721462,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["c-plus-plus","cplusplus","data-structures","map","radix-sort","set"],"created_at":"2024-11-07T13:15:54.274Z","updated_at":"2025-07-13T16:06:25.805Z","avatar_url":"https://github.com/rekola.png","language":"C++","funding_links":[],"categories":[],"sub_categories":[],"readme":"# radix-cpp\n\n[![GitHub License](https://img.shields.io/github/license/rekola/radix-cpp?logo=github\u0026logoColor=lightgrey\u0026color=yellow)](https://github.com/rekola/radix-cpp/blob/main/LICENSE)\n[![CI](https://github.com/rekola/radix-cpp/workflows/Ubuntu-CI/badge.svg)]()\n[![VS17-CI](https://github.com/rekola/radix-cpp/workflows/VS17-CI/badge.svg)]()\n[![PRs Welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg?style=flat-square)](http://makeapullrequest.com)\n\n## Radix set and map implementation for C++\n\nradix-cpp is an experimental flat implementation of ordered set and\nmap. It uses a hash table with open addressing to implement a form of\nradix sort combined with prefix trees, thus providing Θ(1) search time\nas is usual for hash tables, but also quick in order traversal of the\nkeys. Ordinarily hash tables are not ordered, and while in theory, an\norder preserving hash function could be used, it would lead to large\nnumber of collisions. In this implementation the key is divided into\nmultiple 8-bit digits, and each digit is inserted separately in to the\nhash table. The prefix key is also stored along with each digit to\nconstruct the prefix tree. According to benchmarks (given uint32_t\nkeys) the radix-cpp set construction is much faster than that of\nstd::set, and also faster than sorting the keys using std::sort. Both\nstd::sort and std::set use comparison sorthing so they have time\ncomplexity of O(n) = n log n.\n\nIterators are automatically repaired if the underlying table changes,\nso they are stable.\n\nCurrently only integers, floats, doubles and strings are supported as\nkeys, but more support is forthcoming.\n\n### Time Complexity\n\n| Operation | Average | Worst Case |\n| - | - | - |\n| Search | Θ(1) | O(n) |\n| Insert | Θ(w) | O(w*n) |\n| Delete | Θ(w) | O(w*n) |\n| upper_bound() | ? | ? |\n\n* w is the key length in bytes\n\nIterating over nodes in order can be somewhat expensive if the next\nnode has different prefix. Also, it's unclear what the time complexity\nof the iteration operation is.\n\n### Benchmarks\n\nIn these initial benchmarks, radix_cpp::set has been compared with\nstd::set using uint32_t as the key type. For comparison, a test with\nstd::sort and a std::vector is also done, and they have similar speed\nas std::flat_set. The test consists of constructing a set out of\nshuffled array of N consecutive integers and doing an ordered\niteration over the entire set. Search speed comparison has been\nintentionally left out, since it would not be very useful given that\nradix-cpp has avarage complexity of Θ(1).\n\n![Ordered Set Construction Time](https://github.com/rekola/radix-cpp/assets/6755525/ec1adb25-52dc-407c-86c5-af1b2d97eca9 \"Ordered Set Construction Time\")\n![Ordered Set Iteration Time](https://github.com/rekola/radix-cpp/assets/6755525/fe83baa4-7b15-4642-8f5e-1efed45f17a7 \"Ordered Set Iteration Time\")\n\n## Implementation\n\nradix-cpp uses Murmur3 as the hash function. The keys can be of\narbitrary size.\n\n### Node\n\nA Node contains the following data:\n\n| Datum | Description |\n| - | - |\n| payload | pointer to the key or the key/value pair, or 1 for tomb stones |\n| prefix key | The prefix key of the node |\n| ordinal | The ordinal of the node (0-255) |\n| depth | The least-significant-byte of the depth of the node in the prefix tree (0 = empty key) |\n| value count | The number of entries stored in the tree under this node |\n\n### Inserting\n\nWhen inserting a key, each 8-bit digit is inserted along with its\nprefix. A prefix tree is thus created inside the hash table.\n\n### Search\n\nWhen searching for a known key, only the Node for the last digit needs\nto be found. The input key is split into a prefix of n-1 bytes and 1\nbyte ordinal, where n is the size of the key. If a Node with the\nprefix and ordinal is found, it is returned.\n\n### Deletion\n\nDeletion works by using tombstones.\n\n### Iteration\n\nAn iterator has four variables: the depth (in the prefix tree), the\nunordered prefix, the 8-bit ordinal value, and the offset. While the\nordinal is smaller than 255, we know that there are still nodes\navailable in the ordered range, and when advancing to the next stored\nvalue, we can check them all in order. When the range runs out, we\nfall back to the previous digit and advance that one. If the new node\nis not a final node, we go upwards in the tree and find the smallest\nfinal node. The offset is used for probing in case of collisions.\n\n### Limitations and Future Plans\n\n- Maximum number of elements on 64-bit system is is 2^56\n- NaNs are sorted as they were larger than any other value\n- How to sort std::any?\n- Unordered iteration is needed (e.g. for set union and intersection)\n\n## Extending types\n\nTo implement set and map for custom type, the following free functions must be defined:\n\n| Function | Description |\n| - | - |\n| key_type append(key_type key, size_t digit) | Returns a new key with digit appended as the new least significant digit |\n| std::pair\u003ckey_type, size_t\u003e deconstruct(key_type key) | Returns a pair with the numeric value of the least significant digit of the key and the key with the least significant digit removed |\n| size_t keysize(key_type key) | Returns the number of digits in the key |\n\nAdditionally, there must exist a specialization of std::hash for\nkey_type. Signed integers and floating point numbers are not naturally\nascending, and in such case the initial deconstruct also converts the\ndata to ordered binary type.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frekola%2Fradix-cpp","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frekola%2Fradix-cpp","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frekola%2Fradix-cpp/lists"}