{"id":23508723,"url":"https://github.com/lovasko/m_trie","last_synced_at":"2025-05-13T15:34:42.793Z","repository":{"id":32484227,"uuid":"36064541","full_name":"lovasko/m_trie","owner":"lovasko","description":"Trie/Prefix Tree Implementation for C99","archived":false,"fork":false,"pushed_at":"2019-10-20T11:03:57.000Z","size":271,"stargazers_count":1,"open_issues_count":1,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-02-16T19:48:24.013Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/lovasko.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2015-05-22T09:40:14.000Z","updated_at":"2019-10-21T09:30:47.000Z","dependencies_parsed_at":"2022-08-24T20:30:09.634Z","dependency_job_id":null,"html_url":"https://github.com/lovasko/m_trie","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lovasko%2Fm_trie","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lovasko%2Fm_trie/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lovasko%2Fm_trie/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lovasko%2Fm_trie/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/lovasko","download_url":"https://codeload.github.com/lovasko/m_trie/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253970424,"owners_count":21992492,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-12-25T11:31:45.211Z","updated_at":"2025-05-13T15:34:42.652Z","avatar_url":"https://github.com/lovasko.png","language":"C","funding_links":[],"categories":[],"sub_categories":[],"readme":"# m_trie\n[![Build Status](https://travis-ci.org/lovasko/m_trie.svg?branch=master)](https://travis-ci.org/lovasko/m_trie)\n\n## Introduction\nThe `m_trie` library is a general-purpose implementation of the prefix\ntree data structure for the C99 language. The data structure stores\nkey/value pairs, where *key* is an array of bytes and *value* is an\narbitrary pointer to the associated data.\n\n## Example\nThe following example inserts all command-line arguments into the\ntrie and lets the user search for their presence:\n```c\n#include \u003cstdlib.h\u003e\n#include \u003cstdio.h\u003e\n#include \u003cstring.h\u003e\n#include \u003cm_trie.h\u003e\n\nint\nmain(int argc, char* argv[])\n{\n  m_trie tr;\n  char inp[256];\n  int i;\n  int ret;\n\n  m_trie_init(\u0026tr, m_trie_hash_alphabet, 0);\n  for (i = 1; i \u003c argc; i++)\n    m_trie_insert(\u0026tr, (uint8_t*)argv[i], (uint32_t)strlen(argv[i]), NULL);\n\n  while (1) {\n    scanf(\"%s\", inp);\n    if (strcmp(inp, \"!\") == 0)\n      break;\n\n    ret = m_trie_search(\u0026tr, (uint8_t*)inp, (uint32_t)strlen(inp), NULL);\n    printf(\"%s\\n\", ret == M_TRIE_OK ? \"yes\" : \"no\");\n  }\n\n  return EXIT_SUCCESS;\n}\n```\n\nCompile with:\n`cc -Wall -Wextra -Werror -std=c99 example.c -lmtrie`\n\nPlease note that this example lacks error-checking for the sake of\nbrevity. All production code _should_ check for return values of all\n`m_trie` functions.\n\nMore examples can be found in the `examples/` folder.\n\n## API\n### General\nBefore calling any other function, the `m_trie` object needs to be\ninitialised via the `m_trie_init` function. This function does not\nallocate the `m_trie` object itself, which is left to the user of the\nlibrary.\n\nThe initialisation function expects three arguments: a pointer to a\n`m_trie` instance, pointer to a hash function (see below) and a flags\nfield that is an ORed combination of the following constants:\n\n#### M_TRIE_OVERWRITE\nInserting an identical key twice will result in the previously associated\ndata to be overwritten. Such behaviour is prevented by default.\n\n#### M_TRIE_CLEANUP\nTrigger the garbage-collecting procedure after each removal request - no\nthat this can have a significant impact on the performance of the data\nstructure.\n\n#### M_TRIE_FREE\nCall the `free(3)` function on all value data pointers once the associated\nkey's removal is requested.\n\nTo free _all_ resources held by the data structure, call the `m_trie_free`\nfunction.\n\nPrototypes:\n```c\nint m_trie_init(m_trie* tr, int16_t (*hash)(uint8_t), uint8_t flags);\nint m_trie_free(m_trie* tr);\n```\n\n### Access\nIn order to insert a new key/value pair into the prefix tree, the\n`m_trie_insert` function can be used.  Once key/value pairs are inserted\ninto the data structure, it is possible to query the `m_trie` for presence\nof certain key with the `m_trie_search` function. The stored value is\nreturned via the last argument of the function, while the return value of\nthe function is used to report potential errors.\n\nPrototypes:\n```c\nint m_trie_insert(m_trie* tr, uint8_t* key, uint32_t len, void* val);\nint m_trie_search(m_trie* tr, uint8_t* key, uint32_t len, void** val);\n```\n\n### Removing\nThe library offers two functions to remove keys (an their associated\nvalues): `m_trie_remove`, a direct counterpart to the `m_trie_insert`\nfunction. It accepts a key and it's length and marks the corresponding\ndata node for future removal. It is important to note that the\n`m_trie_remove` function *does not* perform deallocation of memory and\nsimply marks node as if it did not contain any data. This decision was\ntaken due to performance reasons, where cascading freeing of memory could\npose significant memory delays and therefore uncontrollable jitter in\ncertain scenarios.\n\nThe last argument of the `m_trie_remove` function is it's mode: either `0`\nfor normal or `1` for prefix. If the prefix mode is selected, the function\nremoves all keys that start with the specified key. The key itself might\nnot indicate a previously inserted key, e.g. it is possible to call the\n`m_trie_remove` function in the prefix mode with key `\"ca\"` and it would\ndelete keys such as `\"california\"` and `\"carpool\"`, even though `\"ca\"`\nitself was not in the trie.\n\nThe second removal function, `m_trie_remove_all`, simply marks all already\ninserted keys to be non-existent, so that an immediate call to the\n`m_trie_search` function with any argument would not succeed. It is still\npossible to use the `m_trie` instance afterwards, without needing to call\n`m_trie_init`. All internal structures remain allocated, which can serve\nas a performance feature in certain scenarios.\n\nPrototypes:\n```c\nint m_trie_remove(m_trie* tr, uint8_t* key, uint32_t len, uint8_t pfix);\nint m_trie_remove_all(m_trie* tr);\n```\n\n## Garbage-collection\nAs mentioned above, none of the removal functions actually deallocate\nresources, in order to have a more predictable run-time. The\ngarbage-collection can be performed by the `m_trie_trim` function that\ntraverses the data structure and frees the unused memory blocks. The\nperformance aspect of this procedure depends on how often this function is\ncalled and may be significant.\n\nPrototype:\n```c\nint m_trie_trim(m_trie* tr);\n```\n\n### Hash functions\nThe `m_trie` data structure needs a user-supplied hash function to\nnavigate the levels of the tree. Each `m_trie` hash function takes a\n`char` value as input and produces a `short` value. In case that the byte\non the input is not valid, a correct `m_trie` hash function must return\n`-1`. The function must uniquely map all valid byte inputs onto the\ninterval `0 .. (n-1)`, where `n` is the number of valid input bytes.  Each\n`m_trie` hash function therefore must be a *minimal perfect* hash\nfunction.\n\n#### Predefined hash functions\nThe library identifies a list hash functions that are ubiquitous\nand general enough to be included in the library, such as the identity\nfunction `m_trie_hash_identity` or a hash function that supports only the\nlower-case letters of English alphabet `m_trie_hash_lower_alphabet`.\n\nPrototypes:\n ```c\nint16_t m_trie_hash_identity(uint8_t key);\nint16_t m_trie_hash_alphabet(uint8_t key);\nint16_t m_trie_hash_digits(uint8_t key);\nint16_t m_trie_hash_base64(uint8_t key);\nint16_t m_trie_hash_alphanumeric(uint8_t key);\nint16_t m_trie_hash_lower_alphabet(uint8_t key);\nint16_t m_trie_hash_upper_alphabet(uint8_t key);\n ```\n\n#### User-supplied hash functions\nUsers of the library are welcomed to create custom hash functions that\nobey the rules described above. An example of a hash function that accepts\nonly characters `'0'` and `'1'`:\n\n```c\nint16_t\nhash_01(uint8_t c)\n{\n  if (c == 48) return 0;\n  if (c == 49) return 1;\n\n  return -1;\n}\n```\n\n## Time \u0026 space complexity\nThe table below makes an assumption that the hashing function operates in\nboth constant time and constant space.\n\n| Function              | Time   | Space  |\n|-----------------------|--------|--------|\n|`m_trie_init`          | `O(1)` | `O(1)` |\n|`m_trie_free`          | `O(n)` | `O(k)` |\n|`m_trie_insert`        | `O(k)` | `O(k)` |\n|`m_trie_search`        | `O(k)` | `O(1)` |\n|`m_trie_remove`        | `O(k)` | `O(1)` |\n|`m_trie_remove` (pfix) | `O(n)` | `O(k)` |\n|`m_trie_remove_all`    | `O(n)` | `O(k)` |\n|`m_trie_trim`          | `O(n)` | `O(k)` |\n|`m_trie_hash_*`        | `O(1)` | `O(1)` |\n\nWhere:\n* `k` is the length of the longest key\n* `n` is the overall number of allocated trie nodes\n\n## Documentation\nEach function that is part of the public API of the library has its own\nUNIX manual page. All pages belong to the section 3 of the manual\ncatalogue and are automatically installed with the library.\n\n## Testing\nThe implementation is verified by a set of tests that each performs a list\nof operations - insertion, removal and search of keys - combined in ways\nthat covers aim to cover all edge cases. The testing framework consists of\ntwo files: `test.c` that implements the data structure operations and\n`test.sh` that serves as orchestration and verification of test results.\nThe process can be invoked by triggering the `test` target of the\nlibrary's `Makefile`.\n\n## Supported platforms\nThe library should compile under all C99 environments and standards-compliant\ncompilers. This means that the library is expected to compile an work under\nLinux, FreeBSD, OpenBSD, NetBSD, macOS, Windows, Haiku, Plan9 and more. Please\nreport any problems regarding a particular platform to the author.\n\nThe library is known to compile with Clang, GCC, and MSVC compilers.\n\n## Build \u0026 install\nThe `m_trie` library has zero dependencies and requires only a\nC99-compatible compiler.\n\nIn order to compile, test and install the library run the following 4\nsteps (some of which may require superuser privileges):\n```\n$ make\n$ make test\n$ make install\n$ make clean\n```\n\n## License\nThe `m_trie` library is licensed under the terms of the 2-clause BSD\nlicense.  For more information please consult the [LICENSE](LICENSE.md)\nfile. In case you need a different license, feel free to contact the\nauthor.\n\n## Author\nDaniel Lovasko (daniel.lovasko@gmail.com)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flovasko%2Fm_trie","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flovasko%2Fm_trie","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flovasko%2Fm_trie/lists"}