{"id":13619994,"url":"https://github.com/wolfgarbe/PruningRadixTrie","last_synced_at":"2025-04-14T19:30:43.890Z","repository":{"id":79249298,"uuid":"267663070","full_name":"wolfgarbe/PruningRadixTrie","owner":"wolfgarbe","description":"PruningRadixTrie - 1000x faster Radix trie for prefix search \u0026 auto-complete","archived":false,"fork":false,"pushed_at":"2024-06-27T08:32:15.000Z","size":27535,"stargazers_count":556,"open_issues_count":4,"forks_count":30,"subscribers_count":5,"default_branch":"master","last_synced_at":"2024-11-08T06:40:08.101Z","etag":null,"topics":["auto-complete","auto-completion","auto-suggest","autocomplete","patricia-tree","patricia-trie","prefix-search","radix-tree","radix-trie","tree","trees","trie","trie-tree"],"latest_commit_sha":null,"homepage":"https://seekstorm.com/blog/pruning-radix-trie/","language":"C#","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/wolfgarbe.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-05-28T18:15:03.000Z","updated_at":"2024-11-06T20:33:40.000Z","dependencies_parsed_at":"2024-08-01T21:53:46.902Z","dependency_job_id":null,"html_url":"https://github.com/wolfgarbe/PruningRadixTrie","commit_stats":null,"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wolfgarbe%2FPruningRadixTrie","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wolfgarbe%2FPruningRadixTrie/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wolfgarbe%2FPruningRadixTrie/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wolfgarbe%2FPruningRadixTrie/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/wolfgarbe","download_url":"https://codeload.github.com/wolfgarbe/PruningRadixTrie/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248945721,"owners_count":21187374,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["auto-complete","auto-completion","auto-suggest","autocomplete","patricia-tree","patricia-trie","prefix-search","radix-tree","radix-trie","tree","trees","trie","trie-tree"],"created_at":"2024-08-01T21:00:50.926Z","updated_at":"2025-04-14T19:30:43.883Z","avatar_url":"https://github.com/wolfgarbe.png","language":"C#","funding_links":[],"categories":["高性能数据结构和算法","others","C#","C# #","C\\#"],"sub_categories":["FPS"],"readme":"PruningRadixTrie\u003cbr\u003e \n[![MIT License](https://img.shields.io/github/license/wolfgarbe/pruningradixtrie.png)](https://github.com/wolfgarbe/PruningRadixTrie/blob/master/LICENSE)\n========\n**PruningRadixTrie - 1000x faster Radix trie** for prefix search \u0026 auto-complete\n\nThe PruningRadixTrie is a novel data structure, derived from a radix trie - but 3 orders of magnitude faster.\n\nA [Radix Trie](https://en.wikipedia.org/wiki/Radix_tree) or Patricia Trie is a space-optimized trie (prefix tree).\u003cbr\u003e\nA **Pruning Radix trie** is a novel Radix trie algorithm, that allows pruning of the Radix trie and early termination of the lookup.\n\nIn many cases, we are not interested in a complete set of all children for a given prefix, but only in the top-k most relevant terms.\nEspecially for short prefixes, this results in a **massive reduction of lookup time** for the top-10 results.\nOn the other hand, a complete result set of millions of suggestions wouldn't be helpful at all for autocompletion.\u003cbr\u003e\u003cbr\u003e\nThe lookup acceleration is achieved by storing in each node the maximum rank of all its children. By comparing this maximum child rank with the lowest rank of the results retrieved so far, we can heavily prune the trie and do early termination of the lookup for non-promising branches with low child ranks.\n\n***\n\n### Performance\n![Benchmark](https://miro.medium.com/max/1400/1*tan-mb-atG_aFTrkrB7mXA.png \"Benchmark\")\n\u003cbr\u003e\u003cbr\u003e\nThe **Pruning Radix Trie** is up to **1000x faster** than an ordinary Radix Trie.\n\nWhile 37 ms for an autocomplete might seem fast enough for a single user, it becomes insufficient when we have to serve thousands of users in parallel. Then autocomplete lookups in large dictionaries are only feasible when powered by something much faster than an ordinary radix trie.\n\nWhile a prefix of length=1 is not very useful for the Latin alphabet, it does make sense for CJK languages. Also, there are many more application fields for a fast prefix search algorithm beyond character-wise word completion: Instead of characters, the prefix can be composed of arbitrary items, e.g. \nwhole words in a query completion, or towns in a long routing sequence.\n\n### Dictionary\n\n[Terms.txt](https://github.com/wolfgarbe/PruningRadixTrie/blob/master/PruningRadixTrie.Benchmark/terms.zip) contains 6 million unigrams and bigrams derived from English Wikipedia, with term frequency counts used for ranking. But you can use any frequency dictionary for any language and domain of your choice.\n\n### Blog Posts\n[The Pruning Radix Trie — a Radix trie on steroids](https://seekstorm.com/blog/pruning-radix-trie/)\u003cbr\u003e\n[1000x Faster Spelling Correction algorithm](https://seekstorm.com/blog/1000x-spelling-correction/)\u003cbr\u003e\n[SymSpell vs. BK-tree: 100x faster fuzzy string search \u0026 spell checking](https://seekstorm.com/blog/symspell-vs-bk-tree/)\n\n### Application:\nThe PruningRadixTrie is perfect for auto-completion, query completion or any other prefix search in large dictionaries.\nWhile 37 ms for an auto-complete might seem fast enough for a **single user**, it becomes a completely different story if we have to serve **thousands of users in parallel**. Then autocomplete lookups in large dictionaries become only feasible when powered by something much faster than an ordinary radix trie.\n\n### Usage: \n\n**Create Object**\n``` \nPruningRadixtrie pruningRadixTrie = new PruningRadixtrie();\n``` \n**AddTerm:** insert term and term frequency count into Pruning Radix Trie. Frequency counts for same term are summed up.\n```\npruningRadixTrie.AddTerm(\"microsoft\", 1000);\n```\n**GetTopkTermsForPrefix:** retrieve the top-k most relevant terms for a given prefix from the Pruning Radix Trie.\n``` \nstring prefix=\"micro\";\nint topK=10;\nvar results = pruningRadixTrie.GetTopkTermsForPrefix(prefix, topK, out long termFrequencyCountPrefix);\nforeach ((string term,long termFrequencyCount) in results) Console.WriteLine(term+\" \"+termFrequencyCount);\n``` \n**ReadTermsFromFile:** Deserialise the Pruning Radix Trie from disk for persistence.\n``` \npruningRadixTrie.ReadTermsFromFile(\"terms.txt\");\n```\n**WriteTermsToFile:** Serialise the Pruning Radix Trie to disk for persistence.\n``` \npruningRadixTrie.WriteTermsToFile(\"terms.txt\");\n```\n\n\n### Ports\nThe following third party ports or reimplementations to other programming languages have not been tested by myself whether they are an exact port, error free, provide identical results or are as fast as the original algorithm. \n\n**Go**\u003cbr\u003e\nhttps://github.com/olympos-labs/pruning-radix-trie\n\n**Java**\u003cbr\u003e\nhttps://github.com/benldr/JPruningRadixTrie\u003cbr\u003e\n\n**Python**\u003cbr\u003e\nhttps://github.com/otto-de/PyPruningRadixTrie\u003cbr\u003e\n\n**Rust**\u003cbr\u003e\nhttps://github.com/peterall/pruning_radix_trie\u003cbr\u003e\n\n---\n\n**PruningRadixTrie** is contributed by [**SeekStorm** - the high performance Search as a Service \u0026 search API](https://seekstorm.com)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwolfgarbe%2FPruningRadixTrie","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fwolfgarbe%2FPruningRadixTrie","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwolfgarbe%2FPruningRadixTrie/lists"}