{"id":29673783,"url":"https://github.com/otto-de/pypruningradixtrie","last_synced_at":"2025-07-22T22:08:20.977Z","repository":{"id":57746572,"uuid":"514221719","full_name":"otto-de/PyPruningRadixTrie","owner":"otto-de","description":"PyPruningRadixTrie - Python version of super fast Radix trie for prefix search \u0026 auto-complete","archived":false,"fork":false,"pushed_at":"2024-12-04T08:33:28.000Z","size":33,"stargazers_count":39,"open_issues_count":0,"forks_count":3,"subscribers_count":5,"default_branch":"main","last_synced_at":"2025-07-15T01:40:45.911Z","etag":null,"topics":["autocomplete","prefix-search","radix-trie","trie"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/otto-de.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE-APACHE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2022-07-15T10:05:17.000Z","updated_at":"2025-04-24T00:18:46.000Z","dependencies_parsed_at":"2023-01-21T18:04:01.586Z","dependency_job_id":null,"html_url":"https://github.com/otto-de/PyPruningRadixTrie","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/otto-de/PyPruningRadixTrie","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/otto-de%2FPyPruningRadixTrie","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/otto-de%2FPyPruningRadixTrie/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/otto-de%2FPyPruningRadixTrie/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/otto-de%2FPyPruningRadixTrie/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/otto-de","download_url":"https://codeload.github.com/otto-de/PyPruningRadixTrie/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/otto-de%2FPyPruningRadixTrie/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":266580854,"owners_count":23951303,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-07-22T02:00:09.085Z","response_time":66,"last_error":null,"robots_txt_status":null,"robots_txt_updated_at":null,"robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["autocomplete","prefix-search","radix-trie","trie"],"created_at":"2025-07-22T22:08:19.168Z","updated_at":"2025-07-22T22:08:20.966Z","avatar_url":"https://github.com/otto-de.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# PyPruningRadixTrie\n![GitHub CI](https://github.com/otto-de/PyPruningRadixTrie/actions/workflows/pipeline.yml/badge.svg)\n[![PyPI version](https://badge.fury.io/py/pypruningradixtrie.svg)](https://badge.fury.io/py/pypruningradixtrie)\n![OSS Lifecycle](https://img.shields.io/osslifecycle?file_url=https%3A%2F%2Fgithub.com%2Fotto-de%2FPyPruningRadixTrie%2Fblob%2Fmain%2FOSSMETADATA)\n\n\nPython Port of [Pruning Radix Trie](https://github.com/wolfgarbe/PruningRadixTrie) by Wolf Garbe.\n\n**Changes compared to original version**\n* Removed parameter to disable pruning behavior.\n  * See `test/non_pruning_radix_trie.py` for a non-pruning version that you can use to see the speed improvement.\n* Added outline for more generic `InputProvider` and providers that read CSV or JSON as examples\n\n\n## What and Why\n\nA [**Trie**](https://en.wikipedia.org/wiki/Trie) is a tree data structure that is commonly used for searching terms\nthat start with a given prefix.  \nIt starts with an empty string at the base of the trie, the _root node_.        \nAdding a new entry to the trie creates a new branch. This branch shares already present characters with existing nodes\nand creates new nodes when it's prefix diverges from the present entries.\n```text\n# trie containing flower \u0026 flowchart (1 char = 1 node)\n\n'' - f - l - o - w - e - r\n                 |\n                 c - h - a - r - t\n```\n\nA [**RadixTrie**](https://en.wikipedia.org/wiki/Radix_tree) is the space optimized version of a Trie.   \nIt combines the nodes with only one sub-node into one, containing more than one character.\n\n```text\n# radix trie containing flower \u0026 flowchart\n\n'' - flow - er\n      |\n     chart\n```\n\nThe prefix **Pruning** references the algorithm to query the RadixTrie.   \nIn order for the pruning to work, the nodes in RadixTrie are stored in a sorted manner.     \nThis structure allows **cutting off unpromising branches** during querying the trie which **makes the algorithm way faster**\ncompared to a non-pruning trie.\n\n\n## Usage\n\nGet from PyPI:\n```shell\npip install pypruningradixtrie\n```\n\n**Create the PRT:**\n```python\n# empty trie\ntrie = PruningRadixTrie()\n\n# fill with data from CSV file on creation\ntrie = PruningRadixTrie('./test_data.csv', CSVInputProvider(',', lambda x: float(x[1])))\n```\n\n**Add entries:**    \nCSV:\n```python\n# fill with data from CSV file, score is at position 1, term at position 0\nfill_trie_from_file(trie, './test_data.csv', CSVInputProvider(',', lambda x: float(x[1]), 0))\n```\n\nJSON:\n```python\n# define a functon to calculate the score out of a JSON entry\ndef score_fun(json_entry: Dict[str, Any]) -\u003e float:\n  return json_entry[\"pages\"] * json_entry[\"year\"] / 10.0\n\n# \"title\" = key for term to insert into PRT\nfill_trie_from_file(trie, './test_data.json', JSONInputProvider(\"title\", score_fun))\n```\n\nSingle Entry:\n```python\n# insert single entry\ninsert_term(trie, term=\"flower\", score=20)\n```\n\n**Use the PRT:**\n```python\n# get the top 10 entries that start with 'flower'\ntrie.get_top_k_for_prefix('flower', 10)\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fotto-de%2Fpypruningradixtrie","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fotto-de%2Fpypruningradixtrie","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fotto-de%2Fpypruningradixtrie/lists"}