Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/okeuday/trie
Erlang Trie Implementation
https://github.com/okeuday/trie
data-structures erlang
Last synced: about 18 hours ago
JSON representation
Erlang Trie Implementation
- Host: GitHub
- URL: https://github.com/okeuday/trie
- Owner: okeuday
- License: mit
- Created: 2011-02-13T05:10:24.000Z (almost 14 years ago)
- Default Branch: master
- Last Pushed: 2023-10-26T18:30:00.000Z (about 1 year ago)
- Last Synced: 2025-01-03T17:17:31.794Z (8 days ago)
- Topics: data-structures, erlang
- Language: Erlang
- Homepage:
- Size: 380 KB
- Stars: 132
- Watchers: 13
- Forks: 33
- Open Issues: 0
-
Metadata Files:
- Readme: README.markdown
- License: LICENSE
Awesome Lists containing this project
- freaking_awesome_elixir - Erlang - Erlang Trie Implementation. (Algorithms and Data structures)
- fucking-awesome-elixir - trie - Erlang Trie Implementation. (Algorithms and Data structures)
- awesome-elixir - trie - Erlang Trie Implementation. (Algorithms and Data structures)
README
Erlang Trie Implementation
==========================The data structure is only for storing keys as strings (lists of integers), but is able to get performance close to the process dictionary when doing key lookups (based on [results here](http://okeuday.livejournal.com/20025.html) with [the benchmark here](http://github.com/okeuday/erlbench)). So, this data structure is (currently) the quickest for lookups on key-value pairs where all keys are strings, if you ignore the process dictionary (which many argue should never be used).
The implementation stores leaf nodes as the string suffix because it is a [PATRICIA trie](https://xlinux.nist.gov/dads/HTML/patriciatree.html) (PATRICIA - Practical Algorithm to Retrieve Information Coded in Alphanumeric, D.R.Morrison (1968)). Storing leaf nodes this way helps avoid single child leafs (compressing the tree a little bit).
The full OTP dict API is supported in addition to other functions. Functions like foldl, iter, itera, and foreach traverse in alphabetical order. Functions like map and foldr traverse in reverse alphabetical order. There are also functions like `find_prefix`, `is_prefix`, and `is_prefixed` that check if a prefix exists within the trie. The functions with a `"_similar"` suffix like `find_similar`, `foldl_similar`, and `foldr_similar` all operate with trie elements that share a common prefix with the supplied string.
The trie data structure supports string patterns. The functions `find_match/2`, `fold_match/4`, and `pattern_parse/2` utilize patterns that contain a`"*"`wildcard character(s) (equivalent to ".+" regex while`"**"`is forbidden). The function `find_match/2` operates on a trie filled with patterns when supplied a string non-pattern, while the function `fold_match/4` operates on a trie without patterns when supplied a string pattern. The functions `find_match2/2` and `pattern2_parse/2` add `"?"` as an additional wildcard character (with `"**"`, `"??"`, `"*?"` and `"?*"` forbidden) that consumes greedily to the next character (`"?"` must not be the last character in the pattern).
The btrie data structure was added because many people wanted a quick associative data structure for binary keys. However, other alternatives provide better efficiency, so the btrie is best used for functions that can not be found elsewhere (or perhaps extra-long keys)... more testing would be needed to determine the best use-cases of the btrie.
Tests
-----rebar compile
ERL_LIBS="/path/to/proper" rebar eunitAuthor
------Michael Truog (mjtruog at protonmail dot com)
License
-------MIT License