https://github.com/coinbase/fuzzy-trie
A trie-based solution to provide fuzzy searching
https://github.com/coinbase/fuzzy-trie
fuzzy-trie
Last synced: about 1 year ago
JSON representation
A trie-based solution to provide fuzzy searching
- Host: GitHub
- URL: https://github.com/coinbase/fuzzy-trie
- Owner: coinbase
- License: other
- Created: 2023-12-14T18:40:04.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2025-03-13T00:29:31.000Z (about 1 year ago)
- Last Synced: 2025-03-24T10:38:47.572Z (about 1 year ago)
- Topics: fuzzy-trie
- Language: Go
- Homepage:
- Size: 32.2 KB
- Stars: 10
- Watchers: 2
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project
README
# Fuzzy Trie
A [trie](https://en.wikipedia.org/wiki/Trie)-based implementation of fuzzy searching.
This uses a [Levenshtein distance](https://en.wikipedia.org/wiki/Levenshtein_distance) calculation to evaluate the closeness of an item in the tree to the search term.
## Usage
The items to be loaded into the tree must implement the `Fuzzable` interface defined in this library.
Once you have implemented that interface for your items to be loaded into the tree, you can do the following:
```
ctx := context.Background()
fuzzables := getMyItems() // returns structs that implement Fuzzable - not Fuzzable, as this must also satisfy ComparableFuzzable
tree := trie.LoadTree(ctx, fuzzables, func(ctx context.Context, item *myFuzzableImpl) (string, error) {
return item.trieNodeKey, nil
})
searchableTree := trie.NewDistanceTrees([]*trie.Tree[*myFuzzableImpl]{tree})
searchResults, err := searchableTree.Search(ctx, "cat")
```
### Multi-Dimensional Trees
If you wish to evaluate _closeness_ to your search term with multiple fields on your items (e.g., perhaps supporting search by a family name and then using relevance of given name as a tie-breaker), you can provide multiple trees to the `DistanceTree` to execute such functionality:
```
ctx := context.Background()
people := getPeople() // struct defined as { familyName string, givenName string }
familyNameTree := trie.LoadTree(ctx, people, func(ctx context.Context, item *person) (string, error) {
return person.familyName, nil
})
givenNameTree := trie.LoadTree(ctx, people, func(ctx context.Context, item *person) (string, error) {
return person.givenName, nil
})
searchableTree := trie.NewDistanceTrees([]*trie.Tree[*person]{familyNameTree, givenNameTree})
searchResults, err := searchableTree.Search(ctx, "jo")
```
The above tree will first find results that have a family name close to 'jo' and, for cases where multiple people are equally close to that search term, will then evaluate the closeness of the person's given name to 'jo' and return the results in that order.
### Measurement
If you wish to measure the performance of this tree within your application, you can supply an implementation of the `trie.Timer` interface provided in this library and use the `SetTimer` method on the `DistanceTrees` struct to inject your implementation.
## Benchmarking
Refer to [benchmarking.md](./internal/benchmark/benchmarking.md) for more information.