An open API service indexing awesome lists of open source software.

https://github.com/ptytb/bk-tree

Burkhard-Keller Tree implementation in pure PowerShell
https://github.com/ptytb/bk-tree

dictionary fuzzy-search spell-checker string-matching

Last synced: 8 months ago
JSON representation

Burkhard-Keller Tree implementation in pure PowerShell

Awesome Lists containing this project

README

          

# BK-tree

This is a very fast string fuzzy-matching module written in PowerShell.

It uses *Damerau-Levenshtein* distance as metric function
and *BK-tree* structure to represent a search tree.

*BK-tree* can be flattened into arrays and loaded later for even faster search.

I wrote this primarily for [pips - Python package browser](https://github.com/ptytb/pips)

# Usage

```PowerShell
Import-Module .\bktree

$bktree = [BKTree]::new()

# Building a BK-tree

$bktree.add('fold')
$bktree.add('mold')
$bktree.add('hold')
$bktree.add('bold')
$bktree.add('fork')
$bktree.add('beer')
$bktree.add('hole')
$bktree.add('shim')

$candidates = $bktree.Search('cold', 2)

# We've built a dictionary, let's save it
$bktree.SaveArrays('dict.bin')

# Load a dictionary. This is supposed to be used in your app along with SearchFast()
$bktree.LoadArrays('dict.bin')

$candidates = $bktree.SearchFast('cold', 2)
```

Explore and try an example `.\test.ps1` to see it with time measurements.

# Hacking

Uncomment the line `Print-Result` in the `SaveArrays` function to see the internal
structure of the flattened tree.

In the arrays, node offset `-2` means it is a parent node, `-1` means node has no children.

# Todo

- [ ] Add method switch to return candidates along with distances

# License

Copyright, 2018, Ilya Pronin.
This code is released under the MIT license.