{"id":18015224,"url":"https://github.com/jchambers/jbktree","last_synced_at":"2025-04-04T15:16:35.815Z","repository":{"id":144776765,"uuid":"140914166","full_name":"jchambers/jbktree","owner":"jchambers","description":"A generic BK-tree implemented as a Java Collection","archived":false,"fork":false,"pushed_at":"2021-05-12T03:20:53.000Z","size":14,"stargazers_count":2,"open_issues_count":1,"forks_count":2,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-02-10T00:47:08.948Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jchambers.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-07-14T03:54:21.000Z","updated_at":"2023-03-01T15:08:40.000Z","dependencies_parsed_at":null,"dependency_job_id":"7df99805-c346-49d0-aa32-fabc22ac1204","html_url":"https://github.com/jchambers/jbktree","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jchambers%2Fjbktree","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jchambers%2Fjbktree/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jchambers%2Fjbktree/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jchambers%2Fjbktree/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jchambers","download_url":"https://codeload.github.com/jchambers/jbktree/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247198466,"owners_count":20900081,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-30T04:12:58.685Z","updated_at":"2025-04-04T15:16:35.786Z","avatar_url":"https://github.com/jchambers.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"# jbktree\n\n[![Build Status](https://travis-ci.org/jchambers/jbktree.svg?branch=master)](https://travis-ci.org/jchambers/jbktree)\n\njbktree provides a [generic](https://docs.oracle.com/javase/tutorial/java/generics/) [BK-tree](https://signal-to-noise.xyz/post/bk-tree/) implemented as a [`java.util.Collection`](https://docs.oracle.com/javase/8/docs/api/java/util/Collection.html). A BK-tree is a kind of [metric tree](https://en.wikipedia.org/wiki/Metric_tree) designed for use in discrete [metric spaces](https://en.wikipedia.org/wiki/Metric_space). BK-trees are generally used to efficiently conduct [_k_-nearest neighbor](https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm) searches.\n\nA common use case for BK-trees (but certainly not the only use case) is \"fuzzy matching\" for strings, or \"spell-checking.\" In this use case, we add a list of known words into a BK-tree, and can then search the tree for words that are within a certain [edit distance](https://en.wikipedia.org/wiki/Edit_distance) of a query term. To demonstrate this use case, we can start by loading a list of words into a `List`.\n\n```java\nfinal List\u003cString\u003e words;\n\n// Under macOS, /usr/share/dict/words contains a list of 235,886 English words\ntry (final BufferedReader reader = new BufferedReader(new FileReader(\"/usr/share/dict/words\"))) {\n    words = reader.lines().collect(Collectors.toList());\n}\n```\n\nNext, we define the distance function we'd like to use to calculate the edit distance between words in the list. In this case, we'll use the [Levenshtein distance](https://en.wikipedia.org/wiki/Levenshtein_distance) implementation from [Commons Text](https://commons.apache.org/proper/commons-text/):\n\n```java\nfinal DiscreteDistanceFunction\u003cString\u003e distanceFunction = (first, second) -\u003e\n        LevenshteinDistance.getDefaultInstance().apply(first, second);\n```\n\nWith a distance function and a collection of words, constructing a BK-tree is staightforward:\n\n```java\nfinal BKTree\u003cString\u003e bkTree = new BKTree\u003c\u003e(distanceFunction, words);\n```\n\nNow, let's say we have a word that we think is misspelled (`\"exaple\"`), and we want to find some possible replacements. We can search the BK-tree for all other words that are within a certain edit distance (\"radius\") of our query term to get some suggestions:\n\n```java\nfinal PriorityQueue\u003cString\u003e results = bkTree.getNearestNeighbors(\"exaple\", 2);\n```\n\nThat gives us the following results:\n\n| Neighbor | Distance |\n|----------|----------|\n| example  | 1        |\n| hexapla  | 2        |\n| vexable  | 2        |\n| exciple  | 2        |\n| exile    | 2        |\n| exhale   | 2        |\n| staple   | 2        |\n| epaule   | 2        |\n| enable   | 2        |\n| elapse   | 2        |\n| eagle    | 2        |\n| saple    | 2        |\n| maple    | 2        |\n| exalt    | 2        |\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjchambers%2Fjbktree","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjchambers%2Fjbktree","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjchambers%2Fjbktree/lists"}