Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/derhuerst/synchronous-autocomplete
Fast, simple autocompletion.
https://github.com/derhuerst/synchronous-autocomplete
autocomplete autocompletion fuzzy search
Last synced: 5 days ago
JSON representation
Fast, simple autocompletion.
- Host: GitHub
- URL: https://github.com/derhuerst/synchronous-autocomplete
- Owner: derhuerst
- License: isc
- Created: 2018-01-07T03:11:37.000Z (almost 7 years ago)
- Default Branch: master
- Last Pushed: 2022-09-23T13:32:23.000Z (about 2 years ago)
- Last Synced: 2024-10-19T23:32:45.363Z (24 days ago)
- Topics: autocomplete, autocompletion, fuzzy, search
- Language: JavaScript
- Homepage: https://github.com/derhuerst/synchronous-autocomplete#synchronous-autocomplete
- Size: 250 KB
- Stars: 13
- Watchers: 2
- Forks: 2
- Open Issues: 2
-
Metadata Files:
- Readme: readme.md
- License: license.md
Awesome Lists containing this project
README
# synchronous-autocomplete
**Fast, simple [autocompletion](https://en.wikipedia.org/wiki/Autocomplete).** Also supports [Levenshtein](https://en.wikipedia.org/wiki/Levenshtein_distance)-based fuzzy search. Uses precomputed indexes to be fast.
[![npm version](https://img.shields.io/npm/v/synchronous-autocomplete.svg)](https://www.npmjs.com/package/synchronous-autocomplete)
![ISC-licensed](https://img.shields.io/github/license/derhuerst/synchronous-autocomplete.svg)
[![support me via GitHub Sponsors](https://img.shields.io/badge/support%20me-donate-fa7664.svg)](https://github.com/sponsors/derhuerst)
[![chat with me on Twitter](https://img.shields.io/badge/chat%20with%20me-on%20Twitter-1da1f2.svg)](https://twitter.com/derhuerst)## Installing
```shell
npm install synchronous-autocomplete
```## Usage
Let's build a simple search for our fruit stand. We assign a `weight` property to each of them because some are bought more often and we want to push their ranking in the search results.
```js
const items = [ {
id: 'apple',
name: 'Juicy sour Apple.',
weight: 3
}, {
id: 'banana',
name: 'Sweet juicy Banana!',
weight: 2
}, {
id: 'pome',
name: 'Sour Pomegranate',
weight: 5
} ]
```Let's understand the terminology used by this tool:
- *item*: A thing to search for. In our example, apple, banana and pomegranate each are an *item*.
- *weight*: How important an *item* is.
- *token*: A word from the fully normalized item name. For example, to find an item named `Hey There!`, you may process its name into the *tokens* `hey` & `there`.
- *fragment*: A word from the normalized search query, which may partially match a *token*. E.g. the *fragment* `ther` (from the search query `Hey Ther`) partially matches the *token* `there`.
- *relevance*: How well an item fits to the search query.
- *score*: A combination of an item's *weight* and *relevance*. Used to rank search results.In order to be as fast and disk-space-efficient as possible, `synchronous-autocomplete` requires five indexes to be prebuilt from the list of items. Check [the example code](example.js) for more details on how to build them. For our example, they would look like this:
```js
const tokens = { // internal item IDs, by token
juicy: [0, 1],
sour: [0, 3],
apple: [0],
sweet: [1],
banana: [1],
pomegranate: [3]
}
const weights = [ // item weights, by internal item ID
3, // apple
2, // banana
5 // pome
]
const nrOfTokens = [ // nr of tokens, by internal item ID
3, // apple
3, // banana
2 // pome
]
const scores = { // "uniqueness" of each token, by token
juicy: 2 / 3, // 2 out of 3 items have the token "juicy"
sour: 2 / 3,
apple: 1 / 3,
sweet: 1 / 3,
banana: 1 / 3,
pomegranate: 1 / 3
}
// In order to create smaller search indexes, we use numerical item IDs
// internally and maintain a mapping to their "real"/original IDs.
const originalIds = [
'apple',
'banana',
'pome'
]
```Next, we must define a function that normalizes search input into a list of *fragments*. Consider using this simple function:
```js
import normalize from 'normalize-for-search'const tokenize = (str) => {
return normalize(str).replace(/[^\w\s]/g, '').split(/\s+/g)
}
```Of course, you don't have to calculate the tokens & scores! Instead, use `buildIndex` to generate the data:
```js
import {buildIndex} from 'synchronous-autocomplete/build.js'const index = buildIndex(tokenize, items)
```Now, we can query our index:
```js
import {createAutocomplete} from 'synchronous-autocomplete'const autocomplete = createAutocomplete(index, tokenize)
autocomplete('bana')
// [ {
// relevance: 0.6666665555555555,
// score: 0.8399472266053544,
// weight: 2,
// } ]autocomplete('sour')
// [ {
// id: 'pome',
// relevance: 1.8333335,
// score: 3.134956187236602,
// weight: 5,
// }, {
// id: 'apple',
// relevance: 1.2222223333333333,
// score: 1.762749635070118,
// weight: 3,
// } ]autocomplete('aplle', 3, true) // note the typo
// [ {
// id: 'apple',
// relevance: 0.22222216666666667,
// score: 0.3204998243877813,
// weight: 3,
// } ]
```## API
```js
const index = buildIndex(tokenize, items)
const {tokens, scores, weights, nrOfTokens, originalIds} = index
```- `tokenize` must be a function that, given a search query, returns an array of *fragments*.
- `items` must be an array of objects, each with `id`, `name` & `weight`.```js
const autocomplete = createAutocomplete(index, tokenize)
autocomplete(query, limit = 6, fuzzy = false, completion = true)
```- `tokens` must be an object with an array of internal *item* IDs per *token*.
- `scores` must be an object with a *token* score per *token*.
- `weights` must be an array with an *item* weight per internal *item* ID.
- `nrOfTokens` must be an array with the number of *tokens* per internal *item* ID.
- `originalIds` must be an array with the (real) *item* ID per internal *item* ID.
- `tokenize` is the same as with `buildIndex()`.## Storing the index as protocol buffer
[Protocol buffers](https://developers.google.com/protocol-buffers/) (a.k. *protobuf*s) are a compact binary format for structured data serialization.
```js
import {encodeIndex} from 'synchronous-autocomplete/encode.js'
import {writeFileSync, readFileSync} from 'node:fs'// encode & write the index
const encoded = encodeIndex(index)
writeFileSync('index.pbf', encoded)// read & decode the index
const decoded = decode(readFileSync('index.pbf'))
```## Contributing
If you have a question or have difficulties using `synchronous-autocomplete`, please double-check your code and setup first. If you think you have found a bug or want to propose a feature, refer to [the issues page](https://github.com/derhuerst/synchronous-autocomplete/issues).