Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/cshum/levi
Stream based full-text search for Node.js and browsers. Built on LevelDB.
https://github.com/cshum/levi
Last synced: about 1 month ago
JSON representation
Stream based full-text search for Node.js and browsers. Built on LevelDB.
- Host: GitHub
- URL: https://github.com/cshum/levi
- Owner: cshum
- License: mit
- Created: 2015-09-01T08:10:20.000Z (over 9 years ago)
- Default Branch: master
- Last Pushed: 2019-01-15T13:25:58.000Z (almost 6 years ago)
- Last Synced: 2024-04-14T11:56:15.550Z (8 months ago)
- Language: JavaScript
- Homepage:
- Size: 155 KB
- Stars: 373
- Watchers: 6
- Forks: 15
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-starred - cshum/levi - Stream based full-text search for Node.js and browsers. Built on LevelDB. (others)
README
# Levi
Stream based full-text search for Node.js and browsers. Using LevelDB as storage backend.
[![Build Status](https://travis-ci.org/cshum/levi.svg?branch=master)](https://travis-ci.org/cshum/levi)
```
npm install levi
```Full-text search using TF-IDF and cosine similarity plus query-time field boost options.
Provided with configurable text processing pipeline: Tokenizer, Porter Stemmer and Stopwords filter.Levi is built on [LevelUP](https://github.com/Level/levelup) - a fast, asynchronous,
[transactional](https://github.com/cshum/level-transactions/) storage interface.
By default, it uses [LevelDB](https://github.com/Level/leveldown) on Node.js and [IndexedDB](https://github.com/maxogden/level.js) on browser.
Also works with a variety of LevelDOWN compatible backends.Using stream based query mechanism with [Highland](http://highlandjs.org/), Levi is designed to be memory efficient, and extensible by combining multiple scoring mechanisms.
## API
### levi(path, [options])
### levi(db, [options])Create a new Levi instance with a [LevelUP](https://github.com/Level/levelup#ctor) database path or instance,
or with a [SublevelUP](https://github.com/cshum/sublevelup) section.```js
var levi = require('levi')// levi instance of database path `db`
var lv = levi('db')
.use(levi.tokenizer())
.use(levi.stemmer())
.use(levi.stopword())```
Text processing pipeline `levi.tokenizer()`, `levi.stemmer()`, `levi.stopword()` are required for indexing.
These are exposed as [ginga](https://github.com/cshum/ginga) plugins so that they can be swapped for different language configurations.### .put(key, value, [options], [callback])
Index document identified by `key`. `value` can be object or string.
Use object fields for `value` if you want field boost options for search.All fields are indexed by default. Set `options.fields` object to specify fields to be indexed.
Accepts optional callback function or returns a promise.
```js
// string as value
lv.put('a', 'Lorem Ipsum is simply dummy text.', function (err) { ... })// object fields as value
lv.put('b', {
id: 'b',
title: 'Lorem Ipsum',
body: 'Dummy text of the printing and typesetting industry.'
}, function (err) { ... })// options.fields
lv.put('c', {
id: 'c',
title: 'Hello World',
body: 'Bla bla bla'
}, {
fields: { title: true } // index title only
}).then(...).catch(...) // returns promise if no callback function
```### .del(key, [options], [callback])
Delete document `key` from index.Accepts optional callback function or returns a promise.
### .batch(array, [options], [callback])
Atomic bulk-write operations put and del,
similar to LevelUP's array form of [`batch()`](https://github.com/Level/levelup#batch)Accepts optional callback function or returns a promise.
```js
lv.batch([
{ type: 'put', key: 'a', value: 'Lorem Ipsum is simply dummy text.' },
{ type: 'del', key: 'b' }
], function (err) { ... })
```### .get(key, [options], [callback])
Fetch value from the store. Works exactly like LevelUP's [`get()`](https://github.com/Level/levelup#get)Accepts optional callback function or returns a promise.
### .readStream([options])
Obtain a ReadStream of documents, lexicographically sorted by key.
Works exactly like LevelUP's [`readStream()`](https://github.com/Level/levelup#dbcreatereadstreamoptions)### .searchStream(query, [options])
The main search interface of Levi is a Node compatible [highland](http://highlandjs.org/) object stream.
`query` can be a string or object fields.Accepts following options:
* `fields` control field boosts. By default every fields weight equally.
* `gt` (greater than), `gte` (greater than or equal) define the lower bound of key range to be searched.
* `lt` (less than), `lte` (less than or equal) define the upper bound of key range to be searched.
* `offset` number, offset results. Default 0.
* `limit` number, limit number of results. Default infinity.
* `expansions` number, maximum expansions of prefix matching for "search as you type" behaviour. Default 0.A "more like this" query can be done by searching with document itself.
```js
lv.searchStream('lorem ipsum').toArray(function (results) { ... }) // highland methodlv.searchStream('lorem ipsum', {
fields: { title: 10, '*': 1 } // title field boost. '*' means any field
}).pipe(...)lv.searchStream('lorem ipusm', {
fields: { title: 1 }, // title only
}).pipe(...)// ltgt
lv.searchStream('lorem ipusm', {
gt: '!posts!',
lt: '!posts!~'
}).pipe(...)// document as query
lv.searchStream({
title: 'Lorem Ipsum',
body: 'Dummy text of the printing and typesetting industry.'
}).pipe(...)// maximum 10 expansions. 'ips' may also match 'ipso', 'ipsum' etc.
lv.searchStream('lorem ips', {
expansions: 10
}).pipe(...)```
result is of form
```js
{
key: 'b',
score: 0.5972843431749838,
value: {
id: 'b',
title: 'Lorem Ipsum',
body: 'Dummy text of the printing and typesetting industry.'
}
}
```### .scoreStream(query, [options])
Underlying scoring mechanism of `searchStream()`. Calculates relevancy score of documents against `query`, lexicographically sorted by key.
Accepts options `fields`, `gt`, `gte`, `lt`, `lte`, `expansions`.Useful for combining multiple criteria or scoring mechanisms to build a more advanced search functionality.
### .pipeline(obj, [callback])
Underlying text processing pipeline of index and query, which extracts text tokens from a serializable `obj` object.
Accepts optional callback function or returns a promise.
```js
lv.pipeline({
a: 'foo bar is a placeholder name',
b: ['foo', 'bar'],
c: 167,
d: null,
e: { ghjk: ['printing'] }
}, function (err, tokens) {
// tokens
[ 'foo', 'bar', 'placehold', 'name', 'foo', 'bar', 'print' ]
})
```### levi.destroy(path, [callback])
Completely remove an existing database at `path`,
which deletes the database directory on Node.js
or deletes the IndexedDB database on browser.If you are using a custom Level backend, you need to invoke its corresponding `destroy()` function to remove database properly.
Accepts optional callback function or returns a promise.
## License
MIT