Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/tinysearch/tinysearch
đ Tiny, full-text search engine for static websites built with Rust and Wasm
https://github.com/tinysearch/tinysearch
bloom-filter elasticlunr hacktoberfest lunrjs rust search-engine static-site wasm
Last synced: 7 days ago
JSON representation
đ Tiny, full-text search engine for static websites built with Rust and Wasm
- Host: GitHub
- URL: https://github.com/tinysearch/tinysearch
- Owner: tinysearch
- License: apache-2.0
- Created: 2018-01-28T15:54:10.000Z (almost 7 years ago)
- Default Branch: master
- Last Pushed: 2023-10-18T18:49:08.000Z (about 1 year ago)
- Last Synced: 2024-10-29T15:29:05.460Z (about 1 month ago)
- Topics: bloom-filter, elasticlunr, hacktoberfest, lunrjs, rust, search-engine, static-site, wasm
- Language: Rust
- Homepage: https://endler.dev/2019/tinysearch
- Size: 769 KB
- Stars: 2,728
- Watchers: 21
- Forks: 87
- Open Issues: 10
-
Metadata Files:
- Readme: README.md
- Funding: .github/FUNDING.yml
- License: LICENSE-APACHE
Awesome Lists containing this project
- stars - tinysearch/tinysearch - text search engine for static websites built with Rust and Wasm (HarmonyOS / Windows Manager)
- awesome-starred-test - tinysearch/tinysearch - đ Tiny, full-text search engine for static websites built with Rust and Wasm (Rust)
- awesome-list - tinysearch - text search engine for static websites built with Rust and Wasm | tinysearch | 1601 | (Rust)
- jimsghstars - tinysearch/tinysearch - đ Tiny, full-text search engine for static websites built with Rust and Wasm (Rust)
README
# tinysearch
![Logo](logo.svg)
![CI](https://github.com/mre/tinysearch/workflows/CI/badge.svg)
tinysearch is a lightweight, fast, full-text search engine. It is designed for
static websites.tinysearch is written in Rust, and then compiled to WebAssembly to run in a
browser.\
It can be used together with static site generators such as
[Jekyll](https://jekyllrb.com/), [Hugo](https://gohugo.io/),
[Zola](https://www.getzola.org/),
[Cobalt](https://github.com/cobalt-org/cobalt.rs), or
[Pelican](https://getpelican.com).![Demo](tinysearch.gif)
## Is it tiny?
The test index file of my blog with around 40 posts creates a WASM payload of
99kB (49kB gzipped, 40kB brotli).\
That is smaller than the demo image above; so yes.## How it works
tinysearch is a Rust/WASM port of the Python code from the article
["Writing a full-text
search engine using Bloom filters"](https://www.stavros.io/posts/bloom-filter-search-engine/).
It can be seen as an alternative to [lunr.js](https://lunrjs.com/) and
[elasticlunr](http://elasticlunr.com/), which are too heavy for smaller websites
and load a lot of JavaScript.Under the hood it uses a [Xor Filter](https://arxiv.org/abs/1912.08258) â
a datastructure for fast approximation of set membership that is smaller than
bloom and cuckoo filters. Each blog post gets converted into a filter that will
then be serialized to a binary blob using
[bincode](https://github.com/bincode-org/bincode). Please note that the
underlying technologies are subject to change.## Limitations
- Only finds entire words. As a consequence there are no search suggestions
(yet). This is a necessary tradeoff for reducing memory usage. A trie
datastructure was about 10x bigger than the xor filters. New research on
compact datastructures for prefix searches might lift this limitation in the
future.
- Since we bundle all search indices for all articles into one static binary, we
recommend to only use it for small- to medium-size websites. Expect around 2
kB uncompressed per article (~1 kb compressed).## Installation
[wasm-pack](https://rustwasm.github.io/wasm-pack/) is required to build the WASM
module. Install it with```sh
cargo install wasm-pack
```To optimize the JavaScript output, you'll also need
[terser](https://github.com/terser/terser):```
npm install terser -g
```If you want to make the WebAssembly as small as possible, we recommend to
install [binaryen](https://github.com/WebAssembly/binaryen) as well. On macOS
you can install it with [homebrew](https://brew.sh/):```sh
brew install binaryen
```Alternatively, you can download the binary from the
[release page](https://github.com/WebAssembly/binaryen/releases) or use your OS
package manager.After that, you can install tinysearch itself:
```
cargo install tinysearch
```## Usage
A JSON file, which contains the content to index, is required as an input.
Please take a look at the [example file](fixtures/index.json).âšī¸ The `body` field in the JSON document is optional and can be skipped to just
index post titles.Once you created the index, you can run
```
tinysearch fixtures/index.json
```This will create a WASM module and the JavaScript glue code to integrate it into
your website. You can open the `demo.html` from any webserver to see the result.For example, Python has a built-in webserver that can be used for a quick test:
```
python3 -m http.server
```then browse to http://0.0.0.0:8000/demo.html to run the demo.
You can also take a look at the code examples for different static site
generators [here](https://github.com/mre/tinysearch/tree/master/howto).## Advanced Usage
For advanced usage options, run
```
tinysearch --help
```Please check what's required to
[host WebAssembly in production](https://rustwasm.github.io/book/reference/deploying-to-production.html)
-- you will need to explicitly set gzip mime types.## Docker
If you don't have a full Rust setup available, you can also use our
nightly-built Docker images.Here is how to quickly try tinysearch with Docker:
```sh
# Download a sample blog index from endler.dev
curl -O https://raw.githubusercontent.com/tinysearch/tinysearch/master/fixtures/index.json
# Create the WASM output
docker run -v $PWD:/app tinysearch/cli --engine-version path=\"/engine\" --path /app/wasm_output /app/index.json
```By default, the most recent stable Alpine Rust image is used. To get nightly,
run```sh
docker build --build-arg RUST_IMAGE=rustlang/rust:nightly-alpine -t tinysearch/cli:nightly .
```### Advanced Docker Build Args
- `WASM_REPO`: Overwrite the wasm-pack repository
- `WASM_BRANCH`: Overwrite the repository branch to use
- `TINY_REPO`: Overwrite repository of tinysearch
- `TINY_BRANCH`: Overwrite tinysearch branch## Github action
To integrate tinysearch in continuous deployment pipelines, a
[github action](https://github.com/marketplace/actions/tinysearch-action) is
available.```yaml
- name: Build tinysearch
uses: leonhfr/tinysearch-action@v1
with:
index: public/index.json
output_dir: public/wasm
output_types: |
wasm
```## Users
The following websites use tinysearch:
- [Matthias Endler's blog](https://endler.dev/2019/tinysearch/)
- [OutOfCheeseError](https://out-of-cheese-error.netlify.app/)
- [Museum of Warsaw Archdiocese](https://maw.art.pl/cyfrowemaw/)Are you using tinysearch, too? Add your site here!
## Maintainers
- Matthias Endler (@mre)
- Jorge-Luis Betancourt (@jorgelbg)
- Mad Mike (@fluential)## License
tinysearch is licensed under either of
- Apache License, Version 2.0, (LICENSE-APACHE or
http://www.apache.org/licenses/LICENSE-2.0)
- MIT license (LICENSE-MIT or http://opensource.org/licenses/MIT)at your option.
[wasm-pack]: https://github.com/rustwasm/wasm-pack