An open API service indexing awesome lists of open source software.

https://github.com/nnethercott/hannoy

Production-ready KV-backed HNSW implementation in Rust using LMDB
https://github.com/nnethercott/hannoy

approximate-nearest-neighbor-search diskann hnsw lmdb python rust vector-database

Last synced: 2 months ago
JSON representation

Production-ready KV-backed HNSW implementation in Rust using LMDB

Awesome Lists containing this project

README

          


hannoy 🗼

[![License](https://img.shields.io/badge/license-MIT-green)](LICENSE)
[![Crates.io](https://img.shields.io/crates/v/hannoy)](https://crates.io/crates/hannoy)
[![dependency status](https://deps.rs/repo/github/nnethercott/hannoy/status.svg)](https://deps.rs/repo/github/nnethercott/hannoy)
[![Build](https://github.com/nnethercott/hannoy/actions/workflows/rust.yml/badge.svg?event=pull_request)](https://github.com/nnethercott/hannoy/actions/workflows/rust.yml)
[![CodSpeed Badge](https://img.shields.io/endpoint?url=https://codspeed.io/badge.json)](https://codspeed.io/nnethercott/hannoy)

hannoy is a key-value backed [HNSW](https://www.pinecone.io/learn/series/faiss/hnsw/) implementation based on [arroy](https://github.com/meilisearch/arroy).

## Motivation
Many popular HNSW libraries are built in memory, meaning you need enough RAM to store all the vectors you're indexing. Instead, `hannoy` uses [LMDB](https://en.wikipedia.org/wiki/Lightning_Memory-Mapped_Database) — a memory-mapped KV store — as a storage backend. This is more well-suited for machines running multiple programs, or cases where the dataset you're indexing won't fit in memory. LMDB also supports non-blocking concurrent reads by design, meaning its safe to query the index in multi-threaded environments.

## Features
- Supported metrics: [euclidean](https://en.wikipedia.org/wiki/Euclidean_distance#:~:text=In%20mathematics%2C%20the%20Euclidean%20distance,occasionally%20called%20the%20Pythagorean%20distance.), [cosine](https://en.wikipedia.org/wiki/Cosine_similarity#Cosine_distance), [manhattan](https://en.wikipedia.org/wiki/Taxicab_geometry), [hamming](https://en.wikipedia.org/wiki/Hamming_distance), as well as quantized counterparts.
- Python bindings with [maturin](https://github.com/PyO3/maturin) and [pyo3](https://github.com/PyO3/pyo3)
- Multithreaded builds using rayon
- Disk-backed storage to enable indexing datasets that won't fit in RAM using LMDB
- [Compressed bitmaps](https://github.com/RoaringBitmap/roaring-rs) to store graph edges with minimal overhead, adding ~200 bytes per vector
- Dynamic document insertions and deletions without full re-indexing

## Missing Features
- GPU-accelerated indexing

## Usage
### Rust 🦀
```rust
use hannoy::{distances::Cosine, Database, Reader, Result, Writer};
use heed::EnvOpenOptions;
use rand::{rngs::StdRng, SeedableRng};

fn main() -> Result<()> {
let env = unsafe {
EnvOpenOptions::new()
.map_size(1024 * 1024 * 1024) // 1GiB
.open("./")
}
.unwrap();

let mut wtxn = env.write_txn()?;
let db: Database = env.create_database(&mut wtxn, None)?;
let writer: Writer = Writer::new(db, 0, 3);

// build
writer.add_item(&mut wtxn, 0, &[1.0, 0.0, 0.0])?;
writer.add_item(&mut wtxn, 0, &[0.0, 1.0, 0.0])?;

let mut rng = StdRng::seed_from_u64(42);
let mut builder = writer.builder(&mut rng);
builder.ef_construction(100).build::<16,32>(&mut wtxn)?;
wtxn.commit()?;

// search
let rtxn = env.read_txn()?;
let reader = Reader::::open(&rtxn, 0, db)?;

let query = vec![0.0, 1.0, 0.0];
let nns = reader.nns(1).ef_search(10).by_vector(&rtxn, &query)?.into_nns();

dbg!("{:?}", &nns);
Ok(())
}
```

### Python 🐍
```python
import hannoy
from hannoy import Metric
import tempfile

tmp_dir = tempfile.gettempdir()
db = hannoy.Database(tmp_dir, Metric.COSINE)

with db.writer(3, m=4, ef=10) as writer:
writer.add_item(0, [1.0, 0.0, 0.0])
writer.add_item(1, [0.0, 1.0, 0.0])

reader = db.reader()
nns = reader.by_vec([0.0, 1.0, 0.0], n=2)

(closest, dist) = nns[0]
```

## Tips and tricks
### Reducing cold start latencies
Search in an hnsw always traverses from the top to bottom layers of the graph, so we know a priori some vectors will be needed. We can hint to the kernel that these vectors (and their neighbours) should be loaded into RAM using [`madvise`](https://man7.org/linux/man-pages/man2/madvise.2.html) to speed up search.

Doing so can reduce cold-start latencies by several milliseconds, and is configured through the `HANNOY_READER_PREFETCH_MEMORY` environment variable.

E.g. prefetching 10MiB of vectors into RAM.
```bash
export HANNOY_READER_PREFETCH_MEMORY=10485760
```