https://github.com/nnethercott/hannoy
Production-ready KV-backed HNSW implementation in Rust using LMDB
https://github.com/nnethercott/hannoy
approximate-nearest-neighbor-search diskann hnsw lmdb python rust vector-database
Last synced: 2 months ago
JSON representation
Production-ready KV-backed HNSW implementation in Rust using LMDB
- Host: GitHub
- URL: https://github.com/nnethercott/hannoy
- Owner: nnethercott
- License: mit
- Created: 2025-06-15T10:14:34.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2026-04-07T11:21:47.000Z (2 months ago)
- Last Synced: 2026-04-07T11:32:19.923Z (2 months ago)
- Topics: approximate-nearest-neighbor-search, diskann, hnsw, lmdb, python, rust, vector-database
- Language: Rust
- Homepage: https://docs.rs/hannoy
- Size: 1.8 MB
- Stars: 77
- Watchers: 5
- Forks: 9
- Open Issues: 4
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
Awesome Lists containing this project
README
hannoy 🗼
[](LICENSE)
[](https://crates.io/crates/hannoy)
[](https://deps.rs/repo/github/nnethercott/hannoy)
[](https://github.com/nnethercott/hannoy/actions/workflows/rust.yml)
[](https://codspeed.io/nnethercott/hannoy)
hannoy is a key-value backed [HNSW](https://www.pinecone.io/learn/series/faiss/hnsw/) implementation based on [arroy](https://github.com/meilisearch/arroy).
## Motivation
Many popular HNSW libraries are built in memory, meaning you need enough RAM to store all the vectors you're indexing. Instead, `hannoy` uses [LMDB](https://en.wikipedia.org/wiki/Lightning_Memory-Mapped_Database) — a memory-mapped KV store — as a storage backend. This is more well-suited for machines running multiple programs, or cases where the dataset you're indexing won't fit in memory. LMDB also supports non-blocking concurrent reads by design, meaning its safe to query the index in multi-threaded environments.
## Features
- Supported metrics: [euclidean](https://en.wikipedia.org/wiki/Euclidean_distance#:~:text=In%20mathematics%2C%20the%20Euclidean%20distance,occasionally%20called%20the%20Pythagorean%20distance.), [cosine](https://en.wikipedia.org/wiki/Cosine_similarity#Cosine_distance), [manhattan](https://en.wikipedia.org/wiki/Taxicab_geometry), [hamming](https://en.wikipedia.org/wiki/Hamming_distance), as well as quantized counterparts.
- Python bindings with [maturin](https://github.com/PyO3/maturin) and [pyo3](https://github.com/PyO3/pyo3)
- Multithreaded builds using rayon
- Disk-backed storage to enable indexing datasets that won't fit in RAM using LMDB
- [Compressed bitmaps](https://github.com/RoaringBitmap/roaring-rs) to store graph edges with minimal overhead, adding ~200 bytes per vector
- Dynamic document insertions and deletions without full re-indexing
## Missing Features
- GPU-accelerated indexing
## Usage
### Rust 🦀
```rust
use hannoy::{distances::Cosine, Database, Reader, Result, Writer};
use heed::EnvOpenOptions;
use rand::{rngs::StdRng, SeedableRng};
fn main() -> Result<()> {
let env = unsafe {
EnvOpenOptions::new()
.map_size(1024 * 1024 * 1024) // 1GiB
.open("./")
}
.unwrap();
let mut wtxn = env.write_txn()?;
let db: Database = env.create_database(&mut wtxn, None)?;
let writer: Writer = Writer::new(db, 0, 3);
// build
writer.add_item(&mut wtxn, 0, &[1.0, 0.0, 0.0])?;
writer.add_item(&mut wtxn, 0, &[0.0, 1.0, 0.0])?;
let mut rng = StdRng::seed_from_u64(42);
let mut builder = writer.builder(&mut rng);
builder.ef_construction(100).build::<16,32>(&mut wtxn)?;
wtxn.commit()?;
// search
let rtxn = env.read_txn()?;
let reader = Reader::::open(&rtxn, 0, db)?;
let query = vec![0.0, 1.0, 0.0];
let nns = reader.nns(1).ef_search(10).by_vector(&rtxn, &query)?.into_nns();
dbg!("{:?}", &nns);
Ok(())
}
```
### Python 🐍
```python
import hannoy
from hannoy import Metric
import tempfile
tmp_dir = tempfile.gettempdir()
db = hannoy.Database(tmp_dir, Metric.COSINE)
with db.writer(3, m=4, ef=10) as writer:
writer.add_item(0, [1.0, 0.0, 0.0])
writer.add_item(1, [0.0, 1.0, 0.0])
reader = db.reader()
nns = reader.by_vec([0.0, 1.0, 0.0], n=2)
(closest, dist) = nns[0]
```
## Tips and tricks
### Reducing cold start latencies
Search in an hnsw always traverses from the top to bottom layers of the graph, so we know a priori some vectors will be needed. We can hint to the kernel that these vectors (and their neighbours) should be loaded into RAM using [`madvise`](https://man7.org/linux/man-pages/man2/madvise.2.html) to speed up search.
Doing so can reduce cold-start latencies by several milliseconds, and is configured through the `HANNOY_READER_PREFETCH_MEMORY` environment variable.
E.g. prefetching 10MiB of vectors into RAM.
```bash
export HANNOY_READER_PREFETCH_MEMORY=10485760
```