https://github.com/kn0sys/valentinus

next generation vector db built with lmdb bindings
https://github.com/kn0sys/valentinus

ai embeddings lmdb ml rust vector-database

Last synced: 25 days ago
JSON representation

next generation vector db built with lmdb bindings

Host: GitHub
URL: https://github.com/kn0sys/valentinus
Owner: kn0sys
License: apache-2.0
Created: 2024-07-08T21:36:26.000Z (10 months ago)
Default Branch: stable
Last Pushed: 2025-03-19T16:49:32.000Z (about 2 months ago)
Last Synced: 2025-03-25T06:22:30.876Z (about 1 month ago)
Topics: ai, embeddings, lmdb, ml, rust, vector-database
Language: Rust
Homepage: https://docs.rs/valentinus
Size: 338 KB
Stars: 12
Watchers: 2
Forks: 2
Open Issues: 2
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

awesome-rust - valentinus - Next generation vector database built with LMDB bindings [![Crates.io Version](https://img.shields.io/crates/v/valentinus)](https://crates.io/crates/valentinus) (Applications / Database)
fucking-awesome-rust - valentinus - Next generation vector database built with LMDB bindings [![Crates.io Version](https://img.shields.io/crates/v/valentinus)](https://crates.io/crates/valentinus) (Applications / Database)

README

        [![build](https://github.com/kn0sys/valentinus/actions/workflows/rust.yml/badge.svg?branch=stable)](https://github.com/kn0sys/valentinus/actions/workflows/rust.yml) [![test](https://github.com/kn0sys/valentinus/actions/workflows/test.yml/badge.svg)](https://github.com/kn0sys/valentinus/actions/workflows/test.yml) [![Crates.io Version](https://img.shields.io/crates/v/valentinus)](https://crates.io/crates/valentinus) ![Crates.io Total Downloads](https://img.shields.io/crates/d/valentinus) [![docs.rs](https://img.shields.io/docsrs/valentinus)](https://docs.rs/valentinus) [![GitHub commit activity](https://img.shields.io/github/commit-activity/m/kn0sys/valentinus)](https://github.com/kn0sys/valentinus/commits/main/) [![Matrix](https://img.shields.io/matrix/valentinus%3Amatrix.org)](https://app.element.io/#/room/#valentinus:matrix.org)

![alt text](logo.png) 

# valentinus 

next generation vector db built with lmdb bindings

### dependencies

* bincode/serde  - serialize/deserialize

* lmdb-rs        - database bindings

* ndarray        - numpy equivalent

* ort/onnx       - embeddings

### getting started

```bash

git clone https://github.com/kn0sys/valentinus && cd valentinus

```

### optional environment variables

| var| usage | default |

|----|-------| --------|

|`LMDB_USER` | working directory of the user for database | $USER|

|`LMDB_MAP_SIZE` | Sets max environment size, i.e. size in memory/disk of all data  | 20% of available memory |

|`ONNX_PARALLEL_THREADS` | parallel execution mode for this session | 1 |

|`VALENTINUS_CUSTOM_DIM` | embeddings dimensions for custom models | all-mini-lm-6 -> 384 |

|`VALENTINUS_LMDB_ENV`| environment for the database (i.e. test, prod) | test |

# tests

* Note: all tests currently require the `all-MiniLM-L6-v2_onnx` directory

* Get the model.onnx and tokenizer.json from huggingface or [build them](https://huggingface.co/docs/optimum/en/exporters/onnx/usage_guides/export_a_model)

```bash

mkdir all-MiniLM-L6-v2_onnx

cd all-MiniLM-L6-v2_onnx && wget https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2/resolve/main/config.json

wget https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2/resolve/main/onnx/model.onnx

wget https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2/resolve/main/special_tokens_map.json

wget https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2/resolve/main/main/tokenizer_config.json

wget https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2/resolve/main/main/tokenizer.json

wget https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2/resolve/main/main/vocab.txt

```

`RUST_TEST_THREADS=1 cargo test`

### examples

see [examples](https://github.com/kn0sys/valentinus/tree/main/examples)

### reference

[inspired by this chromadb python tutorial](https://realpython.com/chromadb-vector-database/#what-is-a-vector-database)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/kn0sys/valentinus

Awesome Lists containing this project

README