https://github.com/not-pizza/victor
Web-optimized vector database (written in Rust).
https://github.com/not-pizza/victor
Last synced: 11 months ago
JSON representation
Web-optimized vector database (written in Rust).
- Host: GitHub
- URL: https://github.com/not-pizza/victor
- Owner: not-pizza
- License: other
- Created: 2023-08-12T18:18:07.000Z (almost 3 years ago)
- Default Branch: main
- Last Pushed: 2024-06-24T08:21:17.000Z (almost 2 years ago)
- Last Synced: 2024-11-08T15:54:33.174Z (over 1 year ago)
- Language: Rust
- Homepage:
- Size: 3.69 MB
- Stars: 187
- Watchers: 6
- Forks: 8
- Open Issues: 18
-
Metadata Files:
- Readme: README.md
- License: LICENSE.md
Awesome Lists containing this project
- awesome-vector-databases - Victor - Web-optimized vector database written in Rust. It offers Rust and JavaScript APIs with efficient vector storage formats that consume significantly less space than JSON, and supports PCA compression for low-storage scenarios. Designed for native filesystem, in-memory, and web environments via WebAssembly. ([Read more](/details/victor.md)) `Open Source` `Rust` `Embedded` `Lightweight` `No Server` `Wasm` (Embedded Vector Databases)
README
## Victor
Web-optimized vector database (written in Rust).
## Features
1. Rust API (using native filesystem, or a transient in-memory filesystem)
2. Web API (Using the [Private Origin File System](https://web.dev/origin-private-file-system/))
3. Very efficient vector storage format
1. For a vector with 1536 dimensions, our representation consumes 1.5 KB, while naively encoding with JSON would consume 20.6 KB.
4. PCA for vector compression when storage space is low
## JS Example
#### Installation
```
npm install victor-db
```
#### Usage
```ts
import { Db } from "victor";
const db = await Db.new();
const content = "My content!";
const tags = ["these", "are", "tags"];
const embedding = new Float64Array(/* your embedding here */);
// write to victor
await db.insert(content, embedding, tags);
// read the 10 closest results from victor that are tagged with "tags"
// (only 1 will be returned because we only inserted one embedding)
const result = await db.search(embedding, ["tags"], 10);
assert(result[0].content == content);
// clear database
await db.clear();
```
See `www/` for a more complete example, including fetching embeddings from OpenAI.
## Rust Example
#### Installation
```
cargo add victor-db
```
#### Usage
The Rust API can automatically create embeddings for you with [fastembed-rs](https://github.com/anush008/fastembed-rs?tab=readme-ov-file)'s default model (currently [BAAI/bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5)).
```rust
use std::path::PathBuf;
use victor_db::native::Db;
let _ = std::fs::create_dir("./victor_test_data");
let mut victor = Db::new(PathBuf::from("./victor_test_data"));
victor.clear_db().await.unwrap();
victor
.add(
vec!["Pineapple", "Rocks"], // documents
vec!["Pizza Toppings"], // tags (only used for filtering)
)
.await;
victor
.add_single("Cheese pizza", vec!["Pizza Flavors"])
.await; // Add another entry with no tags
// read the 10 closest results from victor that are tagged with "Pizza Toppings"
// (only 2 will be returned because we only inserted two embeddings)
let nearest = victor
.search("Hawaiian pizza", vec!["Pizza Toppings"], 10)
.await
.first()
.unwrap()
.content
.clone();
assert_eq!(nearest, "Pineapple".to_string());
```
This example is also in the `/examples` directory. If you've cloned this repository, you can run it with `cargo run --example native_filesystem`.
## Hacking
1. Victor is written in Rust, and compiled to wasm with wasm-pack.
**Install wasm** pack with `cargo install wasm-pack` or `npm i -g wasm-pack`
(https://rustwasm.github.io/wasm-pack/installer/)
2. **Build Victor** with `wasm-pack build --target web`
3. **Set up the example project**, which is in `www/`.
If you use nvm, you can just run `cd www/ && nvm use`
Then, `npm i`.
4. From `www/`, start the example project with `npm run start`.
## Architecture
Relevant code at `src/packed_vector.rs`.

---

## Us
[Sam Hall](https://twitter.com/Shmall27)
[Andre Popovitch](https://twitter.com/ChadNauseam)