https://github.com/anush008/fastembed-rs

Rust library for generating vector embeddings, reranking. Based on qdrant/fastembed.
https://github.com/anush008/fastembed-rs

embeddings fastembed rag reranker reranking retrieval retrieval-augmented-generation vector-search

Last synced: 5 months ago
JSON representation

Rust library for generating vector embeddings, reranking. Based on qdrant/fastembed.

Host: GitHub
URL: https://github.com/anush008/fastembed-rs
Owner: Anush008
License: apache-2.0
Created: 2023-10-01T16:13:02.000Z (about 2 years ago)
Default Branch: main
Last Pushed: 2025-05-05T05:34:17.000Z (5 months ago)
Last Synced: 2025-05-08T19:18:51.481Z (5 months ago)
Topics: embeddings, fastembed, rag, reranker, reranking, retrieval, retrieval-augmented-generation, vector-search
Language: Rust
Homepage: https://docs.rs/fastembed
Size: 593 KB
Stars: 494
Watchers: 5
Forks: 69
Open Issues: 4
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          


  FastEmbed-rs 🦀

 Rust library for generating vector embeddings, reranking locally!

  

  

  



## 🍕 Features

- Supports synchronous usage. No dependency on Tokio.

- Uses [@pykeio/ort](https://github.com/pykeio/ort) for performant ONNX inference.

- Uses [@huggingface/tokenizers](https://github.com/huggingface/tokenizers) for fast encodings.

- Supports batch embeddings generation with parallelism using [@rayon-rs/rayon](https://github.com/rayon-rs/rayon).

## 🔍 Not looking for Rust?

- Python 🐍: [fastembed](https://github.com/qdrant/fastembed)

- Go 🐳: [fastembed-go](https://github.com/Anush008/fastembed-go)

- JavaScript 🌐: [fastembed-js](https://github.com/Anush008/fastembed-js)

## 🤖 Models

### Text Embedding

- [**BAAI/bge-small-en-v1.5**](https://huggingface.co/BAAI/bge-small-en-v1.5) - Default

- [**sentence-transformers/all-MiniLM-L6-v2**](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2)

- [**mixedbread-ai/mxbai-embed-large-v1**](https://huggingface.co/mixedbread-ai/mxbai-embed-large-v1)

- [**Qdrant/clip-ViT-B-32-text**](https://huggingface.co/Qdrant/clip-ViT-B-32-text) - pairs with `clip-ViT-B-32-vision` for image-to-text search

- [**BAAI/bge-large-en-v1.5**](https://huggingface.co/BAAI/bge-large-en-v1.5)

- [**BAAI/bge-small-zh-v1.5**](https://huggingface.co/BAAI/bge-small-zh-v1.5)

- [**BAAI/bge-large-zh-v1.5**](https://huggingface.co/BAAI/bge-large-zh-v1.5)

- [**BAAI/bge-base-en-v1.5**](https://huggingface.co/BAAI/bge-base-en-v1.5)

- [**sentence-transformers/all-MiniLM-L12-v2**](https://huggingface.co/sentence-transformers/all-MiniLM-L12-v2)

- [**sentence-transformers/paraphrase-MiniLM-L12-v2**](https://huggingface.co/sentence-transformers/paraphrase-MiniLM-L12-v2)

- [**sentence-transformers/paraphrase-multilingual-mpnet-base-v2**](https://huggingface.co/sentence-transformers/paraphrase-multilingual-mpnet-base-v2)

- [**lightonai/ModernBERT-embed-large**](https://huggingface.co/lightonai/modernbert-embed-large)

- [**nomic-ai/nomic-embed-text-v1**](https://huggingface.co/nomic-ai/nomic-embed-text-v1)

- [**nomic-ai/nomic-embed-text-v1.5**](https://huggingface.co/nomic-ai/nomic-embed-text-v1.5) - pairs with `nomic-embed-vision-v1.5` for image-to-text search

- [**intfloat/multilingual-e5-small**](https://huggingface.co/intfloat/multilingual-e5-small)

- [**intfloat/multilingual-e5-base**](https://huggingface.co/intfloat/multilingual-e5-base)

- [**intfloat/multilingual-e5-large**](https://huggingface.co/intfloat/multilingual-e5-large)

- [**Alibaba-NLP/gte-base-en-v1.5**](https://huggingface.co/Alibaba-NLP/gte-base-en-v1.5)

- [**Alibaba-NLP/gte-large-en-v1.5**](https://huggingface.co/Alibaba-NLP/gte-large-en-v1.5)

### Sparse Text Embedding

- [**prithivida/Splade_PP_en_v1**](https://huggingface.co/prithivida/Splade_PP_en_v1) - Default

### Image Embedding

- [**Qdrant/clip-ViT-B-32-vision**](https://huggingface.co/Qdrant/clip-ViT-B-32-vision) - Default

- [**Qdrant/resnet50-onnx**](https://huggingface.co/Qdrant/resnet50-onnx)

- [**Qdrant/Unicom-ViT-B-16**](https://huggingface.co/Qdrant/Unicom-ViT-B-16)

- [**Qdrant/Unicom-ViT-B-32**](https://huggingface.co/Qdrant/Unicom-ViT-B-32)

- [**nomic-ai/nomic-embed-vision-v1.5**](https://huggingface.co/nomic-ai/nomic-embed-vision-v1.5)

### Reranking

- [**BAAI/bge-reranker-base**](https://huggingface.co/BAAI/bge-reranker-base) - Default

- [**BAAI/bge-reranker-v2-m3**](https://huggingface.co/BAAI/bge-reranker-v2-m3)

- [**jinaai/jina-reranker-v1-turbo-en**](https://huggingface.co/jinaai/jina-reranker-v1-turbo-en)

- [**jinaai/jina-reranker-v2-base-multiligual**](https://huggingface.co/jinaai/jina-reranker-v2-base-multilingual)

## 🚀 Installation

Run the following command in your project directory:

```bash

cargo add fastembed

```

Or add the following line to your Cargo.toml:

```toml

[dependencies]

fastembed = "4"

```

## 📖 Usage

### Text Embeddings

```rust

use fastembed::{TextEmbedding, InitOptions, EmbeddingModel};

// With default InitOptions

let model = TextEmbedding::try_new(Default::default())?;

// With custom InitOptions

let model = TextEmbedding::try_new(

    InitOptions::new(EmbeddingModel::AllMiniLML6V2).with_show_download_progress(true),

)?;

let documents = vec![

    "passage: Hello, World!",

    "query: Hello, World!",

    "passage: This is an example passage.",

    // You can leave out the prefix but it's recommended

    "fastembed-rs is licensed under Apache  2.0"

    ];

 // Generate embeddings with the default batch size, 256

 let embeddings = model.embed(documents, None)?;

 println!("Embeddings length: {}", embeddings.len()); // -> Embeddings length: 4

 println!("Embedding dimension: {}", embeddings[0].len()); // -> Embedding dimension: 384

```

### Image Embeddings

```rust

use fastembed::{ImageEmbedding, ImageInitOptions, ImageEmbeddingModel};

// With default InitOptions

let model = ImageEmbedding::try_new(Default::default())?;

// With custom InitOptions

let model = ImageEmbedding::try_new(

    ImageInitOptions::new(ImageEmbeddingModel::ClipVitB32).with_show_download_progress(true),

)?;

let images = vec!["assets/image_0.png", "assets/image_1.png"];

// Generate embeddings with the default batch size, 256

let embeddings = model.embed(images, None)?;

println!("Embeddings length: {}", embeddings.len()); // -> Embeddings length: 2

println!("Embedding dimension: {}", embeddings[0].len()); // -> Embedding dimension: 512

```

### Candidates Reranking

```rust

use fastembed::{TextRerank, RerankInitOptions, RerankerModel};

let model = TextRerank::try_new(

    RerankInitOptions::new(RerankerModel::BGERerankerBase).with_show_download_progress(true),

)?;

let documents = vec![

    "hi",

    "The giant panda (Ailuropoda melanoleuca), sometimes called a panda bear, is a bear species endemic to China.",

    "panda is animal",

    "i dont know",

    "kind of mammal",

    ];

// Rerank with the default batch size, 256 and return document contents

let results = model.rerank("what is panda?", documents, true, None)?;

println!("Rerank result: {:?}", results);

```

Alternatively, local model files can be used for inference via the `try_new_from_user_defined(...)` methods of respective structs.

## ✊ Support

To support the library, please donate to our primary upstream dependency, [`ort`](https://github.com/pykeio/ort?tab=readme-ov-file#-sponsor-ort) - The Rust wrapper for the ONNX runtime.

## 📄 LICENSE

Apache 2.0 © [2024](https://github.com/Anush008/fastembed-rs/blob/main/LICENSE)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/anush008/fastembed-rs

Awesome Lists containing this project

README

FastEmbed-rs 🦀

Rust library for generating vector embeddings, reranking locally!