https://github.com/analyticsinmotion/symrank
๐๐ฆ High-performance cosine similarity ranking for Retrieval-Augmented Generation (RAG) pipelines.
https://github.com/analyticsinmotion/symrank
cosine-similarity python-rust rag ranking-system reranking retrieval-augmented-generation
Last synced: 10 months ago
JSON representation
๐๐ฆ High-performance cosine similarity ranking for Retrieval-Augmented Generation (RAG) pipelines.
- Host: GitHub
- URL: https://github.com/analyticsinmotion/symrank
- Owner: analyticsinmotion
- License: apache-2.0
- Created: 2025-05-22T05:15:32.000Z (10 months ago)
- Default Branch: main
- Last Pushed: 2025-05-29T13:22:25.000Z (10 months ago)
- Last Synced: 2025-05-29T14:12:59.292Z (10 months ago)
- Topics: cosine-similarity, python-rust, rag, ranking-system, reranking, retrieval-augmented-generation
- Language: Python
- Homepage:
- Size: 660 KB
- Stars: 2
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
Awesome Lists containing this project
README

Similarity ranking for Retrieval-Augmented Generation
## โจ What is SymRank?
**SymRank** is a blazing-fast Python library for top-k cosine similarity ranking, designed for vector search, retrieval-augmented generation (RAG), and embedding-based matching.
Built with a Rust + SIMD backend, it offers the speed of native code with the ease of Python.
## ๐ Why SymRank?
โก Fast: SIMD-accelerated cosine scoring with adaptive parallelism
๐ง Smart: Automatically selects serial or parallel mode based on workload
๐ข Top-K optimized: Efficient inlined heap selection (no full sort overhead)
๐ Pythonic: Easy-to-use Python API
๐ฆ Powered by Rust: Safe, high-performance core engine
๐ Memory Efficient: Supports batching for speed and to reduce memory footprint
## ๐ฆ Installation
You can install SymRank with 'uv' or alternatively using 'pip'.
### Recommended (with uv):
```bash
uv pip install symrank
```
### Alternatively (using pip):
```bash
pip install symrank
```
## ๐งช Usage
### Basic Example (using python lists)
```python
import symrank as sr
query = [0.1, 0.2, 0.3, 0.4]
candidates = [
("doc_1", [0.1, 0.2, 0.3, 0.5]),
("doc_2", [0.9, 0.1, 0.2, 0.1]),
("doc_3", [0.0, 0.0, 0.0, 1.0]),
]
results = sr.cosine_similarity(query, candidates, k=2)
print(results)
```
*Output*
```python
[{'id': 'doc_1', 'score': 0.9939991235733032}, {'id': 'doc_3', 'score': 0.7302967309951782}]
```
### Basic Example (using numpy arrays)
```python
import symrank as sr
import numpy as np
query = np.array([0.1, 0.2, 0.3, 0.4], dtype=np.float32)
candidates = [
("doc_1", np.array([0.1, 0.2, 0.3, 0.5], dtype=np.float32)),
("doc_2", np.array([0.9, 0.1, 0.2, 0.1], dtype=np.float32)),
("doc_3", np.array([0.0, 0.0, 0.0, 1.0], dtype=np.float32)),
]
results = sr.cosine_similarity(query, candidates, k=2)
print(results)
```
*Output*
```python
[{'id': 'doc_1', 'score': 0.9939991235733032}, {'id': 'doc_3', 'score': 0.7302967309951782}]
```
## ๐งฉ API: cosine_similarity(...)
```python
cosine_similarity(
query_vector, # List[float] or np.ndarray
candidate_vectors, # List[Tuple[str, List[float] or np.ndarray]]
k=5, # Number of top results to return
batch_size=None # Optional: set for memory-efficient batching
)
```
### 'cosine_similarity(...)' Parameters
| Parameter | Type | Default | Description |
|-------------------|----------------------------------------------------|-------------|-------------|
| `query_vector` | `list[float]` or `np.ndarray` | _required_ | The query vector you want to compare against the candidate vectors. |
| `candidate_vectors`| `list[tuple[str, list[float] or np.ndarray]]` | _required_ | List of `(id, vector)` pairs. Each vector can be a list or NumPy array. |
| `k` | `int` | 5 | Number of top results to return, sorted by descending similarity. |
| `batch_size` | `int` or `None` | None | Optional batch size to reduce memory usage. If None, uses SIMD directly. |
### Returns
List of dictionaries with `id` and `score` (cosine similarity), sorted by descending similarity:
```python
[{"id": "doc_42", "score": 0.8763}, {"id": "doc_17", "score": 0.8451}, ...]
```
## ๐ License
This project is licensed under the Apache License 2.0.