https://github.com/0xDebabrata/citrus
(distributed) vector database
https://github.com/0xDebabrata/citrus
approximate-nearest-neighbor-search embeddings hnsw nearest-neighbor-search semantic-search semantic-search-engine similarity-search vector-database vector-search-engine vectors
Last synced: 3 months ago
JSON representation
(distributed) vector database
- Host: GitHub
- URL: https://github.com/0xDebabrata/citrus
- Owner: 0xDebabrata
- License: apache-2.0
- Created: 2023-04-16T21:00:38.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2024-08-18T05:47:30.000Z (about 1 year ago)
- Last Synced: 2025-06-26T16:18:56.389Z (3 months ago)
- Topics: approximate-nearest-neighbor-search, embeddings, hnsw, nearest-neighbor-search, semantic-search, semantic-search-engine, similarity-search, vector-database, vector-search-engine, vectors
- Language: Python
- Homepage: https://searchcitrus.com
- Size: 935 KB
- Stars: 104
- Watchers: 3
- Forks: 13
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- Funding: FUNDING.yml
- License: LICENSE
Awesome Lists containing this project
- awesome-vector-databases - citrus - A distributed vector database designed for scalable and efficient vector similarity search. It is purpose-built for handling large-scale vector data and search workloads. ([Read more](/details/citrus.md)) `open-source` `distributed` `vector search` `scalable` (Vector Database Engines)
README
# 🍋 citrus.
### open-source (distributed) vector database## Installation
```bash
pip install citrusdb
```## Getting started
#### 1. Create index
```py
import citrusdb# Initialize client
citrus = citrusdb.Client()# Create index
citrus.create_index(
name="example",
max_elements=1000, # increases dynamically as you insert more vectors
)
```#### 2. Insert elements
```py
ids = [1, 2, 3]
documents = [
"Your time is limited, so don't waste it living someone else's life",
"I'd rather be optimistic and wrong than pessimistic and right.",
"Running a start-up is like chewing glass and staring into the abyss."
]citrus.add(index="example", ids=ids, documents=documents)
```
You can directly pass vector embeddings as well. If you're passing a list of strings like we have done here, ensure you have your `OPENAI_API_KEY` in the environment. By default we use OpenAI to to generate the embeddings. Please reach out if you're looking for support from a different provider!#### 3. Search
```py
results = citrus.query(
index="example",
documents=["What is it like to launch a startup"],
k=1,
include=["document", "metadata"]
)print(results)
```
You can specify if you want the associated text document and metadata to be returned.
By default, only the IDs are returned.Go launch a repl on [Replit](https://replit.com) and see what result you get after running the query! `result` will contain the `ids` of the top `k` search hits.
## Example
[chat w/ replit ai podcast](https://replit.searchcitrus.com)[pokedex search](https://replit.com/@debabratajr/pokedex-search)
## Facing issues?
Feel free to open issues on this repository! Discord server coming soon!*PS: citrus isn't fully distributed just yet. We're getting there though ;)*
---
Special thanks to
![]()
DevKit - The Essential Developer Toolkit
DSoC 2023