https://github.com/timothyckl/iota
a minimal local embedding database.
https://github.com/timothyckl/iota
document-retrieval embeddings python vector-database vector-search
Last synced: 5 months ago
JSON representation
a minimal local embedding database.
- Host: GitHub
- URL: https://github.com/timothyckl/iota
- Owner: timothyckl
- License: mit
- Created: 2024-03-17T04:43:02.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2024-03-26T16:38:28.000Z (about 2 years ago)
- Last Synced: 2025-12-19T00:20:47.670Z (6 months ago)
- Topics: document-retrieval, embeddings, python, vector-database, vector-search
- Language: Python
- Homepage:
- Size: 668 KB
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 4
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project
README
iota - a minimal local embedding database.
## Motivation
This project was done with the aim of reproducing some of my favourite features from existing vector stores while maintaining minimalism and simplicity.
> [!IMPORTANT]
> This is by no means scalable, but should suffice for smaller projects.
## Installation
Install the package via PyPI:
```bash
pip install iotadb
```
## Usage
Here is a very simple example:
```python
from iotadb import IotaDB, Document
# Define a list of documents
docs = [
Document(text="That is a happy dog"),
Document(text="That is a very happy person"),
Document(text="Today is a sunny day")
]
# Create a collection
db = IotaDB()
db.create_collection(name="my_collection", documents=docs)
# Query documents within your collection
results = db.search("That is a happy person", return_similarities=True)
for doc, score in results:
print(f"Text: {doc.text}")
print(f"similarity: {score:.3f}\n")
```
More examples can be found in the `/examples` directory.
## Features
- **Simple interface**: Easy-to-use API for database operations.
- **Lightweight implementation**: Minimal resource utilization.
- **Local storage**: Stores embeddings locally for fast and retrieval.
- **Fast Indexing**: Efficient embedding indexing for storage and retrieval.
## Use cases
- **Query with Natural Language**: Search for relevant documents using simple natural language queries.
- **Contextual Summarization**: Integrate documents into LLM contexts like GPT-3 for data-augmented tasks.
- **Similarity Search**: Find similar items/documents based on their embeddings.
## Contributing
Interested in contributing? Head over to the [Contribution Guide](CONTRIBUTING.md) for more details.