https://github.com/ggozad/haiku.rag
Opinionated agentic RAG powered by LanceDB, Pydantic AI, and Docling
https://github.com/ggozad/haiku.rag
ai docling lancedb mcp mcp-server ml pydantic-ai rag
Last synced: 9 days ago
JSON representation
Opinionated agentic RAG powered by LanceDB, Pydantic AI, and Docling
- Host: GitHub
- URL: https://github.com/ggozad/haiku.rag
- Owner: ggozad
- License: mit
- Created: 2024-10-13T15:12:23.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2026-02-05T21:12:18.000Z (4 months ago)
- Last Synced: 2026-02-06T00:38:16.008Z (4 months ago)
- Topics: ai, docling, lancedb, mcp, mcp-server, ml, pydantic-ai, rag
- Language: Python
- Homepage:
- Size: 17.4 MB
- Stars: 482
- Watchers: 4
- Forks: 28
- Open Issues: 5
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Funding: .github/FUNDING.yml
- License: LICENSE
- Agents: docs/agents.md
Awesome Lists containing this project
- awesome-mcp - ggozad/haiku.rag - Haiku SQLite RAG is a self-contained Retrieval-Augmented Generation library using SQLite that supports hybrid semantic and full-text search, multiple embedding and QA providers, and integrates with AI assistants via an MCP server. (MCP Servers / Knowledge & Memory)
- awesome-sqlite - haiku.rag - Retrieval-Augmented Generation library/CLI & server built on SQLite with hybrid vector+FTS search, file-system monitoring and Python client. (Tools)
README
# Haiku RAG
[](https://github.com/ggozad/haiku.rag/actions/workflows/test.yml)
[](https://codecov.io/gh/ggozad/haiku.rag)
Agentic RAG built on [LanceDB](https://lancedb.com/), [Pydantic AI](https://ai.pydantic.dev/), and [Docling](https://docling-project.github.io/docling/).
> **New: vision and multimodal search.** Picture-aware ingestion captures embedded figure bytes; vision-capable QA models receive them alongside text. Multimodal embedders put picture vectors in the same space as text, enabling text-as-query → figure hits and image-as-query retrieval.
## Features
- **Hybrid search** — Vector + full-text with Reciprocal Rank Fusion
- **Multimodal & cross-modal search** — Multimodal embedders (vLLM) put picture vectors in the same space as text; supports text-as-query → figure hits and image-as-query
- **Question answering** — RAG skill with citations (page numbers, section headings)
- **Vision QA** — Vision-capable models receive figure bytes alongside chunk text
- **Reranking** — MxBAI, Cohere, Zero Entropy, or vLLM
- **Analysis skill** — Complex analytical tasks via sandboxed Python code execution (aggregation, computation, multi-document analysis)
- **Conversational RAG** — Chat TUI and web application for multi-turn conversations with session memory
- **Document structure** — Stores full [DoclingDocument](https://docling-project.github.io/docling/concepts/docling_document/), enabling structure-aware context expansion
- **Multiple providers** — Embeddings: Ollama, OpenAI, VoyageAI, LM Studio, vLLM (multimodal). QA: any model supported by Pydantic AI
- **Local-first** — Embedded LanceDB, no servers required. Also supports S3, GCS, Azure, and LanceDB Cloud
- **CLI & Python API** — Full functionality from command line or code
- **MCP server** — Expose as tools for AI assistants (Claude Desktop, etc.)
- **Visual grounding** — View chunks highlighted on original page images
- **File monitoring** — Watch directories and auto-index on changes
- **Time travel** — Query the database at any historical point with `--before`
- **Inspector** — TUI for browsing documents, chunks, and search results
## Installation
**Python 3.12 or newer required**
### Full Package (Recommended)
```bash
pip install haiku.rag
```
Includes all features: document processing, all embedding providers, and rerankers.
Using [uv](https://docs.astral.sh/uv/)? `uv pip install haiku.rag`
### Slim Package (Minimal Dependencies)
```bash
pip install haiku.rag-slim
```
Install only the extras you need. See the [Installation](https://ggozad.github.io/haiku.rag/installation/) documentation for available options.
## Quick Start
> **Note**: Requires an embedding provider (Ollama, OpenAI, etc.). See the [Tutorial](https://ggozad.github.io/haiku.rag/tutorial/) for setup instructions.
```bash
# Index a PDF
haiku-rag add-src paper.pdf
# Search
haiku-rag search "attention mechanism"
# Ask questions with citations
haiku-rag ask "What datasets were used for evaluation?" --cite
# Analyze — complex analytical tasks via code execution
haiku-rag analyze "How many documents mention transformers?"
# Interactive chat — multi-turn conversations with memory
haiku-rag chat
# Watch a directory for changes
haiku-rag serve --monitor
```
See [Configuration](https://ggozad.github.io/haiku.rag/configuration/) for customization options.
## Python API
```python
from haiku.rag.client import HaikuRAG
async with HaikuRAG("knowledge.lancedb", create=True) as rag:
# Index documents
await rag.create_document_from_source("paper.pdf")
await rag.create_document_from_source("https://arxiv.org/pdf/1706.03762")
# Search — returns chunks with provenance
results = await rag.search("self-attention")
for result in results:
print(f"{result.score:.2f} | p.{result.page_numbers} | {result.content[:100]}")
# QA with citations
answer, citations = await rag.ask("What is the complexity of self-attention?")
print(answer)
for cite in citations:
print(f" [{cite.chunk_id}] p.{cite.page_numbers}: {cite.content[:80]}")
```
For details on the skills the client wraps, see the [Skills docs](https://ggozad.github.io/haiku.rag/skills/).
## MCP Server
Use with AI assistants like Claude Desktop:
```bash
haiku-rag serve --mcp --stdio
```
Add to your Claude Desktop configuration:
```json
{
"mcpServers": {
"haiku-rag": {
"command": "haiku-rag",
"args": ["serve", "--mcp", "--stdio"]
}
}
}
```
Provides tools for document management, search, QA, and analysis directly in your AI assistant.
## Examples
See the [examples directory](examples/) for working examples:
- **[Docker Setup](examples/docker/)** - Complete Docker deployment with file monitoring and MCP server
- **[Web Application](app/)** - Full-stack conversational RAG with CopilotKit frontend
## Documentation
Full documentation at: https://ggozad.github.io/haiku.rag/
- [Installation](https://ggozad.github.io/haiku.rag/installation/) - Provider setup
- [Architecture](https://ggozad.github.io/haiku.rag/architecture/) - System overview
- [Configuration](https://ggozad.github.io/haiku.rag/configuration/) - YAML configuration
- [CLI](https://ggozad.github.io/haiku.rag/cli/) - Command reference
- [Python API](https://ggozad.github.io/haiku.rag/python/) - Complete API docs
- [Skills](https://ggozad.github.io/haiku.rag/skills/) - The RAG and analysis skills the client wraps
- [Analysis](https://ggozad.github.io/haiku.rag/agents/analysis/) - Complex analytical tasks via code execution
- [Applications](https://ggozad.github.io/haiku.rag/apps/) - Chat TUI, web app, and inspector
- [Server](https://ggozad.github.io/haiku.rag/server/) - File monitoring and MCP
- [MCP](https://ggozad.github.io/haiku.rag/mcp/) - Model Context Protocol integration
- [Benchmarks](https://ggozad.github.io/haiku.rag/benchmarks/) - Performance benchmarks
- [Changelog](https://ggozad.github.io/haiku.rag/changelog/) - Version history
## License
This project is licensed under the [MIT License](LICENSE).
mcp-name: io.github.ggozad/haiku-rag