An open API service indexing awesome lists of open source software.

https://github.com/vectifyai/condb

ConDB: The KV-Cache Native Context Database
https://github.com/vectifyai/condb

agents ai context-database kv-cache llm long-context rag reasoning retrieval tree-search

Last synced: about 17 hours ago
JSON representation

ConDB: The KV-Cache Native Context Database

Awesome Lists containing this project

README

          

ConDB Banner


# ConDB: The KV-Cache Native Context Database

A new context database for reasoning-driven retrieval via tree search.

Fast, context-aware retrieval at scale with up to 70% less token cost.

---

## 🌲 What is ConDB?

**ConDB** (Context Database) is a tree-structured context database that uses LLM-powered **reasoning-based retrieval** via tree search instead of vector similarity — no vector DB, no chunking. It accepts [PageIndex](https://github.com/VectifyAI/PageIndex)-compatible document trees, [ChatIndex](https://github.com/VectifyAI/ChatIndex) conversation trees, filesystem trees, and custom hierarchical JSON — with no runtime dependency on either. The LLM reasons over the tree, like a human expert using a table of contents, to locate relevant content.

### Why not vector search?

- **Similarity ≠ relevance** — vector search retrieves what looks similar, not what is truly relevant. Similar-looking chunks may differ in intent (low accuracy), while truly relevant information may be expressed in very different language and get missed entirely (low recall). True relevance requires reasoning
- **Chunking breaks semantic continuity** — documents must be split into fixed-size segments to fit embedding models, causing context fragmentation that destroys their natural structure and cross-section relationships
- **Retrieval is blind to context** — embedding models encode the query alone, ignoring conversational history, user intent, and other contextual signals

ConDB replaces this with **reasoning-based tree search**: the LLM performs node-level relevance classification over a hierarchical index, incorporating full context — making retrieval adaptive, explainable, and traceable.

### What makes ConDB different

- **Fast tree search at scale** — reasoning-driven tree search with block partitioning and parallel processing, supporting complex, context-aware retrieval over large hierarchical structures
- **KV-cache native** — the first database designed around LLM KV-cache reuse. By caching intermediate results during tree search, ConDB reduces token usage by up to 70% with no loss in accuracy. The same efficiency gains extend to memory systems for long-context reasoning at scale
- **Unified long-context infrastructure** — a single system for both static and dynamic long-context workloads

### Static long context
Structured, persistent knowledge — documents (via [PageIndex](https://github.com/VectifyAI/PageIndex)), file systems, and codebases. Scalable retrieval within large, organized hierarchies.

### Dynamic long context
Evolving, runtime context — agent memory, long conversations (via [ChatIndex](https://github.com/VectifyAI/ChatIndex)), and autoresearch. Systems can continuously update, retrieve, and reason over newly generated information.

### Key capabilities

- **Hierarchical storage** — document trees, chat trees, and custom hierarchical JSON in SQLite
- **Multiple retrieval strategies** — beam search for small trees, block retrieval for large documents
- **Multi-provider LLM support** — Anthropic (Claude) and OpenAI (GPT) out of the box
- **Extensible** — plug in custom storage backends, LLM providers, or retrieval strategies

---

## 🚀 Getting Started

### Install

```bash
pip install -r requirements.txt
```

### Basic Usage

```python
import contextdb

# Open database
db = contextdb.open("my_docs.sqlite")

# Configure LLM
db.set_llm(provider="anthropic", model="claude-sonnet-4-6")

# Store a document tree
tree_id = db.store(document_tree_json, format="document")

# Query with LLM reasoning
result = db.query(tree_id, "What are the key findings?")
print(result.contents)
```

### Index from files with an external tree builder

```python
from contextdb import ContextTree

def build_markdown_tree(path: str) -> dict:
...

ct = ContextTree("context.sqlite")

tree_id = ct.index_markdown_file("doc.md", tree_builder=build_markdown_tree)

# You can also generate a tree out of process and call:
# tree_id = ct.index_document_tree(document_tree_json)

ct.close()
```

### Configuration

Create a `.env` file with your API keys:

```
ANTHROPIC_API_KEY=sk-...
OPENAI_API_KEY=sk-...
```

Model and provider settings live in `contextdb/config/config.yaml`:

```yaml
llm:
provider: anthropic # anthropic or openai
model: claude-sonnet-4-6 # any model the provider supports
context_limit: 100000
max_concurrent: 10

retriever:
beam_size: 3
max_turns: 5
```

Override at runtime with environment variables:

```bash
LLM_MODEL=claude-opus-4-6 python your_script.py
```

---

## 🔍 Retrieval Strategies

ConDB automatically selects the best retrieval strategy based on tree size:

| Strategy | Best for | How it works |
|----------|----------|--------------|
| **Beam** | Small trees
(< 50 nodes) | LLM evaluates and selects promising branches at each depth level |
| **Block** | Large documents
(50+ nodes) | Splits tree into token-bounded blocks, LLM reasons over each block. KV-cache native — caches intermediate block results to cut token usage by up to 70% |

You can also specify a strategy explicitly:

```python
result = db.query(tree_id, "question", strategy="block", beam_size=3)
```

---

## 📈 Benchmark Snapshot

Two benchmarks live under `bench/`.

### Filesystem mode — SWEBench-FileTree

Runs on [`AmuroEita/SWEBench-FileTree`](https://huggingface.co/datasets/AmuroEita/SWEBench-FileTree),
a path-only version of SWE-bench code retrieval:

- 500 GitHub issues as queries
- 475 `(repo, commit)` repository snapshots as independent retrieval universes
- 58,058 file paths; no source code, no file summaries

Given an issue and one snapshot's file tree, return the file(s) the fix
touches. Specification: `notes/condb_swebench_filetree_bench.md`.

```bash
export ANTHROPIC_API_KEY=sk-ant-...
python bench/run_swebench_filetree.py --tier medium
```

Tiers (by retriever difficulty; lower difficulty = more path signal in query):

```
easy 107 queries gold path appears in query text (sanity check)
medium 133 queries gold filename appears in query (main report)
hard 261 queries gold module stem appears (fuzzy matching)
all 500 queries no filter, includes ~48% path-signal-less queries
```

Output goes to `bench/runs/__/`: `report.md`, `summary.json`,
`per_query.jsonl`.

Block mode can optionally rerank only the cross-block merge candidates before
the file/directory split:

```bash
python bench/run_swebench_filetree.py --tier medium --strategy block --ranker vector
```

Available rankers are `none`, `bm25`, and `vector`. The vector ranker uses
LiteLLM embeddings (`--embedding-provider`, `--embedding-model`) and leaves
the default `ranker=none` unchanged.

#### Latest Run

Claude Sonnet 4.6, `--ranker none`, 500 queries, 0 failures. The retriever
returns the file set it deems relevant — no fixed top-K cutoff. Metrics
compare the returned set against the gold set:

- `precision = |returned ∩ gold| / |returned|`
- `recall = |returned ∩ gold| / |gold|`
- `f1`, `exact_match` (set equality), `MRR` (rank of first hit)

Block (ConDB) is compared against a **Vertical baseline** — a per-beam
variant that expands each parent's children into separate subtree blocks
(`A→B`, `A→C`), one LLM call per branch.

| variant | precision | recall | F1 | exact_match | MRR | avg returned | avg latency |
|---|---:|---:|---:|---:|---:|---:|---:|
| Vertical (baseline) | 0.262 | 0.560 | 0.319 | 0.130 | 0.466 | 3.00 | ~24 s |
| **Block (ConDB)** | **0.410** | **0.903** | **0.534** | 0.106 | **0.849** | 2.86 | ~8 s |

Block lifts recall from 0.56 to **0.90** at ~3× lower latency. Both runs
return ~3 candidates per query against an `avg_gold = 1.24` — explaining
the low `exact_match`: the retrievers tend to return one extra plausible
file alongside the actual gold.

Block per-gold-count breakdown:

| gold files | queries | precision | recall | F1 | exact_match | avg returned |
|---|---:|---:|---:|---:|---:|---:|
| 1 | 430 | 0.399 | 0.951 | 0.537 | 0.112 | 2.82 |
| 2 | 48 | 0.469 | 0.677 | 0.544 | 0.062 | 3.12 |
| 3 | 13 | 0.477 | 0.487 | 0.481 | 0.154 | 3.00 |
| 4 | 6 | 0.581 | 0.500 | 0.532 | 0.000 | 3.50 |
| 5 | 1 | 0.333 | 0.200 | 0.250 | 0.000 | 3.00 |
| 6+ | 2 | 0.500 | 0.190 | 0.264 | 0.000 | 3.00 |

Reproduce:

```bash
python bench/run_swebench_filetree.py --tier all --strategy block --ranker none
python bench/run_swebench_filetree.py --tier all --strategy vertical --ranker none
```

### Document mode — single long document

Compares retriever algorithms (Block / Beam / Vertical / ...) on one
hierarchical document. Reports time, LLM calls, token usage with prompt
caching, and USD cost.

```bash
python bench/run_document_bench.py \
--doc examples/large_doc.json \
--config bench/queries.json
```

Queries live in the config JSON as `{"queries": ["...", "..."]}`. Swap in
any `--doc` and any `--config` to benchmark a different document.

---

## 🧩 Learn More

### Architecture

```
contextdb/
├── api/
│ ├── condb.py # ConDB — main entry point
│ └── context_tree.py # ContextTree — tree indexing + query API
├── core/
│ └── storage.py # TreeDB (SQLite), StorageProtocol
├── adapter/
│ └── base.py # DocumentTree, ChatIndex, Generic adapters
├── retriever/
│ ├── base.py # Retriever protocols
│ └── algorithm/ # Beam, Block retrieval strategies
├── llm.py # LLMClient (Anthropic, OpenAI)
├── config/ # YAML configs for retrievers
└── prompts/ # Jinja2 prompt templates
```

### Extending

**Custom Storage Backend**

```python
from contextdb import StorageProtocol

class MyStorage:
def get_node(self, tree_id, node_id): ...
def get_children(self, tree_id, node_id): ...
# implement StorageProtocol methods

ct = ContextTree(storage=MyStorage())
```

**Custom LLM Provider**

```python
from contextdb import LLMProtocol

class MyLLM:
def chat(self, messages, system="", tools=None):
return {"content": [...], "stop_reason": "..."}

ct = ContextTree("db.sqlite", llm=MyLLM())
```

### Testing

```bash
./run_tests.sh all
```

---

## 💬 Community

### Related Projects

- [**PageIndex**](https://github.com/VectifyAI/PageIndex) — vectorless, reasoning-based RAG that builds hierarchical tree indexes from long documents
- [**ChatIndex**](https://github.com/VectifyAI/ChatIndex) — tree indexing for long conversations, enabling reasoning-based retrieval over chat histories
- [**AgentFS**](https://github.com/anthropics/agentfs) — filesystem for AI agents

### Connect with Us

[![Twitter](https://img.shields.io/badge/Twitter-000000?style=for-the-badge&logo=x&logoColor=white)](https://x.com/PageIndexAI) 
[![LinkedIn](https://img.shields.io/badge/LinkedIn-0077B5?style=for-the-badge&logo=linkedin&logoColor=white)](https://www.linkedin.com/company/vectify-ai/) 
[![Discord](https://img.shields.io/badge/Discord-5865F2?style=for-the-badge&logo=discord&logoColor=white)](https://discord.com/invite/VuXuf29EUj) 
[![Contact Us](https://img.shields.io/badge/Contact_Us-3B82F6?style=for-the-badge&logo=envelope&logoColor=white)](https://ii2abc2jejf.typeform.com/to/tK3AXl8T)

---

Licensed under [Apache 2.0](LICENSE).

© 2026 [Vectify AI](https://vectify.ai)