https://github.com/harf-promo/indexa

The local context engine for AI. Indexa reads your code or your disk once, on your machine, and serves any AI tool — Claude Code, Cursor, or a local Ollama model — the relevant slice on demand. Saves cloud tokens; lets local models punch above their context window. Local-first, MCP-native. Apache-2.0.
https://github.com/harf-promo/indexa

claude-code context-engine developer-tools embeddings local-first mcp ollama rag retrieval rust

Last synced: 10 days ago
JSON representation

Host: GitHub
URL: https://github.com/harf-promo/indexa
Owner: harf-promo
License: other
Created: 2026-05-27T18:24:23.000Z (27 days ago)
Default Branch: main
Last Pushed: 2026-06-13T19:06:09.000Z (10 days ago)
Last Synced: 2026-06-13T20:00:28.182Z (10 days ago)
Topics: claude-code, context-engine, developer-tools, embeddings, local-first, mcp, ollama, rag, retrieval, rust
Language: Rust
Size: 1.96 MB
Stars: 1
Watchers: 0
Forks: 0
Open Issues: 1
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: CONTRIBUTING.md
- Funding: .github/FUNDING.yml
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
- Security: SECURITY.md
- Roadmap: ROADMAP.md
- Notice: NOTICE

Awesome Lists containing this project

README

# Indexa

**The local context engine for AI.**

Your AI meets your codebase cold on every session — burning paid tokens to relearn what it knew yesterday, or choking on a context window that can't hold your repo at all. Indexa reads your code or your entire disk **once, on your machine**, builds a persistent hierarchical context store, and serves any AI tool the precise, ranked slice it needs — instantly, every time.

```bash
indexa index ~/code/my-repo # scan + embed + summarize in one command
indexa ask "where is auth handled?" # grounded answer with sources
indexa export ~/code/my-repo --format xml > .context.xml
claude "given @.context.xml, find the auth flow and add MFA"
# the model spends its budget on the work you asked for — not on re-reading your tree
```

*The index is the substrate; context is the product. Local-first · model-agnostic · Apache-2.0.*

> Indexa is production-ready for daily use on one or more repos. Whole-disk indexing is fast; the storage format stabilises before 1.0. New here? Start with the **[Usage Guide](USAGE.md)** or **[Quickstart](docs/quickstart.md)**. Something broken? **[Troubleshooting](docs/TROUBLESHOOTING.md)**.

---

## See it work

Four commands. One context. Every session.

```console
$ indexa index ~/code/my-repo
Scanning ~/code/my-repo
1,284 entries indexed.
Embedded 6,470 new chunks.
318 summaries generated.
Context is ready.

$ indexa ask "where is auth handled?"
Searching 6,470 indexed chunks...

Answer:
Authentication is handled in src/auth/middleware.rs (the `require_auth` route
guard) and src/auth/login.rs (the `login` entry point). Session tokens are
minted and validated in src/auth/session.rs. [1, 2, 3]

Sources:
[1] src/auth/middleware.rs — require_auth
[2] src/auth/login.rs — login
[3] src/auth/session.rs — mint_token
```

Then hand the context to your AI tool — or let your agent pull it live over MCP:

```bash
indexa export ~/code/my-repo --format xml > .context.xml # the artifact, built on your machine
indexa serve # open the web workspace at :7620
indexa mcp # expose the live index to any agent
```

---

## Why Indexa

**Stop paying to re-teach your AI your own codebase.** Every coding assistant wakes up amnesiac. Before it helps, it reads its way back to orientation — burning context window, paid tokens, and your patience on a lesson it learned five minutes ago. Indexa teaches it *once*: it builds a persistent, hierarchical context store on your machine and serves a small ranked slice on demand, so the model spends its budget on the work you asked for.

> **What that's worth — a worked example** *(illustrative, ~4 chars/token — run `indexa status`
> after a few sessions to see your own measured usage and estimated savings)*: an agent orienting itself in a medium repo
> typically reads ~40 files at ~2,000 tokens each ≈ **80K tokens — every session**. The same
> orientation through Indexa: one `search` (~300 tokens) + ten L0 one-line abstracts (~30 each) +
> the two files it actually needs in full (~4K) ≈ **5K tokens**. That's roughly **94% less**,
> before the session's real work starts — and the index was built locally, for zero tokens.

**There are two kinds of context, and almost everyone conflates them.** *Working context* is what's in the model's window right now — scarce, paid, gone when the session ends. *Searchable context* is everything your AI could know: the store on disk. Indexa separates them. The model never holds your repo; it holds the ~2–4K characters that actually matter, retrieved from a store that can be gigabytes.

**Local isn't the compromise — it's the unlock.** Your data never leaves the machine unless *you* point it at a cloud model, and zero tokens leave while Indexa builds context. A small local model stops being small: feed it a retrieved slice instead of the whole project and a 4K-window model reasons over a 100 MB codebase — fast, even on CPU, with a KV-cache sized by your choice, not your repo. One engine, two wins: it saves cloud tools their tokens **and** gives local models the context they can't hold.

*(How retrieval keeps that slice relevant — hybrid search, the ANN index, the honest trade-offs — is in [docs/methodology.md](docs/methodology.md).)*

---

## The only tool of its kind

**No other tool does this.** Indexa is the first local context engine that builds a persistent, hierarchically-summarised, locally-queryable index over your entire disk — and exposes it simultaneously through a CLI, a live web workspace, a native desktop app, and an MCP server that any AI agent can call in real time.

This is not a repo-to-prompt converter. It is not a document chat app. It is not an IDE extension. It is a persistent, on-device intelligence layer that any AI tool can reach — without re-reading your files, without sending your data to a server, and without starting from zero every session.

**Indexa ends the cold-start problem, permanently.**

---

## What Indexa does

- **One-command context** — `indexa index ` scans, embeds, and summarises in a single step. Context is ready immediately; no pipeline to manage.
- **Two-phase context** — an instant surface scan (zero AI, classifies code vs media vs build artifacts) then deep context: parse → chunk → embed → LLM file summaries rolled up bottom-up into a hierarchical context graph, with **L0 / L1 / L2 progressive disclosure** (one-line abstract → full summary → raw content).
- **Hybrid retrieval** — keyword (BM25) + semantic (vector) fused with RRF, plus an **opt-in ANN index** that keeps dense search fast on 50K-plus-chunk corpora.
- **Local multimodal** *(opt-in, on-device)* — caption images with a local vision model and transcribe audio with a local whisper CLI, so you can find media by what's *in* it, not just its filename.
- **Code intelligence** — a code-relationship graph (imports + defined symbols + call edges) across Rust, Python, JS/TS, Go, and Java, plus `who_calls`, `blast_radius`, and an interactive **call-graph visualization** in the web Map tab — all queryable over MCP.
- **Four ways to reach the index** — a CLI, a live web workspace with an Engine status bar, a native macOS desktop app (menu-bar, auto-updating), and an MCP server (42 tools) that AI agents call directly.
- **Resource-aware** — a memory watchdog that won't freeze your machine, and a hardware-aware model picker that annotates every model with its memory footprint, fit against your live RAM, and a per-job ETA.
- **Use your Claude subscription** — the `claude-code` provider runs summaries and answers on your Claude Pro/Max plan (no per-token billing); embeddings always stay local.
- **Export** — XML (the format Anthropic's own docs recommend for context windows), Markdown, or JSON. **Watch** keeps the context current as files change.

---

## Four ways to use it

- **CLI** — `index · ask · export · scan · deep · summarize · watch · serve · mcp · doctor · classify · completion · update`. Scriptable, pipeable, zero services required (`indexa completion ` emits tab-completions).
- **Web workspace** — `indexa serve` → `http://localhost:7620`. Responsive (collapses to a drawer on a phone), keyboard-navigable, and honest about impact — a **token-savings panel** shows what retrieval saved your AI tools this week. A live Engine status bar shows what the machine is doing while it builds:
```
Engine Building · 42 files/s · ETA 1m12s · gemma3:4b CPU 38% RAM 9.1 / 16 GB pressure: ok
```
- **Desktop app** — a native macOS app that lives in the menu bar. Auto-updates silently. Bundles the web workspace — no separate `indexa serve` needed.
- **MCP server** — `indexa mcp` exposes the live index to any [MCP](https://modelcontextprotocol.io) client (Claude Desktop, Cursor, Claude Code) over **42 tools** — retrieval (`search · browse_tree · get_summary · read_file · get_chunk_context · ask`), code graph (`dependencies · who_imports · who_calls · blast_radius · code_graph`), Context Packs, importance weights, smart classification, insights, decision review, config introspection, and indexing triggers.

---

## Code intelligence

Deep-indexing records each code file's graph edges — what it **imports**, the symbols it **defines**, and what it **calls** — across Rust, Python, JS/TS, Go, and Java. Query it over MCP, so your agent reasons about structure without reading every file:

```text
dependencies("src/auth/session.rs")
→ imports: crate::store::Db
→ defines: Session, mint_token, validate

who_imports("crate::store::Db")
→ src/auth/session.rs

blast_radius("mint_token")
→ src/auth/session.rs, src/auth/login.rs, src/auth/middleware.rs
```

Or **see** the whole call graph: `indexa graph ` on the CLI, or the **Graph** sub-view in the
web Map tab — an interactive force-directed view of which files call into which.

---

## Install

Download a pre-built binary from [Releases](../../releases):

```bash
# macOS (Apple Silicon)
curl -L -o /usr/local/bin/indexa \
https://github.com/harf-promo/indexa/releases/latest/download/indexa-aarch64-apple-darwin
chmod +x /usr/local/bin/indexa
xattr -d com.apple.quarantine /usr/local/bin/indexa # bypass Gatekeeper if prompted

# macOS (Intel): indexa-x86_64-apple-darwin
# Linux x86_64: indexa-x86_64-linux-gnu · Linux arm64: indexa-aarch64-linux-gnu
# Windows x64: indexa-x86_64-windows.exe
```

Pull the local models (one-time, ~11 GB total; everything runs offline after this):

```bash
ollama pull nomic-embed-text # embeddings (~270 MB)
ollama pull gemma3:4b # file summaries (~2.5 GB)
ollama pull gemma3:12b # answers + directory roll-ups (~8 GB)
```

Or build from source (Rust ≥ 1.82): `git clone … && cargo build --release` → `target/release/indexa`.

---

## Bring your own model

No model is bundled — Indexa works with whatever you run. Configure in `~/.indexa/config.toml`:

| Adapter | How it runs |
|---|---|
| **Ollama** | Local, fully offline (default). Point elsewhere with `OLLAMA_HOST`. |
| **Claude subscription** | `provider = "claude-code"` — synthesis on your Claude Pro/Max plan, no per-token billing. Embeddings stay local. |
| **llama.cpp** | Local via its HTTP server. |
| **Google Gemini · OpenAI · Anthropic** | Cloud — data leaves your device; API key required. |

Defaults: `nomic-embed-text` (embeddings) · `gemma3:4b` (file context) · `gemma3:12b` (answers + roll-ups). Optional cross-encoder reranking *fails open* — a model hiccup falls back to the original order, so it can never make `ask` worse.

---

## What's coming

- **Mobile companion** — browse your index from a phone on the same network. The read-only API is already LAN-ready (`indexa serve --host 0.0.0.0`); a native client is next.
- **Plugin marketplace** — the parser SDK is shipped (`indexa_parsers::Registry`); next is discovery and distribution of third-party parsers.

Recently shipped: **agentic `ask`** (multi-hop plan→search→refine), **PageRank centrality** (hub files in the code graph), **signature graph visualization**, **universal macOS desktop build**, **Importance weighting**, **Insights** (duplicates / stale / weekly diff), **video captioning**, **Plugin SDK**, **LAN serve**.

Ideas and votes in [Discussions](../../discussions/categories/ideas). Full detail in [ROADMAP.md](ROADMAP.md).

---

## Contributing

Indexa is built in the open and welcomes contributors of every level. Read [CONTRIBUTING.md](CONTRIBUTING.md), browse [`good first issue`](../../issues?q=label%3A%22good+first+issue%22), and join [Discussions](../../discussions). Commits sign off with the [DCO](https://developercertificate.org/) (`git commit -s`); no CLA.

## License

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/harf-promo/indexa

Awesome Lists containing this project

README