An open API service indexing awesome lists of open source software.

https://github.com/0xsyncroot/codewiki

Code intelligence over a tree-sitter knowledge graph — MCP server + CLI + interactive graph UI for AI agents. 18 languages, 16 frameworks, i18n search. Rust.
https://github.com/0xsyncroot/codewiki

Last synced: 23 days ago
JSON representation

Code intelligence over a tree-sitter knowledge graph — MCP server + CLI + interactive graph UI for AI agents. 18 languages, 16 frameworks, i18n search. Rust.

Awesome Lists containing this project

README

          

# CodeWiki

**Local code knowledge graph for AI agents. Tree-sitter parsed, SQLite stored, MCP served.**

[![CI](https://github.com/0xsyncroot/codewiki/actions/workflows/ci.yml/badge.svg)](https://github.com/0xsyncroot/codewiki/actions/workflows/ci.yml)
[![License: MIT](https://img.shields.io/badge/license-MIT-blue.svg)](LICENSE)
[![Version](https://img.shields.io/badge/version-0.2.1-green.svg)](CHANGELOG.md)
[![Rust](https://img.shields.io/badge/rust-stable-orange.svg)](https://www.rust-lang.org)

Instead of reading whole source files, your AI agent asks the graph:

```
codewiki_callers("OrderService") → 3 ms, exact answer
codewiki_impact("AuthService") → 6 ms, full blast radius
codewiki_context("basket checkout")→ 4 ms, ranked entry points + key code
```

Measured across **11 real-world repos and 157 tasks** (Python, Rust, TypeScript,
JavaScript, C#, Java, C++, Go — from 83-file libraries up to the 2k-file jellyfin .NET
server, the 17k-file kubernetes, and the 39k-file TypeScript compiler), CodeWiki uses
**~74% fewer tool-calls and ~98% fewer tokens** than grepping and reading files, at
**near-identical recall (0.88 vs 0.92)**. 100% local, single static binary — no cloud,
no telemetry, no API keys.

---

## Table of contents

- [What it is](#what-it-is)
- [Why it pays off](#why-it-pays-off)
- [Install](#install)
- [Quick start](#quick-start)
- [MCP tools](#mcp-tools)
- [Editor and agent support](#editor-and-agent-support)
- [CLI commands](#cli-commands)
- [Graph UI](#graph-ui)
- [Maintenance and incremental sync](#maintenance-and-incremental-sync)
- [Performance](#performance)
- [Languages and frameworks](#languages-and-frameworks)
- [Enterprise](#enterprise)
- [Architecture](#architecture)
- [Roadmap](#roadmap)
- [Contributing](#contributing)
- [License](#license)

---

## What it is

CodeWiki parses your codebase with [tree-sitter](https://tree-sitter.github.io/tree-sitter/)
and stores the result in a local SQLite knowledge graph: nodes (functions, classes,
interfaces, enums, structs, routes, components) and typed edges (calls, imports,
inherits, implements, route-to-handler, DI bindings).

The graph is exposed to AI coding agents via a
[Model Context Protocol](https://modelcontextprotocol.io) (MCP) server —
9 `codewiki_*` tools that Claude Code, Cursor, Codex, and other MCP-capable agents call
over stdio JSON-RPC. Agents get sub-millisecond structural answers instead of reading
files, and the index is built once then maintained automatically (see
[*Maintenance*](#maintenance-and-incremental-sync)).

Key properties:

- **18 languages** — C, C++, C#, Dart, Go, Java, JavaScript, Kotlin, Lua, Pascal, PHP,
Python, Ruby, Rust, Scala, Swift, TypeScript, and Vue
(plus additional component/template grammars; see
[*Languages and frameworks*](#languages-and-frameworks)).
- **16 framework resolvers** — Angular, ASP.NET / Razor, Django, Express, FastAPI, Flask,
NestJS, Rails, Laravel, React, Vue, Svelte, Spring, Gin (Go web), Vapor (Swift web),
and Cargo workspaces.
- **FTS5 full-text search** — Unicode-aware BM25 ranking with hybrid graph-path scoring.
- **Fully incremental** — a 1-file change re-syncs in tens of milliseconds; the rest of
the graph is untouched (see [*Maintenance*](#maintenance-and-incremental-sync)).
- **Docstring extraction** — Python, Rust, Go, TypeScript/JavaScript, C#.
- **100% local** — grammars are bundled in the binary; no network access after install.

---

## Why it pays off

An MCP-driven agent-savings harness runs **6 task archetypes** (locate, callers, callees,
feature, impl, blast) over **11 real repos across four size tiers** — **157 cases** — and
compares CodeWiki to a **grep + read-full-files baseline**, the surface a tool-less agent
actually uses. Every answer is scored for recall against a frozen ground-truth oracle (the
baseline is scored on the *identical* oracle). Tokens = output bytes ÷ 4 (conservative).
Pricing: Claude Sonnet $3.00 / 1M input tokens. Fresh cold run on `codewiki 0.2.0`. Full
methodology, recall caveat, and raw data: [`benchmark/README.md`](benchmark/README.md)
(source: [`results-savings.tsv`](benchmark/results-savings.tsv)).

| Language | Repo | Files | Tool-call reduction | Token reduction |
|----------|------|------:|:-------------------:|:---------------:|
| C++ | nlohmann/json | 499 | 69% | **98%** |
| Go | gin-gonic/gin | 99 | 75% | **96%** |
| TypeScript | colinhacks/zod | 408 | 62% | **99%** |
| Rust | BurntSushi/ripgrep | 101 | 75% | **98%** |
| Python | pallets/flask | 83 | 65% | **95%** |
| Java | google/gson | 262 | 77% | **95%** |
| JavaScript | expressjs/express | 141 | 60% | **95%** |
| C# | dotnet-architecture/eShopOnWeb | 269 | 78% | **79%** |
| **Overall** | **157 cases / 11 repos** | — | **74%** | **98%** |

Across all 157 cases (155 scored) CodeWiki uses **74% fewer tool-calls and 98% fewer
tokens** than grep+read, saving **~$16.98** of input tokens over the run. The token win is
so large because the baseline reads whole candidate files while CodeWiki returns a small,
ranked, structurally-resolved slice. (C#'s 79% on eShopOnWeb is a small-repo artefact: its
files are already cheap to read, so there is little to save — on the large .NET repo
jellyfin the same six archetypes hit **98%**; recall stays on par, 0.87 vs 0.94.)

**The saving grows with repo size.** CodeWiki's per-query answer stays bounded
(~0.6–1.4k tokens) no matter how big the codebase, while the grep+read baseline explodes:

| Size tier | Repos | Avg CW tokens/query | Avg baseline tokens/query | Token reduction |
|-----------|-------|--------------------:|--------------------------:|:---------------:|
| Small (<300 files) | flask, express, gin, eShopOnWeb, ripgrep | 601 | 16,142 | **96%** |
| Medium (300–700) | zod, gson, json | 935 | 51,079 | **98%** |
| Large .NET (~2k .cs) | jellyfin | 1,442 | 69,365 | **98%** |
| Enterprise (>15k) | kubernetes (17k), microsoft/TypeScript (39k) | 865 | **108,573** | **99%** |

On a 17k–39k-file monorepo a single scoped grep plus the file reads it forces costs
**~109k tokens per query**; CodeWiki answers the same task in ~900. The biggest codebases
— where agent token budgets hurt most — are where CodeWiki saves the most.

**Honest recall note.** CodeWiki now **nearly matches** the baseline's completeness while
keeping the token/call win: mean recall is **0.88 for CodeWiki vs 0.92 for the grep+read
baseline** (0.877 vs 0.921 over the 155 scored cases). The oracle is not rigged for
CodeWiki — the baseline still *edges* recall, and CodeWiki delivers *near-baseline answers
at ~2% of the tokens*. CodeWiki *wins* recall on 30 of the 155 scored cases (structural
`callees`/`impl`/`blast` tasks), loses on 29 (almost all `feature`-comprehension queries —
the residual gap is concentrated there), and ties on the rest. Two cases (C# `EfRepository`
blast, Go `Reconciler` impl) are kept **UNSCORABLE** on purpose — we will not score
CodeWiki against an oracle derived from its own newly-synthesized edges. The suite keeps
hard and expected-to-lose cases (overloaded names, deep `feature` tasks, high-fanout blast
radius) — no cherry-picking. Full methodology, weakness analysis, per-case breakdown, and
the disclosure of what changed since the prior report:
[`benchmark/README.md`](benchmark/README.md).

The savings compound: the index is built once and maintained automatically at a per-edit
cost in milliseconds (see [*Maintenance*](#maintenance-and-incremental-sync)), so every
subsequent query is effectively free.

### Worked example — 5 .NET tasks

> **Different baseline.** This per-task trace uses a more economical
> **grep-then-read-*minimal*-files** baseline (read only the files a task strictly needs),
> not the read-*whole*-files baseline of the cross-language table above. That is why the
> token reduction here (~83%) is lower than the headline 97% — both are honest; they
> simply measure against different baselines. Full trace:
> [`benchmark/README.md`](benchmark/README.md) (§4 .NET worked example).

Measured across 5 realistic tasks on eShopOnWeb (254 .cs) and jellyfin (2,065 .cs), freshly
re-run on `codewiki 0.2.0`. Byte counts from real CLI output and file sizes; tokens =
bytes ÷ 4 (conservative). Pricing: Claude Sonnet $3.00 / 1M input tokens.

| Task | CW calls | Baseline calls | Call reduction | CW tokens | Baseline tokens | Token reduction | $ saved |
|------|:--------:|:--------------:|:--------------:|:---------:|:---------------:|:--------------:|:-------:|
| DI consumers (`IBasketService`) | 2 | 6 | **67%** | 380 | 3,476 | **89%** | $0.009 |
| Feature comprehension (basket checkout) | 1 | 6 | **83%** | 1,017 | 6,842 | **85%** | $0.018 |
| Interface→impls (`IRepository`) | 2 | 6 | **67%** | 624 | 4,219 | **85%** | $0.011 |
| Blast radius (`OrderService` refactor) | 1 | 5 | **80%** | 246 | 1,959 | **87%** | $0.005 |
| Cross-cutting: auth config (jellyfin) | 2 | 4 | **50%** | 2,034 | 8,158 | **75%** | $0.018 |
| **Average** | **1.6** | **5.4** | **70%** | **860** | **4,930** | **83%** | **$0.012** |

Interface→impl queries traverse real `implements` edges — `codewiki impact IRepository`
returns the concrete `EfRepository` implementation directly (Task 3).

---

## Install

CodeWiki ships as a **single self-contained binary** — all tree-sitter grammars are
bundled, with no runtime dependencies. Every GitHub Release attaches CI-built binaries
for four targets, each with a matching SHA-256 checksum. **No toolchain, no compiling,
no cloning required.**

### 1. One-liner (recommended)

The install scripts auto-detect your platform, download the matching pre-built binary
from the **latest** GitHub Release, verify its SHA-256 checksum, and place `codewiki`
on your PATH. Nothing is compiled.

**Linux / macOS:**

```sh
curl -fsSL https://raw.githubusercontent.com/0xsyncroot/codewiki/main/install.sh | sh
```

**Windows (PowerShell):**

```powershell
irm https://raw.githubusercontent.com/0xsyncroot/codewiki/main/install.ps1 | iex
```

**Install location:** `~/.local/bin` on Unix, `%LOCALAPPDATA%\Programs\codewiki` on
Windows (added to your user PATH automatically on Windows). Override with `--dir `
(sh) or `-Dir ` (PowerShell).

**Pin a version** instead of taking the latest:

```sh
# Linux / macOS
curl -fsSL https://raw.githubusercontent.com/0xsyncroot/codewiki/main/install.sh | sh -s -- --version v0.2.1
```

```powershell
# Windows PowerShell
& ([scriptblock]::Create((irm https://raw.githubusercontent.com/0xsyncroot/codewiki/main/install.ps1))) -Version v0.2.1
```

### 2. Manual download

Prefer not to pipe a script to your shell? Download the asset for your platform from the
[Releases page](https://github.com/0xsyncroot/codewiki/releases), verify it, extract, and
put `codewiki` on your PATH.

| OS | Arch | Release asset |
|----|------|---------------|
| Linux | x86_64 | `codewiki-x86_64-unknown-linux-gnu.tar.gz` |
| Linux | ARM64 | `codewiki-aarch64-unknown-linux-gnu.tar.gz` |
| macOS | Apple Silicon | `codewiki-aarch64-apple-darwin.tar.gz` |
| Windows | x86_64 | `codewiki-x86_64-pc-windows-msvc.zip` |

> macOS Intel (`x86_64`): build from source for now (see below) — a pre-built Intel binary will return in a later release.

Each asset has a matching `.sha256`. Download both, then verify and install:

**Linux / macOS:**

```sh
# Replace with the filename for your platform from the table above.
sha256sum -c .sha256 # macOS: shasum -a 256 -c .sha256
tar -xzf # extracts ./codewiki
install -m 755 codewiki ~/.local/bin/codewiki
```

**Windows (PowerShell):**

```powershell
# Verify: the printed hash must match the contents of the .sha256 file.
Get-FileHash codewiki-x86_64-pc-windows-msvc.zip -Algorithm SHA256
Expand-Archive codewiki-x86_64-pc-windows-msvc.zip -DestinationPath $env:LOCALAPPDATA\Programs\codewiki
# Add %LOCALAPPDATA%\Programs\codewiki to your PATH.
```

### Verify the install

```sh
codewiki --version
# codewiki 0.2.1
```

The binary includes every feature — graph UI, FTS5 search, full language support — by
default. No extra flags needed.

3. Build from source (optional — for development only)

End users do not need this; the installers above ship a verified pre-built binary.
Building from source requires a recent stable Rust toolchain (CI builds on current
`stable`):

```sh
git clone https://github.com/0xsyncroot/codewiki
cd codewiki
cargo build --release # binary: ./target/release/codewiki
```

---

## Quick start

CodeWiki is **wired into your agent once, machine-wide**, and then works in
every project you index. Two steps get you productive:

**1. Wire your agent once (machine-wide).** Run this a single time per machine —
not per project:

```sh
codewiki install --target claude
# (or: cursor | codex | opencode | hermes | all)
```

This registers `codewiki serve --mcp` as an MCP server in your agent's global
config. One registration serves **every** project you open.

**2. Index each project.** In each repository you want code intelligence for:

```sh
codewiki init
# Creates + indexes ./.codewiki/ (DB + git hooks).
# Indexed 141 files, 2,291 nodes, 6,374 edges in 0.3s.
```

That's it — restart your agent once after step 1, and from then on every
`codewiki init`'d project is instantly available.

**Why this works.** The MCP server is **cwd-aware**: it resolves the
`.codewiki/` index from the project the agent is working in. So a single global
registration transparently serves all your projects. Open a project you haven't
indexed yet and the tools stay live — they simply prompt you to run
`codewiki init` to enable code intelligence there. After you `codewiki init` a
project that was already open, restart your agent (or reconnect the MCP server)
so it picks up the new index.

**Shortcut.** To do both steps for the current project in one command:

```sh
codewiki setup
# Indexes this project and wires your detected agent(s).
```

Verify everything is healthy at any time:

```sh
codewiki doctor
```

`codewiki doctor` runs 6 checks: binary on PATH, index initialized, index non-empty,
freshness, agent wired (global or local), git hook installed.

---

## MCP tools

All 9 tools are served by `codewiki serve --mcp` (wired automatically by `codewiki install`).
The MCP server keeps the SQLite connection open — query latency is sub-millisecond.

| Tool | Description |
|------|-------------|
| `codewiki_search` | Find a symbol by name — exact, fuzzy, or namespace-qualified |
| `codewiki_context` | AI-focused context for a task description — ranked entry points + key code |
| `codewiki_callers` | All call sites of a symbol |
| `codewiki_callees` | Everything a symbol calls |
| `codewiki_impact` | Transitive blast radius of changing a symbol (depth 3) |
| `codewiki_node` | Symbol signature, source span, and docstring |
| `codewiki_explore` | Several related symbols' source in one capped call |
| `codewiki_files` | Indexed files under a directory path with metadata |
| `codewiki_status` | Index health: file count, node/edge counts, DB size |

---

## Editor and agent support

CodeWiki ships a standard [Model Context Protocol](https://modelcontextprotocol.io)
stdio server (`codewiki serve --mcp`, JSON-RPC over stdin/stdout, protocol `2024-11-05`),
so it works with **any MCP-compatible agent**. For the agents below, a one-command
installer writes the MCP config (and agent instructions) for you:

| Agent | Install command | Notes |
|-------|-----------------|-------|
| Claude Code | `codewiki install --target claude` | Writes `~/.claude.json` (global) or `./.mcp.json` (local); adds a CodeWiki block to `CLAUDE.md`. |
| Cursor | `codewiki install --target cursor` | Writes `~/.cursor/mcp.json` (global) or `./.cursor/mcp.json` (local); local installs also add `./.cursor/rules/codewiki.mdc`. |
| Codex CLI | `codewiki install --target codex` | Global only — writes `[mcp_servers.codewiki]` to `~/.codex/config.toml` and a block to `~/.codex/AGENTS.md`. |
| opencode | `codewiki install --target opencode` | Writes `~/.config/opencode/opencode.jsonc` (global) or `./opencode.jsonc` (local); adds a block to `AGENTS.md`. |
| Hermes Agent | `codewiki install --target hermes` | Global only — writes `mcp_servers.codewiki` to `$HERMES_HOME/config.yaml` (defaults to `~/.hermes/config.yaml`). |

Pass `--location local` to wire the current project instead of your user-wide config
(Codex CLI and Hermes Agent are global-only). `codewiki install --target all` configures
every detected agent at once, and `codewiki setup` indexes the project and wires agents
in a single step. Run `codewiki uninstall --target ` to cleanly remove a config.

### Any other MCP client (manual)

Any MCP-capable client — including editors and assistants without a first-class
installer above — can use CodeWiki by registering the stdio server directly. The
canonical invocation is `codewiki serve --mcp`. A typical `mcpServers` entry:

```json
{
"mcpServers": {
"codewiki": {
"type": "stdio",
"command": "codewiki",
"args": ["serve", "--mcp"]
}
}
}
```

If your client launches MCP servers without inheriting the project working directory,
add `--path` so CodeWiki finds the right index, e.g.
`"args": ["serve", "--mcp", "--path", "/abs/path/to/project"]`. Place the entry in
whatever config file your client reads (the schema varies by client) and restart it.

---

## CLI commands

**Common — start here:**

```
codewiki setup Index + wire MCP in one step [START HERE]
codewiki status Index statistics
codewiki doctor Diagnostics and health checks
codewiki query Search a symbol by name
codewiki context AI-focused context for a task
```

**Advanced:**

```
codewiki init Initialize .codewiki/ for this project
codewiki index Re-index all files (or --path )
codewiki sync Sync changed files into the index
codewiki serve MCP server (--mcp for stdio JSON-RPC)
codewiki files List indexed files
codewiki callers Callers of a symbol
codewiki callees Callees of a symbol
codewiki impact Blast radius of a symbol change
codewiki affected Symbols affected by a set of changed files
```

**Management:**

```
codewiki install Wire agent configs (subset of setup)
codewiki uninstall Remove from agent configs
codewiki uninit Remove .codewiki/ from this project
codewiki snapshot Export index to a portable SQLite file
codewiki restore Restore index from a snapshot
```

Run `codewiki --help` for full option details.

---

## Graph UI

`codewiki graph` launches a local web viewer for interactive exploration of the
knowledge graph:

- **Force-directed graph** — nodes (functions, classes, routes, components) and edges
(calls, imports, inherits, implements) rendered in the browser
- **Neighbourhood explorer** — click any node to focus on its immediate call graph
- **Filter by kind and language** — narrow the view to classes, interfaces, routes, or
a specific language
- **Node detail panel** — signature, file location, docstring, and related symbols
- **Impact view** — highlight the transitive blast radius of a selected symbol

The graph UI is included in every default build:

```sh
# Launch (opens http://localhost:7007 by default):
codewiki graph

# Custom port, no auto-open:
codewiki graph --port 8080 --no-open
```

For a minimal binary without the graph UI (rare, e.g. server-side MCP-only deploys):

```sh
cargo build --release --no-default-features --features bundled-sqlite,wasmtime-grammars
```

---

## Maintenance and incremental sync

**The graph stays current automatically — at negligible cost.** This is the single
authoritative description of how the index is kept fresh; the savings sections above
cross-link here rather than restate it.

`codewiki init` installs two complementary auto-sync paths:

- **Git hooks** (`post-commit`, `post-merge`, `post-checkout`) run `codewiki sync`
detached in the background after a commit, merge, or branch switch, so git is never
blocked. They work on **every** filesystem, including WSL `/mnt` drives.
- **Live file watcher** inside `codewiki serve --mcp` re-syncs on file saves, using a
**2-second debounce** (`WatcherConfig::default()`) that coalesces a multi-file burst
into one sync cycle. The watcher does not fire on WSL `/mnt` (drvfs/9p) drives — inotify
events don't cross the 9p protocol — which is exactly why the git hooks exist as the
fallback there.

Incremental sync is truly incremental: only changed files and their direct dependants are
re-extracted and re-resolved, so sync time stays proportional to the change size, not the
repo size. A 2,065-file .NET repo re-syncs a single edit in 66 ms.

**Measured 1-file sync (tool-reported parse/DB-write time, fresh on `codewiki 0.2.0`):**

| Repo | Files | Sync time |
|------|------:|:---------:|
| flask (Python) | 83 | 22 ms |
| express (JavaScript) | 141 | 21 ms |
| zod (TypeScript) | 408 | 32 ms |
| json (C++) | 499 | 43 ms |
| jellyfin (C#, 2,065 .cs) | 2,065 | 66 ms |
| kubernetes (Go, 17,176 files) | 17,176 | 74 ms |
| microsoft/TypeScript (39,296 files) | 39,296 | 202 ms |

**The cost story:** index once (~0.2–2.1 s for the per-language repos, ~1 min for the
17k–39k-file monorepos), then maintain at tens of milliseconds per change. Use
`codewiki index` instead of `sync` for whole-repo rewrites (a mass reformat is ~2× slower
through the incremental path). Every subsequent query reuses the same graph for sub-ms
responses, so the ~98% token / ~74% tool-call savings (see
[*Why it pays off*](#why-it-pays-off)) repeat every session.

---

## Performance

Full tables, scale analysis, and the agent-savings study live in
[`benchmark/README.md`](benchmark/README.md). All figures below are a fresh run on
`codewiki 0.2.0`. Machine: Intel Core i7-14700KF (14 cores / 28 threads), 31 GiB RAM,
WSL2.

**Cross-language results — 8 real repos (one per language, shallow clones of `main`):**

This table is a **performance** benchmark — cold-index speed, search latency, and
incremental sync. The token / tool-call savings for these same 8 repos are in
[*Why it pays off*](#why-it-pays-off).

| Repo | Language | Files | Cold-index | files/s | Search p50 | Sync (1 file) |
|------|----------|------:|:----------:|--------:|:----------:|:-------------:|
| pallets/flask | Python | 83 | 0.19 s | 430 | 3 ms | 22 ms |
| BurntSushi/ripgrep | Rust | 101 | 0.36 s | 278 | 3 ms | 24 ms |
| expressjs/express | JavaScript | 141 | 0.26 s | 542 | 2 ms | 21 ms |
| colinhacks/zod | TypeScript | 408 | 0.57 s | 720 | 2 ms | 32 ms |
| dotnet-architecture/eShopOnWeb | C# | 269 | 0.15 s | 1,793 | 2 ms | 27 ms |
| google/gson | Java | 262 | 0.49 s | 538 | 3 ms | 33 ms |
| nlohmann/json | C++ | 499 | 1.01 s | 494 | 4 ms | 43 ms |
| gin-gonic/gin | Go | 99 | 0.26 s | 376 | 2 ms | 22 ms |

All 8 index cleanly — zero crashes, zero FK errors — including C++ (`nlohmann/json`,
499 files, 13,438 graph nodes). "Search p50" is the `query` latency; `callers`/`callees`
are comparable, `impact`/`context` run 2–33 ms depending on fan-out.

**Scale:** holds to enterprise size. **kubernetes** (Go, 17,176 files) cold-indexes in
**55 s** into a 208k-node / 971k-edge graph, and **microsoft/TypeScript** (the TS
compiler, 39,296 files) in **62 s** — both clean, exit 0, ~1.1–1.5 GB peak RSS. The
100k-file cold-index extrapolates to **~14 min** (O(n^1.5), target ≤ 20 min) at ~4 GB
peak RSS. Full scale table and the enterprise breakdown: [*Enterprise*](#enterprise) and
[`benchmark/README.md`](benchmark/README.md).

**Search latency** (CLI p50, includes binary cold-start):
- Exact / fuzzy / callers / callees: **2–4 ms** across all 8 languages
- Impact / context query: **2–33 ms** (varies with graph fan-out)
- Via persistent MCP server: sub-millisecond

---

## Languages and frameworks

**18 languages:**

C, C++, C#, Dart, Go, Java, JavaScript, Kotlin, Lua, Pascal, PHP, Python,
Ruby, Rust, Scala, Swift, TypeScript, and Vue.

Plus additional grammars for component, template, and form dialects — Svelte, Luau,
Liquid, Razor, and Delphi form files (DFM).

**16 framework resolvers:**

| Framework | Extracted |
|-----------|-----------|
| **Angular** | `@Component` / `@Directive` / `@Pipe` / `@Injectable` / `@NgModule` nodes; DI constructor injection; routing (`loadComponent`, lazy routes, guards); standalone `imports:[]` bindings |
| **ASP.NET / Razor** | HTTP routes (`[HttpGet/Post/Delete]`, minimal API, `MapGroup`, SignalR hubs); DI registrations; namespace-qualified symbols; interface / struct / enum / record discrimination; method signatures; `is_async` detection |
| **Django** | URL patterns, views, models |
| **Express** | Route handlers, middleware chains |
| **FastAPI** | Route decorators, dependency injection |
| **Flask** | Blueprint routes, view functions |
| **NestJS** | Controllers, guards, interceptors, modules |
| **Rails** | Routes (`resources`, `get/post`), controllers |
| **Laravel** | Route facades, controllers |
| **React** | Component exports, hook calls |
| **Vue** | SFC components, ``, Composition API |
| **Svelte** | Component exports, stores |
| **Spring** | `@RequestMapping`, `@GetMapping`, beans |
| **Gin (Go web)** | `r.GET/POST/…` routes for Gin / chi / Echo / gorilla mux / net/http → route nodes + handler resolution |
| **Vapor (Swift web)** | `app/routes/router .get/post/…` routes; controller, Fluent model, and middleware resolution |
| **Cargo workspace** | Workspace members → module nodes; inter-crate `use` resolution; path-dependency edges |

---

## Enterprise

CodeWiki is production-tested on enterprise-scale codebases. All figures below are fresh
`codewiki 0.2.0` runs on shallow clones:

| Metric | Result |
|--------|--------|
| kubernetes (Go, 17,176 files) | 55 s index, 208,425 nodes / 970,801 edges, 779,688 refs resolved, 1.49 GB RSS, exit 0 |
| microsoft/TypeScript (39,296 files) | 62 s index, 312,521 nodes / 442,014 edges, 1.11 GB RSS, exit 0 (clean, no crash) |
| 100k-file cold index | ~14 min (extrapolated; on-curve with the measured 17k / 39k real-repo runs) |
| 100k-file peak RAM | ~4 GB (extrapolated; sub-linear O(n^0.62) memory scaling) |
| OrchardCore (5,203 .cs) | 9.0 s index, 72,345 nodes / 227,444 edges, 505 interfaces, 97 enums, 3,663 implements edges, fully qualified names |
| jellyfin (2,065 .cs) | 2.1 s index, 19,911 nodes / 46,648 edges, 209 interfaces, 385 routes, 5,852 import edges, 1,019 async methods |
| ABP framework (3,497 .cs) | 1.9 s index, 26,133 nodes / 50,632 edges, 623 interfaces, namespace-qualified (`Volo.Abp.Domain.Services::DomainService`) |
| .NET enterprise verdict | **READY** — interfaces/enums/structs correctly classified, namespaces qualified, signatures stored, async detected, `implements` edges resolved |

Full benchmark suite and .NET audit: [`benchmark/README.md`](benchmark/README.md).

**Privacy:** all indexing and querying runs locally. No data leaves the machine.
No embedding model, no external API, no telemetry. Single static binary —
copy it anywhere, run it, works.

---

## Architecture

Cargo workspace under `crates/` — eight product crates plus a test-only helper
(`codewiki-testutil`):

| Crate | Role |
|-------|------|
| `codewiki-cli` | Binary entry point, `clap` CLI, installer, onboarding UI |
| `codewiki-mcp` | MCP server (rmcp, stdio JSON-RPC), 9 tool handlers |
| `codewiki-extraction` | tree-sitter AST walker, 18-language extractors, docstring extraction |
| `codewiki-storage` | SQLite schema, FTS5 Unicode search, WAL mode, graph query API |
| `codewiki-resolution` | Import resolver, 16 framework resolvers, name matcher, incremental pipeline |
| `codewiki-sync` | File watcher (`notify`), gitignore walk, git hook installation |
| `codewiki-core` | Shared types: `Node`, `Edge`, `CodeWikiError`, `Config` |
| `codewiki-graph` | Graph web UI (axum HTTP, embedded force-graph frontend) |

Data flow: **tree-sitter extraction** → **SQLite / FTS5** → **reference resolution** → **graph edges** → **MCP tools** / **graph UI**.

---

## Roadmap

- **wiki-md generation** — generate structured Markdown documentation from the graph
(per-module API docs, architecture diagrams, call-graph narratives) as a first-class
output format. This is the next major capability after the graph itself.

---

## Contributing

Contributions are welcome. See [CONTRIBUTING.md](CONTRIBUTING.md) for build, test, and
pull-request guidelines, and the [code of conduct](CODE_OF_CONDUCT.md) for community
expectations. To report a security issue, follow the process in [SECURITY.md](SECURITY.md).
Benchmark methodology and how to reproduce the numbers live in
[`benchmark/README.md`](benchmark/README.md).

---

## License

MIT licensed (© 2026 0xsyncroot). See [LICENSE](LICENSE). As a derivative work of
CodeGraph, the CodeGraph attribution and a verbatim copy of its original MIT copyright
notice (© 2026 Colby McHenry) are reproduced in [NOTICE](NOTICE), as the MIT License
requires. Third-party crate and tree-sitter grammar attributions are listed in
[LICENSE-THIRD-PARTY.md](LICENSE-THIRD-PARTY.md) (generated by `cargo about`).

---

## Credits

CodeWiki is a Rust port and derivative of [CodeGraph](https://github.com/colbymchenry/codegraph)
by Colby McHenry (MIT). It reimplements CodeGraph's architecture in Rust — including
the SQLite schema, resolution algorithms, search/scoring, and MCP tool interface —
with added optimisations, framework resolvers (Angular, full ASP.NET / .NET), a graph
web UI, and i18n / Unicode search. See [NOTICE](NOTICE) for the full attribution,
which reproduces CodeGraph's original MIT copyright notice as required.