{"id":47597712,"url":"https://github.com/optave/ops-codegraph-tool","last_synced_at":"2026-04-30T06:01:09.391Z","repository":{"id":339750272,"uuid":"1163218957","full_name":"optave/ops-codegraph-tool","owner":"optave","description":"Code intelligence CLI — function-level dependency graph across 11 languages, 30-tool MCP server for AI agents, complexity metrics, architecture boundary enforcement, CI quality gates, git diff impact with co-change analysis, hybrid semantic search. Fully local, zero API keys required.","archived":false,"fork":false,"pushed_at":"2026-04-07T08:30:52.000Z","size":10187,"stargazers_count":36,"open_issues_count":3,"forks_count":5,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-04-07T08:31:23.028Z","etag":null,"topics":["ai-agents","architecture","ci-cd","cli","code-analysis","code-quality","codeowners","complexity-metrics","dependency-graph","impact-analysis","incremental-builds","mcp-server","semantic-search","sqlite","static-analysis","tree-sitter"],"latest_commit_sha":null,"homepage":null,"language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/optave.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":"SUPPORT.md","governance":null,"roadmap":"docs/roadmap/BACKLOG.md","authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":"CLA.md"},"funding":{"github":["optave"]}},"created_at":"2026-02-21T09:31:55.000Z","updated_at":"2026-04-07T08:23:05.000Z","dependencies_parsed_at":"2026-02-27T09:00:54.837Z","dependency_job_id":null,"html_url":"https://github.com/optave/ops-codegraph-tool","commit_stats":null,"previous_names":["optave/codegraph","optave/ops-codegraph-tool"],"tags_count":257,"template":false,"template_full_name":null,"purl":"pkg:github/optave/ops-codegraph-tool","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/optave%2Fops-codegraph-tool","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/optave%2Fops-codegraph-tool/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/optave%2Fops-codegraph-tool/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/optave%2Fops-codegraph-tool/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/optave","download_url":"https://codeload.github.com/optave/ops-codegraph-tool/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/optave%2Fops-codegraph-tool/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31676652,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-11T08:18:19.405Z","status":"ssl_error","status_checked_at":"2026-04-11T08:17:08.892Z","response_time":54,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai-agents","architecture","ci-cd","cli","code-analysis","code-quality","codeowners","complexity-metrics","dependency-graph","impact-analysis","incremental-builds","mcp-server","semantic-search","sqlite","static-analysis","tree-sitter"],"created_at":"2026-04-01T18:27:40.858Z","updated_at":"2026-04-30T06:01:09.373Z","avatar_url":"https://github.com/optave.png","language":"TypeScript","readme":"\u003cp align=\"center\"\u003e\n \u003cimg src=\"https://img.shields.io/badge/codegraph-dependency%20intelligence-blue?style=for-the-badge\u0026logo=graphql\u0026logoColor=white\" alt=\"codegraph\" /\u003e\n\u003c/p\u003e\n\n\u003ch1 align=\"center\"\u003ecodegraph\u003c/h1\u003e\n\n\u003cp align=\"center\"\u003e\n \u003cstrong\u003eGive your AI the map before it starts exploring.\u003c/strong\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n \u003ca href=\"https://www.npmjs.com/package/@optave/codegraph\"\u003e\u003cimg src=\"https://img.shields.io/npm/v/@optave/codegraph?style=flat-square\u0026logo=npm\u0026logoColor=white\u0026label=npm\" alt=\"npm version\" /\u003e\u003c/a\u003e\n \u003ca href=\"https://github.com/optave/ops-codegraph-tool/blob/main/LICENSE\"\u003e\u003cimg src=\"https://img.shields.io/github/license/optave/ops-codegraph-tool?style=flat-square\u0026logo=opensourceinitiative\u0026logoColor=white\" alt=\"Apache-2.0 License\" /\u003e\u003c/a\u003e\n \u003ca href=\"https://github.com/optave/ops-codegraph-tool/actions\"\u003e\u003cimg src=\"https://img.shields.io/github/actions/workflow/status/optave/ops-codegraph-tool/codegraph-impact.yml?style=flat-square\u0026logo=githubactions\u0026logoColor=white\u0026label=CI\" alt=\"CI\" /\u003e\u003c/a\u003e\n \u003cimg src=\"https://img.shields.io/badge/node-%3E%3D22.6-339933?style=flat-square\u0026logo=node.js\u0026logoColor=white\" alt=\"Node \u003e= 22.6\" /\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n \u003ca href=\"#the-problem\"\u003eThe Problem\u003c/a\u003e \u0026middot;\n \u003ca href=\"#what-codegraph-does\"\u003eWhat It Does\u003c/a\u003e \u0026middot;\n \u003ca href=\"#-quick-start\"\u003eQuick Start\u003c/a\u003e \u0026middot;\n \u003ca href=\"#-commands\"\u003eCommands\u003c/a\u003e \u0026middot;\n \u003ca href=\"#-language-support\"\u003eLanguages\u003c/a\u003e \u0026middot;\n \u003ca href=\"#-ai-agent-integration-core\"\u003eAI Integration\u003c/a\u003e \u0026middot;\n \u003ca href=\"#-how-it-works\"\u003eHow It Works\u003c/a\u003e \u0026middot;\n \u003ca href=\"#-recommended-practices\"\u003ePractices\u003c/a\u003e \u0026middot;\n \u003ca href=\"#-roadmap\"\u003eRoadmap\u003c/a\u003e\n\u003c/p\u003e\n\n---\n\n## The Problem\n\nAI agents face an impossible trade-off. They either spend thousands of tokens reading files to understand a codebase's structure — blowing up their context window until quality degrades — or they assume how things work, and the assumptions are often wrong. Either way, things break. The larger the codebase, the worse it gets.\n\nAn agent modifies a function without knowing 9 files import it. It misreads what a helper does and builds logic on top of that misunderstanding. It leaves dead code behind after a refactor. The PR gets opened, and your reviewer — human or automated — flags the same structural issues again and again: _\"this breaks 14 callers,\"_ _\"that function already exists,\"_ _\"this export is now dead.\"_ If the reviewer catches it, that's multiple rounds of back-and-forth. If they don't, it can ship to production. Multiply that by every PR, every developer, every repo.\n\nThe information to prevent these issues exists — it's in the code itself. But without a structured map, agents lack the context to get it right consistently, reviewers waste cycles on preventable issues, and architecture degrades one unreviewed change at a time.\n\n## What Codegraph Does\n\nCodegraph builds a function-level dependency graph of your entire codebase — every function, every caller, every dependency — and keeps it current with sub-second incremental rebuilds.\n\nIt parses your code with [tree-sitter](https://tree-sitter.github.io/) (native Rust or WASM), stores the graph in SQLite, and exposes it where it matters most:\n\n- **MCP server** — AI agents query the graph directly through 30 tools — one call instead of 30 `grep`/`find`/`cat` invocations\n- **CLI** — developers and agents explore, query, and audit code from the terminal\n- **CI gates** — `check` and `manifesto` commands enforce quality thresholds with exit codes\n- **Programmatic API** — embed codegraph in your own tools via `npm install`\n\nInstead of an agent editing code without structural context and letting reviewers catch the fallout, it knows _\"this function has 14 callers across 9 files\"_ before it touches anything. Dead exports, circular dependencies, and boundary violations surface during development — not during review. The result: PRs that need fewer review rounds.\n\n**Free. Open source. Fully local.** Zero network calls, zero telemetry. Your code stays on your machine. When you want deeper intelligence, bring your own LLM provider — your code only goes where you choose to send it.\n\n**Three commands to a queryable graph:**\n\n```bash\nnpm install -g @optave/codegraph\ncd your-project\ncodegraph build\n```\n\nNo config files, no Docker, no JVM, no API keys, no accounts. Point your agent at the MCP server and it has structural awareness of your codebase.\n\n### Why it matters\n\n| | Without codegraph | With codegraph |\n|---|---|---|\n| **Code review** | Reviewers flag broken callers, dead code, and boundary violations round after round | Structural issues are caught during development — PRs pass review with fewer rounds |\n| **AI agents** | Modify `parseConfig()` without knowing 9 files import it — reviewer catches it | `fn-impact parseConfig` shows every caller before the edit — agent fixes it proactively |\n| **AI agents** | Leave dead exports and duplicate helpers behind after refactors | Dead code, cycles, and duplicates surface in real time via hooks and MCP queries |\n| **AI agents** | Produce code that works but doesn't fit the codebase structure | `context \u003cname\u003e -T` returns source, deps, callers, and tests — the agent writes code that fits |\n| **CI pipelines** | Catch test failures but miss structural degradation | `check --staged` fails the build when blast radius or complexity thresholds are exceeded |\n| **Developers** | Inherit a codebase and grep for hours to understand what calls what | `context handleAuth -T` gives the same structured view agents use |\n| **Architects** | Draw boundary rules that erode within weeks | `manifesto` and `boundaries` enforce architecture rules on every commit |\n\n### Feature comparison\n\n\u003csub\u003eComparison last verified: March 2026. Claims verified against each repo's README/docs. Full analysis: \u003ca href=\"generated/competitive/COMPETITIVE_ANALYSIS.md\"\u003eCOMPETITIVE_ANALYSIS.md\u003c/a\u003e\u003c/sub\u003e\n\n| Capability | codegraph | [joern](https://github.com/joernio/joern) | [narsil-mcp](https://github.com/postrv/narsil-mcp) | [cpg](https://github.com/Fraunhofer-AISEC/cpg) | [axon](https://github.com/harshkedia177/axon) | [GitNexus](https://github.com/abhigyanpatwari/GitNexus) |\n|---|:---:|:---:|:---:|:---:|:---:|:---:|\n| Languages | **34** | ~12 | **32** | ~10 | 3 | 13 |\n| MCP server | **Yes** | — | **Yes** | **Yes** | **Yes** | **Yes** |\n| Dataflow + CFG + AST querying | **Yes** | **Yes** | **Yes**¹ | **Yes** | — | — |\n| Hybrid search (BM25 + semantic) | **Yes** | — | — | — | **Yes** | **Yes** |\n| Git-aware (diff impact, co-change, branch diff) | **All 3** | — | — | — | **All 3** | — |\n| Dead code / role classification | **Yes** | — | **Yes** | — | **Yes** | — |\n| Incremental rebuilds | **O(changed)** | — | O(n) | — | **Yes** | Commit-level⁴ |\n| Architecture rules + CI gate | **Yes** | — | — | — | — | — |\n| Security scanning (SAST / vuln detection) | Intentionally out of scope² | **Yes** | **Yes** | **Yes** | — | — |\n| Zero config, `npm install` | **Yes** | — | **Yes** | — | **Yes** | **Yes** |\n| Graph export (GraphML / Neo4j / DOT) | **Yes** | **Yes** | — | — | — | — |\n| Open source + commercial use | **Yes** (Apache-2.0) | **Yes** (Apache-2.0) | **Yes** (MIT/Apache-2.0) | **Yes** (Apache-2.0) | Source-available³ | Non-commercial⁵ |\n\n\u003csup\u003e¹ narsil-mcp added CFG and dataflow in recent versions. ² Codegraph focuses on structural understanding, not vulnerability detection — use dedicated SAST tools (Semgrep, CodeQL, Snyk) for that. ³ axon claims MIT in pyproject.toml but has no LICENSE file in the repo. ⁴ GitNexus skips re-index if the git commit hasn't changed, but re-processes the entire repo when it does — no per-file incremental parsing. ⁵ GitNexus uses the PolyForm Noncommercial 1.0.0 license.\u003c/sup\u003e\n\n### What makes codegraph different\n\n| | Differentiator | In practice |\n|---|---|---|\n| **🤖** | **AI-first architecture** | 30-tool [MCP server](https://modelcontextprotocol.io/) — agents query the graph directly instead of scraping the filesystem. One call replaces 20+ grep/find/cat invocations |\n| **🏷️** | **Role classification** | Every symbol auto-tagged as `entry`/`core`/`utility`/`adapter`/`dead`/`leaf` — agents understand a symbol's architectural role without reading surrounding code |\n| **🔬** | **Function-level, not just files** | Traces `handleAuth()` → `validateToken()` → `decryptJWT()` and shows 14 callers across 9 files break if `decryptJWT` changes |\n| **⚡** | **Always-fresh graph** | Three-tier change detection: journal (O(changed)) → mtime+size (O(n) stats) → hash (O(changed) reads). Sub-second rebuilds — agents work with current data |\n| **💥** | **Git diff impact** | `codegraph diff-impact` shows changed functions, their callers, and full blast radius — enriched with historically coupled files from git co-change analysis. Ships with a GitHub Actions workflow |\n| **🌐** | **Multi-language, one graph** | 34 languages in a single graph — JS/TS, Python, Go, Rust, Java, C#, PHP, Ruby, C/C++, Kotlin, Swift, Scala, Bash, HCL, Elixir, Lua, Dart, Zig, Haskell, OCaml, F#, Gleam, Clojure, Julia, R, Erlang, Solidity, Objective-C, CUDA, Groovy, Verilog — agents don't need per-language tools |\n| **🧠** | **Hybrid search** | BM25 keyword + semantic embeddings fused via RRF — `hybrid` (default), `semantic`, or `keyword` mode; multi-query via `\"auth; token; JWT\"` |\n| **🔬** | **Dataflow + CFG** | Track how data flows through functions (`flows_to`, `returns`, `mutates`) and visualize intraprocedural control flow graphs for all 34 languages |\n| **🔓** | **Fully local, zero cost** | No API keys, no accounts, no network calls. Optionally bring your own LLM provider — your code only goes where you choose |\n\n---\n\n## 🚀 Quick Start\n\n```bash\nnpm install -g @optave/codegraph\ncd your-project\ncodegraph build # → .codegraph/graph.db created\n```\n\nThat's it. The graph is ready. Now connect your AI agent.\n\n### For AI agents (primary use case)\n\nConnect directly via MCP — your agent gets 30 tools to query the graph:\n\n```bash\ncodegraph mcp # 33-tool MCP server — AI queries the graph directly\n```\n\nOr add codegraph to your agent's instructions (e.g. `CLAUDE.md`):\n\n```markdown\nBefore modifying code, always:\n1. `codegraph where \u003cname\u003e` — find where the symbol lives\n2. `codegraph context \u003cname\u003e -T` — get full context (source, deps, callers)\n3. `codegraph fn-impact \u003cname\u003e -T` — check blast radius before editing\n\nAfter modifying code:\n4. `codegraph diff-impact --staged -T` — verify impact before committing\n```\n\nFull agent setup: [AI Agent Guide](docs/guides/ai-agent-guide.md) \u0026middot; [CLAUDE.md template](docs/guides/ai-agent-guide.md#claudemd-template)\n\n### For developers\n\nThe same graph is available via CLI:\n\n```bash\ncodegraph map # see most-connected files\ncodegraph query myFunc # find any function, see callers \u0026 callees\ncodegraph deps src/index.ts # file-level import/export map\n```\n\nOr install from source:\n\n```bash\ngit clone https://github.com/optave/ops-codegraph-tool.git\ncd codegraph \u0026\u0026 npm install \u0026\u0026 npm link\n```\n\n\u003e **Dev builds:** Pre-release tarballs are attached to [GitHub Releases](https://github.com/optave/ops-codegraph-tool/releases). Install with `npm install -g \u003cpath-to-tarball\u003e`. Note that `npm install -g \u003ctarball-url\u003e` does not work because npm cannot resolve optional platform-specific dependencies from a URL — download the `.tgz` first, then install from the local file.\n\n---\n\n## ✨ Features\n\n| | Feature | Description |\n|---|---|---|\n| 🤖 | **MCP server** | 33-tool MCP server for AI assistants; single-repo by default, opt-in multi-repo |\n| 🎯 | **Deep context** | `context` gives agents source, deps, callers, signature, and tests for a function in one call; `audit --quick` gives structural summaries |\n| 🏷️ | **Node role classification** | Every symbol auto-tagged as `entry`/`core`/`utility`/`adapter`/`dead`/`leaf` based on connectivity — agents instantly know architectural role |\n| 📦 | **Batch querying** | Accept a list of targets and return all results in one JSON payload — enables multi-agent parallel dispatch |\n| 💥 | **Impact analysis** | Trace every file affected by a change (transitive) |\n| 🧬 | **Function-level tracing** | Call chains, caller trees, function-level impact, and A→B pathfinding with qualified call resolution |\n| 📍 | **Fast lookup** | `where` shows exactly where a symbol is defined and used — minimal, fast |\n| 🔍 | **Symbol search** | Find any function, class, or method by name — exact match priority, relevance scoring, `--file` and `--kind` filters |\n| 📁 | **File dependencies** | See what a file imports and what imports it |\n| 📊 | **Diff impact** | Parse `git diff`, find overlapping functions, trace their callers |\n| 🔗 | **Co-change analysis** | Analyze git history for files that always change together — surfaces hidden coupling the static graph can't see; enriches `diff-impact` with historically coupled files |\n| 🗺️ | **Module map** | Bird's-eye view of your most-connected files |\n| 🏗️ | **Structure \u0026 hotspots** | Directory cohesion scores, fan-in/fan-out hotspot detection, module boundaries |\n| 🔄 | **Cycle detection** | Find circular dependencies at file or function level |\n| 📤 | **Export** | DOT, Mermaid, JSON, GraphML, GraphSON, and Neo4j CSV graph export |\n| 🧠 | **Semantic search** | Embeddings-powered natural language search with multi-query RRF ranking |\n| 👀 | **Watch mode** | Incrementally update the graph as files change |\n| ⚡ | **Always fresh** | Three-tier incremental detection — sub-second rebuilds even on large codebases |\n| 🔬 | **Data flow analysis** | Intraprocedural parameter tracking, return consumers, argument flows, and mutation detection — all 34 languages |\n| 🧮 | **Complexity metrics** | Cognitive, cyclomatic, nesting depth, Halstead, and Maintainability Index per function |\n| 🏘️ | **Community detection** | Leiden clustering to discover natural module boundaries and architectural drift |\n| 📜 | **Manifesto rule engine** | Configurable pass/fail rules with warn/fail thresholds for CI gates via `check` (exit code 1 on fail) |\n| 👥 | **CODEOWNERS integration** | Map graph nodes to CODEOWNERS entries — see who owns each function, ownership boundaries in `diff-impact` |\n| 💾 | **Graph snapshots** | `snapshot save`/`restore` for instant DB backup and rollback — checkpoint before refactoring, restore without rebuilding |\n| 🔎 | **Hybrid BM25 + semantic search** | FTS5 keyword search + embedding-based semantic search fused via Reciprocal Rank Fusion — `hybrid`, `semantic`, or `keyword` modes |\n| 📄 | **Pagination \u0026 NDJSON streaming** | Universal `--limit`/`--offset` pagination on all MCP tools and CLI commands; `--ndjson` for newline-delimited JSON streaming |\n| 🔀 | **Branch structural diff** | Compare code structure between two git refs — added/removed/changed symbols with transitive caller impact |\n| 🛡️ | **Architecture boundaries** | User-defined dependency rules between modules with onion architecture preset — violations flagged in manifesto and CI |\n| ✅ | **CI validation predicates** | `check` command with configurable gates: complexity, blast radius, cycles, boundary violations — exit code 0/1 for CI |\n| 📋 | **Composite audit** | Single `audit` command combining explain + impact + health metrics per function — one call instead of 3-4 |\n| 🚦 | **Triage queue** | `triage` merges connectivity, hotspots, roles, and complexity into a ranked audit priority queue |\n| 🔬 | **Dataflow analysis** | Track how data moves through functions with `flows_to`, `returns`, and `mutates` edges — all 34 languages, included by default, skip with `--no-dataflow` |\n| 🧩 | **Control flow graph** | Intraprocedural CFG construction for all 34 languages — `cfg` command with text/DOT/Mermaid output, included by default, skip with `--no-cfg` |\n| 🔎 | **AST node querying** | Stored queryable AST nodes (calls, `new`, string, regex, throw, await) — `ast` command with SQL GLOB pattern matching |\n| 🧬 | **Expanded node/edge types** | `parameter`, `property`, `constant` node kinds with `parent_id` for sub-declaration queries; `contains`, `parameter_of`, `receiver` edge kinds |\n| 📊 | **Exports analysis** | `exports \u003cfile\u003e` shows all exported symbols with per-symbol consumers, re-export detection, and counts |\n| 📈 | **Interactive viewer** | `codegraph plot` generates an interactive HTML graph viewer with hierarchical/force/radial layouts, complexity overlays, and drill-down |\n| 🏷️ | **Stable JSON schema** | `normalizeSymbol` utility ensures consistent 7-field output (name, kind, file, line, endLine, role, fileHash) across all commands |\n\nSee [docs/examples](docs/examples) for real-world CLI and MCP usage examples.\n\n## 📦 Commands\n\n### Build \u0026 Watch\n\n```bash\ncodegraph build [dir] # Parse and build the dependency graph\ncodegraph build --no-incremental # Force full rebuild\ncodegraph build --dataflow # Extract data flow edges (flows_to, returns, mutates)\ncodegraph build --engine wasm # Force WASM engine (skip native)\ncodegraph watch [dir] # Watch for changes, update graph incrementally\n```\n\n### Query \u0026 Explore\n\n```bash\ncodegraph query \u003cname\u003e # Find a symbol — shows callers and callees\ncodegraph deps \u003cfile\u003e # File imports/exports\ncodegraph map # Top 20 most-connected files\ncodegraph map -n 50 --no-tests # Top 50, excluding test files\ncodegraph where \u003cname\u003e # Where is a symbol defined and used?\ncodegraph where --file src/db.js # List symbols, imports, exports for a file\ncodegraph stats # Graph health: nodes, edges, languages, quality score\ncodegraph roles # Node role classification (entry, core, utility, adapter, dead, leaf)\ncodegraph roles --role dead -T # Find dead code (unreferenced, non-exported symbols)\ncodegraph roles --role core --file src/ # Core symbols in src/\ncodegraph exports src/queries.js # Per-symbol consumer analysis (who calls each export)\ncodegraph children \u003cname\u003e # List parameters, properties, constants of a symbol\n```\n\n### Deep Context (designed for AI agents)\n\n```bash\ncodegraph context \u003cname\u003e # Full context: source, deps, callers, signature, tests\ncodegraph context \u003cname\u003e --depth 2 --no-tests # Include callee source 2 levels deep\ncodegraph brief \u003cfile\u003e # Token-efficient file summary: symbols, roles, risk tiers\ncodegraph audit \u003cfile\u003e --quick # Structural summary: public API, internals, data flow\ncodegraph audit \u003cfunction\u003e --quick # Function summary: signature, calls, callers, tests\n```\n\n### Impact Analysis\n\n```bash\ncodegraph impact \u003cfile\u003e # Transitive reverse dependency trace\ncodegraph query \u003cname\u003e # Function-level: callers, callees, call chain\ncodegraph query \u003cname\u003e --no-tests --depth 5\ncodegraph fn-impact \u003cname\u003e # What functions break if this one changes\ncodegraph path \u003cfrom\u003e \u003cto\u003e # Shortest path between two symbols (A calls...calls B)\ncodegraph path \u003cfrom\u003e \u003cto\u003e --reverse # Follow edges backward\ncodegraph path \u003cfrom\u003e \u003cto\u003e --depth 5 --kinds calls,imports\ncodegraph diff-impact # Impact of unstaged git changes\ncodegraph diff-impact --staged # Impact of staged changes\ncodegraph diff-impact HEAD~3 # Impact vs a specific ref\ncodegraph diff-impact main --format mermaid -T # Mermaid flowchart of blast radius\ncodegraph branch-compare main feature-branch # Structural diff between two refs\ncodegraph branch-compare main HEAD --no-tests # Symbols added/removed/changed vs main\ncodegraph branch-compare v2.4.0 v2.5.0 --json # JSON output for programmatic use\ncodegraph branch-compare main HEAD --format mermaid # Mermaid diagram of structural changes\n```\n\n### Co-Change Analysis\n\nAnalyze git history to find files that always change together — surfaces hidden coupling the static graph can't see. Requires a git repository.\n\n```bash\ncodegraph co-change --analyze # Scan git history and populate co-change data\ncodegraph co-change src/queries.js # Show co-change partners for a file\ncodegraph co-change # Show top co-changing file pairs globally\ncodegraph co-change --since 6m # Limit to last 6 months of history\ncodegraph co-change --min-jaccard 0.5 # Only show strong coupling (Jaccard \u003e= 0.5)\ncodegraph co-change --min-support 5 # Minimum co-commit count\ncodegraph co-change --full # Include all details\n```\n\nCo-change data also enriches `diff-impact` — historically coupled files appear in a `historicallyCoupled` section alongside the static dependency analysis.\n\n### Structure \u0026 Hotspots\n\n```bash\ncodegraph structure # Directory overview with cohesion scores\ncodegraph triage --level file # Files with extreme fan-in, fan-out, or density\ncodegraph triage --level directory --sort coupling --no-tests\n```\n\n### Code Health \u0026 Architecture\n\n```bash\ncodegraph complexity # Per-function cognitive, cyclomatic, nesting, MI\ncodegraph complexity --health -T # Full Halstead health view (volume, effort, bugs, MI)\ncodegraph complexity --sort mi -T # Sort by worst maintainability index\ncodegraph complexity --above-threshold -T # Only functions exceeding warn thresholds\ncodegraph communities # Leiden community detection — natural module boundaries\ncodegraph communities --drift -T # Drift analysis only — split/merge candidates\ncodegraph communities --functions # Function-level community detection\ncodegraph check # Pass/fail rule engine (exit code 1 on fail)\ncodegraph check -T # Exclude test files from rule evaluation\n```\n\n### Dataflow, CFG \u0026 AST\n\n```bash\ncodegraph dataflow \u003cname\u003e # Data flow edges for a function (flows_to, returns, mutates)\ncodegraph dataflow \u003cname\u003e --impact # Transitive data-dependent blast radius\ncodegraph cfg \u003cname\u003e # Control flow graph (text format)\ncodegraph cfg \u003cname\u003e --format dot # CFG as Graphviz DOT\ncodegraph cfg \u003cname\u003e --format mermaid # CFG as Mermaid diagram\ncodegraph ast # List all stored AST nodes\ncodegraph ast \"handleAuth\" # Search AST nodes by pattern (GLOB)\ncodegraph ast -k call # Filter by kind: call, new, string, regex, throw, await\ncodegraph ast -k throw --file src/ # Combine kind and file filters\n```\n\n\u003e **Note:** Dataflow and CFG are included by default for all 34 languages. Use `--no-dataflow` / `--no-cfg` for faster builds.\n\n\n### Audit, Triage \u0026 Batch\n\nComposite commands for risk-driven workflows and multi-agent dispatch.\n\n```bash\ncodegraph audit \u003cfile-or-function\u003e # Combined structural summary + impact + health in one report\ncodegraph audit \u003ctarget\u003e --quick # Structural summary only (skip impact and health)\ncodegraph audit src/queries.js -T # Audit all functions in a file\ncodegraph triage # Ranked audit priority queue (connectivity + hotspots + roles)\ncodegraph triage -T --limit 20 # Top 20 riskiest functions, excluding tests\ncodegraph triage --level file -T # File-level hotspot analysis\ncodegraph triage --level directory -T # Directory-level hotspot analysis\ncodegraph batch target1 target2 ... # Batch query multiple targets in one call\ncodegraph batch --json targets.json # Batch from a JSON file\n```\n\n### CI Validation\n\n`codegraph check` provides configurable pass/fail predicates for CI gates and state machines. Exit code 0 = pass, 1 = fail.\n\n```bash\ncodegraph check # Run manifesto rules on whole codebase\ncodegraph check --staged # Check staged changes (diff predicates)\ncodegraph check --staged --rules # Run both diff predicates AND manifesto rules\ncodegraph check --no-new-cycles # Fail if staged changes introduce cycles\ncodegraph check --max-complexity 30 # Fail if any function exceeds complexity threshold\ncodegraph check --max-blast-radius 50 # Fail if blast radius exceeds limit\ncodegraph check --no-boundary-violations # Fail on architecture boundary violations\ncodegraph check main # Check current branch vs main\n```\n\n### CODEOWNERS\n\nMap graph symbols to CODEOWNERS entries. Shows who owns each function and surfaces ownership boundaries.\n\n```bash\ncodegraph owners # Show ownership for all symbols\ncodegraph owners src/queries.js # Ownership for symbols in a specific file\ncodegraph owners --boundary # Show ownership boundaries between modules\ncodegraph owners --owner @backend # Filter by owner\n```\n\nOwnership data also enriches `diff-impact` — affected owners and suggested reviewers appear alongside the static dependency analysis.\n\n### Snapshots\n\nLightweight SQLite DB backup and restore — checkpoint before refactoring, instantly rollback without rebuilding.\n\n```bash\ncodegraph snapshot save before-refactor # Save a named snapshot\ncodegraph snapshot list # List all snapshots\ncodegraph snapshot restore before-refactor # Restore a snapshot\ncodegraph snapshot delete before-refactor # Delete a snapshot\n```\n\n### Export \u0026 Visualization\n\n```bash\ncodegraph export -f dot # Graphviz DOT format\ncodegraph export -f mermaid # Mermaid diagram\ncodegraph export -f json # JSON graph\ncodegraph export -f graphml # GraphML (XML standard)\ncodegraph export -f graphson # GraphSON (TinkerPop v3 / Gremlin)\ncodegraph export -f neo4j # Neo4j CSV (bulk import, separate nodes/relationships files)\ncodegraph export --functions -o graph.dot # Function-level, write to file\ncodegraph plot # Interactive HTML viewer with force/hierarchical/radial layouts\ncodegraph cycles # Detect circular dependencies\ncodegraph cycles --functions # Function-level cycles\n```\n\n### Semantic Search\n\nLocal embeddings for every function, method, and class — search by natural language. Everything runs locally using [@huggingface/transformers](https://huggingface.co/docs/transformers.js) — no API keys needed.\n\n```bash\ncodegraph embed # Build embeddings (default: nomic-v1.5)\ncodegraph embed --model nomic # Use a different model\ncodegraph search \"handle authentication\"\ncodegraph search \"parse config\" --min-score 0.4 -n 10\ncodegraph search \"parseConfig\" --mode keyword # BM25 keyword-only (exact names)\ncodegraph search \"auth flow\" --mode semantic # Embedding-only (conceptual)\ncodegraph search \"auth flow\" --mode hybrid # BM25 + semantic RRF fusion (default)\ncodegraph models # List available models\n```\n\n#### Multi-query search\n\nSeparate queries with `;` to search from multiple angles at once. Results are ranked using [Reciprocal Rank Fusion (RRF)](https://plg.uwaterloo.ca/~gvcormac/cormacksigir09-rrf.pdf) — items that rank highly across multiple queries rise to the top.\n\n```bash\ncodegraph search \"auth middleware; JWT validation\"\ncodegraph search \"parse config; read settings; load env\" -n 20\ncodegraph search \"error handling; retry logic\" --kind function\ncodegraph search \"database connection; query builder\" --rrf-k 30\n```\n\nA single trailing semicolon is ignored (falls back to single-query mode). The `--rrf-k` flag controls the RRF smoothing constant (default 60) — lower values give more weight to top-ranked results.\n\n#### Available Models\n\n| Flag | Model | Dimensions | Size | License | Notes |\n|---|---|---|---|---|---|\n| `minilm` | all-MiniLM-L6-v2 | 384 | ~23 MB | Apache-2.0 | Fastest, good for quick iteration |\n| `jina-small` | jina-embeddings-v2-small-en | 512 | ~33 MB | Apache-2.0 | Better quality, still small |\n| `jina-base` | jina-embeddings-v2-base-en | 768 | ~137 MB | Apache-2.0 | High quality, 8192 token context |\n| `jina-code` | jina-embeddings-v2-base-code | 768 | ~137 MB | Apache-2.0 | Best for code search, trained on code+text (requires HF token) |\n| `nomic` | nomic-embed-text-v1 | 768 | ~137 MB | Apache-2.0 | Good quality, 8192 context |\n| `nomic-v1.5` (default) | nomic-embed-text-v1.5 | 768 | ~137 MB | Apache-2.0 | **Improved nomic, Matryoshka dimensions** |\n| `bge-large` | bge-large-en-v1.5 | 1024 | ~335 MB | MIT | Best general retrieval, top MTEB scores |\n\nThe model used during `embed` is stored in the database, so `search` auto-detects it — no need to pass `--model` when searching.\n\n### Multi-Repo Registry\n\nManage a global registry of codegraph-enabled projects. The registry stores paths to your built graphs so the MCP server can query them when multi-repo mode is enabled.\n\n```bash\ncodegraph registry list # List all registered repos\ncodegraph registry list --json # JSON output\ncodegraph registry add \u003cdir\u003e # Register a project directory\ncodegraph registry add \u003cdir\u003e -n my-name # Custom name\ncodegraph registry remove \u003cname\u003e # Unregister\n```\n\n`codegraph build` auto-registers the project — no manual setup needed.\n\n### Common Flags\n\n| Flag | Description |\n|---|---|\n| `-d, --db \u003cpath\u003e` | Custom path to `graph.db` |\n| `-T, --no-tests` | Exclude `.test.`, `.spec.`, `__test__` files (available on most query commands including `query`, `fn-impact`, `path`, `context`, `where`, `diff-impact`, `search`, `map`, `roles`, `co-change`, `deps`, `impact`, `complexity`, `communities`, `branch-compare`, `audit`, `triage`, `check`, `dataflow`, `cfg`, `ast`, `exports`, `children`) |\n| `--depth \u003cn\u003e` | Transitive trace depth (default varies by command) |\n| `-j, --json` | Output as JSON |\n| `-v, --verbose` | Enable debug output |\n| `--engine \u003cengine\u003e` | Parser engine: `native`, `wasm`, or `auto` (default: `auto`) |\n| `-k, --kind \u003ckind\u003e` | Filter by kind: `function`, `method`, `class`, `interface`, `type`, `struct`, `enum`, `trait`, `record`, `module`, `parameter`, `property`, `constant` |\n| `-f, --file \u003cpath\u003e` | Scope to a specific file (`fn`, `context`, `where`) |\n| `--mode \u003cmode\u003e` | Search mode: `hybrid` (default), `semantic`, or `keyword` (`search`) |\n| `--ndjson` | Output as newline-delimited JSON (one object per line) |\n| `--table` | Output as auto-column aligned table |\n| `--csv` | Output as CSV (RFC 4180, nested objects flattened) |\n| `--limit \u003cn\u003e` | Limit number of results |\n| `--offset \u003cn\u003e` | Skip first N results (pagination) |\n| `--rrf-k \u003cn\u003e` | RRF smoothing constant for multi-query search (default 60) |\n\n## 🌐 Language Support\n\n| Language | Extensions | Imports | Exports | Call Sites | Heritage¹ | Type Inference² | Dataflow |\n|---|---|:---:|:---:|:---:|:---:|:---:|:---:|\n| ![JavaScript](https://img.shields.io/badge/-JavaScript-F7DF1E?style=flat-square\u0026logo=javascript\u0026logoColor=black) | `.js`, `.jsx`, `.mjs`, `.cjs` | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |\n| ![TypeScript](https://img.shields.io/badge/-TypeScript-3178C6?style=flat-square\u0026logo=typescript\u0026logoColor=white) | `.ts`, `.tsx` | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |\n| ![Python](https://img.shields.io/badge/-Python-3776AB?style=flat-square\u0026logo=python\u0026logoColor=white) | `.py`, `.pyi` | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |\n| ![Go](https://img.shields.io/badge/-Go-00ADD8?style=flat-square\u0026logo=go\u0026logoColor=white) | `.go` | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |\n| ![Rust](https://img.shields.io/badge/-Rust-000000?style=flat-square\u0026logo=rust\u0026logoColor=white) | `.rs` | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |\n| ![Java](https://img.shields.io/badge/-Java-ED8B00?style=flat-square\u0026logo=openjdk\u0026logoColor=white) | `.java` | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |\n| ![C#](https://img.shields.io/badge/-C%23-512BD4?style=flat-square\u0026logo=dotnet\u0026logoColor=white) | `.cs` | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |\n| ![PHP](https://img.shields.io/badge/-PHP-777BB4?style=flat-square\u0026logo=php\u0026logoColor=white) | `.php`, `.phtml` | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |\n| ![Ruby](https://img.shields.io/badge/-Ruby-CC342D?style=flat-square\u0026logo=ruby\u0026logoColor=white) | `.rb`, `.rake`, `.gemspec` | ✓ | ✓ | ✓ | ✓ | —³ | ✓ |\n| ![C](https://img.shields.io/badge/-C-A8B9CC?style=flat-square\u0026logo=c\u0026logoColor=black) | `.c`, `.h` | ✓ | ✓ | ✓ | —⁴ | —⁴ | ✓ |\n| ![C++](https://img.shields.io/badge/-C++-00599C?style=flat-square\u0026logo=cplusplus\u0026logoColor=white) | `.cpp`, `.hpp`, `.cc`, `.cxx` | ✓ | ✓ | ✓ | ✓ | — | ✓ |\n| ![Kotlin](https://img.shields.io/badge/-Kotlin-7F52FF?style=flat-square\u0026logo=kotlin\u0026logoColor=white) | `.kt`, `.kts` | ✓ | ✓ | ✓ | ✓ | — | ✓ |\n| ![Swift](https://img.shields.io/badge/-Swift-F05138?style=flat-square\u0026logo=swift\u0026logoColor=white) | `.swift` | ✓ | ✓ | ✓ | ✓ | — | ✓ |\n| ![Scala](https://img.shields.io/badge/-Scala-DC322F?style=flat-square\u0026logo=scala\u0026logoColor=white) | `.scala`, `.sc` | ✓ | ✓ | ✓ | ✓ | — | ✓ |\n| ![Bash](https://img.shields.io/badge/-Bash-4EAA25?style=flat-square\u0026logo=gnubash\u0026logoColor=white) | `.sh`, `.bash` | ✓ | ✓ | ✓ | —⁴ | —⁴ | ✓ |\n| ![Elixir](https://img.shields.io/badge/-Elixir-4B275F?style=flat-square\u0026logo=elixir\u0026logoColor=white) | `.ex`, `.exs` | ✓ | ✓ | ✓ | — | — | ✓ |\n| ![Lua](https://img.shields.io/badge/-Lua-2C2D72?style=flat-square\u0026logo=lua\u0026logoColor=white) | `.lua` | ✓ | ✓ | ✓ | — | — | ✓ |\n| ![Dart](https://img.shields.io/badge/-Dart-0175C2?style=flat-square\u0026logo=dart\u0026logoColor=white) | `.dart` | ✓ | ✓ | ✓ | ✓ | — | ✓ |\n| ![Zig](https://img.shields.io/badge/-Zig-F7A41D?style=flat-square\u0026logo=zig\u0026logoColor=white) | `.zig` | ✓ | ✓ | ✓ | — | — | ✓ |\n| ![Haskell](https://img.shields.io/badge/-Haskell-5D4F85?style=flat-square\u0026logo=haskell\u0026logoColor=white) | `.hs` | ✓ | ✓ | ✓ | — | — | ✓ |\n| ![OCaml](https://img.shields.io/badge/-OCaml-EC6813?style=flat-square\u0026logo=ocaml\u0026logoColor=white) | `.ml`, `.mli` | ✓ | ✓ | ✓ | — | — | ✓ |\n| ![F#](https://img.shields.io/badge/-F%23-378BBA?style=flat-square\u0026logo=fsharp\u0026logoColor=white) | `.fs`, `.fsx`, `.fsi` | ✓ | ✓ | ✓ | — | — | ✓ |\n| ![Gleam](https://img.shields.io/badge/-Gleam-FFAFF3?style=flat-square\u0026logoColor=black) | `.gleam` | ✓ | ✓ | ✓ | — | — | ✓ |\n| ![Clojure](https://img.shields.io/badge/-Clojure-5881D8?style=flat-square\u0026logo=clojure\u0026logoColor=white) | `.clj`, `.cljs`, `.cljc` | ✓ | ✓ | ✓ | — | — | ✓ |\n| ![Julia](https://img.shields.io/badge/-Julia-9558B2?style=flat-square\u0026logo=julia\u0026logoColor=white) | `.jl` | ✓ | ✓ | ✓ | — | — | ✓ |\n| ![R](https://img.shields.io/badge/-R-276DC3?style=flat-square\u0026logo=r\u0026logoColor=white) | `.r`, `.R` | ✓ | ✓ | ✓ | — | — | ✓ |\n| ![Erlang](https://img.shields.io/badge/-Erlang-A90533?style=flat-square\u0026logo=erlang\u0026logoColor=white) | `.erl`, `.hrl` | ✓ | ✓ | ✓ | — | — | ✓ |\n| ![Solidity](https://img.shields.io/badge/-Solidity-363636?style=flat-square\u0026logo=solidity\u0026logoColor=white) | `.sol` | ✓ | ✓ | ✓ | ✓ | — | ✓ |\n| ![Objective-C](https://img.shields.io/badge/-Objective--C-438EFF?style=flat-square\u0026logoColor=white) | `.m` | ✓ | ✓ | ✓ | ✓ | — | ✓ |\n| ![CUDA](https://img.shields.io/badge/-CUDA-76B900?style=flat-square\u0026logo=nvidia\u0026logoColor=white) | `.cu`, `.cuh` | ✓ | ✓ | ✓ | ✓ | — | ✓ |\n| ![Groovy](https://img.shields.io/badge/-Groovy-4298B8?style=flat-square\u0026logo=apachegroovy\u0026logoColor=white) | `.groovy`, `.gvy` | ✓ | ✓ | ✓ | ✓ | — | ✓ |\n| ![Verilog](https://img.shields.io/badge/-Verilog-848484?style=flat-square\u0026logoColor=white) | `.v`, `.sv` | ✓ | ✓ | ✓ | — | — | ✓ |\n| ![Terraform](https://img.shields.io/badge/-Terraform-844FBA?style=flat-square\u0026logo=terraform\u0026logoColor=white) | `.tf`, `.hcl` | ✓ | —³ | —³ | —³ | —³ | —³ |\n\n\u003e ¹ **Heritage** = `extends`, `implements`, `include`/`extend` (Ruby), trait `impl` (Rust), receiver methods (Go).\n\u003e ² **Type Inference** extracts a per-file type map from annotations (`const x: Router`, `MyType x`, `x: MyType`) and `new` expressions, enabling the edge resolver to connect `x.method()` → `Type.method()`.\n\u003e ³ Not applicable — Ruby is dynamically typed; Terraform/HCL is declarative (no functions, classes, or type system).\n\u003e ⁴ Not applicable — C and Bash have no class/inheritance system.\n\u003e All languages have full **parity** between the native Rust engine and the WASM fallback.\n\n## ⚙️ How It Works\n\n```\n┌──────────┐ ┌───────────┐ ┌───────────┐ ┌──────────┐ ┌─────────┐\n│ Source │──▶│ tree-sitter│──▶│ Extract │──▶│ Resolve │──▶│ SQLite │\n│ Files │ │ Parse │ │ Symbols │ │ Imports │ │ DB │\n└──────────┘ └───────────┘ └───────────┘ └──────────┘ └─────────┘\n │\n ▼\n ┌─────────┐\n │ Query │\n └─────────┘\n```\n\n1. **Parse** — tree-sitter parses every source file into an AST (native Rust engine or WASM fallback)\n2. **Extract** — Functions, classes, methods, interfaces, imports, exports, call sites, parameters, properties, and constants are extracted\n3. **Resolve** — Imports are resolved to actual files (handles ESM conventions, `tsconfig.json` path aliases, `baseUrl`)\n4. **Store** — Everything goes into SQLite as nodes + edges with tree-sitter node boundaries, plus structural edges (`contains`, `parameter_of`, `receiver`)\n5. **Analyze** (opt-in) — Complexity metrics, control flow graphs (`--cfg`), dataflow edges (`--dataflow`), and AST node storage\n6. **Query** — All queries run locally against the SQLite DB — typically under 100ms\n\n### Incremental Rebuilds\n\nThe graph stays current without re-parsing your entire codebase. Three-tier change detection ensures rebuilds are proportional to what changed, not the size of the project:\n\n1. **Tier 0 — Journal (O(changed)):** If `codegraph watch` was running, a change journal records exactly which files were touched. The next build reads the journal and only processes those files — zero filesystem scanning\n2. **Tier 1 — mtime+size (O(n) stats, O(changed) reads):** No journal? Codegraph stats every file and compares mtime + size against stored values. Matching files are skipped without reading a single byte\n3. **Tier 2 — Hash (O(changed) reads):** Files that fail the mtime/size check are read and MD5-hashed. Only files whose hash actually changed get re-parsed and re-inserted\n\n**Result:** change one file in a 3,000-file project and the rebuild completes in under a second. Put it in a commit hook, a file watcher, or let your AI agent trigger it.\n\n#### What incremental rebuilds refresh — and what they don't\n\nIncremental builds re-parse changed files and rebuild their edges, structure metrics, and role classifications. But some data is **only fully refreshed on a full rebuild:**\n\n| Data | Incremental | Full rebuild |\n|------|:-----------:|:------------:|\n| Symbols \u0026 edges for changed files | Yes | Yes |\n| Reverse-dependency cascade (importers of changed files) | Yes | Yes |\n| AST nodes, complexity, CFG, dataflow for changed files | Yes | Yes |\n| Directory-level cohesion metrics | Partial (skipped for ≤5 files) | Yes |\n| Advisory checks (orphaned embeddings, stale embeddings, unused exports) | Skipped | Yes |\n| Build metadata persistence | Skipped for ≤3 files | Yes |\n| Incremental drift detection | Skipped | Yes |\n\n**When to run a full rebuild:**\n\n```bash\ncodegraph build --no-incremental # Force full rebuild\n```\n\n- **After large refactors** (renames, moves, deleted files) — the reverse-dependency cascade handles most cases, but a full rebuild ensures nothing is stale\n- **If you suspect stale analysis data** — complexity or dataflow results for files you didn't directly edit won't update incrementally\n- **Periodically** — if you rely heavily on `complexity`, `dataflow`, `roles --role dead`, or `communities` queries, run a full rebuild weekly or after major merges\n- **After upgrading codegraph** — engine, schema, or version changes trigger an automatic full rebuild, but if you skip versions you may want to force one\n\nCodegraph auto-detects and forces a full rebuild when the engine, schema version, or codegraph version changes between builds. For everything else, incremental is the safe default — a full rebuild is a correctness guarantee, not a frequent necessity.\n\n\u003e **Detailed guide:** See [docs/guides/incremental-builds.md](docs/guides/incremental-builds.md) for a complete breakdown of what each build mode refreshes and recommended rebuild schedules.\n\n### Dual Engine\n\nCodegraph ships with two parsing engines:\n\n| Engine | How it works | When it's used |\n|--------|-------------|----------------|\n| **Native** (Rust) | napi-rs addon built from `crates/codegraph-core/` — parallel multi-core parsing via rayon | Auto-selected when the prebuilt binary is available |\n| **WASM** | `web-tree-sitter` with pre-built `.wasm` grammars in `grammars/` | Fallback when the native addon isn't installed |\n\nBoth engines produce identical output. Use `--engine native|wasm|auto` to control selection (default: `auto`).\n\nOn the native path, Rust handles the entire hot pipeline end-to-end:\n\n| Phase | What Rust does |\n|-------|---------------|\n| **Parse** | Parallel multi-file tree-sitter parsing via rayon (3.5× faster than WASM) |\n| **Extract** | Symbols, imports, calls, classes, type maps, AST nodes — all in one pass |\n| **Analyze** | Complexity (cognitive, cyclomatic, Halstead), CFG, and dataflow pre-computed per function during parse |\n| **Resolve** | Import resolution with 6-level priority system and confidence scoring |\n| **Edges** | Call, receiver, extends, and implements edge inference |\n| **DB writes** | All inserts (nodes, edges, AST nodes, complexity, CFG, dataflow) via rusqlite — `better-sqlite3` is lazy-loaded only for the WASM fallback path |\n\nThe Rust crate (`crates/codegraph-core/`) exposes a `NativeDatabase` napi-rs class that holds a persistent `rusqlite::Connection` for the full build lifecycle, eliminating JS↔SQLite round-trips on every operation.\n\n### Call Resolution\n\nCalls are resolved with **qualified resolution** — method calls (`obj.method()`) are distinguished from standalone function calls, and built-in receivers (`console`, `Math`, `JSON`, `Array`, `Promise`, etc.) are filtered out automatically. Import scope is respected: a call to `foo()` only resolves to functions that are actually imported or defined in the same file, eliminating false positives from name collisions.\n\n| Priority | Source | Confidence |\n|---|---|---|\n| 1 | **Import-aware** — `import { foo } from './bar'` → link to `bar` | `1.0` |\n| 2 | **Same-file** — definitions in the current file | `1.0` |\n| 3 | **Same directory** — definitions in sibling files (standalone calls only) | `0.7` |\n| 4 | **Same parent directory** — definitions in sibling dirs (standalone calls only) | `0.5` |\n| 5 | **Method hierarchy** — resolved through `extends`/`implements` | varies |\n\nMethod calls on unknown receivers skip global fallback entirely — `stmt.run()` will never resolve to a standalone `run` function in another file. Duplicate caller/callee edges are deduplicated automatically. Dynamic patterns like `fn.call()`, `fn.apply()`, `fn.bind()`, and `obj[\"method\"]()` are also detected on a best-effort basis.\n\nCodegraph also extracts symbols from common callback patterns: Commander `.command().action()` callbacks (as `command:build`), Express route handlers (as `route:GET /api/users`), and event emitter listeners (as `event:data`).\n\n## 📊 Performance\n\nSelf-measured on every release via CI ([build benchmarks](generated/benchmarks/BUILD-BENCHMARKS.md) | [embedding benchmarks](generated/benchmarks/EMBEDDING-BENCHMARKS.md) | [query benchmarks](generated/benchmarks/QUERY-BENCHMARKS.md) | [incremental benchmarks](generated/benchmarks/INCREMENTAL-BENCHMARKS.md) | [resolution precision/recall](tests/benchmarks/resolution/)):\n\n*Last updated: v3.9.4 (2026-04-18)*\n\n| Metric | Native | WASM |\n|---|---|---|\n| Build speed | **3.2 ms/file** | **16.3 ms/file** |\n| Query time | **29ms** | **44ms** |\n| No-op rebuild | **10ms** | **21ms** |\n| 1-file rebuild | **400ms** | **64ms** |\n| Query: fn-deps | **2.5ms** | **2.2ms** |\n| Query: path | **2.4ms** | **2.2ms** |\n| ~50,000 files (est.) | **~160.0s build** | **~815.0s build** |\n| Resolution precision | **90.3%** | — |\n| Resolution recall | **42.9%** | — |\n\nMetrics are normalized per file for cross-version comparability. Times above are for a full initial build — incremental rebuilds only re-parse changed files.\n\n\u003cdetails\u003e\u003csummary\u003ePer-language resolution precision/recall\u003c/summary\u003e\n\n| Language | Precision | Recall | TP | FP | FN | Edges | Dynamic |\n|----------|----------:|-------:|---:|---:|---:|------:|--------:|\n| javascript | 100.0% | 66.7% | 12 | 0 | 6 | 18 | 14/28 |\n| typescript | 93.8% | 75.0% | 15 | 1 | 5 | 20 | — |\n| bash | 100.0% | 100.0% | 12 | 0 | 0 | 12 | 0/1 |\n| c | 100.0% | 100.0% | 9 | 0 | 0 | 9 | — |\n| clojure | 80.0% | 26.7% | 4 | 1 | 11 | 15 | — |\n| cpp | 100.0% | 57.1% | 8 | 0 | 6 | 14 | — |\n| csharp | 100.0% | 52.6% | 10 | 0 | 9 | 19 | — |\n| cuda | 50.0% | 33.3% | 4 | 4 | 8 | 12 | — |\n| dart | 0.0% | 0.0% | 0 | 0 | 18 | 18 | — |\n| elixir | 0.0% | 0.0% | 0 | 0 | 15 | 15 | — |\n| erlang | 100.0% | 100.0% | 12 | 0 | 0 | 12 | — |\n| fsharp | 0.0% | 0.0% | 0 | 9 | 12 | 12 | — |\n| gleam | 100.0% | 26.7% | 4 | 0 | 11 | 15 | — |\n| go | 100.0% | 69.2% | 9 | 0 | 4 | 13 | 13/14 |\n| groovy | 100.0% | 7.7% | 1 | 0 | 12 | 13 | — |\n| haskell | 100.0% | 33.3% | 4 | 0 | 8 | 12 | — |\n| hcl | 0.0% | 0.0% | 0 | 0 | 2 | 2 | — |\n| java | 100.0% | 52.9% | 9 | 0 | 8 | 17 | — |\n| julia | 0.0% | 0.0% | 0 | 0 | 15 | 15 | — |\n| kotlin | 92.3% | 63.2% | 12 | 1 | 7 | 19 | — |\n| lua | 100.0% | 15.4% | 2 | 0 | 11 | 13 | — |\n| objc | 0.0% | 0.0% | 0 | 1 | 12 | 12 | — |\n| ocaml | 100.0% | 8.3% | 1 | 0 | 11 | 12 | — |\n| php | 100.0% | 31.6% | 6 | 0 | 13 | 19 | — |\n| python | 100.0% | 60.0% | 9 | 0 | 6 | 15 | 15/15 |\n| r | 100.0% | 100.0% | 11 | 0 | 0 | 11 | — |\n| ruby | 100.0% | 100.0% | 11 | 0 | 0 | 11 | 11/11 |\n| rust | 100.0% | 35.7% | 5 | 0 | 9 | 14 | — |\n| scala | 100.0% | 71.4% | 5 | 0 | 2 | 7 | — |\n| solidity | 33.3% | 7.7% | 1 | 2 | 12 | 13 | — |\n| swift | 75.0% | 42.9% | 6 | 2 | 8 | 14 | 9/9 |\n| tsx | 100.0% | 100.0% | 13 | 0 | 0 | 13 | — |\n| verilog | 0.0% | 0.0% | 0 | 0 | 4 | 4 | — |\n| zig | 0.0% | 0.0% | 0 | 0 | 15 | 15 | — |\n\n**By resolution mode (all languages):**\n\n| Mode | Resolved | Expected | Recall |\n|------|--------:|---------:|-------:|\n| module-function | 16 | 106 | 15.1% |\n| receiver-typed | 17 | 104 | 16.3% |\n| static | 66 | 93 | 71.0% |\n| same-file | 48 | 86 | 55.8% |\n| interface-dispatched | 7 | 12 | 58.3% |\n| class-inheritance | 0 | 4 | 0.0% |\n| trait-dispatch | 0 | 2 | 0.0% |\n| package-function | 1 | 1 | 100.0% |\n\n\u003c/details\u003e\n\n### Lightweight Footprint\n\n\u003ca href=\"https://www.npmjs.com/package/@optave/codegraph\"\u003e\u003cimg src=\"https://img.shields.io/npm/unpacked-size/@optave/codegraph?style=flat-square\u0026label=unpacked%20size\" alt=\"npm unpacked size\" /\u003e\u003c/a\u003e\n\nOnly **3 runtime dependencies** — everything else is optional or a devDependency:\n\n| Dependency | What it does | | |\n|---|---|---|---|\n| [better-sqlite3](https://github.com/WiseLibs/better-sqlite3) | SQLite driver (WASM engine; lazy-loaded, not used for native-engine reads) | ![GitHub stars](https://img.shields.io/github/stars/WiseLibs/better-sqlite3?style=flat-square\u0026label=%E2%AD%90) | ![npm downloads](https://img.shields.io/npm/dw/better-sqlite3?style=flat-square\u0026label=%F0%9F%93%A5%2Fwk) |\n| [commander](https://github.com/tj/commander.js) | CLI argument parsing | ![GitHub stars](https://img.shields.io/github/stars/tj/commander.js?style=flat-square\u0026label=%E2%AD%90) | ![npm downloads](https://img.shields.io/npm/dw/commander?style=flat-square\u0026label=%F0%9F%93%A5%2Fwk) |\n| [web-tree-sitter](https://github.com/tree-sitter/tree-sitter) | WASM tree-sitter bindings | ![GitHub stars](https://img.shields.io/github/stars/tree-sitter/tree-sitter?style=flat-square\u0026label=%E2%AD%90) | ![npm downloads](https://img.shields.io/npm/dw/web-tree-sitter?style=flat-square\u0026label=%F0%9F%93%A5%2Fwk) |\n\nOptional: `@huggingface/transformers` (semantic search), `@modelcontextprotocol/sdk` (MCP server) — lazy-loaded only when needed.\n\n## 🤖 AI Agent Integration (Core)\n\n### MCP Server\n\nCodegraph is built around a [Model Context Protocol](https://modelcontextprotocol.io/) server with 30 tools (31 in multi-repo mode) — the primary way agents consume the graph:\n\n```bash\ncodegraph mcp # Single-repo mode (default) — only local project\ncodegraph mcp --multi-repo # Enable access to all registered repos\ncodegraph mcp --repos a,b # Restrict to specific repos (implies --multi-repo)\n```\n\n**Single-repo mode (default):** Tools operate only on the local `.codegraph/graph.db`. The `repo` parameter and `list_repos` tool are not exposed to the AI agent.\n\n**Multi-repo mode (`--multi-repo`):** All tools gain an optional `repo` parameter to target any registered repository, and `list_repos` becomes available. Use `--repos` to restrict which repos the agent can access.\n\n### CLAUDE.md / Agent Instructions\n\nAdd this to your project's `CLAUDE.md` to help AI agents use codegraph. Full template with all commands in the [AI Agent Guide](docs/guides/ai-agent-guide.md#claudemd-template).\n\n```markdown\n## Codegraph\n\nThis project uses codegraph for dependency analysis. The graph is at `.codegraph/graph.db`.\n\n### Before modifying code:\n1. `codegraph where \u003cname\u003e` — find where the symbol lives\n2. `codegraph audit --quick \u003ctarget\u003e` — understand the structure\n3. `codegraph context \u003cname\u003e -T` — get full context (source, deps, callers)\n4. `codegraph fn-impact \u003cname\u003e -T` — check blast radius before editing\n\n### After modifying code:\n5. `codegraph diff-impact --staged -T` — verify impact before committing\n\n### Other useful commands\n- `codegraph build .` — rebuild graph (incremental by default)\n- `codegraph map` — module overview · `codegraph stats` — graph health\n- `codegraph query \u003cname\u003e -T` — call chain · `codegraph path \u003cfrom\u003e \u003cto\u003e -T` — shortest path\n- `codegraph deps \u003cfile\u003e` — file deps · `codegraph exports \u003cfile\u003e -T` — export consumers\n- `codegraph audit \u003ctarget\u003e -T` — full risk report · `codegraph triage -T` — priority queue\n- `codegraph check --staged` — CI gate · `codegraph batch t1 t2 -T --json` — batch query\n- `codegraph search \"\u003cquery\u003e\"` — semantic search · `codegraph cycles` — cycle detection\n- `codegraph roles --role dead -T` — dead code · `codegraph complexity -T` — metrics\n- `codegraph dataflow \u003cname\u003e -T` — data flow · `codegraph cfg \u003cname\u003e -T` — control flow\n\n### Flags\n- `-T` — exclude test files (use by default) · `-j` — JSON output\n- `-f, --file \u003cpath\u003e` — scope to file · `-k, --kind \u003ckind\u003e` — filter kind\n```\n\n## 📋 Recommended Practices\n\nSee **[docs/guides/recommended-practices.md](docs/guides/recommended-practices.md)** for integration guides:\n\n- **Git hooks** — auto-rebuild on commit, impact checks on push, commit message enrichment\n- **CI/CD** — PR impact comments, threshold gates, graph caching\n- **AI agents** — MCP server, CLAUDE.md templates, Claude Code hooks\n- **Developer workflow** — watch mode, explore-before-you-edit, semantic search\n- **Secure credentials** — `apiKeyCommand` with 1Password, Bitwarden, Vault, macOS Keychain, `pass`\n\nFor AI-specific integration, see the **[AI Agent Guide](docs/guides/ai-agent-guide.md)** — a comprehensive reference covering the 6-step agent workflow, complete command-to-MCP mapping, Claude Code hooks, and token-saving patterns.\n\n## 🔁 CI / GitHub Actions\n\nCodegraph ships with a ready-to-use GitHub Actions workflow that comments impact analysis on every pull request.\n\nCopy `.github/workflows/codegraph-impact.yml` to your repo, and every PR will get a comment like:\n\n\u003e **3 functions changed** → **12 callers affected** across **7 files**\n\n## 🛠️ Configuration\n\nCreate a `.codegraphrc.json` in your project root to customize behavior:\n\n```json\n{\n \"include\": [\"src/**\", \"lib/**\"],\n \"exclude\": [\"**/*.test.js\", \"**/__mocks__/**\"],\n \"ignoreDirs\": [\"node_modules\", \".git\", \"dist\"],\n \"extensions\": [\".js\", \".ts\", \".tsx\", \".py\"],\n \"aliases\": {\n \"@/\": \"./src/\",\n \"@utils/\": \"./src/utils/\"\n },\n \"build\": {\n \"incremental\": true\n },\n \"query\": {\n \"excludeTests\": true\n }\n}\n```\n\n\u003e **Tip:** `excludeTests` can also be set at the top level as a shorthand — `{ \"excludeTests\": true }` is equivalent to nesting it under `query`. If both are present, the nested `query.excludeTests` takes precedence.\n\n### Manifesto rules\n\nConfigure pass/fail thresholds for `codegraph check` (manifesto mode):\n\n```json\n{\n \"manifesto\": {\n \"rules\": {\n \"cognitive_complexity\": { \"warn\": 15, \"fail\": 30 },\n \"cyclomatic_complexity\": { \"warn\": 10, \"fail\": 20 },\n \"nesting_depth\": { \"warn\": 4, \"fail\": 6 },\n \"maintainability_index\": { \"warn\": 40, \"fail\": 20 },\n \"halstead_bugs\": { \"warn\": 0.5, \"fail\": 1.0 }\n }\n }\n}\n```\n\nWhen any function exceeds a `fail` threshold, `codegraph check` exits with code 1 — perfect for CI gates.\n\n### LLM credentials\n\nCodegraph supports an `apiKeyCommand` field for secure credential management. Instead of storing API keys in config files or environment variables, you can shell out to a secret manager at runtime:\n\n```json\n{\n \"llm\": {\n \"provider\": \"openai\",\n \"apiKeyCommand\": \"op read op://vault/openai/api-key\"\n }\n}\n```\n\nThe command is split on whitespace and executed with `execFileSync` (no shell injection risk). Priority: **command output \u003e `CODEGRAPH_LLM_API_KEY` env var \u003e file config**. On failure, codegraph warns and falls back to the next source.\n\nWorks with any secret manager: 1Password CLI (`op`), Bitwarden (`bw`), `pass`, HashiCorp Vault, macOS Keychain (`security`), AWS Secrets Manager, etc.\n\n## 📖 Programmatic API\n\nCodegraph also exports a full API for use in your own tools:\n\n```js\nimport { buildGraph, queryNameData, findCycles, exportDOT, normalizeSymbol } from '@optave/codegraph';\n\n// Build the graph\nbuildGraph('/path/to/project');\n\n// Query programmatically\nconst results = queryNameData('myFunction', '/path/to/.codegraph/graph.db');\n// All query results use normalizeSymbol for a stable 7-field schema\n```\n\n```js\nimport { parseFileAuto, getActiveEngine, isNativeAvailable } from '@optave/codegraph';\n\n// Check which engine is active\nconsole.log(getActiveEngine()); // 'native' or 'wasm'\nconsole.log(isNativeAvailable()); // true if Rust addon is installed\n\n// Parse a single file (uses auto-selected engine)\nconst symbols = await parseFileAuto('/path/to/file.ts');\n```\n\n```js\nimport { searchData, multiSearchData, buildEmbeddings } from '@optave/codegraph';\n\n// Build embeddings (one-time)\nawait buildEmbeddings('/path/to/project');\n\n// Single-query search\nconst { results } = await searchData('handle auth', dbPath);\n\n// Multi-query search with RRF ranking\nconst { results: fused } = await multiSearchData(\n ['auth middleware', 'JWT validation'],\n dbPath,\n { limit: 10, minScore: 0.3 }\n);\n// Each result has: { name, kind, file, line, rrf, queryScores[] }\n```\n\n## ⚠️ Limitations\n\n- **No TypeScript type-checker integration** — type inference resolves annotations, `new` expressions, and assignment chains, but does not invoke `tsc` for overload resolution or complex generics\n- **Dynamic calls are best-effort** — complex computed property access and `eval` patterns are not resolved\n- **Python imports** — resolves relative imports but doesn't follow `sys.path` or virtual environment packages\n- **Dataflow analysis** — intraprocedural (single-function scope), not interprocedural\n\n## 🗺️ Roadmap\n\nSee **[ROADMAP.md](docs/roadmap/ROADMAP.md)** for the full development roadmap and **[STABILITY.md](STABILITY.md)** for the stability policy and versioning guarantees. Current plan:\n\n1. ~~**Rust Core**~~ — **Complete** (v1.3.0) — native tree-sitter parsing via napi-rs, parallel multi-core parsing, incremental re-parsing, import resolution \u0026 cycle detection in Rust\n2. ~~**Foundation Hardening**~~ — **Complete** (v1.5.0) — parser registry, complete MCP, test coverage, enhanced config, multi-repo MCP\n3. ~~**Analysis Expansion**~~ — **Complete** (v2.7.0) — complexity metrics, community detection, flow tracing, co-change, manifesto, boundary rules, check, triage, audit, batch, hybrid search\n4. ~~**Deep Analysis \u0026 Graph Enrichment**~~ — **Complete** (v3.0.0) — dataflow analysis, intraprocedural CFG, AST node storage, expanded node/edge types, interactive viewer, exports command\n5. ~~**Architectural Refactoring**~~ — **Complete** (v3.1.5) — unified AST analysis, composable MCP, domain errors, builder pipeline, graph model, qualified names, presentation layer, CLI composability\n6. ~~**Resolution Accuracy**~~ — **Complete** (v3.3.1) — type inference, receiver type tracking, dead role sub-categories, resolution benchmarks, `package.json` exports, monorepo workspace resolution\n7. ~~**TypeScript Migration**~~ — **Complete** (v3.4.0) — all 271 source files migrated from JS to TS, zero `.js` remaining\n8. ~~**Native Analysis Acceleration**~~ — **Complete** (v3.5.0) — all build phases in Rust/rusqlite, sub-100ms incremental rebuilds, better-sqlite3 lazy-loaded as fallback only\n9. ~~**Expanded Language Support**~~ — **Complete** (v3.8.0) — 23 new languages in 4 batches (11 → 34), dual-engine WASM + Rust support for all\n10. **Analysis Depth** — TypeScript-native resolution, inter-procedural type propagation, field-based points-to analysis\n11. **Runtime \u0026 Extensibility** — event-driven pipeline, plugin system, query caching, pagination\n12. **Quality, Security \u0026 Technical Debt** — supply-chain security (SBOM, SLSA), CI coverage gates, timer cleanup, tech debt kill list\n13. **Intelligent Embeddings** — LLM-generated descriptions, enhanced embeddings, module summaries\n14. **Natural Language Queries** — `codegraph ask` command, conversational sessions\n15. **GitHub Integration \u0026 CI** — reusable GitHub Action, LLM-enhanced PR review, SARIF output\n16. **Advanced Features** — dead code detection, monorepo support, agentic search\n\n## 🤝 Contributing\n\nContributions are welcome! See **[CONTRIBUTING.md](CONTRIBUTING.md)** for the full guide — setup, workflow, commit convention, testing, and architecture notes.\n\n```bash\ngit clone https://github.com/optave/ops-codegraph-tool.git\ncd codegraph\nnpm install\nnpm test\n```\n\nLooking to add a new language? Check out **[Adding a New Language](docs/contributing/adding-a-language.md)**.\n\n## 📄 License\n\n[Apache-2.0](LICENSE)\n\n---\n\n\u003cp align=\"center\"\u003e\n \u003csub\u003eBuilt with \u003ca href=\"https://tree-sitter.github.io/\"\u003etree-sitter\u003c/a\u003e and \u003ca href=\"https://github.com/WiseLibs/better-sqlite3\"\u003ebetter-sqlite3\u003c/a\u003e. Your code stays on your machine.\u003c/sub\u003e\n\u003c/p\u003e\n","funding_links":["https://github.com/sponsors/optave"],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Foptave%2Fops-codegraph-tool","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Foptave%2Fops-codegraph-tool","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Foptave%2Fops-codegraph-tool/lists"}