https://github.com/aimasteracc/tree-sitter-analyzer
A scalable, multi-language code analysis framework based on Tree-sitter, usable both as a CLI tool and an MCP server.
https://github.com/aimasteracc/tree-sitter-analyzer
ai ai-coding llms mcp-server programing toon tree-sitter tree-sitter-analyzer vibe-coding
Last synced: 6 days ago
JSON representation
A scalable, multi-language code analysis framework based on Tree-sitter, usable both as a CLI tool and an MCP server.
- Host: GitHub
- URL: https://github.com/aimasteracc/tree-sitter-analyzer
- Owner: aimasteracc
- License: mit
- Created: 2025-07-30T06:42:41.000Z (10 months ago)
- Default Branch: main
- Last Pushed: 2026-03-30T02:16:55.000Z (2 months ago)
- Last Synced: 2026-03-30T04:27:09.272Z (2 months ago)
- Topics: ai, ai-coding, llms, mcp-server, programing, toon, tree-sitter, tree-sitter-analyzer, vibe-coding
- Language: Python
- Homepage:
- Size: 15.5 MB
- Stars: 27
- Watchers: 0
- Forks: 6
- Open Issues: 5
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: docs/CONTRIBUTING.md
- Funding: .github/FUNDING.yml
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
- Codeowners: .github/CODEOWNERS
- Security: .github/SECURITY.md
Awesome Lists containing this project
README
# đŗ Tree-sitter Analyzer
**English** | **[æĨæŦčĒ](README_ja.md)** | **[įŽäŊ䏿](README_zh.md)**
> **The MCP code-intelligence server for AI agents â fewer tokens, fewer tool calls, 100 % local.**
> Pre-indexed AST cache + 50 MCP tools + 13 curated agent skills + TOON-compressed output.
> Beats CodeGraph on 6-repo head-to-head median (**â11 % cost vs CodeGraph's â4 %**), with a strict CLI superset.
[](https://pypi.org/project/tree-sitter-analyzer/)
[](https://python.org)
[](LICENSE)
[](#-quality--testing)
[](https://codecov.io/gh/aimasteracc/tree-sitter-analyzer)
[](https://github.com/aimasteracc/tree-sitter-analyzer)
---
## Get Started
One-line install for **Claude Code**:
```bash
claude mcp add tree-sitter-analyzer \
--env TREE_SITTER_PROJECT_ROOT="$PWD" \
-- uvx --from "tree-sitter-analyzer[mcp]" tree-sitter-analyzer-mcp
```
Restart your agent, then say: *"Set the project root to my repo and run codegraph_status."*
[Other agents (Cursor, Copilot, Cline, Continue, Claude Desktop, Roo Code) â](#-supported-agents)
---
## Why Tree-sitter Analyzer
* **Token-efficient by default.** Every MCP response uses **TOON** â a tabular JSON variant that cuts payload by ~50-70 % vs raw JSON.
* **Verdict envelopes.** Every response carries `verdict: SAFE | CAUTION | UNSAFE | INFO | WARN | ERROR | NOT_FOUND`, so orchestrators branch on outcomes without re-prompting.
* **Project health grading (AâF).** No other open-source tool grades your whole project on size / complexity / coverage / duplication / dependencies / git-hotspots in one call.
* **13 curated workflows (Skills).** Pre-baked tool subsets for "find symbol", "trace call chain", "score health", "safe-to-edit before refactor", "PR review", etc.
* **5 layers of safety.** `safe_to_edit` + `modification_guard` + constraint DSL + `change_impact` + verdict envelopes â designed so agents *know* before they touch.
* **Beats the leading competitor (CodeGraph) on multiple head-to-head benchmarks.** See below.
---
## Benchmark Results
Headless Claude Code (Haiku 4.5) asked one architecture question per repo. 3 arms: no-MCP / CodeGraph MCP / Tree-sitter Analyzer MCP. Single run per arm â indicative, not statistically settled.
| Codebase | Lang / files | Baseline | CodeGraph | **TSA** | Winner |
|---|---|---|---|---|---|
| **Gin** | Go / 99 | $0.164 | $0.094 (â43 %) | **$0.080 (â51 %)** | **TSA** â |
| **Alamofire** | Swift / 98 | $0.201 | $0.219 (+9 %) | **$0.147 (â27 %)** | **TSA** â |
| **Excalidraw** | TS / 603 | $0.204 | **$0.179 (â12 %)** | $0.212 (+4 %) | CodeGraph |
| **Django** | Py / 2 910 | $0.162 | **$0.106 (â35 %)** | $0.205 (+27 %) | CodeGraph |
| **Tokio** | Rust / 778 | **$0.214** | $0.285 (+33 %) | $0.303 (+42 %) | both lose |
| **OkHttp** | Java / 596 | **$0.169** | $0.200 (+18 %) | $0.178 (+5 %) | both lose |
| **Median Î vs baseline** | | | **â4 %** | **â11 %** | **TSA** |
TSA wins outright on **2 of 6 repos**, has a lower **median cost saving (â11 %)**, and matches CodeGraph's reported direction on every repo where the indexer-class tools should help.
> Why the median diverges from CodeGraph's published â35 % claim: we used Haiku for cost control; they used Opus + 4-run median. See `docs/internal/CODEGRAPH_BENCHMARK_FINAL_2026-05-24.md` for raw envelopes + reproducer scripts.
---
## Key Features
### Pre-indexed code intelligence (CodeGraph parity + superset)
| Capability | TSA tool | Status |
|---|---|---|
| Symbol search (FTS5) | `codegraph_symbol_search` | parity |
| Go-to-def / find-refs / call hierarchy in one call | `codegraph_navigate` | PRIMARY entry point |
| Bulk-fetch N related symbols + relationship map | `codegraph_explore` | parity |
| Function-level blast radius + risk score | `codegraph_impact` | parity + risk score |
| Who-calls-X / what-X-calls | `codegraph_callers` / `codegraph_callees` | parity |
| Index health at-a-glance | `codegraph_status` | parity |
| Pre-built call graph cache | `codegraph_autoindex` / `codegraph_full_index` / `codegraph_incremental_sync` | parity |
| Tests affected by a change (CLI) | `--affected FILE...` | parity |
### Tree-sitter Analyzer exclusive
| Capability | TSA tool | Note |
|---|---|---|
| **Project AâF health grading** | `check_project_health` | 6 dimensions, no competitor offers this |
| **TOON output** | every tool, `output_format: "toon"` (default) | 50-70 % token saving |
| **Verdict envelopes** | every tool | `SAFE/CAUTION/UNSAFE/INFO/WARN/ERROR/NOT_FOUND` |
| **Safe-to-edit gate** | `safe_to_edit` + `modification_guard` | refuses high-risk edits before they happen |
| **Architectural constraint DSL** | `check_constraints` | "module A cannot import B" â enforced |
| **Code health (file-level)** | `check_file_health` | block/long-method/smell detection |
| **Class hierarchy** | `codegraph_class_hierarchy` | type-inheritance tree |
| **Dependency matrix** | `codegraph_dependency_matrix` | module-coupling matrix |
| **Dead code** | `codegraph_dead_code` | transitive unreachable analysis |
| **Complexity heatmap** | `codegraph_complexity_heatmap` | per-fn cyclomatic + project view |
| **AST-structural clone detection** | `codegraph_similarity` | beyond text similarity |
| **Mermaid call-graph export** | `codegraph_visualize` | paste-ready in docs |
| **PR review** | `codegraph_pr_review` | AST-diff + semantic classify + blast radius |
| **agent_summary** | every response | next-step hint baked into the envelope |
| **Synapse cross-file resolver** | internal | import-aware, beats regex guessing |
| **Temporal activation** | `symbol_lineage` | per-symbol git-modification frequency |
### Skills (13 curated workflows)
CodeGraph has zero skills. We ship 13 under `.claude/skills/tsa-*/`:
`tsa-landing`, `tsa-find`, `tsa-graph`, `tsa-structure`, `tsa-deps`, `tsa-index`, `tsa-health-watch`, `tsa-edit-safety`, `tsa-edit-then-verify`, `tsa-constraints`, `tsa-pr-review`, `tsa-refactor-queue`, `tsa-temporal`.
Each skill ships an `allowed-tools` subset + procedure recipe + decision-surface schema, so the agent doesn't have to triage 50 tools on every question.
### 248 CLI flags
Strict superset of CodeGraph's 15-command CLI. Highlights:
```bash
tree-sitter-analyzer --table full # method/signature/complexity table
tree-sitter-analyzer --partial-read --start-line N --end-line M
tree-sitter-analyzer --project-health # A-F grade across the project
tree-sitter-analyzer --callers # who-calls
tree-sitter-analyzer --codegraph-impact # blast radius + risk
tree-sitter-analyzer --affected # tests transitively affected
tree-sitter-analyzer --dead-code # transitive unreachable
tree-sitter-analyzer --check-constraints # architectural rules
tree-sitter-analyzer --safe-to-edit # refuse if risky
```
See [`docs/CODEMAPS/cli.md`](docs/CODEMAPS/cli.md) for the full surface.
---
## Quick Start
### 1. Install dependencies
```bash
# uv (required)
curl -LsSf https://astral.sh/uv/install.sh | sh # macOS / Linux
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex" # Windows
# fd + ripgrep (required for search)
brew install fd ripgrep # macOS
winget install sharkdp.fd BurntSushi.ripgrep.MSVC # Windows
```
### 2. Install Tree-sitter Analyzer
```bash
uv add "tree-sitter-analyzer[all,mcp]"
```
### 3. Hook it into your agent
See **[Supported Agents](#-supported-agents)**. Most clients want this MCP server entry:
```json
{
"mcpServers": {
"tree-sitter-analyzer": {
"command": "uvx",
"args": ["--from", "tree-sitter-analyzer[mcp]", "tree-sitter-analyzer-mcp"],
"env": { "TREE_SITTER_PROJECT_ROOT": "/absolute/path/to/your/project" }
}
}
}
```
After restart: *"Set the project root to my repo and call codegraph_status."*
---
## How It Works
```
Source code â tree-sitter parse â SQLite + FTS5 index (.ast-cache/index.db)
â
codegraph_navigate / codegraph_explore / codegraph_callers / ...
â
TOON-compressed envelope
(verdict + agent_summary + data)
â
MCP client / CLI consumer
```
The index is built lazily on first query, refreshed on file change via a content-hash diff (`codegraph_incremental_sync`). All 50 tools read from the same `.ast-cache/`, so a query and its follow-up share work.
---
## Supported Agents
đ Claude Code (recommended)
```bash
claude mcp add tree-sitter-analyzer \
--env TREE_SITTER_PROJECT_ROOT="$PWD" \
-- uvx --from "tree-sitter-analyzer[mcp]" tree-sitter-analyzer-mcp
```
Verify: `claude mcp list`. The 13 `tsa-*` skills auto-discover from `.claude/skills/`.
đ Claude Desktop
Edit `claude_desktop_config.json` (macOS: `~/Library/Application Support/Claude/`, Windows: `%APPDATA%\Claude\`, Linux: `~/.config/Claude/`):
```json
{
"mcpServers": {
"tree-sitter-analyzer": {
"command": "uvx",
"args": ["--from", "tree-sitter-analyzer[mcp]", "tree-sitter-analyzer-mcp"],
"env": { "TREE_SITTER_PROJECT_ROOT": "/absolute/path/to/your/project" }
}
}
}
```
đ GitHub Copilot (VS Code)
Create `.vscode/mcp.json` (note: `servers`, not `mcpServers`):
```json
{
"servers": {
"tree-sitter-analyzer": {
"type": "stdio",
"command": "uvx",
"args": ["--from", "tree-sitter-analyzer[mcp]", "tree-sitter-analyzer-mcp"],
"env": { "TREE_SITTER_PROJECT_ROOT": "${workspaceFolder}" }
}
}
}
```
đą Cursor / Cline / Continue / Roo Code
All read the same `mcpServers` schema as Claude Desktop. Cursor: **Settings â MCP**. Cline: MCP panel â Edit settings. Continue: `~/.continue/config.json` under `experimental.modelContextProtocolServers`. Roo Code: MCP panel â Edit MCP Settings.
> â ī¸ `TREE_SITTER_PROJECT_ROOT` must be **absolute**. The server enforces a security boundary against escapes via `SecurityBoundaryManager`.
---
## Supported Languages
21 language plugins; 16 fully wired into the indexer + 5 (data/markup) reachable via the single-file CLI path. The 2026-05-24 patch unblocked Swift / Kotlin / Ruby / PHP / C# that had been silently skipped for months.
| Tier | Languages |
|---|---|
| **Full index + symbol + call graph** | Python ¡ Java ¡ JavaScript ¡ TypeScript ¡ Go ¡ Rust ¡ C ¡ C++ ¡ C# ¡ Swift ¡ Kotlin ¡ Ruby ¡ PHP |
| **Single-file analysis (CLI)** | HTML ¡ CSS ¡ Markdown ¡ SQL ¡ YAML |
| **Scaffold (plugin exists, indexer wiring pending)** | bash ¡ scala ¡ json |
CodeGraph supports a similar set; the only popular code languages neither tool ships yet are **Dart, Vue, Svelte, Lua** (next-sprint backlog).
---
## Configuration
Mostly nothing. The defaults are designed so you can hook it into your agent and forget:
* **Output format**: TOON. Override per-call with `output_format: "json"`.
* **Project root**: `TREE_SITTER_PROJECT_ROOT` (env var, MCP) or `--project-root` (CLI).
* **Cache location**: `/.ast-cache/`. Safe to delete â auto-rebuilds.
* **Optional**: `TREE_SITTER_OUTPUT_PATH` for large-output write target.
---
## Quality & Testing
| Metric | Value |
|---|---|
| Tests passed | 16,154 â
|
| Coverage | [](https://codecov.io/gh/aimasteracc/tree-sitter-analyzer) |
| Type safety | 100 % mypy |
| Platforms | macOS ¡ Linux ¡ Windows |
| Pre-commit gates | bandit ¡ mypy ¡ pyupgrade ¡ detect-secrets ¡ codemap-sync ¡ smell-ratchet |
```bash
uv run pytest -q # full suite
uv run python check_quality.py --new-code-only # quality gate
```
---
## Troubleshooting
| Symptom | Fix |
|---|---|
| `unsupported language` on `.swift / .kt / .rb / .php / .cs` | Update to âĨ 1.12.x â the 5-language gap was patched in commit `50e99a8f`. |
| MCP server doesn't appear in client | `TREE_SITTER_PROJECT_ROOT` must be **absolute**; restart the client after config edit. |
| `database is locked` | Stop any other process holding `.ast-cache/index.db`; if persistent, `rm -rf .ast-cache && tree-sitter-analyzer --autoindex`. |
| Slow first call | First call builds the index. Subsequent calls are sub-second. Run `--full-index` upfront to amortise. |
| Agent picks the wrong tool | Use a `tsa-*` skill (`/tsa-graph`, `/tsa-find`, ...) â each skill restricts the visible tool set to one workflow. |
---
## Development
```bash
git clone https://github.com/aimasteracc/tree-sitter-analyzer.git
cd tree-sitter-analyzer
uv sync --extra all --extra mcp
uv run pytest -q
```
See **[`docs/CONTRIBUTING.md`](docs/CONTRIBUTING.md)** for the development guide.
---
## Contributing & License
* â A GitHub star helps surface this tool to other AI-agent users.
* đ [Sponsor](https://github.com/sponsors/aimasteracc) â supports continued MCP / Skills development.
* Lead sponsor: **[@o93](https://github.com/o93)**.
* MIT licensed â see [LICENSE](LICENSE).
* Release history: [CHANGELOG.md](CHANGELOG.md).