https://github.com/aimasteracc/tree-sitter-analyzer

A scalable, multi-language code analysis framework based on Tree-sitter, usable both as a CLI tool and an MCP server.
https://github.com/aimasteracc/tree-sitter-analyzer
ai ai-coding llms mcp-server programing toon tree-sitter tree-sitter-analyzer vibe-coding
Last synced: 6 days ago
JSON representation
A scalable, multi-language code analysis framework based on Tree-sitter, usable both as a CLI tool and an MCP server.
Host: GitHub
URL: https://github.com/aimasteracc/tree-sitter-analyzer
Owner: aimasteracc
License: mit
Created: 2025-07-30T06:42:41.000Z (10 months ago)
Default Branch: main
Last Pushed: 2026-03-30T02:16:55.000Z (2 months ago)
Last Synced: 2026-03-30T04:27:09.272Z (2 months ago)
Topics: ai, ai-coding, llms, mcp-server, programing, toon, tree-sitter, tree-sitter-analyzer, vibe-coding
Language: Python
Homepage:
Size: 15.5 MB
Stars: 27
Watchers: 0
Forks: 6
Open Issues: 5
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: docs/CONTRIBUTING.md
- Funding: .github/FUNDING.yml
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
- Codeowners: .github/CODEOWNERS
- Security: .github/SECURITY.md
Awesome Lists containing this project

README

          # 🌳 Tree-sitter Analyzer

**English** | **[日本語](README_ja.md)** | **[简体中文](README_zh.md)**

> **The MCP code-intelligence server for AI agents — fewer tokens, fewer tool calls, 100 % local.**

> Pre-indexed AST cache + 50 MCP tools + 13 curated agent skills + TOON-compressed output.

> Beats CodeGraph on 6-repo head-to-head median (**−11 % cost vs CodeGraph's −4 %**), with a strict CLI superset.

[![PyPI](https://img.shields.io/pypi/v/tree-sitter-analyzer.svg)](https://pypi.org/project/tree-sitter-analyzer/)

[![Python Version](https://img.shields.io/badge/python-3.10%2B-blue.svg)](https://python.org)

[![License](https://img.shields.io/badge/license-MIT-green.svg)](LICENSE)

[![Tests](https://img.shields.io/badge/tests-16154%20passed-brightgreen.svg)](#-quality--testing)

[![Coverage](https://codecov.io/gh/aimasteracc/tree-sitter-analyzer/branch/main/graph/badge.svg)](https://codecov.io/gh/aimasteracc/tree-sitter-analyzer)

[![GitHub Stars](https://img.shields.io/github/stars/aimasteracc/tree-sitter-analyzer.svg?style=social)](https://github.com/aimasteracc/tree-sitter-analyzer)

---

## Get Started

One-line install for **Claude Code**:

```bash

claude mcp add tree-sitter-analyzer \

  --env TREE_SITTER_PROJECT_ROOT="$PWD" \

  -- uvx --from "tree-sitter-analyzer[mcp]" tree-sitter-analyzer-mcp

```

Restart your agent, then say: *"Set the project root to my repo and run codegraph_status."*

[Other agents (Cursor, Copilot, Cline, Continue, Claude Desktop, Roo Code) →](#-supported-agents)

---

## Why Tree-sitter Analyzer

* **Token-efficient by default.** Every MCP response uses **TOON** — a tabular JSON variant that cuts payload by ~50-70 % vs raw JSON.

* **Verdict envelopes.** Every response carries `verdict: SAFE | CAUTION | UNSAFE | INFO | WARN | ERROR | NOT_FOUND`, so orchestrators branch on outcomes without re-prompting.

* **Project health grading (A–F).** No other open-source tool grades your whole project on size / complexity / coverage / duplication / dependencies / git-hotspots in one call.

* **13 curated workflows (Skills).** Pre-baked tool subsets for "find symbol", "trace call chain", "score health", "safe-to-edit before refactor", "PR review", etc.

* **5 layers of safety.** `safe_to_edit` + `modification_guard` + constraint DSL + `change_impact` + verdict envelopes — designed so agents *know* before they touch.

* **Beats the leading competitor (CodeGraph) on multiple head-to-head benchmarks.** See below.

---

## Benchmark Results

Headless Claude Code (Haiku 4.5) asked one architecture question per repo. 3 arms: no-MCP / CodeGraph MCP / Tree-sitter Analyzer MCP. Single run per arm — indicative, not statistically settled.

| Codebase | Lang / files | Baseline | CodeGraph | **TSA** | Winner |

|---|---|---|---|---|---|

| **Gin** | Go / 99 | $0.164 | $0.094 (−43 %) | **$0.080 (−51 %)** | **TSA** ⭐ |

| **Alamofire** | Swift / 98 | $0.201 | $0.219 (+9 %) | **$0.147 (−27 %)** | **TSA** ⭐ |

| **Excalidraw** | TS / 603 | $0.204 | **$0.179 (−12 %)** | $0.212 (+4 %) | CodeGraph |

| **Django** | Py / 2 910 | $0.162 | **$0.106 (−35 %)** | $0.205 (+27 %) | CodeGraph |

| **Tokio** | Rust / 778 | **$0.214** | $0.285 (+33 %) | $0.303 (+42 %) | both lose |

| **OkHttp** | Java / 596 | **$0.169** | $0.200 (+18 %) | $0.178 (+5 %) | both lose |

| **Median Δ vs baseline** | | | **−4 %** | **−11 %** | **TSA** |

TSA wins outright on **2 of 6 repos**, has a lower **median cost saving (−11 %)**, and matches CodeGraph's reported direction on every repo where the indexer-class tools should help.

> Why the median diverges from CodeGraph's published −35 % claim: we used Haiku for cost control; they used Opus + 4-run median. See `docs/internal/CODEGRAPH_BENCHMARK_FINAL_2026-05-24.md` for raw envelopes + reproducer scripts.

---

## Key Features

### Pre-indexed code intelligence (CodeGraph parity + superset)

| Capability | TSA tool | Status |

|---|---|---|

| Symbol search (FTS5) | `codegraph_symbol_search` | parity |

| Go-to-def / find-refs / call hierarchy in one call | `codegraph_navigate` | PRIMARY entry point |

| Bulk-fetch N related symbols + relationship map | `codegraph_explore` | parity |

| Function-level blast radius + risk score | `codegraph_impact` | parity + risk score |

| Who-calls-X / what-X-calls | `codegraph_callers` / `codegraph_callees` | parity |

| Index health at-a-glance | `codegraph_status` | parity |

| Pre-built call graph cache | `codegraph_autoindex` / `codegraph_full_index` / `codegraph_incremental_sync` | parity |

| Tests affected by a change (CLI) | `--affected FILE...` | parity |

### Tree-sitter Analyzer exclusive

| Capability | TSA tool | Note |

|---|---|---|

| **Project A–F health grading** | `check_project_health` | 6 dimensions, no competitor offers this |

| **TOON output** | every tool, `output_format: "toon"` (default) | 50-70 % token saving |

| **Verdict envelopes** | every tool | `SAFE/CAUTION/UNSAFE/INFO/WARN/ERROR/NOT_FOUND` |

| **Safe-to-edit gate** | `safe_to_edit` + `modification_guard` | refuses high-risk edits before they happen |

| **Architectural constraint DSL** | `check_constraints` | "module A cannot import B" → enforced |

| **Code health (file-level)** | `check_file_health` | block/long-method/smell detection |

| **Class hierarchy** | `codegraph_class_hierarchy` | type-inheritance tree |

| **Dependency matrix** | `codegraph_dependency_matrix` | module-coupling matrix |

| **Dead code** | `codegraph_dead_code` | transitive unreachable analysis |

| **Complexity heatmap** | `codegraph_complexity_heatmap` | per-fn cyclomatic + project view |

| **AST-structural clone detection** | `codegraph_similarity` | beyond text similarity |

| **Mermaid call-graph export** | `codegraph_visualize` | paste-ready in docs |

| **PR review** | `codegraph_pr_review` | AST-diff + semantic classify + blast radius |

| **agent_summary** | every response | next-step hint baked into the envelope |

| **Synapse cross-file resolver** | internal | import-aware, beats regex guessing |

| **Temporal activation** | `symbol_lineage` | per-symbol git-modification frequency |

### Skills (13 curated workflows)

CodeGraph has zero skills. We ship 13 under `.claude/skills/tsa-*/`:

`tsa-landing`, `tsa-find`, `tsa-graph`, `tsa-structure`, `tsa-deps`, `tsa-index`, `tsa-health-watch`, `tsa-edit-safety`, `tsa-edit-then-verify`, `tsa-constraints`, `tsa-pr-review`, `tsa-refactor-queue`, `tsa-temporal`.

Each skill ships an `allowed-tools` subset + procedure recipe + decision-surface schema, so the agent doesn't have to triage 50 tools on every question.

### 248 CLI flags

Strict superset of CodeGraph's 15-command CLI. Highlights:

```bash

tree-sitter-analyzer --table full           # method/signature/complexity table

tree-sitter-analyzer --partial-read --start-line N --end-line M 

tree-sitter-analyzer --project-health             # A-F grade across the project

tree-sitter-analyzer --callers            # who-calls

tree-sitter-analyzer --codegraph-impact       # blast radius + risk

tree-sitter-analyzer --affected          # tests transitively affected

tree-sitter-analyzer --dead-code                  # transitive unreachable

tree-sitter-analyzer --check-constraints          # architectural rules

tree-sitter-analyzer --safe-to-edit         # refuse if risky

```

See [`docs/CODEMAPS/cli.md`](docs/CODEMAPS/cli.md) for the full surface.

---

## Quick Start

### 1. Install dependencies

```bash

# uv (required)

curl -LsSf https://astral.sh/uv/install.sh | sh        # macOS / Linux

powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"  # Windows

# fd + ripgrep (required for search)

brew install fd ripgrep                                # macOS

winget install sharkdp.fd BurntSushi.ripgrep.MSVC      # Windows

```

### 2. Install Tree-sitter Analyzer

```bash

uv add "tree-sitter-analyzer[all,mcp]"

```

### 3. Hook it into your agent

See **[Supported Agents](#-supported-agents)**. Most clients want this MCP server entry:

```json

{

  "mcpServers": {

    "tree-sitter-analyzer": {

      "command": "uvx",

      "args": ["--from", "tree-sitter-analyzer[mcp]", "tree-sitter-analyzer-mcp"],

      "env": { "TREE_SITTER_PROJECT_ROOT": "/absolute/path/to/your/project" }

    }

  }

}

```

After restart: *"Set the project root to my repo and call codegraph_status."*

---

## How It Works

```

Source code → tree-sitter parse → SQLite + FTS5 index (.ast-cache/index.db)

                                         ↓

        codegraph_navigate / codegraph_explore / codegraph_callers / ...

                                         ↓

                            TOON-compressed envelope

                            (verdict + agent_summary + data)

                                         ↓

                              MCP client / CLI consumer

```

The index is built lazily on first query, refreshed on file change via a content-hash diff (`codegraph_incremental_sync`). All 50 tools read from the same `.ast-cache/`, so a query and its follow-up share work.

---

## Supported Agents

📘 Claude Code (recommended)

```bash

claude mcp add tree-sitter-analyzer \

  --env TREE_SITTER_PROJECT_ROOT="$PWD" \

  -- uvx --from "tree-sitter-analyzer[mcp]" tree-sitter-analyzer-mcp

```

Verify: `claude mcp list`. The 13 `tsa-*` skills auto-discover from `.claude/skills/`.

📗 Claude Desktop

Edit `claude_desktop_config.json` (macOS: `~/Library/Application Support/Claude/`, Windows: `%APPDATA%\Claude\`, Linux: `~/.config/Claude/`):

```json

{

  "mcpServers": {

    "tree-sitter-analyzer": {

      "command": "uvx",

      "args": ["--from", "tree-sitter-analyzer[mcp]", "tree-sitter-analyzer-mcp"],

      "env": { "TREE_SITTER_PROJECT_ROOT": "/absolute/path/to/your/project" }

    }

  }

}

```

📙 GitHub Copilot (VS Code)

Create `.vscode/mcp.json` (note: `servers`, not `mcpServers`):

```json

{

  "servers": {

    "tree-sitter-analyzer": {

      "type": "stdio",

      "command": "uvx",

      "args": ["--from", "tree-sitter-analyzer[mcp]", "tree-sitter-analyzer-mcp"],

      "env": { "TREE_SITTER_PROJECT_ROOT": "${workspaceFolder}" }

    }

  }

}

```

🖱 Cursor / Cline / Continue / Roo Code

All read the same `mcpServers` schema as Claude Desktop. Cursor: **Settings → MCP**. Cline: MCP panel → Edit settings. Continue: `~/.continue/config.json` under `experimental.modelContextProtocolServers`. Roo Code: MCP panel → Edit MCP Settings.

> ⚠️ `TREE_SITTER_PROJECT_ROOT` must be **absolute**. The server enforces a security boundary against escapes via `SecurityBoundaryManager`.

---

## Supported Languages

21 language plugins; 16 fully wired into the indexer + 5 (data/markup) reachable via the single-file CLI path. The 2026-05-24 patch unblocked Swift / Kotlin / Ruby / PHP / C# that had been silently skipped for months.

| Tier | Languages |

|---|---|

| **Full index + symbol + call graph** | Python · Java · JavaScript · TypeScript · Go · Rust · C · C++ · C# · Swift · Kotlin · Ruby · PHP |

| **Single-file analysis (CLI)** | HTML · CSS · Markdown · SQL · YAML |

| **Scaffold (plugin exists, indexer wiring pending)** | bash · scala · json |

CodeGraph supports a similar set; the only popular code languages neither tool ships yet are **Dart, Vue, Svelte, Lua** (next-sprint backlog).

---

## Configuration

Mostly nothing. The defaults are designed so you can hook it into your agent and forget:

* **Output format**: TOON. Override per-call with `output_format: "json"`.

* **Project root**: `TREE_SITTER_PROJECT_ROOT` (env var, MCP) or `--project-root` (CLI).

* **Cache location**: `/.ast-cache/`. Safe to delete — auto-rebuilds.

* **Optional**: `TREE_SITTER_OUTPUT_PATH` for large-output write target.

---

## Quality & Testing

| Metric | Value |

|---|---|

| Tests passed | 16,154 ✅ |

| Coverage | [![Coverage](https://codecov.io/gh/aimasteracc/tree-sitter-analyzer/branch/main/graph/badge.svg)](https://codecov.io/gh/aimasteracc/tree-sitter-analyzer) |

| Type safety | 100 % mypy |

| Platforms | macOS · Linux · Windows |

| Pre-commit gates | bandit · mypy · pyupgrade · detect-secrets · codemap-sync · smell-ratchet |

```bash

uv run pytest -q                                # full suite

uv run python check_quality.py --new-code-only  # quality gate

```

---

## Troubleshooting

| Symptom | Fix |

|---|---|

| `unsupported language` on `.swift / .kt / .rb / .php / .cs` | Update to ≥ 1.12.x — the 5-language gap was patched in commit `50e99a8f`. |

| MCP server doesn't appear in client | `TREE_SITTER_PROJECT_ROOT` must be **absolute**; restart the client after config edit. |

| `database is locked` | Stop any other process holding `.ast-cache/index.db`; if persistent, `rm -rf .ast-cache && tree-sitter-analyzer --autoindex`. |

| Slow first call | First call builds the index. Subsequent calls are sub-second. Run `--full-index` upfront to amortise. |

| Agent picks the wrong tool | Use a `tsa-*` skill (`/tsa-graph`, `/tsa-find`, ...) — each skill restricts the visible tool set to one workflow. |

---

## Development

```bash

git clone https://github.com/aimasteracc/tree-sitter-analyzer.git

cd tree-sitter-analyzer

uv sync --extra all --extra mcp

uv run pytest -q

```

See **[`docs/CONTRIBUTING.md`](docs/CONTRIBUTING.md)** for the development guide.

---

## Contributing & License

* ⭐ A GitHub star helps surface this tool to other AI-agent users.

* 💖 [Sponsor](https://github.com/sponsors/aimasteracc) — supports continued MCP / Skills development.

* Lead sponsor: **[@o93](https://github.com/o93)**.

* MIT licensed — see [LICENSE](LICENSE).

* Release history: [CHANGELOG.md](CHANGELOG.md).
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/aimasteracc/tree-sitter-analyzer

Awesome Lists containing this project

README