https://github.com/ralscha/tree-sitter-mcp
https://github.com/ralscha/tree-sitter-mcp
Last synced: 17 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/ralscha/tree-sitter-mcp
- Owner: ralscha
- License: mit
- Created: 2026-05-31T05:11:16.000Z (26 days ago)
- Default Branch: main
- Last Pushed: 2026-06-06T09:28:15.000Z (20 days ago)
- Last Synced: 2026-06-06T11:13:36.910Z (20 days ago)
- Language: Go
- Size: 62.5 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# tree-sitter-mcp
An [MCP (Model Context Protocol)](https://modelcontextprotocol.io/) server that gives AI assistants structured access to codebases via [tree-sitter](https://tree-sitter.github.io/). It can parse ASTs, extract symbols, run S-expression queries, find similar code, and analyze project structure through a standardized MCP interface.
## Features
- **20+ MCP tools** covering file ops, AST inspection, symbol extraction, text/regex search, tree-sitter queries, complexity analysis, and more
- **Bundled parsers** for C, C++, Go, HTML, Java, JavaScript, JSON, PHP, Python, Ruby, and Rust
- **Extension detection** for additional file types in project summaries and file filtering
- **Project registry** to register multiple project directories and scope operations to each
- **AST as JSON** with full or depth-limited abstract syntax trees
- **Symbol extraction** using built-in query templates
- **S-expression queries** with direct tree-sitter query execution
- **Query builder** for combining query templates and adapting queries across languages
- **Structural search** using AST fingerprinting and Jaccard similarity
- **Parse tree caching** with configurable in-memory cache size and TTL
- **Pre-parsing** via `--pre-parse` to warm the parse cache at startup
- **YAML configuration** for cache size, file security limits, excluded dirs, and more
- **Diagnostics** with `diagnose_config` for troubleshooting YAML config loading
## Installation
Build from source with Go and Task:
```sh
task build
```
The binary is written to `bin/tree-sitter-mcp` or `bin/tree-sitter-mcp.exe` on Windows. Note that the build requires CGO to compile the bundled tree-sitter parsers, so ensure you have a C compiler installed and configured.
## Usage
### Command-line Flags
```text
tree-sitter-mcp [flags]
Flags:
--config string Path to YAML configuration file
--debug Enable debug logging
--disable-cache Disable parse tree caching
--pre-parse string Pre-parse all source files in a directory at startup
--transport string MCP transport: stdio or sse (default "stdio")
--http-addr string HTTP listen address when using SSE (default ":8080")
--sse-path string SSE endpoint path when using SSE (default "/sse")
--version Show version and exit
```
### Running as an MCP Server
By default, the server communicates over stdio. Configure your MCP client to launch it:
```json
{
"mcpServers": {
"tree-sitter": {
"command": "/path/to/bin/tree-sitter-mcp",
"args": ["--config", "/path/to/config.yaml"]
}
}
}
```
To serve MCP over SSE instead, run the server as an HTTP process:
```sh
tree-sitter-mcp --transport sse --http-addr :8080 --sse-path /sse
```
Then configure an SSE-capable MCP client to connect to:
```text
http://localhost:8080/sse
```
The same settings can be provided with `MCP_TS_TRANSPORT`, `MCP_TS_HTTP_ADDR`, and `MCP_TS_SSE_PATH`.
### Pre-parsing a Project
Use `--pre-parse` to walk a directory and parse source files with bundled parsers into the cache before the MCP server starts accepting requests. This eliminates first-query latency for subsequent `get_ast`, `run_query`, and `get_symbols` calls.
```sh
tree-sitter-mcp --debug --pre-parse /path/to/project
```
Hidden files/directories (`.git`, `.vscode`, etc.) and configured `excluded_dirs` are skipped. The server logs a summary after pre-parsing completes:
```text
Pre-parsing project at /path/to/project ...
Pre-parse complete: 142 files scanned, 130 parsed, 10 skipped, 2 errors in 2.3s
go: 85 files
python: 30 files
Starting tree-sitter MCP server (cache: true, max_file_size: 5MB, max_depth: 5)
```
## Configuration
Create a YAML config file. Defaults are used when no config file is supplied.
```yaml
cache:
enabled: true
max_size_mb: 100
ttl_seconds: 300
security:
max_file_size_mb: 5
allowed_extensions: []
excluded_dirs:
- .git
- node_modules
- __pycache__
- .venv
- venv
- .tox
language:
default_max_depth: 5
preferred_languages:
- go
- python
log_level: INFO
max_results_default: 100
```
Environment variable overrides currently supported:
- `MCP_TS_LOG_LEVEL`
- `MCP_TS_CACHE_MAX_SIZE_MB`
- `MCP_TS_TRANSPORT`
- `MCP_TS_HTTP_ADDR`
- `MCP_TS_SSE_PATH`
## MCP Tools
### Project Management
| Tool | Description |
|------|-------------|
| `register_project` | Register a project directory for code exploration |
| `list_projects` | List all registered projects |
| `remove_project` | Remove a registered project |
| `analyze_project` | Analyze project structure: file counts, languages, top-level files |
### File Operations
| Tool | Description |
|------|-------------|
| `list_files` | List files in a project, filtered by basename/path glob, depth, and extensions |
| `get_file` | Get file content with optional line range limits |
| `get_file_metadata` | Get file metadata (size, modification time, language) |
### AST & Parsing
| Tool | Description |
|------|-------------|
| `get_ast` | Get the full AST for a file as nested JSON |
| `get_node_at_position` | Find the AST node at a specific row/column |
| `list_languages` | List available tree-sitter languages |
| `check_language` | Check if a language parser is available |
### Symbols
| Tool | Description |
|------|-------------|
| `get_symbols` | Extract symbols from a file |
| `find_usage` | Find usages of a symbol/identifier across project files |
| `get_dependencies` | Find the dependencies/imports/includes of a file |
### Search
| Tool | Description |
|------|-------------|
| `find_text` | Search for text/regex in project files with file-pattern and context-line support |
| `find_similar_code` | Find structurally similar code using AST fingerprinting |
| `run_query` | Run a raw tree-sitter S-expression query on project files |
### Queries
| Tool | Description |
|------|-------------|
| `get_query_template` | Get a predefined tree-sitter query template |
| `list_query_templates` | List available tree-sitter query templates |
| `build_query` | Combine multiple templates/patterns into a compound query |
| `adapt_query` | Adapt a query from one language to another by translating node types |
| `get_node_types` | Get descriptions of common AST node types for a language |
### Analysis
| Tool | Description |
|------|-------------|
| `analyze_complexity` | Analyze line count, function count, and average function length |
### Utilities
| Tool | Description |
|------|-------------|
| `configure` | Dynamically reconfigure server settings at runtime |
| `clear_cache` | Clear the parse tree cache, optionally scoped to project/file |
| `diagnose_config` | Diagnose YAML configuration loading issues |
## Language Support
Bundled tree-sitter parsers are available for AST, query, symbol, dependency, complexity, and similarity operations:
| Language | Extensions |
|----------|-----------|
| C | `.c`, `.h` |
| C++ | `.cpp`, `.cc`, `.hpp` |
| Go | `.go` |
| HTML | `.html` |
| Java | `.java` |
| JavaScript | `.js`, `.jsx` |
| JSON | `.json` |
| PHP | `.php` |
| Python | `.py` |
| Ruby | `.rb` |
| Rust | `.rs` |
The server also recognizes these extensions for detection and project summaries, but parser-backed tools require a bundled or manually registered parser:
| Language | Extensions |
|----------|-----------|
| C# | `.cs` |
| TypeScript | `.ts`, `.tsx` |
| Kotlin | `.kt` |
| Swift | `.swift` |
| Dart | `.dart` |
| Scala | `.scala` |
| Lua | `.lua` |
| Haskell | `.hs` |
| OCaml | `.ml` |
| Elixir | `.ex`, `.exs` |
| Clojure | `.clj` |
| Elm | `.elm` |
| Bash | `.sh` |
| SQL | `.sql` |
| YAML | `.yaml`, `.yml` |
| CSS | `.css` |
| SCSS | `.scss`, `.sass` |
| Markdown | `.md` |
| Protobuf | `.proto` |
| XML | `.xml` |
## Development
```bash
# Build
task build
# Run tests
task test
# Format code
task format
# Lint (requires Docker)
task lint
# Tidy dependencies
task tidy
```
## Demo: Eino Architecture Detective
The repository includes a runnable Eino demo in `demos/eino-architecture-detective`. It launches this MCP server over stdio, uses the MCP tools to gather tree-sitter evidence from a target codebase, and asks an OpenAI-compatible Eino chat model to produce an architecture report.
Create `.env` from `.env.example`, fill in `OPENAI_API_KEY`, then run:
```bash
task demo:eino
```
You can also point it at another project and add a focus question:
```bash
task demo:eino TARGET=/path/to/project FOCUS="Where is the core domain logic?"
```
## License
MIT