https://github.com/probelabs/probe

AI-friendly semantic code search engine for large codebases. Combines ripgrep speed with tree-sitter AST parsing. Powers AI coding assistants with precise, context-aware code understanding.
https://github.com/probelabs/probe
ai ai-coder ast cli code-search mcp nodejs-sdk ripgrep rust search-engine semantic-search tree-sitter
Last synced: 5 months ago
JSON representation
AI-friendly semantic code search engine for large codebases. Combines ripgrep speed with tree-sitter AST parsing. Powers AI coding assistants with precise, context-aware code understanding.
Host: GitHub
URL: https://github.com/probelabs/probe
Owner: probelabs
License: apache-2.0
Created: 2025-03-05T16:09:15.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2026-03-06T12:46:43.000Z (5 months ago)
Last Synced: 2026-03-06T12:48:29.781Z (5 months ago)
Topics: ai, ai-coder, ast, cli, code-search, mcp, nodejs-sdk, ripgrep, rust, search-engine, semantic-search, tree-sitter
Language: Rust
Homepage: https://probelabs.com/probe
Size: 19.7 MB
Stars: 480
Watchers: 4
Forks: 47
Open Issues: 11
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
- Security: SECURITY.md
Awesome Lists containing this project

README

          


  



# Probe

**We read code 10x more than we write it.** Probe is a code and markdown context engine, with a built-in agent, made to work on enterprise-scale codebases.

Today's AI coding tools use a caveman approach: grep some files, read random lines, hope for the best. It works on toy projects. It falls apart on real codebases.

**Probe is a context engine built for reading and reasoning.** It treats your code as code—not text. AST parsing understands structure. Semantic search finds what matters. You get complete, meaningful context in a single call.

**The Probe Agent** is purpose-built for code understanding. It knows how to wield the Probe engine expertly—searching, extracting, and reasoning across your entire codebase. Perfect for spec-driven development, code reviews, onboarding, and any task where understanding comes before writing.

**One Probe call captures what takes other tools 10+ agentic loops**—deeper, cleaner, and far less noise.

---

## Table of Contents

- [Why Probe?](#why-probe)

- [Quick Start](#quick-start)

- [Features](#features)

- [Usage Modes](#usage-modes)

  - [Probe Agent (MCP)](#probe-agent-mcp)

  - [Raw MCP Tools](#raw-mcp-tools)

  - [CLI Agent](#cli-agent)

  - [Direct CLI Commands](#direct-cli-commands)

  - [Node.js SDK](#nodejs-sdk)

- [LLM Script](#llm-script)

- [Installation](#installation)

- [Supported Languages](#supported-languages)

- [Documentation](#documentation)

- [Environment Variables](#environment-variables)

- [Contributing](#contributing)

- [License](#license)

---

## Why Probe?

| Traditional Approach | Probe |

|---------------------|-------|

| Grep + read random lines | Semantic search with Elasticsearch syntax |

| Treats code as text | Understands code structure via tree-sitter AST |

| Returns fragments | Returns complete functions, classes, structs |

| Requires indexing | Zero setup, instant results |

| 10+ loops to gather context | One call, complete picture |

| Struggles at scale | Built for million-line codebases |

---

## Quick Start

### Option 1: Probe Agent via MCP (Recommended)

Our built-in agent natively integrates with Claude Code, using its authentication—no extra API keys needed.

Add to `~/.claude/claude_desktop_config.json`:

```json

{

  "mcpServers": {

    "probe": {

      "command": "npx",

      "args": ["-y", "@probelabs/probe@latest", "agent", "--mcp"]

    }

  }

}

```

The Probe Agent is purpose-built to read and reason about code. It piggybacks on Claude Code's auth (or Codex auth), or works with any model via your own API key (e.g., `GOOGLE_API_KEY`).

### Option 2: Raw Probe Tools via MCP

If you prefer direct access to search/query/extract tools without the agent layer:

```json

{

  "mcpServers": {

    "probe": {

      "command": "npx",

      "args": ["-y", "@probelabs/probe@latest", "mcp"]

    }

  }

}

```

### Option 3: Direct CLI (No MCP)

Use Probe directly from your terminal—no AI editor required:

```bash

# Semantic search with Elasticsearch syntax

npx -y @probelabs/probe search "authentication AND login" ./src

# Extract code block at line 42

npx -y @probelabs/probe extract src/main.rs:42

# AST pattern matching

npx -y @probelabs/probe query "fn $NAME($$$) -> Result<$RET>" --language rust

```

### Option 4: CLI Agent

Ask questions about any codebase directly from your terminal:

```bash

# One-shot question (works with any LLM provider)

npx -y @probelabs/probe@latest agent "How is authentication implemented?"

# With code editing capabilities

npx -y @probelabs/probe@latest agent "Refactor the login function" --allow-edit

```

---

## Features

- **Code-Aware**: Tree-sitter AST parsing understands your code's actual structure

- **Semantic Search**: Elasticsearch-style queries (`AND`, `OR`, `NOT`, phrases, filters)

- **Complete Context**: Returns entire functions, classes, or structs—not fragments

- **One Call, Full Context**: Captures what takes other tools 10+ loops to gather

- **Zero Indexing**: Instant results on any codebase, no setup required

- **Fully Local**: Your code never leaves your machine

- **Blazing Fast**: Ripgrep-powered scanning handles million-line codebases

- **Smart Ranking**: BM25, TF-IDF, and hybrid algorithms surface what matters

- **Multi-Language**: Rust, Python, JavaScript, TypeScript, Go, C/C++, Java, and more

---

## Usage Modes

### Probe Agent (MCP)

The recommended way to use Probe with AI editors. The Probe Agent is a specialized coding assistant that reasons about your code—not just pattern matches.

```json

{

  "mcpServers": {

    "probe": {

      "command": "npx",

      "args": ["-y", "@probelabs/probe@latest", "agent", "--mcp"]

    }

  }

}

```

**Why use the agent?**

- Purpose-built to understand and reason about code

- Piggybacks on Claude Code / Codex authentication (or use your own API key)

- Smarter multi-step reasoning for complex questions

- Built-in code editing, task delegation, and more

**Agent options:**

| Option | Description |

|--------|-------------|

| `--path ` | Search directory (default: current) |

| `--provider ` | AI provider: `anthropic`, `openai`, `google` |

| `--model ` | Override model name |

| `--prompt ` | Persona: `code-explorer`, `engineer`, `code-review`, `architect` |

| `--allow-edit` | Enable code modification |

| `--enable-delegate` | Enable task delegation to subagents |

| `--enable-bash` | Enable bash command execution |

| `--max-iterations ` | Max tool iterations (default: 30) |

---

### Raw MCP Tools

Direct access to Probe's search, query, and extract tools—without the agent layer. Use this when you want your AI editor to call Probe tools directly.

```json

{

  "mcpServers": {

    "probe": {

      "command": "npx",

      "args": ["-y", "@probelabs/probe@latest", "mcp"]

    }

  }

}

```

**Available tools:**

- `search` - Semantic code search with Elasticsearch-style queries

- `query` - AST-based structural pattern matching

- `extract` - Extract code blocks by line number or symbol name

---

### CLI Agent

Run the Probe Agent directly from your terminal:

```bash

# One-shot question

npx -y @probelabs/probe@latest agent "How does the ranking algorithm work?"

# Specify search path

npx -y @probelabs/probe@latest agent "Find API endpoints" --path ./src

# Enable code editing

npx -y @probelabs/probe@latest agent "Add error handling to login()" --allow-edit

# Use custom persona

npx -y @probelabs/probe@latest agent "Review this code" --prompt code-review

```

---

### Direct CLI Commands

For scripting and direct code analysis.

#### Search Command

```bash

probe search  [PATH] [OPTIONS]

```

**Examples:**

```bash

# Basic search

probe search "authentication" ./src

# Boolean operators (Elasticsearch syntax)

probe search "error AND handling" ./

probe search "login OR auth" ./src

probe search "database NOT sqlite" ./

# Search hints (file filters)

probe search "function AND ext:rs" ./           # Only .rs files

probe search "class AND file:src/**/*.py" ./    # Python files in src/

probe search "error AND dir:tests" ./           # Files in tests/

# Limit results for AI context windows

probe search "API" ./ --max-tokens 10000

```

**Key options:**

| Option | Description |

|--------|-------------|

| `--max-tokens ` | Limit total tokens returned |

| `--max-results ` | Limit number of results |

| `--reranker ` | Ranking: `bm25`, `tfidf`, `hybrid`, `hybrid2` |

| `--allow-tests` | Include test files |

| `--format ` | Output: `markdown`, `json`, `xml` |

#### Extract Command

```bash

probe extract  [OPTIONS]

```

**Examples:**

```bash

# Extract function at line 42

probe extract src/main.rs:42

# Extract by symbol name

probe extract src/main.rs#authenticate

# Extract line range

probe extract src/main.rs:10-50

# From compiler output

go test | probe extract

```

#### Query Command (AST Patterns)

```bash

probe query  [PATH] [OPTIONS]

```

**Examples:**

```bash

# Find all async functions in Rust

probe query "async fn $NAME($$$)" --language rust

# Find React components

probe query "function $NAME($$$) { return <$$$> }" --language javascript

# Find Python classes with specific method

probe query "class $CLASS: def __init__($$$)" --language python

```

---

### Node.js SDK

Use Probe programmatically in your applications.

```javascript

import { ProbeAgent } from '@probelabs/probe/agent';

// Create agent

const agent = new ProbeAgent({

  path: './src',

  provider: 'anthropic'

});

await agent.initialize();

// Ask questions

const response = await agent.answer('How does authentication work?');

console.log(response);

// Get token usage

console.log(agent.getTokenUsage());

```

**Direct functions:**

```javascript

import { search, extract, query } from '@probelabs/probe';

// Semantic search

const results = await search({

  query: 'authentication',

  path: './src',

  maxTokens: 10000

});

// Extract code

const code = await extract({

  files: ['src/auth.ts:42'],

  format: 'markdown'

});

// AST pattern query

const matches = await query({

  pattern: 'async function $NAME($$$)',

  path: './src',

  language: 'typescript'

});

```

**Vercel AI SDK integration:**

```javascript

import { tools } from '@probelabs/probe';

const { searchTool, queryTool, extractTool } = tools;

// Use with Vercel AI SDK

const result = await generateText({

  model: anthropic('claude-sonnet-4-6'),

  tools: {

    search: searchTool({ defaultPath: './src' }),

    query: queryTool({ defaultPath: './src' }),

    extract: extractTool({ defaultPath: './src' })

  },

  prompt: 'Find authentication code'

});

```

---

## LLM Script

Probe Agent can use the `execute_plan` tool to run deterministic, multi-step code analysis tasks. LLM Script is a sandboxed JavaScript DSL where the AI generates executable plans combining search, extraction, and LLM reasoning in a single pipeline.

```javascript

// AI-generated LLM Script example (await is auto-injected, don't write it)

const files = search("authentication login")

const chunks = chunk(files)

const analysis = map(chunks, c => LLM("Summarize auth patterns", c))

return analysis.join("\n")

```

**Key features:**

- **Agent integration** - Probe Agent calls `execute_plan` tool to run scripts

- **Auto-await** - Async calls are automatically awaited (don't write `await`)

- **All tools available** - `search()`, `query()`, `extract()`, `LLM()`, `map()`, `chunk()`, plus any MCP tools

- **Sandboxed execution** - Safe, isolated JavaScript environment with timeout protection

See the full [LLM Script Documentation](./docs/llm-script.md) for syntax and examples.

---

## Installation

### NPM (Recommended)

```bash

npm install -g @probelabs/probe

```

### curl (macOS/Linux)

```bash

curl -fsSL https://raw.githubusercontent.com/probelabs/probe/main/install.sh | bash

```

### PowerShell (Windows)

```powershell

iwr -useb https://raw.githubusercontent.com/probelabs/probe/main/install.ps1 | iex

```

### From Source

```bash

git clone https://github.com/probelabs/probe.git

cd probe

cargo build --release

cargo install --path .

```

---

## Supported Languages

| Language | Extensions |

|----------|------------|

| Rust | `.rs` |

| JavaScript/JSX | `.js`, `.jsx` |

| TypeScript/TSX | `.ts`, `.tsx` |

| Python | `.py` |

| Go | `.go` |

| C/C++ | `.c`, `.h`, `.cpp`, `.cc`, `.hpp` |

| Java | `.java` |

| Ruby | `.rb` |

| PHP | `.php` |

| Swift | `.swift` |

| C# | `.cs` |

| Markdown | `.md` |

---

## Documentation

Full documentation available at [probelabs.com/probe](https://probelabs.com/probe) or browse locally in [`docs/`](./docs/).

### Getting Started

- [Quick Start](./docs/quick-start.md) - Get up and running in 5 minutes

- [Installation](./docs/installation.md) - NPM, curl, Docker, and building from source

- [Features Overview](./docs/features.md) - Core capabilities

### Probe CLI

- [Search Command](./docs/probe-cli/search.md) - Elasticsearch-style semantic search

- [Extract Command](./docs/probe-cli/extract.md) - Extract code blocks with full AST context

- [Query Command](./docs/probe-cli/query.md) - AST-based structural pattern matching

- [CLI Reference](./docs/probe-cli/cli-reference.md) - Complete command-line reference

### Probe Agent

- [Agent Overview](./docs/probe-agent/overview.md) - What is Probe Agent and when to use it

- [API Reference](./docs/probe-agent/sdk/api-reference.md) - ProbeAgent class documentation

- [Node.js SDK](./docs/probe-agent/sdk/nodejs-sdk.md) - Full Node.js SDK reference

- [MCP Integration](./docs/probe-agent/protocols/mcp-integration.md) - Editor integration guide

- [LLM Script](./docs/llm-script.md) - Programmable orchestration DSL

### Guides & Reference

- [Query Patterns](./docs/guides/query-patterns.md) - Effective search strategies

- [Architecture](./docs/reference/architecture.md) - System design and internals

- [Environment Variables](./docs/reference/environment-variables.md) - All configuration options

- [FAQ](./docs/reference/faq.md) - Frequently asked questions

---

## Environment Variables

```bash

# AI Provider Keys

ANTHROPIC_API_KEY=sk-ant-...

OPENAI_API_KEY=sk-...

GOOGLE_API_KEY=...

# Provider Selection

FORCE_PROVIDER=anthropic

MODEL_NAME=claude-sonnet-4-6

# Custom Endpoints

ANTHROPIC_API_URL=https://your-proxy.com

OPENAI_API_URL=https://your-proxy.com

# Debug

DEBUG=1

```

---

## Contributing

We welcome contributions! See our [Contributing Guide](https://github.com/probelabs/probe/blob/main/CONTRIBUTING.md).

For questions or support:

- [GitHub Issues](https://github.com/probelabs/probe/issues)

- [Discord Community](https://discord.gg/hBN4UsTZ)

---

## License

Apache 2.0 - See [LICENSE](LICENSE) for details.
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/probelabs/probe

Awesome Lists containing this project

README