{"id":27107769,"url":"https://github.com/buger/probe","last_synced_at":"2025-08-12T06:51:36.529Z","repository":{"id":280852619,"uuid":"943383028","full_name":"buger/probe","owner":"buger","description":"Probe is an AI-friendly, fully local, semantic code search engine which which works with for large codebases. The final missing building block for next generation of AI coding tools.","archived":false,"fork":false,"pushed_at":"2025-04-05T08:27:21.000Z","size":2134,"stargazers_count":113,"open_issues_count":13,"forks_count":7,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-04-05T08:27:33.075Z","etag":null,"topics":["ai","ai-coder","search-engine"],"latest_commit_sha":null,"homepage":"https://probeai.dev/","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/buger.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-03-05T16:09:15.000Z","updated_at":"2025-04-05T08:26:48.000Z","dependencies_parsed_at":null,"dependency_job_id":"36d9da9b-6aff-441a-be8d-7990e4275bd9","html_url":"https://github.com/buger/probe","commit_stats":null,"previous_names":["buger/probe"],"tags_count":74,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/buger%2Fprobe","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/buger%2Fprobe/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/buger%2Fprobe/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/buger%2Fprobe/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/buger","download_url":"https://codeload.github.com/buger/probe/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247545201,"owners_count":20956119,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","ai-coder","search-engine"],"created_at":"2025-04-06T21:02:43.198Z","updated_at":"2025-04-06T21:03:19.037Z","avatar_url":"https://github.com/buger.png","language":"Rust","readme":"\u003cp align=\"center\"\u003e\n  \u003cimg src=\"logo.png?2\" alt=\"Probe Logo\" width=\"400\"\u003e\n\u003c/p\u003e\n\n# Probe\n\nProbe is an **AI-friendly, fully local, semantic code search** tool designed to power the next generation of AI coding assistants. By combining the speed of [ripgrep](https://github.com/BurntSushi/ripgrep) with the code-aware parsing of [tree-sitter](https://tree-sitter.github.io/tree-sitter/), Probe delivers precise results with complete code blocks—perfect for large codebases and AI-driven development workflows.\n\n---\n\n## Table of Contents\n\n- [Quick Start](#quick-start)\n- [Features](#features)\n- [Installation](#installation)\n  - [Quick Installation](#quick-installation)\n  - [Requirements](#requirements)\n  - [Manual Installation](#manual-installation)\n  - [Building from Source](#building-from-source)\n  - [Verifying the Installation](#verifying-the-installation)\n  - [Troubleshooting](#troubleshooting)\n  - [Uninstalling](#uninstalling)\n- [Usage](#usage)\n  - [CLI Mode](#cli-mode)\n  - [MCP Server Mode](#mcp-server-mode)\n  - [AI Chat Mode](#ai-chat-mode) (Example in examples/chat)\n  - [Web Interface](#web-interface)\n- [Supported Languages](#supported-languages)\n- [How It Works](#how-it-works)\n- [Adding Support for New Languages](#adding-support-for-new-languages)\n- [Releasing New Versions](#releasing-new-versions)\n\n---\n\n## Quick Start\n\n**NPM Installation**\nThe easiest way to install Probe is via npm, which also installs the binary:\n\n~~~bash\nnpm install -g @buger/probe\n~~~\n\n**Basic Search Example**\nSearch for code containing the phrase \"llm pricing\" in the current directory:\n\n~~~bash\nprobe search \"llm pricing\" ./\n~~~\n\n**Advanced Search (with Token Limiting)**\nSearch for \"partial prompt injection\" in the current directory but limit the total tokens to 10000 (useful for AI tools with context window constraints):\n\n~~~bash\nprobe search \"prompt injection\" ./ --max-tokens 10000\n~~~\n\n**Elastic Search Queries**\nUse advanced query syntax for more powerful searches:\n\n~~~bash\n# Use AND operator for terms that must appear together\nprobe search \"error AND handling\" ./\n\n# Use OR operator for alternative terms\nprobe search \"login OR authentication OR auth\" ./src\n\n# Group terms with parentheses for complex queries\nprobe search \"(error OR exception) AND (handle OR process)\" ./\n\n# Use wildcards for partial matching\nprobe search \"auth* connect*\" ./\n\n# Exclude terms with NOT operator\nprobe search \"database NOT sqlite\" ./\n~~~\n\n**Extract Code Blocks**\nExtract a specific function or code block containing line 42 in main.rs:\n\n~~~bash\nprobe extract src/main.rs:42\n~~~\n\nYou can even paste failing test output and it will extract needed files and AST out of it, like:\n\n~~~bash\ngo test | probe extract\n~~~\n\n**Interactive AI Chat**\nUse the AI assistant to ask questions about your codebase:\n\n~~~bash\n# Run directly with npx (no installation needed)\nnpx -y @buger/probe-chat\n\n# Set your API key first\nexport ANTHROPIC_API_KEY=your_api_key\n# Or for OpenAI\n# export OPENAI_API_KEY=your_api_key\n\n# Specify a directory to search (optional)\nnpx -y @buger/probe-chat /path/to/your/project\n~~~\n\n**Node.js SDK Usage**\nUse Probe programmatically in your Node.js applications with the Vercel AI SDK:\n\n~~~javascript\nimport { ProbeChat } from '@buger/probe-chat';\nimport { StreamingTextResponse } from 'ai';\n\n// Create a chat instance\nconst chat = new ProbeChat({\n  model: 'claude-3-sonnet-20240229',\n  anthropicApiKey: process.env.ANTHROPIC_API_KEY,\n  allowedFolders: ['/path/to/your/project']\n});\n\n// In an API route or Express handler\nexport async function POST(req) {\n  const { messages } = await req.json();\n  const userMessage = messages[messages.length - 1].content;\n  \n  // Get a streaming response from the AI\n  const stream = await chat.chat(userMessage, { stream: true });\n  \n  // Return a streaming response\n  return new StreamingTextResponse(stream);\n}\n\n// Or use it in a non-streaming way\nconst response = await chat.chat('How is authentication implemented?');\nconsole.log(response);\n~~~\n\n**MCP server**\n\nIntegrate with any AI editor:\n\n  ~~~json\n  {\n    \"mcpServers\": {\n      \"memory\": {\n        \"command\": \"npx\",\n        \"args\": [\n          \"-y\",\n          \"@buger/probe-mcp\"\n        ]\n      }\n    }\n  }\n  ~~~\n\nExample queries:\n\u003e \"Do the probe and search my codebase for implementations of the ranking algorithm\"\n\u003e\n\u003e \"Using probe find all functions related to error handling in the src directory\"\n\n---\n\n## Features\n\n- **AI-Friendly**: Extracts **entire functions, classes, or structs** so AI models get full context.\n- **Fully Local**: Keeps your code on your machine—no external APIs.\n- **Powered by ripgrep**: Extremely fast scanning of large codebases.\n- **Tree-sitter Integration**: Parses and understands code structure accurately.\n- **Re-Rankers \u0026 NLP**: Uses tokenization, stemming, BM25, TF-IDF, or hybrid ranking methods for better search results.\n- **Code Extraction**: Extract specific code blocks or entire files with the `extract` command.\n- **Multi-Language**: Works with popular languages like Rust, Python, JavaScript, TypeScript, Java, Go, C/C++, Swift, C#, and more.\n- **Interactive AI Chat**: AI assistant example in the examples directory that can answer questions about your codebase using Claude or GPT models.\n- **Flexible**: Run as a CLI tool, an MCP server, or an interactive AI chat.\n\n---\n\n## Installation\n\n### Quick Installation\n\nYou can install Probe with a single command using npm, curl, or PowerShell:\n\n**Using npm (Recommended for Node.js users)**\n~~~bash\nnpm install -g @buger/probe\n~~~\n\n**Using curl (For macOS and Linux)**\n~~~bash\ncurl -fsSL https://raw.githubusercontent.com/buger/probe/main/install.sh | bash\n~~~\n\n**What the curl script does**:\n\n1. Detects your operating system and architecture\n2. Fetches the latest release from GitHub\n3. Downloads the appropriate binary for your system\n4. Verifies the checksum for security\n5. Installs the binary to `/usr/local/bin`\n\n**Using PowerShell (For Windows)**\n~~~powershell\niwr -useb https://raw.githubusercontent.com/buger/probe/main/install.ps1 | iex\n~~~\n\n**What the PowerShell script does**:\n\n1. Detects your system architecture (x86_64 or ARM64)\n2. Fetches the latest release from GitHub\n3. Downloads the appropriate Windows binary\n4. Verifies the checksum for security\n5. Installs the binary to your user directory (`%LOCALAPPDATA%\\Probe`) by default\n6. Provides instructions to add the binary to your PATH if needed\n\n**Installation options**:\n\nThe PowerShell script supports several options:\n\n~~~powershell\n# Install for current user (default)\niwr -useb https://raw.githubusercontent.com/buger/probe/main/install.ps1 | iex\n\n# Install system-wide (requires admin privileges)\niwr -useb https://raw.githubusercontent.com/buger/probe/main/install.ps1 | iex -args \"--system\"\n\n# Install to a custom directory\niwr -useb https://raw.githubusercontent.com/buger/probe/main/install.ps1 | iex -args \"--dir\", \"C:\\Tools\\Probe\"\n\n# Show help\niwr -useb https://raw.githubusercontent.com/buger/probe/main/install.ps1 | iex -args \"--help\"\n~~~\n\n### Requirements\n\n- **Operating Systems**: macOS, Linux, or Windows\n- **Architectures**: x86_64 (all platforms) or ARM64 (macOS and Windows)\n- **Tools**:\n  - For macOS/Linux: `curl`, `bash`, and `sudo`/root privileges\n  - For Windows: PowerShell 5.1 or later\n\n### Manual Installation\n\n1. Download the appropriate binary for your platform from the [GitHub Releases](https://github.com/buger/probe/releases) page:\n   - `probe-x86_64-linux.tar.gz` for Linux (x86_64)\n   - `probe-x86_64-darwin.tar.gz` for macOS (Intel)\n   - `probe-aarch64-darwin.tar.gz` for macOS (Apple Silicon)\n   - `probe-x86_64-windows.zip` for Windows (x86_64)\n   - `probe-aarch64-windows.zip` for Windows (ARM64)\n2. Extract the archive:\n   ~~~bash\n   # For Linux/macOS\n   tar -xzf probe-*-*.tar.gz\n   \n   # For Windows (using PowerShell)\n   Expand-Archive -Path probe-*-windows.zip -DestinationPath .\\probe\n   ~~~\n3. Move the binary to a location in your PATH:\n   ~~~bash\n   # For Linux/macOS\n   sudo mv probe /usr/local/bin/\n   \n   # For Windows (using PowerShell)\n   # Create a directory for the binary (if it doesn't exist)\n   $installDir = \"$env:LOCALAPPDATA\\Probe\"\n   New-Item -ItemType Directory -Path $installDir -Force\n   \n   # Move the binary\n   Move-Item -Path .\\probe\\probe.exe -Destination $installDir\n   \n   # Add to PATH (optional)\n   [Environment]::SetEnvironmentVariable('PATH', [Environment]::GetEnvironmentVariable('PATH', 'User') + \";$installDir\", 'User')\n   ~~~\n\n### Building from Source\n\n1. Install Rust and Cargo (if not already installed):\n   \n   For macOS/Linux:\n   ~~~bash\n   curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh\n   ~~~\n   \n   For Windows:\n   ~~~powershell\n   # Download and run the Rust installer\n   Invoke-WebRequest -Uri https://win.rustup.rs/x86_64 -OutFile rustup-init.exe\n   .\\rustup-init.exe\n   # Follow the on-screen instructions\n   ~~~\n\n2. Clone this repository:\n   ~~~bash\n   git clone https://github.com/buger/probe.git\n   cd code-search\n   ~~~\n3. Build the project:\n   ~~~bash\n   cargo build --release\n   ~~~\n4. (Optional) Install globally:\n   ~~~bash\n   cargo install --path .\n   ~~~\n   \n   This will install the binary to your Cargo bin directory, which is typically:\n   - `$HOME/.cargo/bin` on macOS/Linux\n   - `%USERPROFILE%\\.cargo\\bin` on Windows\n\n### Verifying the Installation\n\nFor macOS/Linux:\n~~~bash\nprobe --version\n~~~\n\nFor Windows:\n~~~powershell\nprobe --version\n~~~\n\nIf you get a \"command not found\" error on Windows, make sure the installation directory is in your PATH or use the full path to the executable:\n~~~powershell\n# If installed to the default user location\n\u0026 \"$env:LOCALAPPDATA\\Probe\\probe.exe\" --version\n\n# If installed to the default system location\n\u0026 \"$env:ProgramFiles\\Probe\\probe.exe\" --version\n~~~\n\n### Troubleshooting\n\n- **Permissions**:\n  - For macOS/Linux: Ensure you can write to `/usr/local/bin`\n  - For Windows: Ensure you have write permissions to the installation directory\n- **System Requirements**: Double-check your OS/architecture compatibility\n- **PATH Issues**:\n  - For Windows: Restart your terminal after adding the installation directory to PATH\n  - For macOS/Linux: Verify that `/usr/local/bin` is in your PATH\n- **Manual Install**: If the quick install script fails, try [Manual Installation](#manual-installation)\n- **GitHub Issues**: Report issues on the [GitHub repository](https://github.com/buger/probe/issues)\n\n### Uninstalling\n\nFor macOS/Linux:\n~~~bash\n# If installed via npm\nnpm uninstall -g @buger/probe\n\n# If installed via curl script or manually\nsudo rm /usr/local/bin/probe\n~~~\n\nFor Windows:\n~~~powershell\n# If installed via PowerShell script to default user location\nRemove-Item -Path \"$env:LOCALAPPDATA\\Probe\\probe.exe\" -Force\n\n# If installed via PowerShell script to system location\nRemove-Item -Path \"$env:ProgramFiles\\Probe\\probe.exe\" -Force\n\n# If you added the installation directory to PATH, you may want to remove it\n# For user PATH:\n$userPath = [Environment]::GetEnvironmentVariable('PATH', 'User')\n$userPath = ($userPath -split ';' | Where-Object { $_ -ne \"$env:LOCALAPPDATA\\Probe\" }) -join ';'\n[Environment]::SetEnvironmentVariable('PATH', $userPath, 'User')\n~~~\n\n---\n## Usage\n\nProbe can be used in three main modes:\n\n1. **CLI Mode**: Direct code search and extraction from the command line\n2. **MCP Server Mode**: Run as a server exposing search functionality via MCP\n3. **Web Interface**: Browser-based UI for code exploration\n\nAdditionally, there are example implementations in the examples directory:\n\n- **AI Chat Example**: Interactive AI assistant for code exploration (in examples/chat)\n- **Web Interface Example**: Browser-based UI for code exploration (in examples/web)\n\n### CLI Mode\n\n#### Search Command\n\n~~~bash\nprobe search \u003cSEARCH_PATTERN\u003e [OPTIONS]\n~~~\n\n##### Key Options\n\n- `\u003cSEARCH_PATTERN\u003e`: Pattern to search for (required)\n- `--files-only`: Skip AST parsing; only list files with matches\n- `--ignore`: Custom ignore patterns (in addition to `.gitignore`)\n- `--exclude-filenames, -n`: Exclude files whose names match query words (filename matching is enabled by default)\n- `--reranker, -r`: Choose a re-ranking algorithm (`hybrid`, `hybrid2`, `bm25`, `tfidf`)\n- `--frequency, -s`: Frequency-based search (tokenization, stemming, stopword removal)\n=======\n- `--max-results`: Maximum number of results to return\n- `--max-bytes`: Maximum total bytes of code to return\n- `--max-tokens`: Maximum total tokens of code to return (useful for AI)\n- `--allow-tests`: Include test files and test code blocks\n- `--any-term`: Match files containing **any** query terms (default behavior)\n- `--no-merge`: Disable merging of adjacent code blocks after ranking (merging enabled by default)\n- `--merge-threshold`: Max lines between code blocks to consider them adjacent for merging (default: 5)\n\n##### Examples\n\n~~~bash\n# 1) Search for \"setTools\" in the current directory with frequency-based search\nprobe search \"setTools\"\n\n# 2) Search for \"impl\" in ./src\nprobe search \"impl\"  ./src\n\n# 3) Search for \"keyword\" returning only the top 5 results\nprobe search \"keyword\" --max-tokens 10000\n\n# 4) Search for \"function\" and disable merging of adjacent code blocks\nprobe search \"function\" --no-merge\n~~~\n\n#### Extract Command\n\nThe extract command allows you to extract code blocks from files. When a line number is specified, it uses tree-sitter to find the closest suitable parent node (function, struct, class, etc.) for that line. You can also specify a symbol name to extract the code block for that specific symbol.\n\n~~~bash\nprobe extract \u003cFILES\u003e [OPTIONS]\n~~~\n\n##### Key Options\n\n- `\u003cFILES\u003e`: Files to extract from (can include line numbers with colon, e.g., `file.rs:10`, or symbol names with hash, e.g., `file.rs#function_name`)\n- `--allow-tests`: Include test files and test code blocks in results\n- `-c, --context \u003cLINES\u003e`: Number of context lines to include before and after the extracted block (default: 0)\n- `-f, --format \u003cFORMAT\u003e`: Output format (`markdown`, `plain`, `json`) (default: `markdown`)\n\n##### Examples\n\n~~~bash\n# 1) Extract a function containing line 42 from main.rs\nprobe extract src/main.rs:42\n\n# 2) Extract multiple files or blocks\nprobe extract src/main.rs:42 src/lib.rs:15 src/cli.rs\n\n# 3) Extract with JSON output format\nprobe extract src/main.rs:42 --format json\n\n# 4) Extract with 5 lines of context around the specified line\nprobe extract src/main.rs:42 --context 5\n\n# 5) Extract a specific function by name (using # symbol syntax)\nprobe extract src/main.rs#handle_extract\n\n# 6) Extract a specific line range (using : syntax)\nprobe extract src/main.rs:10-20\n\n# 7) Extract from stdin (useful with error messages or compiler output)\ncat error_log.txt | probe extract\n~~~\n\nThe extract command can also read file paths from stdin, making it useful for processing compiler errors or log files:\n\n~~~bash\n# Extract code blocks from files mentioned in error logs\ngrep -r \"error\" ./logs/ | probe extract\n~~~\n\n### MCP Server\n\nAdd the following to your AI editor's MCP configuration file:\n  \n  ~~~json\n  {\n    \"mcpServers\": {\n      \"memory\": {\n        \"command\": \"npx\",\n        \"args\": [\n          \"-y\",\n          \"@buger/probe-mcp\"\n        ]\n      }\n    }\n  }\n  ~~~\n  \n- **Example Usage in AI Editors**:\n  \n  Once configured, you can ask your AI assistant to search your codebase with natural language queries like:\n  \n  \u003e \"Do the probe and search my codebase for implementations of the ranking algorithm\"\n  \u003e\n  \u003e \"Using probe find all functions related to error handling in the src directory\"\n\n### AI Chat Mode\n\nThe AI chat functionality is available as a standalone npm package that can be run directly with npx.\n\n#### Using npx (Recommended)\n\n~~~bash\n# Run directly with npx (no installation needed)\nnpx -y @buger/probe-chat\n\n# Set your API key\nexport ANTHROPIC_API_KEY=your_api_key\n# Or for OpenAI\n# export OPENAI_API_KEY=your_api_key\n\n# Or specify a directory to search\nnpx -y @buger/probe-chat /path/to/your/project\n~~~\n\n#### Using the npm package\n\n~~~bash\n# Install globally\nnpm install -g @buger/probe-chat\n\n# Start the chat interface\nprobe-chat\n~~~\n\n#### Using the example code\n\n~~~bash\n# Navigate to the examples directory\ncd examples/chat\n\n# Install dependencies\nnpm install\n\n# Set your API key\nexport ANTHROPIC_API_KEY=your_api_key\n# Or for OpenAI\n# export OPENAI_API_KEY=your_api_key\n\n# Start the chat interface\nnode index.js\n~~~\n\nThis starts an interactive CLI interface where you can ask questions about your codebase and get AI-powered responses.\n\n#### Features\n\n- **AI-Powered Search**: Uses LLMs to understand your questions and search the codebase intelligently\n- **Multi-Model Support**: Works with both Anthropic's Claude and OpenAI's GPT models\n- **Token Tracking**: Monitors token usage for both requests and responses\n- **Conversation History**: Maintains context across multiple interactions\n- **Colored Output**: Provides a user-friendly terminal interface with syntax highlighting\n\n#### Configuration\n\nConfigure the chat using environment variables:\n\n~~~bash\n# Use Claude models (recommended)\nexport ANTHROPIC_API_KEY=your_api_key\n\n# Or use OpenAI models\nexport OPENAI_API_KEY=your_api_key\n\n# Override the default model\nexport MODEL_NAME=claude-3-opus-20240229\n\n# Override API URLs (useful for proxies or enterprise deployments)\nexport ANTHROPIC_API_URL=https://your-anthropic-proxy.com\nexport OPENAI_API_URL=https://your-openai-proxy.com/v1\n\n# Enable debug mode for detailed logging\nexport DEBUG=1\n~~~\n\n#### Example Usage\n\n~~~\n❯ How does the ranking algorithm work?\n\n─ Response ─────────────────────────────────────────────────────────\nI'll explain how the ranking algorithm works in the codebase.\n\nThe ranking system in Probe uses multiple algorithms to sort search results by relevance. The main ranking algorithms are:\n\n1. TF-IDF (Term Frequency-Inverse Document Frequency)\n2. BM25 (Best Matching 25)\n3. Hybrid (a combination approach)\n\nLet me break down each approach:\n\n## TF-IDF Ranking\n\nTF-IDF weighs terms based on how frequently they appear in a document versus how common they are across all documents.\n\nKey implementation details:\n- Term frequency (TF) measures how often a term appears in a document\n- Inverse document frequency (IDF) measures how rare a term is across all documents\n- Final score is calculated as TF × IDF\n\n## BM25 Ranking\n\nBM25 is an advanced ranking function that improves upon TF-IDF by adding document length normalization.\n\nKey implementation details:\n- Uses parameters k1 (term frequency saturation) and b (document length normalization)\n- Handles edge cases like empty documents and rare terms\n- Provides better results for longer documents\n\n## Hybrid Ranking\n\nThe hybrid approach combines multiple ranking signals for better results:\n\n1. Combines scores from both TF-IDF and BM25\n2. Considers document length and term positions\n3. Applies normalization to ensure fair comparison\n\nThe default reranker is \"hybrid\" which provides the best overall results for code search.\n\nThe ranking implementation can be found in `src/search/result_ranking.rs`.\n─────────────────────────────────────────────────────────────────────\nToken Usage: Request: 1245 Response: 1532 (Current message only: ~1532)\nTotal: 2777 tokens (Cumulative for entire session)\n─────────────────────────────────────────────────────────────────────\n~~~\n\n### Web Interface\n\nProbe includes a web-based chat interface that provides a user-friendly way to interact with your codebase using AI. You can run it directly with npx or set it up manually.\n\n#### Quick Start with npx\n\n~~~bash\n# Run directly with npx (no installation needed)\nnpx -y @buger/probe-web\n\n# Set your API key first\nexport ANTHROPIC_API_KEY=your_api_key\n\n# Configure allowed folders (optional)\nexport ALLOWED_FOLDERS=/path/to/folder1,/path/to/folder2\n~~~\n\n#### Manual Setup and Configuration\n\n1. **Navigate to the web directory**:\n   ```bash\n   cd web\n   ```\n\n2. **Install dependencies**:\n   ```bash\n   npm install\n   ```\n\n3. **Configure environment variables**:\n   Create or edit the `.env` file in the web directory:\n   ```\n   ANTHROPIC_API_KEY=your_anthropic_api_key\n   PORT=8080\n   ALLOWED_FOLDERS=/path/to/folder1,/path/to/folder2\n   ```\n\n4. **Start the server**:\n   ```bash\n   npm start\n   ```\n\n5. **Access the web interface**:\n   Open your browser and navigate to `http://localhost:8080`\n\n#### Technical Details\n\n- Built with vanilla JavaScript and Node.js\n- Uses the Vercel AI SDK for Claude integration\n- Executes Probe commands via the probeTool.js module\n- Renders markdown with Marked.js and syntax highlighting with Highlight.js\n- Supports Mermaid.js for diagram generation and visualization\n\n---\n\n## Supported Languages\n\nProbe currently supports:\n\n- **Rust** (`.rs`)\n- **JavaScript / JSX** (`.js`, `.jsx`)\n- **TypeScript / TSX** (`.ts`, `.tsx`)\n- **Python** (`.py`)\n- **Go** (`.go`)\n- **C / C++** (`.c`, `.h`, `.cpp`, `.cc`, `.cxx`, `.hpp`, `.hxx`)\n- **Java** (`.java`)\n- **Ruby** (`.rb`)\n- **PHP** (`.php`)\n- **Swift** (`.swift`)\n- **C#** (`.cs`)\n- **Markdown** (`.md`, `.markdown`)\n\n---\n\n## How It Works\n\nProbe combines **fast file scanning** with **deep code parsing** to provide highly relevant, context-aware results:\n\n1. **Ripgrep Scanning**  \n   Probe uses ripgrep to quickly search across your files, identifying lines that match your query. Ripgrep's efficiency allows it to handle massive codebases at lightning speed.\n\n2. **AST Parsing with Tree-sitter**  \n   For each file containing matches, Probe uses tree-sitter to parse the file into an Abstract Syntax Tree (AST). This process ensures that code blocks (functions, classes, structs) can be identified precisely.\n\n3. **NLP \u0026 Re-Rankers**  \n   Next, Probe applies classical NLP methods—tokenization, stemming, and stopword removal—alongside re-rankers such as **BM25**, **TF-IDF**, or the **hybrid** approach (combining multiple ranking signals). This step elevates the most relevant code blocks to the top, especially helpful for AI-driven searches.\n\n4. **Block Extraction**  \n   Probe identifies the smallest complete AST node containing each match (e.g., a full function or class). It extracts these code blocks and aggregates them into search results.\n\n5. **Context for AI**  \n   Finally, these structured blocks can be returned directly or fed into an AI system. By providing the full context of each code segment, Probe helps AI models navigate large codebases and produce more accurate insights.\n\n---\n\n## Adding Support for New Languages\n\n1. **Tree-sitter Grammar**: In `Cargo.toml`, add the tree-sitter parser for the new language.  \n2. **Language Module**: Create a new file in `src/language/` for parsing logic.  \n3. **Implement Language Trait**: Adapt the parse method for the new language constructs.  \n4. **Factory Update**: Register your new language in Probe's detection mechanism.\n\n---\n\n## Releasing New Versions\n\nProbe uses GitHub Actions for multi-platform builds and releases.\n\n1. **Update `Cargo.toml`** with the new version.  \n2. **Create a new Git tag**:\n   ~~~bash\n   git tag -a vX.Y.Z -m \"Release vX.Y.Z\"\n   git push origin vX.Y.Z\n   ~~~\n3. **GitHub Actions** will build, package, and draft a new release with checksums.\n\nEach release includes:\n- Linux binary (x86_64)\n- macOS binaries (x86_64 and aarch64)\n- Windows binaries (x86_64 and aarch64)\n- SHA256 checksums\n\n---\n\nWe believe that **local, privacy-focused, semantic code search** is essential for the future of AI-assisted development. Probe is built to empower developers and AI alike to navigate and comprehend large codebases more effectively.\n\nFor questions or contributions, please open an issue on [GitHub](https://github.com/buger/probe/issues) or join our [Discord community](https://discord.gg/hBN4UsTZ) for discussions and support. Happy coding—and searching!\n","funding_links":[],"categories":["Rust"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbuger%2Fprobe","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbuger%2Fprobe","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbuger%2Fprobe/lists"}