{"id":31026169,"url":"https://github.com/muvon/octocode","last_synced_at":"2026-06-08T01:03:18.405Z","repository":{"id":297009068,"uuid":"994165011","full_name":"Muvon/octocode","owner":"Muvon","description":"Semantic code searcher and codebase utility with AI memory onboard","archived":false,"fork":false,"pushed_at":"2025-09-01T10:34:02.000Z","size":1678,"stargazers_count":170,"open_issues_count":7,"forks_count":7,"subscribers_count":0,"default_branch":"master","last_synced_at":"2025-09-01T12:40:45.332Z","etag":null,"topics":["ai","ai-memory","cli","cli-app","code-search","developer-tools","doc-search","graphrag","mcp-server","rag","semantic-search"],"latest_commit_sha":null,"homepage":"https://octocode.muvon.io","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Muvon.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-06-01T11:05:16.000Z","updated_at":"2025-08-31T10:51:39.000Z","dependencies_parsed_at":"2025-07-05T11:21:04.978Z","dependency_job_id":"7b6dd50f-739e-4580-bfc0-9867dcb94ee8","html_url":"https://github.com/Muvon/octocode","commit_stats":null,"previous_names":["muvon/octocode"],"tags_count":16,"template":false,"template_full_name":null,"purl":"pkg:github/Muvon/octocode","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Muvon%2Foctocode","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Muvon%2Foctocode/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Muvon%2Foctocode/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Muvon%2Foctocode/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Muvon","download_url":"https://codeload.github.com/Muvon/octocode/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Muvon%2Foctocode/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":275004571,"owners_count":25389192,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-13T02:00:10.085Z","response_time":70,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","ai-memory","cli","cli-app","code-search","developer-tools","doc-search","graphrag","mcp-server","rag","semantic-search"],"created_at":"2025-09-13T17:58:49.558Z","updated_at":"2026-06-08T01:03:18.397Z","avatar_url":"https://github.com/Muvon.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cdiv align=\"center\"\u003e\n\n\u003cimg src=\"https://raw.githubusercontent.com/Muvon/octocode/master/logo.svg\" width=\"240\" alt=\"Octocode\"\u003e\n\n### **Structural Code Intelligence for AI Agents — MCP Server + Knowledge Graph + Semantic Search**\n\n[![GitHub stars](https://img.shields.io/github/stars/Muvon/octocode?style=social)](https://github.com/Muvon/octocode/stargazers)\n[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)\n[![Rust](https://img.shields.io/badge/Rust-1.82%2B-orange.svg)](https://www.rust-lang.org)\n[![Release](https://img.shields.io/github/v/release/Muvon/octocode)](https://github.com/Muvon/octocode/releases)\n\n**Give your AI assistant a brain for your codebase.** Octocode transforms your project into a navigable knowledge graph that Claude, Cursor, and other AI agents can search, understand, and navigate.\n\n[🚀 Quick Start](#-quick-start) • [🤖 MCP Integration](#-mcp-server-integration) • [📖 Documentation](#-documentation) • [🌐 Website](https://octocode.muvon.io)\n\n\u003ca href=\"https://glama.ai/mcp/servers/Muvon/octocode\"\u003e\n  \u003cimg width=\"300\" src=\"https://glama.ai/mcp/servers/Muvon/octocode/badge\" alt=\"Octocode MCP server\" /\u003e\n\u003c/a\u003e\n\n\u003c/div\u003e\n\n---\n\n## 🤖 Built for AI Agents\n\n**The Problem:** AI assistants are blind to your codebase. They can't search your files, understand dependencies, or remember context across sessions.\n\n**The Solution:** Octocode's MCP server gives AI agents:\n- 🔍 **Semantic search** — Find code by meaning, not keywords\n- 🕸️ **Knowledge graph** — Navigate imports, calls, and dependencies\n- 📝 **Code signatures** — View structure without reading entire files\n- 🧠 **Persistent memory** — Remember decisions across conversations\n\n**Works with:** Claude Desktop • Cursor • Windsurf • Any MCP-compatible AI\n\n```json\n// Add to your AI assistant config\n{\n  \"mcpServers\": {\n    \"octocode\": {\n      \"command\": \"octocode\",\n      \"args\": [\"mcp\", \"--path\", \"/your/project\"]\n    }\n  }\n}\n```\n\nNow your AI assistant can:\n```\nYou: \"Where is authentication handled?\"\nAI: *searches your codebase* \"Authentication is in src/middleware/auth.rs,\n    which imports jwt.rs for token validation and calls user_store.rs for lookup.\"\n\nYou: \"What files depend on the payment module?\"\nAI: *queries knowledge graph* \"src/api/handlers/payment.rs imports payment/mod.rs,\n    which is also used by src/workers/refund.rs and src/cron/billing.rs\"\n\nYou: \"Remember this bug fix for future reference\"\nAI: *stores in memory* \"Got it. I'll remember this authentication bypass fix\n    and apply similar patterns when reviewing security code.\"\n```\n\n## 🤔 Why Octocode?\n\n**Standard RAG treats your code as flat text chunks.** It finds similar-sounding snippets but has no idea that `auth_middleware.rs` imports `jwt.rs`, calls `user_store.rs`, and is wired into `router.rs`. Octocode understands *structure*.\n\n```\n# Semantic search finds the right code\noctocode search \"authentication middleware\"\n→ src/middleware/auth.rs | Similarity 0.923\n\n# GraphRAG reveals the full dependency chain\noctocode graphrag get-relationships --node_id src/middleware/auth.rs\nOutgoing:\n  imports → jwt (src/auth/jwt.rs): token validation logic\n  calls   → user_store (src/db/user_store.rs): user lookup by token\nIncoming:\n  imports ← router (src/router.rs): wires auth into the request pipeline\n```\n\nOctocode uses **tree-sitter AST parsing** to extract real symbols (functions, imports, dependencies), builds a **GraphRAG knowledge graph** of relationships between files, and exposes everything via **MCP** — so AI tools can *navigate* your project architecture, not just search it.\n\n## 🔬 How It Works\n\n```\nSource Code → Tree-sitter AST → Symbols \u0026 Relationships → Knowledge Graph\n                                        ↓\n                    Embeddings + Hybrid Search + Reranking → MCP Server\n```\n\n1. **AST Parsing** — tree-sitter extracts real code symbols (functions, classes, imports), not arbitrary text chunks\n2. **Knowledge Graph** — GraphRAG maps relationships between files: `imports`, `calls`, `implements`, `extends`, `configures`, and 9 more types — each with importance weighting\n3. **Hybrid Search** — semantic similarity + BM25 full-text search + reranking — not just vector embeddings\n4. **MCP Server** — exposes `semantic_search`, `view_signatures`, and `graphrag` tools to any MCP-compatible client\n\n## ✨ What Makes It Different\n\n| | Standard RAG | Doc Lookup Tools | **Octocode** |\n|---|---|---|---|\n| **Indexes** | Text chunks | External library docs | Your codebase structure (AST) |\n| **Understands** | Similar text | API specs \u0026 usage | Functions, imports, dependencies |\n| **Cross-file** | No | No | Yes — navigates the dependency graph |\n| **Relationships** | No | No | `imports`, `calls`, `implements`, `extends`... |\n| **AI integration** | Varies | MCP | Native MCP server + LSP |\n\n\u003e **Doc tools give AI the manual for libraries you use. Octocode gives AI the blueprint of how you put them together.**\n\n**Built with Rust** for performance. **Local-first** for privacy. **Open source** (Apache 2.0) for transparency.\n\n## 🚀 Quick Start\n\n### 1. Install\n\n```bash\n# Universal installer (Linux, macOS, Windows)\ncurl -fsSL https://raw.githubusercontent.com/Muvon/octocode/master/install.sh | sh\n\n# macOS with Homebrew\nbrew install muvon/tap/octocode\n```\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cstrong\u003eOther installation methods\u003c/strong\u003e\u003c/summary\u003e\n\n```bash\n# Cargo (build from source)\ncargo install --git https://github.com/Muvon/octocode\n\n# Download binary from releases\n# https://github.com/Muvon/octocode/releases\n```\n\nSee [Installation Guide](INSTALL.md) for platform-specific instructions.\n\u003c/details\u003e\n\n### 2. Set Up API Keys\n\n```bash\n# Required: Embedding provider (Voyage AI has 200M free tokens/month)\nexport VOYAGE_API_KEY=\"your-voyage-api-key\"\n\n# Optional: LLM for commit messages, code review\nexport OPENROUTER_API_KEY=\"your-openrouter-api-key\"\n```\n\n**Get your Voyage API key:** [voyageai.com](https://www.voyageai.com/) (free tier available)\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cstrong\u003eOther embedding providers\u003c/strong\u003e\u003c/summary\u003e\n\nOctocode supports multiple embedding providers:\n\n```bash\n# OpenAI\nexport OPENAI_API_KEY=\"your-key\"\noctocode config --code-embedding-model \"openai:text-embedding-3-small\"\n\n# Jina AI\nexport JINA_API_KEY=\"your-key\"\noctocode config --code-embedding-model \"jina:jina-embeddings-v3\"\n\n# Google\nexport GOOGLE_API_KEY=\"your-key\"\noctocode config --code-embedding-model \"google:text-embedding-005\"\n```\n\nSee [API Keys guide](doc/API_KEYS.md) for all supported providers.\n\u003c/details\u003e\n\n### 3. Index Your Codebase\n\n```bash\ncd /your/project\noctocode index\n# → Indexed 12,847 blocks across 342 files\n```\n\n### 4. Search Your Code\n\n```bash\n# Natural language search\noctocode search \"authentication middleware\"\n\n# Multi-query for broader results\noctocode search \"auth\" \"middleware\" \"session\"\n\n# Filter by language\noctocode search \"database connection pool\" --lang rust\n\n# Search commit history\noctocode search \"authentication refactor\" --mode commits\n```\n\n### 5. Connect Your AI Assistant\n\nAdd to your MCP client config (Claude Desktop, Cursor, Windsurf):\n\n```json\n{\n  \"mcpServers\": {\n    \"octocode\": {\n      \"command\": \"octocode\",\n      \"args\": [\"mcp\", \"--path\", \"/your/project\"]\n    }\n  }\n}\n```\n\nDone! Your AI assistant now understands your codebase structure.\n\n## 🔌 MCP Server Integration\n\nOctocode includes a **built-in MCP server** that exposes your codebase as tools to AI assistants. This is the primary way to use Octocode — give your AI assistant direct access to search and navigate your code.\n\n### Available Tools\n\n| Tool | What It Does |\n|------|--------------|\n| `semantic_search` | Find code by meaning — \"authentication flow\", \"error handling\", \"database queries\" |\n| `view_signatures` | View file structure — function signatures, class definitions, imports |\n| `graphrag` | Query relationships — \"what calls this function?\", \"what does this module import?\" |\n| `structural_search` | AST pattern matching — find `.unwrap()` calls, `new` instantiations, specific patterns |\n\n### Conversational AI Examples\n\nOnce connected, your AI assistant can answer questions about your codebase:\n\n```\nYou: \"Where is user authentication implemented?\"\nAI: *uses semantic_search* \"Found in src/auth/login.rs. The authenticate() function\n    validates credentials against the database, generates a JWT token, and stores\n    the session in Redis.\"\n\nYou: \"What files depend on the payment module?\"\nAI: *uses graphrag* \"src/api/handlers/payment.rs imports payment/mod.rs, which is also\n    used by src/workers/refund.rs and src/cron/billing.rs. The payment module exports\n    process_payment() and validate_transaction() functions.\"\n\nYou: \"Show me all error handling in the API layer\"\nAI: *uses structural_search* \"Found 23 error handling patterns in src/api/:\n    - 15 use Result\u003cT, ApiError\u003e with explicit error types\n    - 8 use .unwrap() (potential panics in handlers/user.rs:42, handlers/auth.rs:87)\n    - 3 use .expect() with custom messages\"\n```\n\n### Quick Setup\n\n**Octomind (Recommended)** — Zero setup, Octocode pre-configured:\n```bash\ncurl -fsSL https://raw.githubusercontent.com/muvon/octomind/master/install.sh | bash\noctomind run developer:rust\n```\n\n**Claude Code (CLI)** — Command-line setup:\n```bash\nclaude mcp add octocode -- octocode mcp --path /path/to/your/project\n```\n\n**Claude Desktop / Cursor / Windsurf** — Add to config:\n```json\n{\n  \"mcpServers\": {\n    \"octocode\": {\n      \"command\": \"octocode\",\n      \"args\": [\"mcp\", \"--path\", \"/path/to/your/project\"]\n    }\n  }\n}\n```\n\n**Config locations:**\n- Claude Desktop: `~/Library/Application Support/Claude/claude_desktop_config.json` (macOS)\n- Cursor: `~/.cursor/mcp.json` or Settings → MCP Servers\n- Windsurf: Settings → MCP\n\n📖 **[Complete MCP Client Setup Guide](doc/MCP_CLIENTS.md)** — Detailed instructions for 15+ clients including VS Code (Cline/Continue), Zed, Replit, and more.\n\n## 🎯 What Can You Do With It?\n\n**New developer onboarding:**\n```\nYou: \"How does the authentication system work?\"\nAI: *searches and navigates* \"Authentication starts in src/middleware/auth.rs which\n    validates JWT tokens. It calls src/auth/jwt.rs for token verification, which uses\n    the public key from config. Failed auth returns 401 via src/errors/auth_error.rs.\n    Sessions are stored in Redis via src/cache/session.rs.\"\n```\n\n**Code archaeology:**\n```\nYou: \"Find all places we handle database errors\"\nAI: *structural search* \"Found 47 error handling patterns:\n    - 32 use Result\u003cT, DbError\u003e with proper error types\n    - 15 use .unwrap() (potential issues in src/db/user.rs:23, src/db/order.rs:156)\n    - Recommend adding proper error handling to those locations\"\n```\n\n**Refactoring with confidence:**\n```\nYou: \"What depends on the PaymentProcessor trait?\"\nAI: *queries graph* \"src/api/handlers/checkout.rs, src/workers/refund_worker.rs,\n    and src/cron/billing.rs all depend on PaymentProcessor. The trait is defined\n    in src/domain/payment.rs and implemented by src/infrastructure/stripe.rs\n    and src/infrastructure/paypal.rs.\"\n```\n\n**Code review assistance:**\n```\nYou: \"Review this PR for security issues\"\nAI: *analyzes changes* \"The PR adds password hashing in src/auth/hash.rs. However,\n    it uses SHA256 which is fast and vulnerable to brute force. Recommend using\n    bcrypt or argon2 instead. Also found 3 instances of .unwrap() that could panic\n    in production.\"\n```\n\n## 🌐 Supported Languages\n\n| Language | Extensions | Features |\n|----------|------------|----------|\n| **Rust** | `.rs` | Full AST parsing, pub/use detection, module structure |\n| **Python** | `.py` | Import/class/function extraction, docstring parsing |\n| **TypeScript/JavaScript** | `.ts`, `.tsx`, `.js`, `.jsx` | ES6 imports/exports, type definitions |\n| **Go** | `.go` | Package/import analysis, struct/interface parsing |\n| **PHP** | `.php` | Class/function extraction, namespace support |\n| **C++** | `.cpp`, `.cc`, `.cxx`, `.c++`, `.c`, `.h`, `.hpp`, `.hxx`, `.cppm`, `.ixx`, `.mxx`, `.ccm`, `.cxxm` | Include analysis, class/function extraction, C++20 module support |\n| **Ruby** | `.rb` | Class/module extraction, method definitions |\n| **Java** | `.java` | Import analysis, class/method extraction |\n| **JSON** | `.json` | Structure analysis, key extraction |\n| **Bash** | `.sh`, `.bash` | Function and variable extraction |\n| **Markdown** | `.md` | Document section indexing, header extraction |\n\n*Plus: CSS, Lua, Svelte, and more via tree-sitter*\n\n## 📚 Documentation\n\n- **[Getting Started](doc/GETTING_STARTED.md)** — First steps and basic workflow\n- **[Installation Guide](INSTALL.md)** — Detailed methods and building from source\n- **[MCP Client Setup](doc/MCP_CLIENTS.md)** — Connect to Claude, Cursor, Windsurf, and 15+ clients\n- **[MCP Integration](doc/MCP_INTEGRATION.md)** — MCP server details and advanced configuration\n- **[Commands Reference](doc/COMMANDS.md)** — Complete CLI reference\n- **[Configuration](doc/CONFIGURATION.md)** — Templates and customization\n- **[API Keys](doc/API_KEYS.md)** — Provider setup guide\n- **[Architecture](doc/ARCHITECTURE.md)** — How it works under the hood\n- **[Contributing](doc/CONTRIBUTING.md)** — Development setup\n\n## 🔒 Privacy \u0026 Security\n\n- **🏠 Local-first** — local embedding models available on supported platforms (macOS ARM default builds); cloud providers on all platforms\n- **🔐 Secure** — API keys stored locally, env vars supported\n- **🚫 Respects .gitignore** — Never indexes sensitive files\n- **🛡️ MCP security** — Local-only server, no external network for search\n- **📤 Cloud-safe** — Embeddings process only metadata, never source code\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cstrong\u003e📊 Retrieval Quality Benchmark\u003c/strong\u003e\u003c/summary\u003e\n\nWe measure semantic search quality using a hand-annotated ground truth dataset of 254 queries (127 code + 127 docs) with precise line-range annotations. Each query has 1–3 expected results scored by relevance.\n\nTested on commit [`b1771ba`](https://github.com/Muvon/octocode/commit/b1771ba) with [benchmark config](benchmark/config.toml) (contextual retrieval, Voyage reranker, RaBitQ quantization).\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cstrong\u003eDocumentation search\u003c/strong\u003e (\u003ccode\u003e--mode docs\u003c/code\u003e) — Hit@10: 0.953, MRR: 0.776\u003c/summary\u003e\n\n| Metric | Score |\n|--------|-------|\n| Hit@5 | 0.929 (118/127) |\n| Hit@10 | 0.953 (121/127) |\n| MRR | 0.776 |\n| NDCG@10 | 0.801 |\n| Recall@5 | 0.902 |\n| Recall@10 | 0.921 |\n\n**Missed queries** (6 of 127):\n\n| # | Query | Expected | Got (top 1) |\n|---|-------|----------|-------------|\n| 43 | how to set up MCP proxy for managing multiple repositories | `doc/MCP_INTEGRATION.md:286-311` | `doc/MCP_INTEGRATION.md:286-4` |\n| 51 | what are the prerequisites before using octocode | `doc/GETTING_STARTED.md:6-12` | `doc/CONTRIBUTING.md:7-33` |\n| 59 | what to do when hitting API rate limits | `doc/GETTING_STARTED.md:209-216` | `doc/PERFORMANCE.md:304-356` |\n| 75 | typical performance metrics for small medium and large projects | `doc/PERFORMANCE.md:4-13` | `doc/PERFORMANCE.md:414-14` |\n| 112 | how to install octocode on different operating systems | `INSTALL.md:4-14` | `INSTALL.md:49-70` |\n| 115 | how to fix macOS Gatekeeper blocking the binary | `INSTALL.md:199-206` | `INSTALL.md:198-119` |\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cstrong\u003eCode search\u003c/strong\u003e (\u003ccode\u003e--mode code\u003c/code\u003e) — Hit@10: 0.992, MRR: 0.895\u003c/summary\u003e\n\n| Metric | Score |\n|--------|-------|\n| Hit@5 | 0.992 (126/127) |\n| Hit@10 | 0.992 (126/127) |\n| MRR | 0.895 |\n| NDCG@10 | 0.906 |\n| Recall@5 | 0.962 |\n| Recall@10 | 0.974 |\n\n**Missed queries** (1 of 127):\n\n| # | Query | Expected | Got (top 1) |\n|---|-------|----------|-------------|\n| 105 | how does the system ensure two developers get the same database path | `src/storage.rs:60-83` | `src/mcp/proxy.rs:631-644` |\n\n\u003c/details\u003e\n\nMetrics: **Hit@k** (did the answer appear?), **MRR** (how high?), **NDCG@10** (are best results ranked first?), **Recall@k** (how many found?). See [benchmark/](benchmark/) for methodology, scoring script, and the full dataset.\n\n\u003c/details\u003e\n\n## 🤝 Community \u0026 Support\n\n- ⭐ **Star us on GitHub** — It really helps!\n- 🐛 [Report Issues](https://github.com/Muvon/octocode/issues)\n- 💬 [Discussions](https://github.com/Muvon/octocode/discussions)\n- 📧 [opensource@muvon.io](mailto:opensource@muvon.io)\n- 🌐 [muvon.io](https://muvon.io)\n\n## ⚖️ License\n\nApache License 2.0 — See [LICENSE](LICENSE) for details.\n\n---\n\n\u003cdiv align=\"center\"\u003e\n\n**Built with 🦀 Rust by [Muvon](https://muvon.io) in Hong Kong**\n\n[⭐ Star](https://github.com/Muvon/octocode) • [🍴 Fork](https://github.com/Muvon/octocode/fork) • [📣 Share](https://twitter.com/intent/tweet?text=Octocode%20-%20AI-powered%20code%20intelligence%20with%20built-in%20MCP%20server\u0026url=https://github.com/Muvon/octocode)\n\n\u003c/div\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmuvon%2Foctocode","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmuvon%2Foctocode","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmuvon%2Foctocode/lists"}