{"id":29462057,"url":"https://github.com/ctoth/mcp_server_code_extractor","last_synced_at":"2026-04-11T08:05:13.397Z","repository":{"id":304261998,"uuid":"1018226046","full_name":"ctoth/mcp_server_code_extractor","owner":"ctoth","description":"🎯 Precise code extraction for AI assistants - MCP server using tree-sitter to extract functions, classes \u0026 snippets from 30+ languages without manual parsing","archived":false,"fork":false,"pushed_at":"2025-07-12T00:20:23.000Z","size":27,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"master","last_synced_at":"2025-07-12T02:28:50.934Z","etag":null,"topics":["ai-tools","claude","code-extraction","mcp","model-context-protocol","python","tree-sitter"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ctoth.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-07-11T20:41:10.000Z","updated_at":"2025-07-12T00:20:26.000Z","dependencies_parsed_at":"2025-07-12T02:29:30.304Z","dependency_job_id":"de04bb85-6442-4ae7-a603-29f77ca4729d","html_url":"https://github.com/ctoth/mcp_server_code_extractor","commit_stats":null,"previous_names":["ctoth/mcp_server_code_extractor"],"tags_count":3,"template":false,"template_full_name":null,"purl":"pkg:github/ctoth/mcp_server_code_extractor","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ctoth%2Fmcp_server_code_extractor","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ctoth%2Fmcp_server_code_extractor/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ctoth%2Fmcp_server_code_extractor/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ctoth%2Fmcp_server_code_extractor/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ctoth","download_url":"https://codeload.github.com/ctoth/mcp_server_code_extractor/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ctoth%2Fmcp_server_code_extractor/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":265238169,"owners_count":23732579,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai-tools","claude","code-extraction","mcp","model-context-protocol","python","tree-sitter"],"created_at":"2025-07-14T04:00:46.589Z","updated_at":"2026-04-11T08:05:13.357Z","avatar_url":"https://github.com/ctoth.png","language":"Python","readme":"# MCP Server Code Extractor\r\n\r\nA Model Context Protocol (MCP) server that provides precise code extraction tools using tree-sitter parsing. Extract functions, classes, and code snippets from 30+ programming languages without manual parsing.\r\n\r\n## Why MCP Server Code Extractor?\r\n\r\nWhen working with AI coding assistants like Claude, you often need to:\r\n- Extract specific functions or classes from large codebases\r\n- Get an overview of what's in a file without reading the entire thing\r\n- Retrieve precise code snippets with accurate line numbers\r\n- Avoid manual parsing and grep/sed/awk gymnastics\r\n\r\nMCP Server Code Extractor solves these problems by providing structured, tree-sitter-powered code extraction tools directly within your AI assistant.\r\n\r\n## Features\r\n\r\n- **🎯 Precise Extraction**: Uses tree-sitter parsing for accurate code boundary detection\r\n- **🔍 Semantic Search**: Search for function calls and code patterns across files and directories\r\n- **🌍 30+ Languages**: Supports Python, JavaScript, TypeScript, Go, Rust, Java, C/C++, and many more\r\n- **📍 Line Numbers**: Every extraction includes precise line number information\r\n- **🗂️ Directory Search**: Search entire codebases with file pattern filtering and exclusions\r\n- **📊 Depth Control**: Extract at different levels (top-level only, classes+methods, everything)\r\n- **🌐 URL Support**: Fetch and extract code from GitHub, GitLab, and direct file URLs\r\n- **🔄 Git Integration**: Extract code from any git revision, branch, or tag\r\n- **⚡ Fast \u0026 Lightweight**: Efficient caching and minimal dependencies\r\n- **🤖 AI-Optimized**: Designed specifically for use with AI coding assistants\r\n\r\n## Installation\r\n\r\n### Quick Start with uvx (Recommended)\r\n\r\n```bash\r\n# Install and run directly with uvx\r\nuvx mcp-server-code-extractor\r\n```\r\n\r\n### Alternative Installation Methods\r\n\r\n#### Using UV\r\n```bash\r\n# Install UV if you haven't already\r\ncurl -LsSf https://astral.sh/uv/install.sh | sh\r\n\r\n# Run as package with UV\r\nuv run mcp-server-code-extractor\r\n```\r\n\r\n#### Using pip\r\n```bash\r\npip install mcp-server-code-extractor\r\nmcp-server-code-extractor\r\n```\r\n\r\n#### Development Installation\r\n```bash\r\n# Clone this repository\r\ngit clone https://github.com/ctoth/mcp_server_code_extractor\r\ncd mcp_server_code_extractor\r\n\r\n# Install development dependencies\r\nuv add --dev pytest black flake8 mypy\r\n\r\n# Run as Python module\r\nuv run python -m code_extractor\r\n```\r\n\r\n### Configure with Claude Desktop\r\n\r\nAdd to your Claude Desktop configuration:\r\n\r\n#### Using uvx (Recommended)\r\n```json\r\n{\r\n  \"mcpServers\": {\r\n    \"mcp-server-code-extractor\": {\r\n      \"command\": \"uvx\",\r\n      \"args\": [\"mcp-server-code-extractor\"]\r\n    }\r\n  }\r\n}\r\n```\r\n\r\n#### Using UV\r\n```json\r\n{\r\n  \"mcpServers\": {\r\n    \"mcp-server-code-extractor\": {\r\n      \"command\": \"uv\",\r\n      \"args\": [\"run\", \"mcp-server-code-extractor\"]\r\n    }\r\n  }\r\n}\r\n```\r\n\r\n#### Using pip installation\r\n```json\r\n{\r\n  \"mcpServers\": {\r\n    \"mcp-server-code-extractor\": {\r\n      \"command\": \"mcp-server-code-extractor\"\r\n    }\r\n  }\r\n}\r\n```\r\n\r\n### Testing with MCP Inspector\r\n\r\n```bash\r\n# Test the server with MCP Inspector\r\nnpx @modelcontextprotocol/inspector uvx mcp-server-code-extractor\r\n\r\n# Or with other installation methods\r\nnpx @modelcontextprotocol/inspector uv run mcp-server-code-extractor\r\nnpx @modelcontextprotocol/inspector mcp-server-code-extractor\r\n```\r\n\r\n## Available Tools\r\n\r\n### 1. `get_symbols` - Discover Code Structure\r\nList all functions, classes, and other symbols in a file with depth control.\r\n\r\n```\r\nParameters:\r\n- path_or_url: Path to source file or URL\r\n- git_revision: Optional git revision (branch, tag, commit)\r\n- depth: Symbol extraction depth (0=everything, 1=top-level only, 2=classes+methods)\r\n\r\nReturns:\r\n- name: Symbol name\r\n- type: function/class/method/etc\r\n- start_line/end_line: Line numbers\r\n- preview: First line of the symbol\r\n- parent: Parent class name (for methods)\r\n```\r\n\r\n### 2. `search_code` - Semantic Code Search\r\nSearch for code patterns using tree-sitter parsing. Supports both single-file and directory-wide searches.\r\n\r\n```\r\nParameters:\r\n- search_type: Type of search (\"function-calls\")\r\n- target: What to search for (e.g., \"requests.get\", \"logger.error\", \"validateData\")\r\n- scope: File path, directory path, or URL to search in\r\n- language: Programming language (auto-detected if not specified)\r\n- git_revision: Optional git revision (commit, branch, tag) - not supported for URLs\r\n- max_results: Maximum number of results to return (default: 100)\r\n- include_context: Include surrounding code lines for context (default: true)\r\n- file_patterns: File patterns for directory search (e.g., [\"*.py\", \"*.js\"])\r\n- exclude_patterns: File patterns to exclude (e.g., [\"*.pyc\", \"node_modules/*\"])\r\n- max_files: Maximum number of files to search in directory mode (default: 1000)\r\n- follow_symlinks: Whether to follow symbolic links in directory search (default: false)\r\n\r\nReturns:\r\n- file_path: Path to file containing the match\r\n- start_line/end_line: Line numbers of the match\r\n- match_text: The matching code\r\n- context_before/context_after: Surrounding code lines\r\n- language: Detected programming language\r\n- metadata: Additional search information\r\n```\r\n\r\n### 3. `get_function` - Extract Complete Functions\r\nExtract a complete function with all its code.\r\n\r\n```\r\nParameters:\r\n- path_or_url: Path to source file or URL\r\n- function_name: Name of the function to extract\r\n- git_revision: Optional git revision (branch, tag, commit)\r\n\r\nReturns:\r\n- code: Complete function code\r\n- start_line/end_line: Precise boundaries\r\n- language: Detected language\r\n```\r\n\r\n### 4. `get_class` - Extract Complete Classes\r\nExtract an entire class definition including all methods.\r\n\r\n```\r\nParameters:\r\n- path_or_url: Path to source file or URL\r\n- class_name: Name of the class to extract\r\n- git_revision: Optional git revision (branch, tag, commit)\r\n\r\nReturns:\r\n- code: Complete class code\r\n- start_line/end_line: Precise boundaries\r\n- language: Detected language\r\n```\r\n\r\n### 5. `get_lines` - Extract Specific Line Ranges\r\nGet exact line ranges when you know the line numbers.\r\n\r\n```\r\nParameters:\r\n- path_or_url: Path to source file or URL\r\n- start_line: Starting line (1-based)\r\n- end_line: Ending line (inclusive)\r\n- git_revision: Optional git revision (branch, tag, commit)\r\n\r\nReturns:\r\n- code: Extracted lines\r\n- line numbers and metadata\r\n```\r\n\r\n### 6. `get_signature` - Get Function Signatures\r\nQuickly get just the function signature without the body.\r\n\r\n```\r\nParameters:\r\n- path_or_url: Path to source file or URL\r\n- function_name: Name of the function\r\n- git_revision: Optional git revision (branch, tag, commit)\r\n\r\nReturns:\r\n- signature: Function signature only\r\n- start_line: Where the function starts\r\n```\r\n\r\n## Usage Examples\r\n\r\n### Example 1: Exploring Local Files\r\n\r\n```python\r\n# First, see what's in the file\r\nsymbols = get_symbols(\"src/main.py\")\r\n# Returns: List of all functions and classes with line numbers\r\n\r\n# Extract a specific function\r\nresult = get_function(\"src/main.py\", \"process_data\")\r\n# Returns: Complete function code with line numbers\r\n\r\n# Get just a function signature\r\nsig = get_signature(\"src/main.py\", \"process_data\")\r\n# Returns: \"def process_data(input_file: str, output_dir: Path) -\u003e Dict[str, Any]:\"\r\n```\r\n\r\n### Example 2: Working with URLs and Git Revisions\r\n\r\n```python\r\n# Explore a GitHub file (current version)\r\nsymbols = get_symbols(\"https://raw.githubusercontent.com/user/repo/main/src/api.py\")\r\n\r\n# Extract function from GitLab\r\nresult = get_function(\"https://gitlab.com/user/project/-/raw/main/utils.py\", \"helper_func\")\r\n\r\n# Work with git revisions (local files only)\r\nsymbols_old = get_symbols(\"src/api.py\", git_revision=\"HEAD~1\")\r\nfunction_from_branch = get_function(\"src/utils.py\", \"helper_func\", git_revision=\"feature-branch\")\r\nclass_from_tag = get_class(\"src/models.py\", \"User\", git_revision=\"v1.0.0\")\r\n\r\n# Get lines from any URL\r\nlines = get_lines(\"https://example.com/code/script.py\", 10, 25)\r\n```\r\n\r\n### Example 3: Progressive Code Discovery\r\n\r\n```python\r\n# 1. Start with overview - just see the main structure\r\noverview = get_symbols(\"models/user.py\", depth=1)\r\n# Shows: class User, class Admin, def create_user, etc.\r\n\r\n# 2. Explore a specific class and its methods\r\nclass_methods = get_symbols(\"models/user.py\", depth=2)\r\n# Shows: class User with its methods like __init__, validate, save\r\n\r\n# 3. Extract the full class when you need implementation details\r\nuser_class = get_class(\"models/user.py\", \"User\")\r\n# Returns: Complete User class with all methods\r\n\r\n# 4. Or get just a specific method signature for quick reference\r\ninit_sig = get_signature(\"models/user.py\", \"__init__\")\r\n# Returns: \"def __init__(self, username: str, email: str, **kwargs):\"\r\n\r\n# 5. Extract specific lines when you know exactly what you need\r\nlines = get_lines(\"models/user.py\", 10, 25)\r\n# Returns: Lines 10-25 of the file\r\n```\r\n\r\n### Example 4: Semantic Code Search\r\n\r\n```python\r\n# Search for specific function calls in a single file\r\nresults = search_code(\r\n    search_type=\"function-calls\",\r\n    target=\"requests.get\",\r\n    scope=\"src/api.py\"\r\n)\r\n# Returns: All requests.get() calls with line numbers and context\r\n\r\n# Search across an entire directory\r\nresults = search_code(\r\n    search_type=\"function-calls\", \r\n    target=\"logger.error\",\r\n    scope=\"src/\",\r\n    file_patterns=[\"*.py\"],\r\n    exclude_patterns=[\"test_*\", \"__pycache__/*\"]\r\n)\r\n# Returns: All logger.error() calls across Python files, excluding tests\r\n\r\n# Cross-language search in frontend code\r\nresults = search_code(\r\n    search_type=\"function-calls\",\r\n    target=\"fetchData\", \r\n    scope=\"frontend/\",\r\n    file_patterns=[\"*.js\", \"*.ts\", \"*.jsx\"],\r\n    max_results=50\r\n)\r\n# Returns: All fetchData() calls in JavaScript/TypeScript files\r\n```\r\n\r\n### Example 5: Multi-Language Support\r\n\r\n```javascript\r\n// Works with JavaScript/TypeScript\r\nsymbols = get_symbols(\"app.ts\")\r\nfunc = get_function(\"app.ts\", \"handleRequest\")\r\n```\r\n\r\n```go\r\n// Works with Go\r\nsymbols = get_symbols(\"main.go\")\r\nmethod = get_function(\"main.go\", \"ServeHTTP\")\r\n```\r\n\r\n## Supported Languages\r\n\r\n- Python, JavaScript, TypeScript, JSX/TSX\r\n- Go, Rust, C, C++, C#, Java\r\n- Ruby, PHP, Swift, Kotlin, Scala\r\n- Bash, PowerShell, SQL\r\n- Haskell, OCaml, Elixir, Clojure\r\n- And many more...\r\n\r\n## Best Practices\r\n\r\n### Progressive Discovery Workflow\r\n1. **Start with `search_code`** to find relevant functions and patterns across the codebase\r\n2. **Use `get_symbols`** with `depth=1` to see file structure of interesting files\r\n3. **Use depth control** - `depth=2` for classes+methods, `depth=0` for everything\r\n4. **Extract specific items** with `get_function/get_class` for implementation details\r\n5. **Use `get_signature`** for quick API exploration without full code\r\n6. **Use `get_lines`** when you know exact line numbers\r\n\r\n### Semantic Search Tips\r\n- Use **directory search** to find patterns across your entire codebase\r\n- Apply **file patterns** to focus on specific languages or file types\r\n- Use **exclusion patterns** to skip test files, build artifacts, and dependencies\r\n- Set appropriate **max_results** and **max_files** limits for large codebases\r\n- Enable **context** to understand the surrounding code\r\n\r\n### Git Integration Tips\r\n- Use git revisions to compare implementations across versions\r\n- Extract from feature branches to review changes\r\n- Use tags to get stable API versions\r\n\r\n### URL Usage\r\n- GitHub/GitLab URLs work great for exploring open source code\r\n- Combine with local git revisions for comprehensive analysis\r\n- Note: git revisions only work with local files, not URLs\r\n\r\n## Advantages Over Traditional Tools\r\n\r\n**Traditional file reading:**\r\n- Reads entire files (inefficient for large files)\r\n- Requires manual parsing to find functions/classes\r\n- Manual line counting for extraction\r\n- Complex syntax edge cases\r\n\r\n**MCP Server Code Extractor:**\r\n- ✅ Extracts exactly what you need\r\n- ✅ Provides structured data with metadata\r\n- ✅ Handles complex syntax automatically\r\n- ✅ Works across 30+ languages consistently\r\n- ✅ Depth control for efficient exploration\r\n- ✅ Git integration for version comparison\r\n\r\n## Contributing\r\n\r\nContributions are welcome! Please feel free to submit a Pull Request.\r\n\r\n## License\r\n\r\nMIT License - see LICENSE file for details.\r\n\r\n## Acknowledgments\r\n\r\n- Built on [tree-sitter](https://tree-sitter.github.io/) for robust parsing\r\n- Uses [tree-sitter-languages](https://github.com/grantjenks/py-tree-sitter-languages) for language support\r\n- Implements the [Model Context Protocol](https://modelcontextprotocol.io/) specification\r\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fctoth%2Fmcp_server_code_extractor","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fctoth%2Fmcp_server_code_extractor","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fctoth%2Fmcp_server_code_extractor/lists"}