{"id":31580549,"url":"https://github.com/peteretelej/diffchunk","last_synced_at":"2025-10-05T21:14:56.416Z","repository":{"id":301414034,"uuid":"1008982649","full_name":"peteretelej/diffchunk","owner":"peteretelej","description":"diffchunk - A local MCP server that gives LLMs the ability to work with large diff files. Essential for working with large repos.","archived":false,"fork":false,"pushed_at":"2025-07-19T13:01:52.000Z","size":6016,"stargazers_count":8,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-09-30T12:33:30.000Z","etag":null,"topics":["code-review","diff-analysis","llm-tools","mcp-server"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/peteretelej.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"docs/CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-06-26T11:48:31.000Z","updated_at":"2025-09-10T17:44:19.000Z","dependencies_parsed_at":"2025-07-18T01:58:21.690Z","dependency_job_id":null,"html_url":"https://github.com/peteretelej/diffchunk","commit_stats":null,"previous_names":["peteretelej/diffchunk"],"tags_count":9,"template":false,"template_full_name":null,"purl":"pkg:github/peteretelej/diffchunk","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/peteretelej%2Fdiffchunk","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/peteretelej%2Fdiffchunk/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/peteretelej%2Fdiffchunk/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/peteretelej%2Fdiffchunk/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/peteretelej","download_url":"https://codeload.github.com/peteretelej/diffchunk/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/peteretelej%2Fdiffchunk/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":278520443,"owners_count":26000482,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-05T02:00:06.059Z","response_time":54,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["code-review","diff-analysis","llm-tools","mcp-server"],"created_at":"2025-10-05T21:14:50.951Z","updated_at":"2025-10-05T21:14:56.411Z","avatar_url":"https://github.com/peteretelej.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# diffchunk\n\n[![CI](https://github.com/peteretelej/diffchunk/actions/workflows/ci.yml/badge.svg)](https://github.com/peteretelej/diffchunk/actions/workflows/ci.yml)\n[![codecov](https://codecov.io/gh/peteretelej/diffchunk/branch/main/graph/badge.svg)](https://codecov.io/gh/peteretelej/diffchunk)\n[![PyPI version](https://img.shields.io/pypi/v/diffchunk.svg)](https://pypi.org/project/diffchunk/)\n[![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n[![Ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff)\n[![uv](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/uv/main/assets/badge/v0.json)](https://github.com/astral-sh/uv)\n\nMCP server that enables LLMs to navigate large diff files efficiently. Instead of reading entire diffs sequentially, LLMs can jump directly to relevant changes using pattern-based navigation.\n\n## Problem\n\nLarge diffs exceed LLM context limits and waste tokens on irrelevant changes. A 50k+ line diff can't be processed directly and manual splitting loses file relationships.\n\n## Solution\n\nMCP server with 4 navigation tools:\n\n- `load_diff` - Parse diff file with custom settings (optional)\n- `list_chunks` - Show chunk overview with file mappings (auto-loads)\n- `get_chunk` - Retrieve specific chunk content (auto-loads)\n- `find_chunks_for_files` - Locate chunks by file patterns (auto-loads)\n\n## Setup\n\n**Prerequisite:** Install [uv](https://docs.astral.sh/uv/getting-started/installation/) (an extremely fast Python package manager) which provides the `uvx` command.\n\nAdd to your MCP client configuration:\n\n```json\n{\n  \"mcpServers\": {\n    \"diffchunk\": {\n      \"command\": \"uvx\",\n      \"args\": [\"--from\", \"diffchunk\", \"diffchunk-mcp\"]\n    }\n  }\n}\n```\n\n## Usage\n\nYour AI assistant can now handle massive changesets that previously caused failures in Cline, Roocode, Cursor, and other tools.\n\n### Using with AI Assistant\n\nOnce configured, your AI assistant can analyze large commits, branches, or diffs using diffchunk.\n\nHere are some example use cases:\n\n**Branch comparisons:**\n\n- _\"Review all changes in develop not in the main branch for any bugs\"_\n- _\"Tell me about all the changes I have yet to merge\"_\n- _\"What new features were added to the staging branch?\"_\n- _\"Summarize all changes to this repo in the last 2 weeks\"_\n\n**Code review:**\n\n- _\"Use diffchunk to check my feature branch for security vulnerabilities\"_\n- _\"Use diffchunk to find any breaking changes before I merge to production\"_\n- _\"Use diffchunk to review this large refactor for potential issues\"_\n\n**Change analysis:**\n\n- _\"Use diffchunk to show me all database migrations that need to be run\"_\n- _\"Use diffchunk to find what API changes might affect our mobile app\"_\n- _\"Use diffchunk to analyze all new dependencies added recently\"_\n\n**Direct file analysis:**\n\n- _\"Use diffchunk to analyze the diff at /tmp/changes.diff and find any bugs\"_\n- _\"Create a diff of my uncommitted changes and review it\"_\n- _\"Compare my local branch with origin and highlight conflicts\"_\n\n### Tip: AI Assistant Rules\n\nAdd to your AI assistant's custom instructions for automatic usage:\n\n```\nWhen reviewing large changesets or git commits, use diffchunk to handle large diff files.\nCreate temporary diff files and tracking files as needed and clean up after analysis.\n```\n\n## How It Works\n\nWhen you ask your AI assistant to analyze changes, it uses diffchunk's tools strategically:\n\n1. **Creates the diff file** (e.g., `git diff main..develop \u003e /tmp/changes.diff`) based on your question\n2. **Uses `list_chunks`** to get an overview of the diff structure and total scope\n3. **Uses `find_chunks_for_files`** to locate relevant sections when you ask about specific file types\n4. **Uses `get_chunk`** to examine specific sections without loading the entire diff into context\n5. **Tracks progress systematically** through large changesets, analyzing chunk by chunk\n6. **Cleans up temporary files** after completing the analysis\n\nThis lets your AI assistant handle massive diffs that would normally crash other tools, while providing thorough analysis without losing context.\n\n### Tool Usage Patterns\n\n**Overview first:**\n\n```python\nlist_chunks(\"/tmp/changes.diff\")\n# → 5 chunks across 12 files, 3,847 total lines\n```\n\n**Target specific files:**\n\n```python\nfind_chunks_for_files(\"/tmp/changes.diff\", \"*.py\")\n# → [1, 3, 5] - Python file chunks\n\nget_chunk(\"/tmp/changes.diff\", 1)\n# → Content of first Python chunk\n```\n\n**Systematic analysis:**\n\n```python\n# Process each chunk in sequence\nget_chunk(\"/tmp/changes.diff\", 1)\nget_chunk(\"/tmp/changes.diff\", 2)\n# ... continue through all chunks\n```\n\n## Configuration\n\n### Path Requirements\n\n- **Absolute paths only**: `/home/user/project/changes.diff`\n- **Cross-platform**: Windows (`C:\\path`) and Unix (`/path`)\n- **Home expansion**: `~/project/changes.diff`\n\n### Auto-Loading Defaults\n\nTools auto-load with optimized settings:\n\n- `max_chunk_lines`: 1000\n- `skip_trivial`: true (whitespace-only)\n- `skip_generated`: true (lock files, build artifacts)\n\n### Custom Settings\n\nUse `load_diff` for non-default behavior:\n\n```python\nload_diff(\n    \"/tmp/large.diff\",\n    max_chunk_lines=2000,\n    include_patterns=\"*.py,*.js\",\n    exclude_patterns=\"*test*\"\n)\n```\n\n## Supported Formats\n\n- Git diff output (`git diff`, `git show`)\n- Unified diff format (`diff -u`)\n- Multiple files in single diff\n- Binary file change indicators\n\n## Performance\n\n- Efficiently handles 100k+ line diffs\n- Memory efficient streaming\n- Auto-reload on file changes\n\n## Documentation\n\n- [Design](docs/design.md) - Architecture and implementation details\n- [Contributing](docs/CONTRIBUTING.md) - Development setup and workflows\n\n## License\n\n[MIT](./LICENSE)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpeteretelej%2Fdiffchunk","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpeteretelej%2Fdiffchunk","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpeteretelej%2Fdiffchunk/lists"}