{"id":47297153,"url":"https://github.com/symgraph/GhidrAssist","last_synced_at":"2026-03-30T23:00:49.893Z","repository":{"id":268565722,"uuid":"862642098","full_name":"symgraph/GhidrAssist","owner":"symgraph","description":"An LLM extension for Ghidra to enable AI assistance in RE.","archived":false,"fork":false,"pushed_at":"2026-03-28T15:30:57.000Z","size":4891,"stargazers_count":590,"open_issues_count":3,"forks_count":46,"subscribers_count":9,"default_branch":"master","last_synced_at":"2026-03-28T17:49:44.727Z","etag":null,"topics":["chatgpt","ghidra","ghidra-extension","ghidra-plugin","llm","ml","openai","re","reverse-engineering"],"latest_commit_sha":null,"homepage":"","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/symgraph.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null},"funding":{"github":["jtang613"],"patreon":"jtang613","open_collective":null,"ko_fi":null,"tidelift":null,"community_bridge":null,"liberapay":null,"issuehunt":null,"lfx_crowdfunding":null,"polar":null,"buy_me_a_coffee":"jtang613","thanks_dev":null,"custom":null}},"created_at":"2024-09-25T00:14:05.000Z","updated_at":"2026-03-28T16:44:07.000Z","dependencies_parsed_at":"2025-12-14T04:03:54.937Z","dependency_job_id":null,"html_url":"https://github.com/symgraph/GhidrAssist","commit_stats":null,"previous_names":["jtang613/ghidrassist","symgraph/ghidrassist"],"tags_count":63,"template":false,"template_full_name":null,"purl":"pkg:github/symgraph/GhidrAssist","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/symgraph%2FGhidrAssist","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/symgraph%2FGhidrAssist/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/symgraph%2FGhidrAssist/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/symgraph%2FGhidrAssist/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/symgraph","download_url":"https://codeload.github.com/symgraph/GhidrAssist/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/symgraph%2FGhidrAssist/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31213707,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-30T15:24:02.938Z","status":"ssl_error","status_checked_at":"2026-03-30T15:23:44.804Z","response_time":138,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["chatgpt","ghidra","ghidra-extension","ghidra-plugin","llm","ml","openai","re","reverse-engineering"],"created_at":"2026-03-16T18:00:30.558Z","updated_at":"2026-03-30T23:00:49.886Z","avatar_url":"https://github.com/symgraph.png","language":"Java","funding_links":["https://github.com/sponsors/jtang613","https://patreon.com/jtang613","https://buymeacoffee.com/jtang613"],"categories":["Java"],"sub_categories":[],"readme":"# GhidrAssist\nAuthor: **Jason Tang**\n\n_An advanced LLM-powered plugin for interactive reverse engineering assistance in Ghidra._\n\n## Description\n\nGhidrAssist integrates Large Language Models (LLMs) into Ghidra to provide intelligent assistance for binary exploration and reverse engineering. It supports any OpenAI v1-compatible API, including local models (Ollama, LM-Studio, Open-WebUI) and cloud providers (OpenAI, Anthropic, Azure).\n\n### Key Features\n\n**Core Functionality:**\n* **Code Explanation** - Explain functions and instructions in both disassembly and decompiled pseudo-C\n  - Security analysis panel showing risk level, activity profile, and API usage\n  - Editable summaries with user-edit protection from auto-overwrite\n* **Interactive Chat** - Multi-turn conversational queries with persistent chat history\n* **Custom Queries** - Direct LLM queries with optional context from current function/location\n\n**Graph-RAG Knowledge System:**\n* **Semantic Knowledge Graph** - Hierarchical representation of binary analysis\n  - 5-level semantic hierarchy: Statement → Block → Function → Module → Binary\n  - Pre-computed LLM summaries enable fast, LLM-free queries\n  - SQLite persistence with JGraphT graph algorithms\n  - Full-text search (FTS5) on summaries and security annotations\n* **Community Detection** - Automatic module discovery via Leiden algorithm\n  - Groups related functions into logical modules\n  - Hierarchical community structure with summaries\n  - Visual graph exploration with configurable depth\n* **Security Feature Extraction** - Comprehensive security analysis\n  - Network APIs: POSIX sockets, WinSock, DNS, SSL/TLS, WinHTTP, WinINet\n  - File I/O APIs: POSIX, Windows, C library functions\n  - Crypto APIs: OpenSSL, Windows crypto, platform-specific\n  - String patterns: IP addresses, URLs, domains, file paths, registry keys\n  - Risk level classification (LOW/MEDIUM/HIGH) and activity profiling\n* **Semantic Graph Tab** - Visual knowledge graph interface\n  - Graph view with N-hop depth exploration\n  - List view of all indexed functions\n  - Semantic search across summaries\n  - One-click re-indexing and security analysis\n\n**Advanced Capabilities:**\n* **Extended Thinking/Reasoning Control** - Adjust LLM reasoning depth for quality vs. speed trade-offs\n  - Support for OpenAI o1/o3/o4, Claude with extended thinking, and local reasoning models\n  - Configurable effort levels: Low (fast), Medium (balanced), High (thorough)\n  - Per-program persistence - different binaries can use different reasoning levels\n  - Provider-agnostic implementation (Anthropic, OpenAI, Azure, LiteLLM, LMStudio, Ollama)\n* **ReAct Agentic Mode** - Autonomous investigation using structured reasoning (Think-Act-Observe)\n  - LLM proposes investigation steps based on your query\n  - Systematic tool execution with progress tracking via todo lists\n  - Iteration history preservation showing all investigation steps\n  - Final synthesis with comprehensive answer and key findings\n  - Accurate metrics (iterations, tool calls, duration)\n* **MCP Integration** - Model Context Protocol client for tool-based analysis\n  - Works with [GhidrAssistMCP](https://github.com/jtang613/GhidrAssistMCP) for Ghidra-specific tools\n  - Conversational tool calling with automatic function execution\n  - Support for SSE (Server-Sent Events) transport\n* **Function Calling** - LLM can autonomously navigate binaries and modify analysis\n  - Rename functions and variables\n  - Navigate to addresses and cross-references\n  - Execute Ghidra commands\n* **Actions Tab** - Propose and apply bulk analysis improvements\n  - Security vulnerability detection\n  - Code quality analysis\n  - Automated refactoring suggestions\n* **RAG (Retrieval Augmented Generation)** - Enhance queries with contextual documents\n  - Add custom documentation, exploit notes, architecture references\n  - Lucene-based full-text search\n  - Context injection into queries\n* **RLHF Dataset Generation** - Collect feedback for model fine-tuning\n\n\n### Architecture\n\nThe plugin uses a modular, service-oriented architecture:\n\n**Core Services:**\n- **Query Modes**: Regular queries, MCP-enhanced queries, or full agentic investigation\n- **ReAct Orchestrator**: Manages autonomous investigation loops with todo tracking and findings accumulation\n- **Conversational Tool Handler**: Manages multi-turn tool calling sessions\n- **MCPToolManager**: Interfaces with external MCP servers for specialized tools\n\n**Graph-RAG Backend:**\n- **BinaryKnowledgeGraph**: Hybrid SQLite + JGraphT storage for semantic knowledge\n- **GraphRAGEngine**: LLM-free query engine using pre-computed summaries\n- **SemanticExtractor**: LLM-powered function summarization with batch processing\n- **SecurityFeatureExtractor**: Static analysis for network, file I/O, and crypto APIs\n- **CommunityDetector**: Leiden algorithm implementation for module discovery\n\n**Data Layer:**\n- **AnalysisDB**: SQLite database for chat history, RLHF feedback, and knowledge graphs\n- **SchemaMigrationRunner**: Versioned database migrations for transparent upgrades\n- **RAGEngine**: Lucene-powered document search for custom context injection\n\n**UI Components:**\n- Tab-based interface: Explain, Query, Actions, Semantic Graph, RAG Management, MCP Servers\n- Service orchestration via TabController\n\nFuture Roadmap:\n* Model fine-tuning using collected RLHF dataset\n* Additional MCP tool integrations\n* Enhanced agentic capabilities, multi-agent collaboration\n* Embedding-based similarity search\n\n## Screenshots\n\n![Screenshot](https://github.com/user-attachments/assets/f5476e0d-5e30-4855-90a9-e0dbf39e16c7)\n\n\nhttps://github.com/user-attachments/assets/bd79474a-c82f-4083-b432-96625fef1387\n\n\n## Quickstart\n\n* If necessary, copy the binary release ZIP archive to the Ghidra_Install/Extensions/Ghidra directory.\n* Launch Ghidra -\u003e File -\u003e Install Extension -\u003e Enable GhidrAssist.\n* Load a binary and launch the CodeBrowser.\n* CodeBrowser -\u003e File -\u003e Configure -\u003e Miscellaneous -\u003e Enable GhidrAssist.\n* CodeBrowser -\u003e Tools -\u003e GhidraAssist Settings.\n* Ensure the RLHF and RAG database paths are appropriate for your environment.\n* Point the API host to your preferred API provider and set the API key.\n* (Optional) In the Analysis Options tab, set the Reasoning Effort level (None/Low/Medium/High) for models that support extended thinking.\n* Open GhidrAssist with the GhidrAssist option in the Windows menu and start exploring.\n\n## LLM Setup\n\nGhidrAssist works with any OpenAI v1-compatible API. Setup details are provider-specific - here are some helpful resources:\n\n**Local LLM Providers:**\n- [LM Studio](https://lmstudio.ai/docs/basics) - Easy local model hosting with GUI\n- [Ollama](https://github.com/ollama/ollama#running-local-builds) - Command-line local model management\n- Open-WebUI - Web interface for local models\n\n**Cloud Providers:**\n- [OpenAI API](https://help.openai.com/en/articles/4936850-where-do-i-find-my-openai-api-key)\n- [Anthropic Claude](https://docs.anthropic.com/en/docs/initial-setup)\n- Azure OpenAI\n\n**LiteLLM Proxy (Multi-Provider Gateway):**\n- [LiteLLM](https://docs.litellm.ai/) - Unified API for 100+ LLM providers\n- Supports AWS Bedrock, Google Vertex AI, Azure, and many others\n- Select \"LiteLLM\" as provider type in GhidrAssist settings\n- Automatic model family detection for proper message formatting\n\n### Recommended Models\n\n**For Agentic Mode (requires strong reasoning and tool use):**\n- **Cloud**: GPT-5.1, Claude Sonnet 4.5\n- **Local**: GPT-OSS, Llama 3.3 70B, DeepSeek-R1 70B, Qwen2.5 72B\n\n**Models with Extended Thinking/Reasoning Support:**\n- **OpenAI**: o1-preview, o1-mini, o3-mini, o4-mini, gpt-5 (use `reasoning_effort` parameter)\n- **Anthropic**: Claude Sonnet 4.5, Claude Opus 4.5, Claude Haiku 4.5, Claude Opus 4.1/4, Claude Sonnet 4 (use `thinking.budget_tokens` parameter)\n- **Local**: openai/gpt-oss-20b via Ollama/LMStudio (supports effort levels)\n\n**Reasoning Effort Guidelines:**\n- **Low**: Quick analysis, minimal thinking tokens (~5-10s, lower cost)\n- **Medium**: Balanced reasoning depth (~15-30s, moderate cost)\n- **High**: Deep security analysis (~30-60s, 2x cost, recommended for vulnerability hunting)\n\n**Note**: Agentic mode requires models with strong function calling and multi-step reasoning capabilities. Smaller models may struggle with complex investigations. Extended thinking is optional but can significantly improve analysis quality for complex reverse engineering tasks.\n\n## Using GhidrAssistMCP for Tool-Based Analysis\n\n[GhidrAssistMCP](https://github.com/jtang613/GhidrAssistMCP) provides MCP tools that allow the LLM to interact directly with Ghidra's analysis capabilities.\n\n### Setup\n\n1. **Start the MCP Server**\n\n2. **Configure GhidrAssist:**\n   - Open Tools → GhidrAssist Settings → MCP Servers tab\n   - Add server: `http://127.0.0.1:8081` as `GhidrAssistMCP` with transport type `SSE`\n\n3. **Enable MCP in queries:**\n   - In the Custom Query tab, check \"Use MCP\"\n   - Optionally enable \"Agentic\" for autonomous investigation mode\n\n### Usage Modes\n\n**Regular MCP Queries:**\n- Enable \"Use MCP\" checkbox\n- Ask questions like \"What does the current function do?\"\n- LLM can call tools to get decompilation, cross-references, etc.\n\n**Agentic Mode (Recommended):**\n- Enable both \"Use MCP\" and \"Agentic\" checkboxes\n- Ask complex questions like \"Find vulnerabilities in this function\" or \"Analyze the call graph\"\n- The ReAct agent will:\n  1. Propose investigation steps as a todo list\n  2. Systematically execute tools to gather information\n  3. Track progress and accumulate findings\n  4. Synthesize a comprehensive answer with evidence\n\n**Example Queries:**\n- \"What security vulnerabilities exist in this function?\"\n- \"Trace the data flow from user input to this call\"\n- \"Find all functions that modify global variable X\"\n- \"Analyze the error handling in the current function\"\n\n## Using the Semantic Graph (Graph-RAG)\n\nThe Semantic Graph tab provides a knowledge graph interface for exploring binary analysis results without requiring LLM calls for every query.\n\n### Getting Started\n\n1. **Index the Binary:**\n   - Open the Semantic Graph tab\n   - Click \"ReIndex Binary\" to extract structural relationships\n   - Click \"Semantic Analysis\" to generate LLM summaries (requires API)\n   - Progress is shown in the status bar\n\n2. **Explore the Graph:**\n   - **List View**: Browse all indexed functions with summaries and security flags\n   - **Graph View**: Visualize call relationships with configurable N-hop depth\n   - **Search View**: Full-text search across summaries and security annotations\n\n3. **Security Analysis:**\n   - Click \"Security Analysis\" to scan for security-relevant features\n   - Results include: network APIs, file I/O, crypto usage, string patterns\n   - Risk levels (LOW/MEDIUM/HIGH) are assigned based on detected features\n\n### Explain Tab Integration\n\nWhen viewing a function in the Explain tab:\n- If the function is indexed, the pre-computed summary is shown instantly\n- Security panel displays: risk level, activity profile, APIs used\n- Click \"Edit\" to modify summaries (protected from auto-overwrite)\n- Use \"Refresh\" to re-generate the summary with the LLM\n\n### Benefits\n\n- **Fast Queries**: Pre-computed summaries eliminate LLM latency for repeat queries\n- **Offline Analysis**: Browse indexed data without API connectivity\n- **Security Focus**: Automatic detection of security-relevant code patterns\n- **Module Discovery**: Community detection groups related functions automatically\n\n## Homepage\nhttps://symgraph.ai\n\n\n## Minimum Version\n\nThis plugin requires the following minimum version of Ghidra:\n\n* 11.0\n\n## License\n\nThis plugin is released under a MIT license.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsymgraph%2FGhidrAssist","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsymgraph%2FGhidrAssist","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsymgraph%2FGhidrAssist/lists"}