{"id":24108246,"url":"https://github.com/vxcontrol/pentagi","last_synced_at":"2026-01-06T19:16:26.553Z","repository":{"id":271317551,"uuid":"913030762","full_name":"vxcontrol/pentagi","owner":"vxcontrol","description":"✨ Fully autonomous AI Agents system capable of performing complex penetration testing tasks","archived":false,"fork":false,"pushed_at":"2025-03-23T13:32:08.000Z","size":28550,"stargazers_count":105,"open_issues_count":8,"forks_count":19,"subscribers_count":7,"default_branch":"master","last_synced_at":"2025-03-23T13:48:04.423Z","etag":null,"topics":["ai-agents","ai-security-tool","anthropic","autonomous-agents","golang","gpt","graphql","multi-agent-system","offensive-security","open-source","openai","penetration-testing","penetration-testing-tools","react","security-automation","security-testing","security-tools","self-hosted"],"latest_commit_sha":null,"homepage":"https://pentagi.com","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/vxcontrol.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-01-06T22:22:48.000Z","updated_at":"2025-03-23T13:32:09.000Z","dependencies_parsed_at":"2025-02-28T03:00:14.764Z","dependency_job_id":"dba3033b-79c8-49a1-8db7-f64c23d071c9","html_url":"https://github.com/vxcontrol/pentagi","commit_stats":null,"previous_names":["vxcontrol/pentagi"],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vxcontrol%2Fpentagi","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vxcontrol%2Fpentagi/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vxcontrol%2Fpentagi/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vxcontrol%2Fpentagi/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/vxcontrol","download_url":"https://codeload.github.com/vxcontrol/pentagi/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246457967,"owners_count":20780676,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai-agents","ai-security-tool","anthropic","autonomous-agents","golang","gpt","graphql","multi-agent-system","offensive-security","open-source","openai","penetration-testing","penetration-testing-tools","react","security-automation","security-testing","security-tools","self-hosted"],"created_at":"2025-01-10T23:26:40.835Z","updated_at":"2026-01-06T19:16:26.545Z","avatar_url":"https://github.com/vxcontrol.png","language":"Go","readme":"# PentAGI\n\n\u003cdiv align=\"center\" style=\"font-size: 1.5em; margin: 20px 0;\"\u003e\n    \u003cstrong\u003eP\u003c/strong\u003eenetration testing \u003cstrong\u003eA\u003c/strong\u003ertificial \u003cstrong\u003eG\u003c/strong\u003eeneral \u003cstrong\u003eI\u003c/strong\u003entelligence\n\u003c/div\u003e\n\u003cbr\u003e\n\u003cdiv align=\"center\"\u003e\n\n\u003e 🚀 **Join the Community!** Connect with security researchers, AI enthusiasts, and fellow ethical hackers. Get support, share insights, and stay updated with the latest PentAGI developments.\n\n[![Discord](https://img.shields.io/badge/Discord-7289DA?logo=discord\u0026logoColor=white)](https://discord.gg/2xrMh7qX6m)⠀[![Telegram](https://img.shields.io/badge/Telegram-2CA5E0?logo=telegram\u0026logoColor=white)](https://t.me/+Ka9i6CNwe71hMWQy)\n\n\u003c/div\u003e\n\n## 📖 Table of Contents\n\n- [Overview](#-overview)\n- [Features](#-features)\n- [Quick Start](#-quick-start)\n- [Advanced Setup](#-advanced-setup)\n- [Development](#-development)\n- [Testing LLM Agents](#-testing-llm-agents)\n- [Embedding Configuration and Testing](#-embedding-configuration-and-testing)\n- [Function Testing with ftester](#-function-testing-with-ftester)\n- [Building](#%EF%B8%8F-building)\n- [Credits](#-credits)\n- [License](#-license)\n\n## 🎯 Overview\n\nPentAGI is an innovative tool for automated security testing that leverages cutting-edge artificial intelligence technologies. The project is designed for information security professionals, researchers, and enthusiasts who need a powerful and flexible solution for conducting penetration tests.\n\nYou can watch the video **PentAGI overview**:\n[![PentAGI Overview Video](https://github.com/user-attachments/assets/0828dc3e-15f1-4a1d-858e-9696a146e478)](https://youtu.be/R70x5Ddzs1o)\n\n## ✨ Features\n\n- 🛡️ Secure \u0026 Isolated. All operations are performed in a sandboxed Docker environment with complete isolation.\n- 🤖 Fully Autonomous. AI-powered agent that automatically determines and executes penetration testing steps.\n- 🔬 Professional Pentesting Tools. Built-in suite of 20+ professional security tools including nmap, metasploit, sqlmap, and more.\n- 🧠 Smart Memory System. Long-term storage of research results and successful approaches for future use.\n- 📚 Knowledge Graph Integration. Graphiti-powered knowledge graph using Neo4j for semantic relationship tracking and advanced context understanding.\n- 🔍 Web Intelligence. Built-in browser via [scraper](https://hub.docker.com/r/vxcontrol/scraper) for gathering latest information from web sources.\n- 🔎 External Search Systems. Integration with advanced search APIs including [Tavily](https://tavily.com), [Traversaal](https://traversaal.ai), [Perplexity](https://www.perplexity.ai), [DuckDuckGo](https://duckduckgo.com/), [Google Custom Search](https://programmablesearchengine.google.com/), and [Searxng](https://searxng.org) for comprehensive information gathering.\n- 👥 Team of Specialists. Delegation system with specialized AI agents for research, development, and infrastructure tasks.\n- 📊 Comprehensive Monitoring. Detailed logging and integration with Grafana/Prometheus for real-time system observation.\n- 📝 Detailed Reporting. Generation of thorough vulnerability reports with exploitation guides.\n- 📦 Smart Container Management. Automatic Docker image selection based on specific task requirements.\n- 📱 Modern Interface. Clean and intuitive web UI for system management and monitoring.\n- 🔌 API Integration. Support for REST and GraphQL APIs for seamless external system integration.\n- 💾 Persistent Storage. All commands and outputs are stored in PostgreSQL with [pgvector](https://hub.docker.com/r/vxcontrol/pgvector) extension.\n- 🎯 Scalable Architecture. Microservices-based design supporting horizontal scaling.\n- 🏠 Self-Hosted Solution. Complete control over your deployment and data.\n- 🔑 Flexible Authentication. Support for various LLM providers ([OpenAI](https://platform.openai.com/), [Anthropic](https://www.anthropic.com/), [Ollama](https://ollama.com/), [AWS Bedrock](https://aws.amazon.com/bedrock/), [Google AI/Gemini](https://ai.google.dev/), [Deep Infra](https://deepinfra.com/), [OpenRouter](https://openrouter.ai/), [DeepSeek](https://www.deepseek.com/en)), [Moonshot](https://platform.moonshot.ai/) and custom configurations.\n- ⚡ Quick Deployment. Easy setup through [Docker Compose](https://docs.docker.com/compose/) with comprehensive environment configuration.\n\n## 🏗️ Architecture\n\n### System Context\n\n```mermaid\nflowchart TB\n    classDef person fill:#08427B,stroke:#073B6F,color:#fff\n    classDef system fill:#1168BD,stroke:#0B4884,color:#fff\n    classDef external fill:#666666,stroke:#0B4884,color:#fff\n\n    pentester[\"👤 Security Engineer\n    (User of the system)\"]\n\n    pentagi[\"✨ PentAGI\n    (Autonomous penetration testing system)\"]\n\n    target[\"🎯 target-system\n    (System under test)\"]\n    llm[\"🧠 llm-provider\n    (OpenAI/Anthropic/Ollama/Bedrock/Gemini/Custom)\"]\n    search[\"🔍 search-systems\n    (Google/DuckDuckGo/Tavily/Traversaal/Perplexity/Searxng)\"]\n    langfuse[\"📊 langfuse-ui\n    (LLM Observability Dashboard)\"]\n    grafana[\"📈 grafana\n    (System Monitoring Dashboard)\"]\n\n    pentester --\u003e |Uses HTTPS| pentagi\n    pentester --\u003e |Monitors AI HTTPS| langfuse\n    pentester --\u003e |Monitors System HTTPS| grafana\n    pentagi --\u003e |Tests Various protocols| target\n    pentagi --\u003e |Queries HTTPS| llm\n    pentagi --\u003e |Searches HTTPS| search\n    pentagi --\u003e |Reports HTTPS| langfuse\n    pentagi --\u003e |Reports HTTPS| grafana\n\n    class pentester person\n    class pentagi system\n    class target,llm,search,langfuse,grafana external\n\n    linkStyle default stroke:#ffffff,color:#ffffff\n```\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003e🔄 Container Architecture\u003c/b\u003e (click to expand)\u003c/summary\u003e\n\n```mermaid\ngraph TB\n    subgraph Core Services\n        UI[Frontend UI\u003cbr/\u003eReact + TypeScript]\n        API[Backend API\u003cbr/\u003eGo + GraphQL]\n        DB[(Vector Store\u003cbr/\u003ePostgreSQL + pgvector)]\n        MQ[Task Queue\u003cbr/\u003eAsync Processing]\n        Agent[AI Agents\u003cbr/\u003eMulti-Agent System]\n    end\n\n    subgraph Knowledge Graph\n        Graphiti[Graphiti\u003cbr/\u003eKnowledge Graph API]\n        Neo4j[(Neo4j\u003cbr/\u003eGraph Database)]\n    end\n\n    subgraph Monitoring\n        Grafana[Grafana\u003cbr/\u003eDashboards]\n        VictoriaMetrics[VictoriaMetrics\u003cbr/\u003eTime-series DB]\n        Jaeger[Jaeger\u003cbr/\u003eDistributed Tracing]\n        Loki[Loki\u003cbr/\u003eLog Aggregation]\n        OTEL[OpenTelemetry\u003cbr/\u003eData Collection]\n    end\n\n    subgraph Analytics\n        Langfuse[Langfuse\u003cbr/\u003eLLM Analytics]\n        ClickHouse[ClickHouse\u003cbr/\u003eAnalytics DB]\n        Redis[Redis\u003cbr/\u003eCache + Rate Limiter]\n        MinIO[MinIO\u003cbr/\u003eS3 Storage]\n    end\n\n    subgraph Security Tools\n        Scraper[Web Scraper\u003cbr/\u003eIsolated Browser]\n        PenTest[Security Tools\u003cbr/\u003e20+ Pro Tools\u003cbr/\u003eSandboxed Execution]\n    end\n\n    UI --\u003e |HTTP/WS| API\n    API --\u003e |SQL| DB\n    API --\u003e |Events| MQ\n    MQ --\u003e |Tasks| Agent\n    Agent --\u003e |Commands| PenTest\n    Agent --\u003e |Queries| DB\n    Agent --\u003e |Knowledge| Graphiti\n    Graphiti --\u003e |Graph| Neo4j\n\n    API --\u003e |Telemetry| OTEL\n    OTEL --\u003e |Metrics| VictoriaMetrics\n    OTEL --\u003e |Traces| Jaeger\n    OTEL --\u003e |Logs| Loki\n\n    Grafana --\u003e |Query| VictoriaMetrics\n    Grafana --\u003e |Query| Jaeger\n    Grafana --\u003e |Query| Loki\n\n    API --\u003e |Analytics| Langfuse\n    Langfuse --\u003e |Store| ClickHouse\n    Langfuse --\u003e |Cache| Redis\n    Langfuse --\u003e |Files| MinIO\n\n    classDef core fill:#f9f,stroke:#333,stroke-width:2px,color:#000\n    classDef knowledge fill:#ffa,stroke:#333,stroke-width:2px,color:#000\n    classDef monitoring fill:#bbf,stroke:#333,stroke-width:2px,color:#000\n    classDef analytics fill:#bfb,stroke:#333,stroke-width:2px,color:#000\n    classDef tools fill:#fbb,stroke:#333,stroke-width:2px,color:#000\n\n    class UI,API,DB,MQ,Agent core\n    class Graphiti,Neo4j knowledge\n    class Grafana,VictoriaMetrics,Jaeger,Loki,OTEL monitoring\n    class Langfuse,ClickHouse,Redis,MinIO analytics\n    class Scraper,PenTest tools\n```\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003e📊 Entity Relationship\u003c/b\u003e (click to expand)\u003c/summary\u003e\n\n```mermaid\nerDiagram\n    Flow ||--o{ Task : contains\n    Task ||--o{ SubTask : contains\n    SubTask ||--o{ Action : contains\n    Action ||--o{ Artifact : produces\n    Action ||--o{ Memory : stores\n\n    Flow {\n        string id PK\n        string name \"Flow name\"\n        string description \"Flow description\"\n        string status \"active/completed/failed\"\n        json parameters \"Flow parameters\"\n        timestamp created_at\n        timestamp updated_at\n    }\n\n    Task {\n        string id PK\n        string flow_id FK\n        string name \"Task name\"\n        string description \"Task description\"\n        string status \"pending/running/done/failed\"\n        json result \"Task results\"\n        timestamp created_at\n        timestamp updated_at\n    }\n\n    SubTask {\n        string id PK\n        string task_id FK\n        string name \"Subtask name\"\n        string description \"Subtask description\"\n        string status \"queued/running/completed/failed\"\n        string agent_type \"researcher/developer/executor\"\n        json context \"Agent context\"\n        timestamp created_at\n        timestamp updated_at\n    }\n\n    Action {\n        string id PK\n        string subtask_id FK\n        string type \"command/search/analyze/etc\"\n        string status \"success/failure\"\n        json parameters \"Action parameters\"\n        json result \"Action results\"\n        timestamp created_at\n    }\n\n    Artifact {\n        string id PK\n        string action_id FK\n        string type \"file/report/log\"\n        string path \"Storage path\"\n        json metadata \"Additional info\"\n        timestamp created_at\n    }\n\n    Memory {\n        string id PK\n        string action_id FK\n        string type \"observation/conclusion\"\n        vector embedding \"Vector representation\"\n        text content \"Memory content\"\n        timestamp created_at\n    }\n```\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003e🤖 Agent Interaction\u003c/b\u003e (click to expand)\u003c/summary\u003e\n\n```mermaid\nsequenceDiagram\n    participant O as Orchestrator\n    participant R as Researcher\n    participant D as Developer\n    participant E as Executor\n    participant VS as Vector Store\n    participant KB as Knowledge Base\n\n    Note over O,KB: Flow Initialization\n    O-\u003e\u003eVS: Query similar tasks\n    VS--\u003e\u003eO: Return experiences\n    O-\u003e\u003eKB: Load relevant knowledge\n    KB--\u003e\u003eO: Return context\n\n    Note over O,R: Research Phase\n    O-\u003e\u003eR: Analyze target\n    R-\u003e\u003eVS: Search similar cases\n    VS--\u003e\u003eR: Return patterns\n    R-\u003e\u003eKB: Query vulnerabilities\n    KB--\u003e\u003eR: Return known issues\n    R-\u003e\u003eVS: Store findings\n    R--\u003e\u003eO: Research results\n\n    Note over O,D: Planning Phase\n    O-\u003e\u003eD: Plan attack\n    D-\u003e\u003eVS: Query exploits\n    VS--\u003e\u003eD: Return techniques\n    D-\u003e\u003eKB: Load tools info\n    KB--\u003e\u003eD: Return capabilities\n    D--\u003e\u003eO: Attack plan\n\n    Note over O,E: Execution Phase\n    O-\u003e\u003eE: Execute plan\n    E-\u003e\u003eKB: Load tool guides\n    KB--\u003e\u003eE: Return procedures\n    E-\u003e\u003eVS: Store results\n    E--\u003e\u003eO: Execution status\n```\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003e🧠 Memory System\u003c/b\u003e (click to expand)\u003c/summary\u003e\n\n```mermaid\ngraph TB\n    subgraph \"Long-term Memory\"\n        VS[(Vector Store\u003cbr/\u003eEmbeddings DB)]\n        KB[Knowledge Base\u003cbr/\u003eDomain Expertise]\n        Tools[Tools Knowledge\u003cbr/\u003eUsage Patterns]\n    end\n\n    subgraph \"Working Memory\"\n        Context[Current Context\u003cbr/\u003eTask State]\n        Goals[Active Goals\u003cbr/\u003eObjectives]\n        State[System State\u003cbr/\u003eResources]\n    end\n\n    subgraph \"Episodic Memory\"\n        Actions[Past Actions\u003cbr/\u003eCommands History]\n        Results[Action Results\u003cbr/\u003eOutcomes]\n        Patterns[Success Patterns\u003cbr/\u003eBest Practices]\n    end\n\n    Context --\u003e |Query| VS\n    VS --\u003e |Retrieve| Context\n\n    Goals --\u003e |Consult| KB\n    KB --\u003e |Guide| Goals\n\n    State --\u003e |Record| Actions\n    Actions --\u003e |Learn| Patterns\n    Patterns --\u003e |Store| VS\n\n    Tools --\u003e |Inform| State\n    Results --\u003e |Update| Tools\n\n    VS --\u003e |Enhance| KB\n    KB --\u003e |Index| VS\n\n    classDef ltm fill:#f9f,stroke:#333,stroke-width:2px,color:#000\n    classDef wm fill:#bbf,stroke:#333,stroke-width:2px,color:#000\n    classDef em fill:#bfb,stroke:#333,stroke-width:2px,color:#000\n\n    class VS,KB,Tools ltm\n    class Context,Goals,State wm\n    class Actions,Results,Patterns em\n```\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003e🔄 Chain Summarization\u003c/b\u003e (click to expand)\u003c/summary\u003e\n\nThe chain summarization system manages conversation context growth by selectively summarizing older messages. This is critical for preventing token limits from being exceeded while maintaining conversation coherence.\n\n```mermaid\nflowchart TD\n    A[Input Chain] --\u003e B{Needs Summarization?}\n    B --\u003e|No| C[Return Original Chain]\n    B --\u003e|Yes| D[Convert to ChainAST]\n    D --\u003e E[Apply Section Summarization]\n    E --\u003e F[Process Oversized Pairs]\n    F --\u003e G[Manage Last Section Size]\n    G --\u003e H[Apply QA Summarization]\n    H --\u003e I[Rebuild Chain with Summaries]\n    I --\u003e J{Is New Chain Smaller?}\n    J --\u003e|Yes| K[Return Optimized Chain]\n    J --\u003e|No| C\n\n    classDef process fill:#bbf,stroke:#333,stroke-width:2px,color:#000\n    classDef decision fill:#bfb,stroke:#333,stroke-width:2px,color:#000\n    classDef output fill:#fbb,stroke:#333,stroke-width:2px,color:#000\n\n    class A,D,E,F,G,H,I process\n    class B,J decision\n    class C,K output\n```\n\nThe algorithm operates on a structured representation of conversation chains (ChainAST) that preserves message types including tool calls and their responses. All summarization operations maintain critical conversation flow while reducing context size.\n\n### Global Summarizer Configuration Options\n\n| Parameter | Environment Variable | Default | Description |\n|-----------|----------------------|---------|-------------|\n| Preserve Last | `SUMMARIZER_PRESERVE_LAST` | `true` | Whether to keep all messages in the last section intact |\n| Use QA Pairs | `SUMMARIZER_USE_QA` | `true` | Whether to use QA pair summarization strategy |\n| Summarize Human in QA | `SUMMARIZER_SUM_MSG_HUMAN_IN_QA` | `false` | Whether to summarize human messages in QA pairs |\n| Last Section Size | `SUMMARIZER_LAST_SEC_BYTES` | `51200` | Maximum byte size for last section (50KB) |\n| Max Body Pair Size | `SUMMARIZER_MAX_BP_BYTES` | `16384` | Maximum byte size for a single body pair (16KB) |\n| Max QA Sections | `SUMMARIZER_MAX_QA_SECTIONS` | `10` | Maximum QA pair sections to preserve |\n| Max QA Size | `SUMMARIZER_MAX_QA_BYTES` | `65536` | Maximum byte size for QA pair sections (64KB) |\n| Keep QA Sections | `SUMMARIZER_KEEP_QA_SECTIONS` | `1` | Number of recent QA sections to keep without summarization |\n\n### Assistant Summarizer Configuration Options\n\nAssistant instances can use customized summarization settings to fine-tune context management behavior:\n\n| Parameter | Environment Variable | Default | Description |\n|-----------|----------------------|---------|-------------|\n| Preserve Last | `ASSISTANT_SUMMARIZER_PRESERVE_LAST` | `true` | Whether to preserve all messages in the assistant's last section |\n| Last Section Size | `ASSISTANT_SUMMARIZER_LAST_SEC_BYTES` | `76800` | Maximum byte size for assistant's last section (75KB) |\n| Max Body Pair Size | `ASSISTANT_SUMMARIZER_MAX_BP_BYTES` | `16384` | Maximum byte size for a single body pair in assistant context (16KB) |\n| Max QA Sections | `ASSISTANT_SUMMARIZER_MAX_QA_SECTIONS` | `7` | Maximum QA sections to preserve in assistant context |\n| Max QA Size | `ASSISTANT_SUMMARIZER_MAX_QA_BYTES` | `76800` | Maximum byte size for assistant's QA sections (75KB) |\n| Keep QA Sections | `ASSISTANT_SUMMARIZER_KEEP_QA_SECTIONS` | `3` | Number of recent QA sections to preserve without summarization |\n\nThe assistant summarizer configuration provides more memory for context retention compared to the global settings, preserving more recent conversation history while still ensuring efficient token usage.\n\n### Summarizer Environment Configuration\n\n```bash\n# Default values for global summarizer logic\nSUMMARIZER_PRESERVE_LAST=true\nSUMMARIZER_USE_QA=true\nSUMMARIZER_SUM_MSG_HUMAN_IN_QA=false\nSUMMARIZER_LAST_SEC_BYTES=51200\nSUMMARIZER_MAX_BP_BYTES=16384\nSUMMARIZER_MAX_QA_SECTIONS=10\nSUMMARIZER_MAX_QA_BYTES=65536\nSUMMARIZER_KEEP_QA_SECTIONS=1\n\n# Default values for assistant summarizer logic\nASSISTANT_SUMMARIZER_PRESERVE_LAST=true\nASSISTANT_SUMMARIZER_LAST_SEC_BYTES=76800\nASSISTANT_SUMMARIZER_MAX_BP_BYTES=16384\nASSISTANT_SUMMARIZER_MAX_QA_SECTIONS=7\nASSISTANT_SUMMARIZER_MAX_QA_BYTES=76800\nASSISTANT_SUMMARIZER_KEEP_QA_SECTIONS=3\n```\n\n\u003c/details\u003e\n\nThe architecture of PentAGI is designed to be modular, scalable, and secure. Here are the key components:\n\n1. **Core Services**\n   - Frontend UI: React-based web interface with TypeScript for type safety\n   - Backend API: Go-based REST and GraphQL APIs for flexible integration\n   - Vector Store: PostgreSQL with pgvector for semantic search and memory storage\n   - Task Queue: Async task processing system for reliable operation\n   - AI Agent: Multi-agent system with specialized roles for efficient testing\n\n2. **Knowledge Graph**\n   - Graphiti: Knowledge graph API for semantic relationship tracking and contextual understanding\n   - Neo4j: Graph database for storing and querying relationships between entities, actions, and outcomes\n   - Automatic capturing of agent responses and tool executions for building comprehensive knowledge base\n\n3. **Monitoring Stack**\n   - OpenTelemetry: Unified observability data collection and correlation\n   - Grafana: Real-time visualization and alerting dashboards\n   - VictoriaMetrics: High-performance time-series metrics storage\n   - Jaeger: End-to-end distributed tracing for debugging\n   - Loki: Scalable log aggregation and analysis\n\n4. **Analytics Platform**\n   - Langfuse: Advanced LLM observability and performance analytics\n   - ClickHouse: Column-oriented analytics data warehouse\n   - Redis: High-speed caching and rate limiting\n   - MinIO: S3-compatible object storage for artifacts\n\n5. **Security Tools**\n   - Web Scraper: Isolated browser environment for safe web interaction\n   - Pentesting Tools: Comprehensive suite of 20+ professional security tools\n   - Sandboxed Execution: All operations run in isolated containers\n\n6. **Memory Systems**\n   - Long-term Memory: Persistent storage of knowledge and experiences\n   - Working Memory: Active context and goals for current operations\n   - Episodic Memory: Historical actions and success patterns\n   - Knowledge Base: Structured domain expertise and tool capabilities\n   - Context Management: Intelligently manages growing LLM context windows using chain summarization\n\nThe system uses Docker containers for isolation and easy deployment, with separate networks for core services, monitoring, and analytics to ensure proper security boundaries. Each component is designed to scale horizontally and can be configured for high availability in production environments.\n\n## 🚀 Quick Start\n\n### System Requirements\n\n- Docker and Docker Compose\n- Minimum 2 vCPU\n- Minimum 4GB RAM\n- 20GB free disk space\n- Internet access for downloading images and updates\n\n### Using Installer (Recommended)\n\nPentAGI provides an interactive installer with a terminal-based UI for streamlined configuration and deployment. The installer guides you through system checks, LLM provider setup, search engine configuration, and security hardening.\n\n**Supported Platforms:**\n- **Linux**: amd64 [download](https://pentagi.com/downloads/linux/amd64/installer-latest.zip) | arm64 [download](https://pentagi.com/downloads/linux/arm64/installer-latest.zip)\n- **Windows**: amd64 [download](https://pentagi.com/downloads/windows/amd64/installer-latest.zip)\n- **macOS**: amd64 (Intel) [download](https://pentagi.com/downloads/darwin/amd64/installer-latest.zip) | arm64 (M-series) [download](https://pentagi.com/downloads/darwin/arm64/installer-latest.zip)\n\n**Quick Installation (Linux amd64):**\n\n```bash\n# Create installation directory\nmkdir -p pentagi \u0026\u0026 cd pentagi\n\n# Download installer\nwget -O installer.zip https://pentagi.com/downloads/linux/amd64/installer-latest.zip\n\n# Extract\nunzip installer.zip\n\n# Run interactive installer\n./installer\n```\n\n**Prerequisites \u0026 Permissions:**\n\nThe installer requires appropriate privileges to interact with the Docker API for proper operation. By default, it uses the Docker socket (`/var/run/docker.sock`) which requires either:\n\n- **Option 1 (Recommended for production):** Run the installer as root:\n  ```bash\n  sudo ./installer\n  ```\n\n- **Option 2 (Development environments):** Grant your user access to the Docker socket by adding them to the `docker` group:\n  ```bash\n  # Add your user to the docker group\n  sudo usermod -aG docker $USER\n  \n  # Log out and log back in, or activate the group immediately\n  newgrp docker\n  \n  # Verify Docker access (should run without sudo)\n  docker ps\n  ```\n\n  ⚠️ **Security Note:** Adding a user to the `docker` group grants root-equivalent privileges. Only do this for trusted users in controlled environments. For production deployments, consider using rootless Docker mode or running the installer with sudo.\n\nThe installer will:\n1. **System Checks**: Verify Docker, network connectivity, and system requirements\n2. **Environment Setup**: Create and configure `.env` file with optimal defaults\n3. **Provider Configuration**: Set up LLM providers (OpenAI, Anthropic, Gemini, Bedrock, Ollama, Custom)\n4. **Search Engines**: Configure DuckDuckGo, Google, Tavily, Traversaal, Perplexity, Searxng\n5. **Security Hardening**: Generate secure credentials and configure SSL certificates\n6. **Deployment**: Start PentAGI with docker-compose\n\n**For Production \u0026 Enhanced Security:**\n\nFor production deployments or security-sensitive environments, we **strongly recommend** using a distributed two-node architecture where worker operations are isolated on a separate server. This prevents untrusted code execution and network access issues on your main system.\n\n👉 **See detailed guide**: [Worker Node Setup](examples/guides/worker_node.md)\n\nThe two-node setup provides:\n- **Isolated Execution**: Worker containers run on dedicated hardware\n- **Network Isolation**: Separate network boundaries for penetration testing\n- **Security Boundaries**: Docker-in-Docker with TLS authentication\n- **OOB Attack Support**: Dedicated port ranges for out-of-band techniques\n\n### Manual Installation\n\n1. Create a working directory or clone the repository:\n\n```bash\nmkdir pentagi \u0026\u0026 cd pentagi\n```\n\n2. Copy `.env.example` to `.env` or download it:\n\n```bash\ncurl -o .env https://raw.githubusercontent.com/vxcontrol/pentagi/master/.env.example\n```\n\n3. Fill in the required API keys in `.env` file.\n\n```bash\n# Required: At least one of these LLM providers\nOPEN_AI_KEY=your_openai_key\nANTHROPIC_API_KEY=your_anthropic_key\nGEMINI_API_KEY=your_gemini_key\n\n# Optional: AWS Bedrock provider (enterprise-grade models)\nBEDROCK_REGION=us-east-1\nBEDROCK_ACCESS_KEY_ID=your_aws_access_key\nBEDROCK_SECRET_ACCESS_KEY=your_aws_secret_key\n\n# Optional: Local LLM provider (zero-cost inference)\nOLLAMA_SERVER_URL=http://localhost:11434\n\n# Optional: Additional search capabilities\nDUCKDUCKGO_ENABLED=true\nGOOGLE_API_KEY=your_google_key\nGOOGLE_CX_KEY=your_google_cx\nTAVILY_API_KEY=your_tavily_key\nTRAVERSAAL_API_KEY=your_traversaal_key\nPERPLEXITY_API_KEY=your_perplexity_key\nPERPLEXITY_MODEL=sonar-pro\nPERPLEXITY_CONTEXT_SIZE=medium\n\n# Searxng meta search engine (aggregates results from multiple sources)\nSEARXNG_URL=http://your-searxng-instance:8080\nSEARXNG_CATEGORIES=general\nSEARXNG_LANGUAGE=\nSEARXNG_SAFESEARCH=0\nSEARXNG_TIME_RANGE=\n\n## Graphiti knowledge graph settings\nGRAPHITI_ENABLED=true\nGRAPHITI_TIMEOUT=30\nGRAPHITI_URL=http://graphiti:8000\nGRAPHITI_MODEL_NAME=gpt-5-mini\n\n# Neo4j settings (used by Graphiti stack)\nNEO4J_USER=neo4j\nNEO4J_DATABASE=neo4j\nNEO4J_PASSWORD=devpassword\nNEO4J_URI=bolt://neo4j:7687\n\n# Assistant configuration\nASSISTANT_USE_AGENTS=false         # Default value for agent usage when creating new assistants\n```\n\n4. Change all security related environment variables in `.env` file to improve security.\n\n\u003cdetails\u003e\n    \u003csummary\u003eSecurity related environment variables\u003c/summary\u003e\n\n### Main Security Settings\n- `COOKIE_SIGNING_SALT` - Salt for cookie signing, change to random value\n- `PUBLIC_URL` - Public URL of your server (eg. `https://pentagi.example.com`)\n- `SERVER_SSL_CRT` and `SERVER_SSL_KEY` - Custom paths to your existing SSL certificate and key for HTTPS (these paths should be used in the docker-compose.yml file to mount as volumes)\n\n### Scraper Access\n- `SCRAPER_PUBLIC_URL` - Public URL for scraper if you want to use different scraper server for public URLs\n- `SCRAPER_PRIVATE_URL` - Private URL for scraper (local scraper server in docker-compose.yml file to access it to local URLs)\n\n### Access Credentials\n- `PENTAGI_POSTGRES_USER` and `PENTAGI_POSTGRES_PASSWORD` - PostgreSQL credentials\n- `NEO4J_USER` and `NEO4J_PASSWORD` - Neo4j credentials (for Graphiti knowledge graph)\n\n\u003c/details\u003e\n\n5. Remove all inline comments from `.env` file if you want to use it in VSCode or other IDEs as a envFile option:\n\n```bash\nperl -i -pe 's/\\s+#.*$//' .env\n```\n\n6. Run the PentAGI stack:\n\n```bash\ncurl -O https://raw.githubusercontent.com/vxcontrol/pentagi/master/docker-compose.yml\ndocker compose up -d\n```\n\nVisit [localhost:8443](https://localhost:8443) to access PentAGI Web UI (default is `admin@pentagi.com` / `admin`)\n\n\u003e [!NOTE]\n\u003e If you caught an error about `pentagi-network` or `observability-network` or `langfuse-network` you need to run `docker-compose.yml` firstly to create these networks and after that run `docker-compose-langfuse.yml`, `docker-compose-graphiti.yml`, and `docker-compose-observability.yml` to use Langfuse, Graphiti, and Observability services.\n\u003e\n\u003e You have to set at least one Language Model provider (OpenAI, Anthropic, Gemini, AWS Bedrock, or Ollama) to use PentAGI. AWS Bedrock provides enterprise-grade access to multiple foundation models from leading AI companies, while Ollama provides zero-cost local inference if you have sufficient computational resources. Additional API keys for search engines are optional but recommended for better results.\n\u003e\n\u003e `LLM_SERVER_*` environment variables are experimental feature and will be changed in the future. Right now you can use them to specify custom LLM server URL and one model for all agent types.\n\u003e\n\u003e `PROXY_URL` is a global proxy URL for all LLM providers and external search systems. You can use it for isolation from external networks.\n\u003e\n\u003e The `docker-compose.yml` file runs the PentAGI service as root user because it needs access to docker.sock for container management. If you're using TCP/IP network connection to Docker instead of socket file, you can remove root privileges and use the default `pentagi` user for better security.\n\n### Assistant Configuration\n\nPentAGI allows you to configure default behavior for assistants:\n\n| Variable | Default | Description |\n|----------|---------|-------------|\n| `ASSISTANT_USE_AGENTS` | `false` | Controls the default value for agent usage when creating new assistants |\n\nThe `ASSISTANT_USE_AGENTS` setting affects the initial state of the \"Use Agents\" toggle when creating a new assistant in the UI:\n- `false` (default): New assistants are created with agent delegation disabled by default\n- `true`: New assistants are created with agent delegation enabled by default\n\nNote that users can always override this setting by toggling the \"Use Agents\" button in the UI when creating or editing an assistant. This environment variable only controls the initial default state.\n\n### Custom LLM Provider Configuration\n\nWhen using custom LLM providers with the `LLM_SERVER_*` variables, you can fine-tune the reasoning format used in requests:\n\n| Variable | Default | Description |\n|----------|---------|-------------|\n| `LLM_SERVER_URL` | | Base URL for the custom LLM API endpoint |\n| `LLM_SERVER_KEY` | | API key for the custom LLM provider |\n| `LLM_SERVER_MODEL` | | Default model to use (can be overridden in provider config) |\n| `LLM_SERVER_CONFIG_PATH` | | Path to the YAML configuration file for agent-specific models |\n| `LLM_SERVER_LEGACY_REASONING` | `false` | Controls reasoning format in API requests |\n\nThe `LLM_SERVER_LEGACY_REASONING` setting affects how reasoning parameters are sent to the LLM:\n- `false` (default): Uses modern format where reasoning is sent as a structured object with `max_tokens` parameter\n- `true`: Uses legacy format with string-based `reasoning_effort` parameter\n\nThis setting is important when working with different LLM providers as they may expect different reasoning formats in their API requests. If you encounter reasoning-related errors with custom providers, try changing this setting.\n\n### Local LLM Provider Configuration\n\nPentAGI supports Ollama for local LLM inference, providing zero-cost operation and enhanced privacy:\n\n| Variable | Default | Description |\n|----------|---------|-------------|\n| `OLLAMA_SERVER_URL` | | URL of your Ollama server |\n| `OLLAMA_SERVER_CONFIG_PATH` | | Path to custom agent configuration file |\n\nConfiguration examples:\n\n```bash\n# Basic Ollama setup\nOLLAMA_SERVER_URL=http://localhost:11434\n\n# Remote Ollama server\nOLLAMA_SERVER_URL=http://ollama-server:11434\n\n# Custom configuration with agent-specific models\nOLLAMA_SERVER_CONFIG_PATH=/path/to/ollama-config.yml\n\n# Default configuration file inside the docker container (just for example)\nOLLAMA_SERVER_CONFIG_PATH=/opt/pentagi/conf/ollama-llama318b.provider.yml\n```\n\nThe system automatically discovers available models from your Ollama server and provides zero-cost inference for penetration testing workflows.\n\n#### Creating Custom Ollama Models with Extended Context\n\nPentAGI requires models with larger context windows than the default Ollama configurations. You need to create custom models with increased `num_ctx` parameter through Modelfiles. While typical agent workflows consume around 64K tokens, PentAGI uses 110K context size for safety margin and handling complex penetration testing scenarios.\n\n**Important**: The `num_ctx` parameter can only be set during model creation via Modelfile - it cannot be changed after model creation or overridden at runtime.\n\n##### Example: Qwen3 32B FP16 with Extended Context\n\nCreate a Modelfile named `Modelfile_qwen3_32b_fp16_tc`:\n\n```dockerfile\nFROM qwen3:32b-fp16\nPARAMETER num_ctx 110000\nPARAMETER temperature 0.3\nPARAMETER top_p 0.8\nPARAMETER min_p 0.0\nPARAMETER top_k 20\nPARAMETER repeat_penalty 1.1\n```\n\nBuild the custom model:\n\n```bash\nollama create qwen3:32b-fp16-tc -f Modelfile_qwen3_32b_fp16_tc\n```\n\n##### Example: QwQ 32B FP16 with Extended Context\n\nCreate a Modelfile named `Modelfile_qwq_32b_fp16_tc`:\n\n```dockerfile\nFROM qwq:32b-fp16\nPARAMETER num_ctx 110000\nPARAMETER temperature 0.2\nPARAMETER top_p 0.7\nPARAMETER min_p 0.0\nPARAMETER top_k 40\nPARAMETER repeat_penalty 1.2\n```\n\nBuild the custom model:\n\n```bash\nollama create qwq:32b-fp16-tc -f Modelfile_qwq_32b_fp16_tc\n```\n\n\u003e **Note**: The QwQ 32B FP16 model requires approximately **71.3 GB VRAM** for inference. Ensure your system has sufficient GPU memory before attempting to use this model.\n\nThese custom models are referenced in the pre-built provider configuration files (`ollama-qwen332b-fp16-tc.provider.yml` and `ollama-qwq32b-fp16-tc.provider.yml`) that are included in the Docker image at `/opt/pentagi/conf/`.\n\n### OpenAI Provider Configuration\n\nPentAGI supports OpenAI's advanced language models, including the latest reasoning-capable o-series models designed for complex analytical tasks:\n\n| Variable | Default | Description |\n|----------|---------|-------------|\n| `OPEN_AI_KEY` | | API key for OpenAI services |\n| `OPEN_AI_SERVER_URL` | `https://api.openai.com/v1` | OpenAI API endpoint |\n\nConfiguration examples:\n\n```bash\n# Basic OpenAI setup\nOPEN_AI_KEY=your_openai_api_key\nOPEN_AI_SERVER_URL=https://api.openai.com/v1\n\n# Using with proxy for enhanced security\nOPEN_AI_KEY=your_openai_api_key\nPROXY_URL=http://your-proxy:8080\n```\n\nThe OpenAI provider offers cutting-edge capabilities including:\n\n- **Reasoning Models**: Advanced o-series models (o1, o3, o4-mini) with step-by-step analytical thinking\n- **Latest GPT-4.1 Series**: Flagship models optimized for complex security research and exploit development\n- **Cost-Effective Options**: From nano models for high-volume scanning to powerful reasoning models for deep analysis\n- **Versatile Performance**: Fast, intelligent models perfect for multi-step security analysis and penetration testing\n- **Proven Reliability**: Industry-leading models with consistent performance across diverse security scenarios\n\nThe system automatically selects appropriate OpenAI models based on task complexity, optimizing for both performance and cost-effectiveness.\n\n### Anthropic Provider Configuration\n\nPentAGI integrates with Anthropic's Claude models, known for their exceptional safety, reasoning capabilities, and sophisticated understanding of complex security contexts:\n\n| Variable | Default | Description |\n|----------|---------|-------------|\n| `ANTHROPIC_API_KEY` | | API key for Anthropic services |\n| `ANTHROPIC_SERVER_URL` | `https://api.anthropic.com/v1` | Anthropic API endpoint |\n\nConfiguration examples:\n\n```bash\n# Basic Anthropic setup\nANTHROPIC_API_KEY=your_anthropic_api_key\nANTHROPIC_SERVER_URL=https://api.anthropic.com/v1\n\n# Using with proxy for secure environments\nANTHROPIC_API_KEY=your_anthropic_api_key\nPROXY_URL=http://your-proxy:8080\n```\n\nThe Anthropic provider delivers superior capabilities including:\n\n- **Advanced Reasoning**: Claude 4 series with exceptional reasoning for sophisticated penetration testing\n- **Extended Thinking**: Claude 3.7 with step-by-step thinking capabilities for methodical security research\n- **High-Speed Performance**: Claude 3.5 Haiku for blazing-fast vulnerability scans and real-time monitoring\n- **Comprehensive Analysis**: Claude Sonnet models for complex security analysis and threat hunting\n- **Safety-First Design**: Built-in safety mechanisms ensuring responsible security testing practices\n\nThe system leverages Claude's advanced understanding of security contexts to provide thorough and responsible penetration testing guidance.\n\n### Google AI (Gemini) Provider Configuration\n\nPentAGI supports Google's Gemini models through the Google AI API, offering state-of-the-art reasoning capabilities and multimodal features:\n\n| Variable | Default | Description |\n|----------|---------|-------------|\n| `GEMINI_API_KEY` | | API key for Google AI services |\n| `GEMINI_SERVER_URL` | `https://generativelanguage.googleapis.com` | Google AI API endpoint |\n\nConfiguration examples:\n\n```bash\n# Basic Gemini setup\nGEMINI_API_KEY=your_gemini_api_key\nGEMINI_SERVER_URL=https://generativelanguage.googleapis.com\n\n# Using with proxy\nGEMINI_API_KEY=your_gemini_api_key\nPROXY_URL=http://your-proxy:8080\n```\n\nThe Gemini provider offers advanced features including:\n\n- **Thinking Capabilities**: Advanced reasoning models (Gemini 2.5 series) with step-by-step analysis\n- **Multimodal Support**: Text and image processing for comprehensive security assessments\n- **Large Context Windows**: Up to 2M tokens for analyzing extensive codebases and documentation\n- **Cost-Effective Options**: From high-performance pro models to economical flash variants\n- **Security-Focused Models**: Specialized configurations optimized for penetration testing workflows\n\nThe system automatically selects appropriate Gemini models based on agent requirements, balancing performance, capabilities, and cost-effectiveness.\n\n### AWS Bedrock Provider Configuration\n\nPentAGI integrates with Amazon Bedrock, offering access to a wide range of foundation models from leading AI companies including Anthropic, AI21, Cohere, Meta, and Amazon's own models:\n\n| Variable | Default | Description |\n|----------|---------|-------------|\n| `BEDROCK_REGION` | `us-east-1` | AWS region for Bedrock service |\n| `BEDROCK_ACCESS_KEY_ID` | | AWS access key ID for authentication |\n| `BEDROCK_SECRET_ACCESS_KEY` | | AWS secret access key for authentication |\n| `BEDROCK_SESSION_TOKEN` | | AWS session token as alternative way for authentication |\n| `BEDROCK_SERVER_URL` | | Optional custom Bedrock endpoint URL |\n\nConfiguration examples:\n\n```bash\n# Basic AWS Bedrock setup with credentials\nBEDROCK_REGION=us-east-1\nBEDROCK_ACCESS_KEY_ID=your_aws_access_key\nBEDROCK_SECRET_ACCESS_KEY=your_aws_secret_key\n\n# Using with proxy for enhanced security\nBEDROCK_REGION=us-east-1\nBEDROCK_ACCESS_KEY_ID=your_aws_access_key\nBEDROCK_SECRET_ACCESS_KEY=your_aws_secret_key\nPROXY_URL=http://your-proxy:8080\n\n# Using custom endpoint (for VPC endpoints or testing)\nBEDROCK_REGION=us-east-1\nBEDROCK_ACCESS_KEY_ID=your_aws_access_key\nBEDROCK_SECRET_ACCESS_KEY=your_aws_secret_key\nBEDROCK_SERVER_URL=https://bedrock-runtime.us-east-1.amazonaws.com\n```\n\n\u003e [!IMPORTANT]\n\u003e **AWS Bedrock Rate Limits Warning**\n\u003e\n\u003e The default PentAGI configuration for AWS Bedrock uses two primary models:\n\u003e - `us.anthropic.claude-sonnet-4-20250514-v1:0` (for most agents) - **2 requests per minute** for new AWS accounts\n\u003e - `us.anthropic.claude-3-5-haiku-20241022-v1:0` (for simple tasks) - **20 requests per minute** for new AWS accounts\n\u003e\n\u003e These default rate limits are **extremely restrictive** for comfortable penetration testing scenarios and will significantly impact your workflow. We **strongly recommend**:\n\u003e\n\u003e 1. **Request quota increases** for your AWS Bedrock models through the AWS Service Quotas console\n\u003e 2. **Use provisioned throughput models** with hourly billing for higher throughput requirements\n\u003e 3. **Switch to alternative models** with higher default quotas (e.g., Amazon Nova series, Meta Llama models)\n\u003e 4. **Consider using a different LLM provider** (OpenAI, Anthropic, Gemini) if you need immediate high-throughput access\n\u003e\n\u003e Without adequate rate limits, you may experience frequent delays, timeouts, and degraded testing performance.\n\nThe AWS Bedrock provider delivers comprehensive capabilities including:\n\n- **Multi-Provider Access**: Access to models from Anthropic (Claude), AI21 (Jamba), Cohere (Command), Meta (Llama), Amazon (Nova, Titan), and DeepSeek (R1) through a single interface\n- **Advanced Reasoning**: Support for Claude 4 and other reasoning-capable models with step-by-step thinking\n- **Multimodal Models**: Amazon Nova series supporting text, image, and video processing for comprehensive security analysis\n- **Enterprise Security**: AWS-native security controls, VPC integration, and compliance certifications\n- **Cost Optimization**: Wide range of model sizes and capabilities for cost-effective penetration testing\n- **Regional Availability**: Deploy models in your preferred AWS region for data residency and performance\n- **High Performance**: Low-latency inference through AWS's global infrastructure\n\nThe system automatically selects appropriate Bedrock models based on task complexity and requirements, leveraging the full spectrum of available foundation models for optimal security testing results.\n\n\u003e [!WARNING]\n\u003e **Converse API Requirements**\n\u003e\n\u003e PentAGI uses the **Amazon Bedrock Converse API** for model interactions, which requires models to support the following features:\n\u003e - ✅ **Converse** - Basic conversation API support\n\u003e - ✅ **ConverseStream** - Streaming response support\n\u003e - ✅ **Tool use** - Function calling capabilities for penetration testing tools\n\u003e - ✅ **Streaming tool use** - Real-time tool execution feedback\n\u003e\n\u003e **Before selecting models**, verify their feature support at: [Supported models and model features](https://docs.aws.amazon.com/bedrock/latest/userguide/conversation-inference-supported-models-features.html)\n\u003e\n\u003e ⚠️ **Important**: Some models like AI21 Jurassic-2 and Cohere Command (Text) have **limited chat support** and may not work properly with PentAGI's multi-turn conversation workflows.\n\n\u003e **Note**: AWS credentials can also be provided through IAM roles, environment variables, or AWS credential files following standard AWS SDK authentication patterns. Ensure your AWS account has appropriate permissions for Amazon Bedrock service access.\n\nFor advanced configuration options and detailed setup instructions, please visit our [documentation](https://docs.pentagi.com).\n\n## 🔧 Advanced Setup\n\n### Langfuse Integration\n\nLangfuse provides advanced capabilities for monitoring and analyzing AI agent operations.\n\n1. Configure Langfuse environment variables in existing `.env` file.\n\n\u003cdetails\u003e\n    \u003csummary\u003eLangfuse valuable environment variables\u003c/summary\u003e\n\n### Database Credentials\n- `LANGFUSE_POSTGRES_USER` and `LANGFUSE_POSTGRES_PASSWORD` - Langfuse PostgreSQL credentials\n- `LANGFUSE_CLICKHOUSE_USER` and `LANGFUSE_CLICKHOUSE_PASSWORD` - ClickHouse credentials\n- `LANGFUSE_REDIS_AUTH` - Redis password\n\n### Encryption and Security Keys\n- `LANGFUSE_SALT` - Salt for hashing in Langfuse Web UI\n- `LANGFUSE_ENCRYPTION_KEY` - Encryption key (32 bytes in hex)\n- `LANGFUSE_NEXTAUTH_SECRET` - Secret key for NextAuth\n\n### Admin Credentials\n- `LANGFUSE_INIT_USER_EMAIL` - Admin email\n- `LANGFUSE_INIT_USER_PASSWORD` - Admin password\n- `LANGFUSE_INIT_USER_NAME` - Admin username\n\n### API Keys and Tokens\n- `LANGFUSE_INIT_PROJECT_PUBLIC_KEY` - Project public key (used from PentAGI side too)\n- `LANGFUSE_INIT_PROJECT_SECRET_KEY` - Project secret key (used from PentAGI side too)\n\n### S3 Storage\n- `LANGFUSE_S3_ACCESS_KEY_ID` - S3 access key ID\n- `LANGFUSE_S3_SECRET_ACCESS_KEY` - S3 secret access key\n\n\u003c/details\u003e\n\n2. Enable integration with Langfuse for PentAGI service in `.env` file.\n\n```bash\nLANGFUSE_BASE_URL=http://langfuse-web:3000\nLANGFUSE_PROJECT_ID= # default: value from ${LANGFUSE_INIT_PROJECT_ID}\nLANGFUSE_PUBLIC_KEY= # default: value from ${LANGFUSE_INIT_PROJECT_PUBLIC_KEY}\nLANGFUSE_SECRET_KEY= # default: value from ${LANGFUSE_INIT_PROJECT_SECRET_KEY}\n```\n\n3. Run the Langfuse stack:\n\n```bash\ncurl -O https://raw.githubusercontent.com/vxcontrol/pentagi/master/docker-compose-langfuse.yml\ndocker compose -f docker-compose.yml -f docker-compose-langfuse.yml up -d\n```\n\nVisit [localhost:4000](http://localhost:4000) to access Langfuse Web UI with credentials from `.env` file:\n\n- `LANGFUSE_INIT_USER_EMAIL` - Admin email\n- `LANGFUSE_INIT_USER_PASSWORD` - Admin password\n\n### Monitoring and Observability\n\nFor detailed system operation tracking, integration with monitoring tools is available.\n\n1. Enable integration with OpenTelemetry and all observability services for PentAGI in `.env` file.\n\n```bash\nOTEL_HOST=otelcol:8148\n```\n\n2. Run the observability stack:\n\n```bash\ncurl -O https://raw.githubusercontent.com/vxcontrol/pentagi/master/docker-compose-observability.yml\ndocker compose -f docker-compose.yml -f docker-compose-observability.yml up -d\n```\n\nVisit [localhost:3000](http://localhost:3000) to access Grafana Web UI.\n\n\u003e [!NOTE]\n\u003e If you want to use Observability stack with Langfuse, you need to enable integration in `.env` file to set `LANGFUSE_OTEL_EXPORTER_OTLP_ENDPOINT` to `http://otelcol:4318`.\n\u003e\n\u003e To run all available stacks together (Langfuse, Graphiti, and Observability):\n\u003e\n\u003e ```bash\n\u003e docker compose -f docker-compose.yml -f docker-compose-langfuse.yml -f docker-compose-graphiti.yml -f docker-compose-observability.yml up -d\n\u003e ```\n\u003e\n\u003e You can also register aliases for these commands in your shell to run it faster:\n\u003e\n\u003e ```bash\n\u003e alias pentagi=\"docker compose -f docker-compose.yml -f docker-compose-langfuse.yml -f docker-compose-graphiti.yml -f docker-compose-observability.yml\"\n\u003e alias pentagi-up=\"docker compose -f docker-compose.yml -f docker-compose-langfuse.yml -f docker-compose-graphiti.yml -f docker-compose-observability.yml up -d\"\n\u003e alias pentagi-down=\"docker compose -f docker-compose.yml -f docker-compose-langfuse.yml -f docker-compose-graphiti.yml -f docker-compose-observability.yml down\"\n\u003e ```\n\n### Knowledge Graph Integration (Graphiti)\n\nPentAGI integrates with [Graphiti](https://github.com/vxcontrol/pentagi-graphiti), a temporal knowledge graph system powered by Neo4j, to provide advanced semantic understanding and relationship tracking for AI agent operations. The vxcontrol fork provides custom entity and edge types that are specific to pentesting purposes.\n\n#### What is Graphiti?\n\nGraphiti automatically extracts and stores structured knowledge from agent interactions, building a graph of entities, relationships, and temporal context. This enables:\n\n- **Semantic Memory**: Store and recall relationships between tools, targets, vulnerabilities, and techniques\n- **Contextual Understanding**: Track how different pentesting actions relate to each other over time\n- **Knowledge Reuse**: Learn from past penetration tests and apply insights to new assessments\n- **Advanced Querying**: Search for complex patterns like \"What tools were effective against similar targets?\"\n\n#### Enabling Graphiti\n\nThe Graphiti knowledge graph is **optional** and disabled by default. To enable it:\n\n1. Configure Graphiti environment variables in `.env` file:\n\n```bash\n## Graphiti knowledge graph settings\nGRAPHITI_ENABLED=true\nGRAPHITI_TIMEOUT=30\nGRAPHITI_URL=http://graphiti:8000\nGRAPHITI_MODEL_NAME=gpt-5-mini\n\n# Neo4j settings (used by Graphiti stack)\nNEO4J_USER=neo4j\nNEO4J_DATABASE=neo4j\nNEO4J_PASSWORD=devpassword\nNEO4J_URI=bolt://neo4j:7687\n\n# OpenAI API key (required by Graphiti for entity extraction)\nOPEN_AI_KEY=your_openai_api_key\n```\n\n2. Run the Graphiti stack along with the main PentAGI services:\n\n```bash\n# Download the Graphiti compose file if needed\ncurl -O https://raw.githubusercontent.com/vxcontrol/pentagi/master/docker-compose-graphiti.yml\n\n# Start PentAGI with Graphiti\ndocker compose -f docker-compose.yml -f docker-compose-graphiti.yml up -d\n```\n\n3. Verify Graphiti is running:\n\n```bash\n# Check service health\ndocker compose -f docker-compose.yml -f docker-compose-graphiti.yml ps graphiti neo4j\n\n# View Graphiti logs\ndocker compose -f docker-compose.yml -f docker-compose-graphiti.yml logs -f graphiti\n\n# Access Neo4j Browser (optional)\n# Visit http://localhost:7474 and login with NEO4J_USER/NEO4J_PASSWORD\n\n# Access Graphiti API (optional, for debugging)\n# Visit http://localhost:8000/docs for Swagger API documentation\n```\n\n\u003e [!NOTE]\n\u003e The Graphiti service is defined in `docker-compose-graphiti.yml` as a separate stack. You must run both compose files together to enable the knowledge graph functionality. The pre-built Docker image `vxcontrol/graphiti:latest` is used by default.\n\n#### What Gets Stored\n\nWhen enabled, PentAGI automatically captures:\n\n- **Agent Responses**: All agent reasoning, analysis, and decisions\n- **Tool Executions**: Commands executed, tools used, and their results\n- **Context Information**: Flow, task, and subtask hierarchy\n\n### GitHub and Google OAuth Integration\n\nOAuth integration with GitHub and Google allows users to authenticate using their existing accounts on these platforms. This provides several benefits:\n\n- Simplified login process without need to create separate credentials\n- Enhanced security through trusted identity providers\n- Access to user profile information from GitHub/Google accounts\n- Seamless integration with existing development workflows\n\nFor using GitHub OAuth you need to create a new OAuth application in your GitHub account and set the `GITHUB_CLIENT_ID` and `GITHUB_CLIENT_SECRET` in `.env` file.\n\nFor using Google OAuth you need to create a new OAuth application in your Google account and set the `GOOGLE_CLIENT_ID` and `GOOGLE_CLIENT_SECRET` in `.env` file.\n\n### Docker Image Configuration\n\nPentAGI allows you to configure Docker image selection for executing various tasks. The system automatically chooses the most appropriate image based on the task type, but you can constrain this selection by specifying your preferred images:\n\n| Variable | Default | Description |\n|----------|---------|-------------|\n| `DOCKER_DEFAULT_IMAGE` | `debian:latest` | Default Docker image for general tasks and ambiguous cases |\n| `DOCKER_DEFAULT_IMAGE_FOR_PENTEST` | `vxcontrol/kali-linux` | Default Docker image for security/penetration testing tasks |\n\nWhen these environment variables are set, AI agents will be limited to the image choices you specify. This is particularly useful for:\n\n- **Security Enforcement**: Restricting usage to only verified and trusted images\n- **Environment Standardization**: Using corporate or customized images across all operations\n- **Performance Optimization**: Utilizing pre-built images with necessary tools already installed\n\nConfiguration examples:\n\n```bash\n# Using a custom image for general tasks\nDOCKER_DEFAULT_IMAGE=mycompany/custom-debian:latest\n\n# Using a specialized image for penetration testing\nDOCKER_DEFAULT_IMAGE_FOR_PENTEST=mycompany/pentest-tools:v2.0\n```\n\n\u003e [!NOTE]\n\u003e If a user explicitly specifies a particular Docker image in their task, the system will try to use that exact image, ignoring these settings. These variables only affect the system's automatic image selection process.\n\n## 💻 Development\n\n### Development Requirements\n\n- golang\n- nodejs\n- docker\n- postgres\n- commitlint\n\n### Environment Setup\n\n#### Backend Setup\n\nRun once `cd backend \u0026\u0026 go mod download` to install needed packages.\n\nFor generating swagger files have to run\n\n```bash\nswag init -g ../../pkg/server/router.go -o pkg/server/docs/ --parseDependency --parseInternal --parseDepth 2 -d cmd/pentagi\n```\n\nbefore installing `swag` package via\n\n```bash\ngo install github.com/swaggo/swag/cmd/swag@v1.8.7\n```\n\nFor generating graphql resolver files have to run\n\n```bash\ngo run github.com/99designs/gqlgen --config ./gqlgen/gqlgen.yml\n```\n\nafter that you can see the generated files in `pkg/graph` folder.\n\nFor generating ORM methods (database package) from sqlc configuration\n\n```bash\ndocker run --rm -v $(pwd):/src -w /src --network pentagi-network -e DATABASE_URL=\"{URL}\" sqlc/sqlc generate -f sqlc/sqlc.yml\n```\n\nFor generating Langfuse SDK from OpenAPI specification\n\n```bash\nfern generate --local\n```\n\nand to install fern-cli\n\n```bash\nnpm install -g fern-api\n```\n\n#### Testing\n\nFor running tests `cd backend \u0026\u0026 go test -v ./...`\n\n#### Frontend Setup\n\nRun once `cd frontend \u0026\u0026 npm install` to install needed packages.\n\nFor generating graphql files have to run `npm run graphql:generate` which using `graphql-codegen.ts` file.\n\nBe sure that you have `graphql-codegen` installed globally:\n\n```bash\nnpm install -g graphql-codegen\n```\n\nAfter that you can run:\n* `npm run prettier` to check if your code is formatted correctly\n* `npm run prettier:fix` to fix it\n* `npm run lint` to check if your code is linted correctly\n* `npm run lint:fix` to fix it\n\nFor generating SSL certificates you need to run `npm run ssl:generate` which using `generate-ssl.ts` file or it will be generated automatically when you run `npm run dev`.\n\n#### Backend Configuration\n\nEdit the configuration for `backend` in `.vscode/launch.json` file:\n- `DATABASE_URL` - PostgreSQL database URL (eg. `postgres://postgres:postgres@localhost:5432/pentagidb?sslmode=disable`)\n- `DOCKER_HOST` - Docker SDK API (eg. for macOS `DOCKER_HOST=unix:///Users/\u003cmy-user\u003e/Library/Containers/com.docker.docker/Data/docker.raw.sock`) [more info](https://stackoverflow.com/a/62757128/5922857)\n\nOptional:\n- `SERVER_PORT` - Port to run the server (default: `8443`)\n- `SERVER_USE_SSL` - Enable SSL for the server (default: `false`)\n\n#### Frontend Configuration\n\nEdit the configuration for `frontend` in `.vscode/launch.json` file:\n- `VITE_API_URL` - Backend API URL. *Omit* the URL scheme (e.g., `localhost:8080` *NOT* `http://localhost:8080`)\n- `VITE_USE_HTTPS` - Enable SSL for the server (default: `false`)\n- `VITE_PORT` - Port to run the server (default: `8000`)\n- `VITE_HOST` - Host to run the server (default: `0.0.0.0`)\n\n### Running the Application\n\n#### Backend\n\nRun the command(s) in `backend` folder:\n- Use `.env` file to set environment variables like a `source .env`\n- Run `go run cmd/pentagi/main.go` to start the server\n\n\u003e [!NOTE]\n\u003e The first run can take a while as dependencies and docker images need to be downloaded to setup the backend environment.\n\n#### Frontend\n\nRun the command(s) in `frontend` folder:\n- Run `npm install` to install the dependencies\n- Run `npm run dev` to run the web app\n- Run `npm run build` to build the web app\n\nOpen your browser and visit the web app URL.\n\n## 🧪 Testing LLM Agents\n\nPentAGI includes a powerful utility called `ctester` for testing and validating LLM agent capabilities. This tool helps ensure your LLM provider configurations work correctly with different agent types, allowing you to optimize model selection for each specific agent role.\n\nThe utility features parallel testing of multiple agents, detailed reporting, and flexible configuration options.\n\n### Key Features\n\n- **Parallel Testing**: Tests multiple agents simultaneously for faster results\n- **Comprehensive Test Suite**: Evaluates basic completion, JSON responses, function calling, and penetration testing knowledge\n- **Detailed Reporting**: Generates markdown reports with success rates and performance metrics\n- **Flexible Configuration**: Test specific agents or test groups as needed\n- **Specialized Test Groups**: Includes domain-specific tests for cybersecurity and penetration testing scenarios\n\n### Usage Scenarios\n\n#### For Developers (with local Go environment)\n\nIf you've cloned the repository and have Go installed:\n\n```bash\n# Default configuration with .env file\ncd backend\ngo run cmd/ctester/*.go -verbose\n\n# Custom provider configuration\ngo run cmd/ctester/*.go -config ../examples/configs/openrouter.provider.yml -verbose\n\n# Generate a report file\ngo run cmd/ctester/*.go -config ../examples/configs/deepinfra.provider.yml -report ../test-report.md\n\n# Test specific agent types only\ngo run cmd/ctester/*.go -agents simple,simple_json,primary_agent -verbose\n\n# Test specific test groups only\ngo run cmd/ctester/*.go -groups basic,advanced -verbose\n```\n\n#### For Users (using Docker image)\n\nIf you prefer to use the pre-built Docker image without setting up a development environment:\n\n```bash\n# Using Docker to test with default environment\ndocker run --rm -v $(pwd)/.env:/opt/pentagi/.env vxcontrol/pentagi /opt/pentagi/bin/ctester -verbose\n\n# Test with your custom provider configuration\ndocker run --rm \\\n  -v $(pwd)/.env:/opt/pentagi/.env \\\n  -v $(pwd)/my-config.yml:/opt/pentagi/config.yml \\\n  vxcontrol/pentagi /opt/pentagi/bin/ctester -config /opt/pentagi/config.yml -agents simple,primary_agent,coder -verbose\n\n# Generate a detailed report\ndocker run --rm \\\n  -v $(pwd)/.env:/opt/pentagi/.env \\\n  -v $(pwd):/opt/pentagi/output \\\n  vxcontrol/pentagi /opt/pentagi/bin/ctester -report /opt/pentagi/output/report.md\n```\n\n#### Using Pre-configured Providers\n\nThe Docker image comes with built-in support for major providers (OpenAI, Anthropic, Gemini, Ollama) and pre-configured provider files for additional services (OpenRouter, DeepInfra, DeepSeek, Moonshot):\n\n```bash\n# Test with OpenRouter configuration\ndocker run --rm \\\n  -v $(pwd)/.env:/opt/pentagi/.env \\\n  vxcontrol/pentagi /opt/pentagi/bin/ctester -config /opt/pentagi/conf/openrouter.provider.yml\n\n# Test with DeepInfra configuration\ndocker run --rm \\\n  -v $(pwd)/.env:/opt/pentagi/.env \\\n  vxcontrol/pentagi /opt/pentagi/bin/ctester -config /opt/pentagi/conf/deepinfra.provider.yml\n\n# Test with DeepSeek configuration\ndocker run --rm \\\n  -v $(pwd)/.env:/opt/pentagi/.env \\\n  vxcontrol/pentagi /opt/pentagi/bin/ctester -config /opt/pentagi/conf/deepseek.provider.yml\n\n# Test with Moonshot configuration\ndocker run --rm \\\n  -v $(pwd)/.env:/opt/pentagi/.env \\\n  vxcontrol/pentagi /opt/pentagi/bin/ctester -config /opt/pentagi/conf/moonshot.provider.yml\n\n# Test with OpenAI configuration\ndocker run --rm \\\n  -v $(pwd)/.env:/opt/pentagi/.env \\\n  vxcontrol/pentagi /opt/pentagi/bin/ctester -type openai\n\n# Test with Anthropic configuration\ndocker run --rm \\\n  -v $(pwd)/.env:/opt/pentagi/.env \\\n  vxcontrol/pentagi /opt/pentagi/bin/ctester -type anthropic\n\n# Test with Gemini configuration\ndocker run --rm \\\n  -v $(pwd)/.env:/opt/pentagi/.env \\\n  vxcontrol/pentagi /opt/pentagi/bin/ctester -type gemini\n\n# Test with AWS Bedrock configuration\ndocker run --rm \\\n  -v $(pwd)/.env:/opt/pentagi/.env \\\n  vxcontrol/pentagi /opt/pentagi/bin/ctester -type bedrock\n\n# Test with Custom OpenAI configuration\ndocker run --rm \\\n  -v $(pwd)/.env:/opt/pentagi/.env \\\n  vxcontrol/pentagi /opt/pentagi/bin/ctester -config /opt/pentagi/conf/custom-openai.provider.yml\n\n# Test with Ollama configuration (local inference)\ndocker run --rm \\\n  -v $(pwd)/.env:/opt/pentagi/.env \\\n  vxcontrol/pentagi /opt/pentagi/bin/ctester -config /opt/pentagi/conf/ollama-llama318b.provider.yml\n\n# Test with Ollama Qwen3 32B configuration (requires custom model creation)\ndocker run --rm \\\n  -v $(pwd)/.env:/opt/pentagi/.env \\\n  vxcontrol/pentagi /opt/pentagi/bin/ctester -config /opt/pentagi/conf/ollama-qwen332b-fp16-tc.provider.yml\n\n# Test with Ollama QwQ 32B configuration (requires custom model creation and 71.3GB VRAM)\ndocker run --rm \\\n  -v $(pwd)/.env:/opt/pentagi/.env \\\n  vxcontrol/pentagi /opt/pentagi/bin/ctester -config /opt/pentagi/conf/ollama-qwq32b-fp16-tc.provider.yml\n```\n\nTo use these configurations, your `.env` file only needs to contain:\n\n```\nLLM_SERVER_URL=https://openrouter.ai/api/v1      # or https://api.deepinfra.com/v1/openai or https://api.deepseek.com or https://api.openai.com/v1 or https://api.moonshot.ai/v1\nLLM_SERVER_KEY=your_api_key\nLLM_SERVER_MODEL=                                # Leave empty, as models are specified in the config\nLLM_SERVER_CONFIG_PATH=/opt/pentagi/conf/openrouter.provider.yml  # or deepinfra.provider.yml or deepseek.provider.yml or custom-openai.provider.yml or moonshot.provider.yml\nLLM_SERVER_LEGACY_REASONING=false                # Controls reasoning format, for OpenAI must be true (default: false)\n\n# For OpenAI (official API)\nOPEN_AI_KEY=your_openai_api_key                  # Your OpenAI API key\nOPEN_AI_SERVER_URL=https://api.openai.com/v1     # OpenAI API endpoint\n\n# For Anthropic (Claude models)\nANTHROPIC_API_KEY=your_anthropic_api_key         # Your Anthropic API key\nANTHROPIC_SERVER_URL=https://api.anthropic.com/v1  # Anthropic API endpoint\n\n# For Gemini (Google AI)\nGEMINI_API_KEY=your_gemini_api_key               # Your Google AI API key\nGEMINI_SERVER_URL=https://generativelanguage.googleapis.com  # Google AI API endpoint\n\n# For AWS Bedrock (enterprise foundation models)\nBEDROCK_REGION=us-east-1                         # AWS region for Bedrock service\nBEDROCK_ACCESS_KEY_ID=your_aws_access_key        # AWS access key ID\nBEDROCK_SECRET_ACCESS_KEY=your_aws_secret_key    # AWS secret access key\nBEDROCK_SESSION_TOKEN=your_aws_session_token     # AWS session token (alternative auth method)\nBEDROCK_SERVER_URL=                              # Optional custom Bedrock endpoint\n\n# For Ollama (local inference)\nOLLAMA_SERVER_URL=http://localhost:11434         # Your Ollama server URL (only for test run)\nOLLAMA_SERVER_CONFIG_PATH=/opt/pentagi/conf/ollama-llama318b.provider.yml  # or ollama-qwen332b-fp16-tc.provider.yml or ollama-qwq32b-fp16-tc.provider.yml\n```\n\n#### Using OpenAI with Unverified Organizations\n\nFor OpenAI accounts with unverified organizations that don't have access to the latest reasoning models (o1, o3, o4-mini), you need to use a custom configuration.\n\nTo use OpenAI with unverified organization accounts, configure your `.env` file as follows:\n\n```bash\nLLM_SERVER_URL=https://api.openai.com/v1\nLLM_SERVER_KEY=your_openai_api_key\nLLM_SERVER_MODEL=                                # Leave empty, models are specified in config\nLLM_SERVER_CONFIG_PATH=/opt/pentagi/conf/custom-openai.provider.yml\nLLM_SERVER_LEGACY_REASONING=true                 # Required for OpenAI reasoning format\n```\n\nThis configuration uses the pre-built `custom-openai.provider.yml` file that maps all agent types to models available for unverified organizations, using `o3-mini` instead of models like `o1`, `o3`, and `o4-mini`.\n\nYou can test this configuration using:\n\n```bash\n# Test with custom OpenAI configuration for unverified accounts\ndocker run --rm \\\n  -v $(pwd)/.env:/opt/pentagi/.env \\\n  vxcontrol/pentagi /opt/pentagi/bin/ctester -config /opt/pentagi/conf/custom-openai.provider.yml\n```\n\n\u003e [!NOTE]\n\u003e The `LLM_SERVER_LEGACY_REASONING=true` setting is crucial for OpenAI compatibility as it ensures reasoning parameters are sent in the format expected by OpenAI's API.\n\n#### Running Tests in a Production Environment\n\nIf you already have a running PentAGI container and want to test the current configuration:\n\n```bash\n# Run ctester in an existing container using current environment variables\ndocker exec -it pentagi /opt/pentagi/bin/ctester -verbose\n\n# Test specific agent types with deterministic ordering\ndocker exec -it pentagi /opt/pentagi/bin/ctester -agents simple,primary_agent,pentester -groups basic,knowledge -verbose\n\n# Generate a report file inside the container\ndocker exec -it pentagi /opt/pentagi/bin/ctester -report /opt/pentagi/data/agent-test-report.md\n\n# Access the report from the host\ndocker cp pentagi:/opt/pentagi/data/agent-test-report.md ./\n```\n\n### Command-line Options\n\nThe utility accepts several options:\n\n- `-env \u003cpath\u003e` - Path to environment file (default: `.env`)\n- `-type \u003cprovider\u003e` - Provider type: `custom`, `openai`, `anthropic`, `ollama`, `bedrock`, `gemini` (default: `custom`)\n- `-config \u003cpath\u003e` - Path to custom provider config (default: from `LLM_SERVER_CONFIG_PATH` env variable)\n- `-tests \u003cpath\u003e` - Path to custom tests YAML file (optional)\n- `-report \u003cpath\u003e` - Path to write the report file (optional)\n- `-agents \u003clist\u003e` - Comma-separated list of agent types to test (default: `all`)\n- `-groups \u003clist\u003e` - Comma-separated list of test groups to run (default: `all`)\n- `-verbose` - Enable verbose output with detailed test results for each agent\n\n### Available Agent Types\n\nAgents are tested in the following deterministic order:\n\n1. **simple** - Basic completion tasks\n2. **simple_json** - JSON-structured responses\n3. **primary_agent** - Main reasoning agent\n4. **assistant** - Interactive assistant mode\n5. **generator** - Content generation\n6. **refiner** - Content refinement and improvement\n7. **adviser** - Expert advice and consultation\n8. **reflector** - Self-reflection and analysis\n9. **searcher** - Information gathering and search\n10. **enricher** - Data enrichment and expansion\n11. **coder** - Code generation and analysis\n12. **installer** - Installation and setup tasks\n13. **pentester** - Penetration testing and security assessment\n\n### Available Test Groups\n\n- **basic** - Fundamental completion and prompt response tests\n- **advanced** - Complex reasoning and function calling tests\n- **json** - JSON format validation and structure tests (specifically designed for `simple_json` agent)\n- **knowledge** - Domain-specific cybersecurity and penetration testing knowledge tests\n\n\u003e **Note**: The `json` test group is specifically designed for the `simple_json` agent type, while all other agents are tested with `basic`, `advanced`, and `knowledge` groups. This specialization ensures optimal testing coverage for each agent's intended purpose.\n\n### Example Provider Configuration\n\nProvider configuration defines which models to use for different agent types:\n\n```yaml\nsimple:\n  model: \"provider/model-name\"\n  temperature: 0.7\n  top_p: 0.95\n  n: 1\n  max_tokens: 4000\n\nsimple_json:\n  model: \"provider/model-name\"\n  temperature: 0.7\n  top_p: 1.0\n  n: 1\n  max_tokens: 4000\n  json: true\n\n# ... other agent types ...\n```\n\n### Optimization Workflow\n\n1. **Create a baseline**: Run tests with default configuration to establish benchmark performance\n2. **Analyze agent-specific performance**: Review the deterministic agent ordering to identify underperforming agents\n3. **Test specialized configurations**: Experiment with different models for each agent type using provider-specific configs\n4. **Focus on domain knowledge**: Pay special attention to knowledge group tests for cybersecurity expertise\n5. **Validate function calling**: Ensure tool-based tests pass consistently for critical agent types\n6. **Compare results**: Look for the best success rate and performance across all test groups\n7. **Deploy optimal configuration**: Use in production with your optimized setup\n\nThis tool helps ensure your AI agents are using the most effective models for their specific tasks, improving reliability while optimizing costs.\n\n## 🧮 Embedding Configuration and Testing\n\nPentAGI uses vector embeddings for semantic search, knowledge storage, and memory management. The system supports multiple embedding providers that can be configured according to your needs and preferences.\n\n### Supported Embedding Providers\n\nPentAGI supports the following embedding providers:\n\n- **OpenAI** (default): Uses OpenAI's text embedding models\n- **Ollama**: Local embedding model through Ollama\n- **Mistral**: Mistral AI's embedding models\n- **Jina**: Jina AI's embedding service\n- **HuggingFace**: Models from HuggingFace\n- **GoogleAI**: Google's embedding models\n- **VoyageAI**: VoyageAI's embedding models\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eEmbedding Provider Configuration\u003c/b\u003e (click to expand)\u003c/summary\u003e\n\n### Environment Variables\n\nTo configure the embedding provider, set the following environment variables in your `.env` file:\n\n```bash\n# Primary embedding configuration\nEMBEDDING_PROVIDER=openai       # Provider type (openai, ollama, mistral, jina, huggingface, googleai, voyageai)\nEMBEDDING_MODEL=text-embedding-3-small  # Model name to use\nEMBEDDING_URL=                  # Optional custom API endpoint\nEMBEDDING_KEY=                  # API key for the provider (if required)\nEMBEDDING_BATCH_SIZE=100        # Number of documents to process in a batch\nEMBEDDING_STRIP_NEW_LINES=true  # Whether to remove new lines from text before embedding\n\n# Advanced settings\nPROXY_URL=                      # Optional proxy for all API calls\n\n# SSL/TLS Certificate Configuration (for external communication with LLM backends and tool servers)\nEXTERNAL_SSL_CA_PATH=           # Path to custom CA certificate file (PEM format) inside the container\n                                # Must point to /opt/pentagi/ssl/ directory (e.g., /opt/pentagi/ssl/ca-bundle.pem)\nEXTERNAL_SSL_INSECURE=false     # Skip certificate verification (use only for testing)\n```\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eHow to Add Custom CA Certificates\u003c/b\u003e (click to expand)\u003c/summary\u003e\n\nIf you see this error: `tls: failed to verify certificate: x509: certificate signed by unknown authority`\n\n**Step 1:** Get your CA certificate bundle in PEM format (can contain multiple certificates)\n\n**Step 2:** Place the file in the SSL directory on your host machine:\n```bash\n# Default location (if PENTAGI_SSL_DIR is not set)\ncp ca-bundle.pem ./pentagi-ssl/\n\n# Or custom location (if using PENTAGI_SSL_DIR in docker-compose.yml)\ncp ca-bundle.pem /path/to/your/ssl/dir/\n```\n\n**Step 3:** Set the path in `.env` file (path must be inside the container):\n```bash\n# The volume pentagi-ssl is mounted to /opt/pentagi/ssl inside the container\nEXTERNAL_SSL_CA_PATH=/opt/pentagi/ssl/ca-bundle.pem\nEXTERNAL_SSL_INSECURE=false\n```\n\n**Step 4:** Restart PentAGI:\n```bash\ndocker compose restart pentagi\n```\n\n**Notes:**\n- The `pentagi-ssl` volume is mounted to `/opt/pentagi/ssl` inside the container\n- You can change host directory using `PENTAGI_SSL_DIR` variable in docker-compose.yml\n- File supports multiple certificates and intermediate CAs in one PEM file\n- Use `EXTERNAL_SSL_INSECURE=true` only for testing (not recommended for production)\n\n\u003c/details\u003e\n\n### Provider-Specific Limitations\n\nEach provider has specific limitations and supported features:\n\n- **OpenAI**: Supports all configuration options\n- **Ollama**: Does not support `EMBEDDING_KEY` as it uses local models\n- **Mistral**: Does not support `EMBEDDING_MODEL` or custom HTTP client\n- **Jina**: Does not support custom HTTP client\n- **HuggingFace**: Requires `EMBEDDING_KEY` and supports all other options\n- **GoogleAI**: Does not support `EMBEDDING_URL`, requires `EMBEDDING_KEY`\n- **VoyageAI**: Supports all configuration options\n\nIf `EMBEDDING_URL` and `EMBEDDING_KEY` are not specified, the system will attempt to use the corresponding LLM provider settings (e.g., `OPEN_AI_KEY` when `EMBEDDING_PROVIDER=openai`).\n\n### Why Consistent Embedding Providers Matter\n\nIt's crucial to use the same embedding provider consistently because:\n\n1. **Vector Compatibility**: Different providers produce vectors with different dimensions and mathematical properties\n2. **Semantic Consistency**: Changing providers can break semantic similarity between previously embedded documents\n3. **Memory Corruption**: Mixed embeddings can lead to poor search results and broken knowledge base functionality\n\nIf you change your embedding provider, you should flush and reindex your entire knowledge base (see `etester` utility below).\n\n\u003c/details\u003e\n\n### Embedding Tester Utility (etester)\n\nPentAGI includes a specialized `etester` utility for testing, managing, and debugging embedding functionality. This tool is essential for diagnosing and resolving issues related to vector embeddings and knowledge storage.\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eEtester Commands\u003c/b\u003e (click to expand)\u003c/summary\u003e\n\n```bash\n# Test embedding provider and database connection\ncd backend\ngo run cmd/etester/main.go test -verbose\n\n# Show statistics about the embedding database\ngo run cmd/etester/main.go info\n\n# Delete all documents from the embedding database (use with caution!)\ngo run cmd/etester/main.go flush\n\n# Recalculate embeddings for all documents (after changing provider)\ngo run cmd/etester/main.go reindex\n\n# Search for documents in the embedding database\ngo run cmd/etester/main.go search -query \"How to install PostgreSQL\" -limit 5\n```\n\n### Using Docker\n\nIf you're running PentAGI in Docker, you can use etester from within the container:\n\n```bash\n# Test embedding provider\ndocker exec -it pentagi /opt/pentagi/bin/etester test\n\n# Show detailed database information\ndocker exec -it pentagi /opt/pentagi/bin/etester info -verbose\n```\n\n### Advanced Search Options\n\nThe `search` command supports various filters to narrow down results:\n\n```bash\n# Filter by document type\ndocker exec -it pentagi /opt/pentagi/bin/etester search -query \"Security vulnerability\" -doc_type guide -threshold 0.8\n\n# Filter by flow ID\ndocker exec -it pentagi /opt/pentagi/bin/etester search -query \"Code examples\" -doc_type code -flow_id 42\n\n# All available search options\ndocker exec -it pentagi /opt/pentagi/bin/etester search -help\n```\n\nAvailable search parameters:\n- `-query STRING`: Search query text (required)\n- `-doc_type STRING`: Filter by document type (answer, memory, guide, code)\n- `-flow_id NUMBER`: Filter by flow ID (positive number)\n- `-answer_type STRING`: Filter by answer type (guide, vulnerability, code, tool, other)\n- `-guide_type STRING`: Filter by guide type (install, configure, use, pentest, development, other)\n- `-limit NUMBER`: Maximum number of results (default: 3)\n- `-threshold NUMBER`: Similarity threshold (0.0-1.0, default: 0.7)\n\n### Common Troubleshooting Scenarios\n\n1. **After changing embedding provider**: Always run `flush` or `reindex` to ensure consistency\n2. **Poor search results**: Try adjusting the similarity threshold or check if embeddings are correctly generated\n3. **Database connection issues**: Verify PostgreSQL is running with pgvector extension installed\n4. **Missing API keys**: Check environment variables for your chosen embedding provider\n\n\u003c/details\u003e\n\n## 🔍 Function Testing with ftester\n\nPentAGI includes a versatile utility called `ftester` for debugging, testing, and developing specific functions and AI agent behaviors. While `ctester` focuses on testing LLM model capabilities, `ftester` allows you to directly invoke individual system functions and AI agent components with precise control over execution context.\n\n### Key Features\n\n- **Direct Function Access**: Test individual functions without running the entire system\n- **Mock Mode**: Test functions without a live PentAGI deployment using built-in mocks\n- **Interactive Input**: Fill function arguments interactively for exploratory testing\n- **Detailed Output**: Color-coded terminal output with formatted responses and errors\n- **Context-Aware Testing**: Debug AI agents within the context of specific flows, tasks, and subtasks\n- **Observability Integration**: All function calls are logged to Langfuse and Observability stack\n\n### Usage Modes\n\n#### Command Line Arguments\n\nRun ftester with specific function and arguments directly from the command line:\n\n```bash\n# Basic usage with mock mode\ncd backend\ngo run cmd/ftester/main.go [function_name] -[arg1] [value1] -[arg2] [value2]\n\n# Example: Test terminal command in mock mode\ngo run cmd/ftester/main.go terminal -command \"ls -la\" -message \"List files\"\n\n# Using a real flow context\ngo run cmd/ftester/main.go -flow 123 terminal -command \"whoami\" -message \"Check user\"\n\n# Testing AI agent in specific task/subtask context\ngo run cmd/ftester/main.go -flow 123 -task 456 -subtask 789 pentester -message \"Find vulnerabilities\"\n```\n\n#### Interactive Mode\n\nRun ftester without arguments for a guided interactive experience:\n\n```bash\n# Start interactive mode\ngo run cmd/ftester/main.go [function_name]\n\n# For example, to interactively fill browser tool arguments\ngo run cmd/ftester/main.go browser\n```\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eAvailable Functions\u003c/b\u003e (click to expand)\u003c/summary\u003e\n\n### Environment Functions\n- **terminal**: Execute commands in a container and return the output\n- **file**: Perform file operations (read, write, list) in a container\n\n### Search Functions\n- **browser**: Access websites and capture screenshots\n- **google**: Search the web using Google Custom Search\n- **duckduckgo**: Search the web using DuckDuckGo\n- **tavily**: Search using Tavily AI search engine\n- **traversaal**: Search using Traversaal AI search engine\n- **perplexity**: Search using Perplexity AI\n- **searxng**: Search using Searxng meta search engine (aggregates results from multiple engines)\n\n### Vector Database Functions\n- **search_in_memory**: Search for information in vector database\n- **search_guide**: Find guidance documents in vector database\n- **search_answer**: Find answers to questions in vector database\n- **search_code**: Find code examples in vector database\n\n### AI Agent Functions\n- **advice**: Get expert advice from an AI agent\n- **coder**: Request code generation or modification\n- **maintenance**: Run system maintenance tasks\n- **memorist**: Store and organize information in vector database\n- **pentester**: Perform security tests and vulnerability analysis\n- **search**: Complex search across multiple sources\n\n### Utility Functions\n- **describe**: Show information about flows, tasks, and subtasks\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eDebugging Flow Context\u003c/b\u003e (click to expand)\u003c/summary\u003e\n\nThe `describe` function provides detailed information about tasks and subtasks within a flow. This is particularly useful for diagnosing issues when PentAGI encounters problems or gets stuck.\n\n```bash\n# List all flows in the system\ngo run cmd/ftester/main.go describe\n\n# Show all tasks and subtasks for a specific flow\ngo run cmd/ftester/main.go -flow 123 describe\n\n# Show detailed information for a specific task\ngo run cmd/ftester/main.go -flow 123 -task 456 describe\n\n# Show detailed information for a specific subtask\ngo run cmd/ftester/main.go -flow 123 -task 456 -subtask 789 describe\n\n# Show verbose output with full descriptions and results\ngo run cmd/ftester/main.go -flow 123 describe -verbose\n```\n\nThis function allows you to identify the exact point where a flow might be stuck and resume processing by directly invoking the appropriate agent function.\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eFunction Help and Discovery\u003c/b\u003e (click to expand)\u003c/summary\u003e\n\nEach function has a help mode that shows available parameters:\n\n```bash\n# Get help for a specific function\ngo run cmd/ftester/main.go [function_name] -help\n\n# Examples:\ngo run cmd/ftester/main.go terminal -help\ngo run cmd/ftester/main.go browser -help\ngo run cmd/ftester/main.go describe -help\n```\n\nYou can also run ftester without arguments to see a list of all available functions:\n\n```bash\ngo run cmd/ftester/main.go\n```\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eOutput Format\u003c/b\u003e (click to expand)\u003c/summary\u003e\n\nThe `ftester` utility uses color-coded output to make interpretation easier:\n\n- **Blue headers**: Section titles and key names\n- **Cyan [INFO]**: General information messages\n- **Green [SUCCESS]**: Successful operations\n- **Red [ERROR]**: Error messages\n- **Yellow [WARNING]**: Warning messages\n- **Yellow [MOCK]**: Indicates mock mode operation\n- **Magenta values**: Function arguments and results\n\nJSON and Markdown responses are automatically formatted for readability.\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eAdvanced Usage Scenarios\u003c/b\u003e (click to expand)\u003c/summary\u003e\n\n### Debugging Stuck AI Flows\n\nWhen PentAGI gets stuck in a flow:\n\n1. Pause the flow through the UI\n2. Use `describe` to identify the current task and subtask\n3. Directly invoke the agent function with the same task/subtask IDs\n4. Examine the detailed output to identify the issue\n5. Resume the flow or manually intervene as needed\n\n### Testing Environment Variables\n\nVerify that API keys and external services are configured correctly:\n\n```bash\n# Test Google search API configuration\ngo run cmd/ftester/main.go google -query \"pentesting tools\"\n\n# Test browser access to external websites\ngo run cmd/ftester/main.go browser -url \"https://example.com\"\n```\n\n### Developing New AI Agent Behaviors\n\nWhen developing new prompt templates or agent behaviors:\n\n1. Create a test flow in the UI\n2. Use ftester to directly invoke the agent with different prompts\n3. Observe responses and adjust prompts accordingly\n4. Check Langfuse for detailed traces of all function calls\n\n### Verifying Docker Container Setup\n\nEnsure containers are properly configured:\n\n```bash\ngo run cmd/ftester/main.go -flow 123 terminal -command \"env | grep -i proxy\" -message \"Check proxy settings\"\n```\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eDocker Container Usage\u003c/b\u003e (click to expand)\u003c/summary\u003e\n\nIf you have PentAGI running in Docker, you can use ftester from within the container:\n\n```bash\n# Run ftester inside the running PentAGI container\ndocker exec -it pentagi /opt/pentagi/bin/ftester [arguments]\n\n# Examples:\ndocker exec -it pentagi /opt/pentagi/bin/ftester -flow 123 describe\ndocker exec -it pentagi /opt/pentagi/bin/ftester -flow 123 terminal -command \"ps aux\" -message \"List processes\"\n```\n\nThis is particularly useful for production deployments where you don't have a local development environment.\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eIntegration with Observability Tools\u003c/b\u003e (click to expand)\u003c/summary\u003e\n\nAll function calls made through ftester are logged to:\n\n1. **Langfuse**: Captures the entire AI agent interaction chain, including prompts, responses, and function calls\n2. **OpenTelemetry**: Records metrics, traces, and logs for system performance analysis\n3. **Terminal Output**: Provides immediate feedback on function execution\n\nTo access detailed logs:\n\n- Check Langfuse UI for AI agent traces (typically at `http://localhost:4000`)\n- Use Grafana dashboards for system metrics (typically at `http://localhost:3000`)\n- Examine terminal output for immediate function results and errors\n\n\u003c/details\u003e\n\n### Command-line Options\n\nThe main utility accepts several options:\n\n- `-env \u003cpath\u003e` - Path to environment file (optional, default: `.env`)\n- `-provider \u003ctype\u003e` - Provider type to use (default: `custom`, options: `openai`, `anthropic`, `ollama`, `bedrock`, `gemini`, `custom`)\n- `-flow \u003cid\u003e` - Flow ID for testing (0 means using mocks, default: `0`)\n- `-task \u003cid\u003e` - Task ID for agent context (optional)\n- `-subtask \u003cid\u003e` - Subtask ID for agent context (optional)\n\nFunction-specific arguments are passed after the function name using `-name value` format.\n\n## 🏗️ Building\n\n### Building Docker Image\n\n```bash\ndocker build -t local/pentagi:latest .\n```\n\n\u003e [!NOTE]\n\u003e You can use `docker buildx` to build the image for different platforms like a `docker buildx build --platform linux/amd64 -t local/pentagi:latest .`\n\u003e\n\u003e You need to change image name in docker-compose.yml file to `local/pentagi:latest` and run `docker compose up -d` to start the server or use `build` key option in [docker-compose.yml](docker-compose.yml) file.\n\n## 👏 Credits\n\nThis project is made possible thanks to the following research and developments:\n- [Emerging Architectures for LLM Applications](https://lilianweng.github.io/posts/2023-06-23-agent)\n- [A Survey of Autonomous LLM Agents](https://arxiv.org/abs/2403.08299)\n\n## 📄 License\n\n### PentAGI Core License\n\n**PentAGI Core**: Licensed under [MIT License](LICENSE)  \nCopyright (c) 2025 PentAGI Development Team\n\n### VXControl Cloud SDK Integration\n\n**VXControl Cloud SDK Integration**: This repository integrates [VXControl Cloud SDK](https://github.com/vxcontrol/cloud) under a **special licensing exception** that applies **ONLY** to the official PentAGI project.\n\n#### ✅ Official PentAGI Project\n- This official repository: `https://github.com/vxcontrol/pentagi`\n- Official releases distributed by VXControl LLC\n- Code used under direct authorization from VXControl LLC\n\n#### ⚠️ Important for Forks and Third-Party Use\n\nIf you fork this project or create derivative works, the VXControl SDK components are subject to **AGPL-3.0** license terms. You must either:\n\n1. **Remove VXControl SDK integration**\n2. **Open source your entire application** (comply with AGPL-3.0 copyleft terms)\n3. **Obtain a commercial license** from VXControl LLC\n\n#### Commercial Licensing\n\nFor commercial use of VXControl Cloud SDK in proprietary applications, contact:\n- **Email**: info@vxcontrol.com  \n- **Subject**: \"VXControl Cloud SDK Commercial License\"\n","funding_links":[],"categories":["Go","红队\u0026渗透测试","AI \u0026 LLM","Security \u0026 Privacy Agents","🤖 AI \u0026 Machine Learning","Openai","Tools","🛠️ Agentic AI Frameworks","Agent Categories","Repos","Pentest \u0026 Red Teaming Agents","🏗️ Infrastructure, Utils \u0026 Orchestration"],"sub_categories":["Agents \u0026 Orchestration","Runtime Analysis Tools","🛡️ Security-Focused Tools","\u003ca name=\"Unclassified\"\u003e\u003c/a\u003eUnclassified","🤖 Agentic AI Frameworks"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvxcontrol%2Fpentagi","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fvxcontrol%2Fpentagi","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvxcontrol%2Fpentagi/lists"}