{"id":35860151,"url":"https://github.com/qualifire-dev/rogue","last_synced_at":"2026-04-19T09:06:35.849Z","repository":{"id":314086630,"uuid":"997037661","full_name":"qualifire-dev/rogue","owner":"qualifire-dev","description":"AI Agent Evaluator \u0026 Red Team Platform","archived":false,"fork":false,"pushed_at":"2026-02-24T09:57:08.000Z","size":42636,"stargazers_count":1008,"open_issues_count":45,"forks_count":158,"subscribers_count":7,"default_branch":"main","last_synced_at":"2026-02-24T15:44:45.575Z","etag":null,"topics":["agents","ai","ai-agents","e2e-testing","llm","testing","testing-framework"],"latest_commit_sha":null,"homepage":"https://qualifire.ai","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/qualifire-dev.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE.md","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":".github/CODEOWNERS","security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null}},"created_at":"2025-06-05T21:16:31.000Z","updated_at":"2026-02-24T09:11:34.000Z","dependencies_parsed_at":"2025-09-10T15:19:55.002Z","dependency_job_id":"d03553c2-9365-4c6e-9ffe-3b2a6d1c3463","html_url":"https://github.com/qualifire-dev/rogue","commit_stats":null,"previous_names":["qualifire-dev/rogue"],"tags_count":30,"template":false,"template_full_name":null,"purl":"pkg:github/qualifire-dev/rogue","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/qualifire-dev%2Frogue","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/qualifire-dev%2Frogue/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/qualifire-dev%2Frogue/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/qualifire-dev%2Frogue/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/qualifire-dev","download_url":"https://codeload.github.com/qualifire-dev/rogue/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/qualifire-dev%2Frogue/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32000744,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-18T20:23:30.271Z","status":"online","status_checked_at":"2026-04-19T02:00:07.110Z","response_time":55,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agents","ai","ai-agents","e2e-testing","llm","testing","testing-framework"],"created_at":"2026-01-08T12:14:52.600Z","updated_at":"2026-04-19T09:06:35.842Z","avatar_url":"https://github.com/qualifire-dev.png","language":"Python","readme":"# Rogue — AI Agent Evaluator \u0026 Red Team Platform\n\n![](https://pixel.qualifire.ai/api/record/rogue)\n\n\u003cdiv align=\"center\"\u003e\n\n\u003ca href=\"https://trendshift.io/repositories/15191\" target=\"_blank\"\u003e\u003cimg src=\"https://trendshift.io/api/badge/repositories/15191\" alt=\"qualifire-dev%2Frogue | Trendshift\" style=\"width: 250px; height: 55px;\" width=\"250\" height=\"55\"/\u003e\u003c/a\u003e\n\n![Tests](https://github.com/qualifire-dev/rogue/actions/workflows/test.yml/badge.svg?branch=main)\n\n\u003cimg src=\"./freddy-rogue.png\" width=\"200\"/\u003e\n\n**Stress-test your AI agents before attackers do.**\n\n[Discord Community](https://discord.gg/EUfAt7ZDeK) · [Quick Start](#-quick-start) · [Documentation](./docs/)\n\n\u003c/div\u003e\n\n---\n\n## Two Ways to Harden Your Agent\n\n\u003ctable\u003e\n\u003ctr\u003e\n\u003ctd width=\"50%\" valign=\"top\"\u003e\n\n### 🎯 Automatic Evaluation\n\nTest your agent against **business policies** and expected behaviors.\n\n- Define scenarios \u0026 expected outcomes\n- Verify compliance with business rules\n- Watch live conversations as Rogue probes your agent\n- Get detailed pass/fail reports with reasoning\n\n**Best for:** Regression testing, behavior validation, policy compliance\n\n\u003c/td\u003e\n\u003ctd width=\"50%\" valign=\"top\"\u003e\n\n### 🔴 Red Teaming\n\nSimulate **adversarial attacks** to find security vulnerabilities.\n\n- 75+ vulnerabilities across 12 security categories\n- 20 attack techniques (encoding, social engineering, injection)\n- CVSS-based risk scoring\n- 8 compliance frameworks (OWASP, MITRE, NIST, GDPR, EU AI Act)\n\n**Best for:** Security audits, penetration testing, compliance reporting\n\n\u003c/td\u003e\n\u003c/tr\u003e\n\u003c/table\u003e\n\n---\n\n## Architecture\n\nRogue operates on a **client-server architecture** with multiple interfaces:\n\n| Component  | Description                                 |\n| ---------- | ------------------------------------------- |\n| **Server** | Core evaluation \u0026 red team logic            |\n| **TUI**    | Modern terminal interface (Go + Bubble Tea) |\n| **CLI**    | Non-interactive mode for CI/CD pipelines    |\n\nhttps://github.com/user-attachments/assets/b5c04772-6916-4aab-825b-6a7476d77787\n\n### Supported Protocols\n\n| Protocol   | Transport            | Description                                                                        |\n| ---------- | -------------------- | ---------------------------------------------------------------------------------- |\n| **A2A**    | HTTP                 | [Google's Agent-to-Agent](https://a2a-protocol.org/latest/) protocol               |\n| **MCP**    | SSE, STREAMABLE_HTTP | [Model Context Protocol](https://modelcontextprotocol.io/) via `send_message` tool |\n| **Python** | —                    | Direct Python function calls (no network protocol)                                 |\n\nSee examples in [`examples/`](./examples/) for reference implementations.\n\n#### Python Entrypoint\n\nFor agents implemented as Python functions without A2A or MCP:\n\n1. Create a Python file with a `call_agent` function:\n\n```python\ndef call_agent(messages: list[dict]) -\u003e str:\n    \"\"\"\n    Process conversation and return response.\n\n    Args:\n        messages: List of {\"role\": \"user\"|\"assistant\", \"content\": \"...\"}\n\n    Returns:\n        Agent's response as a string\n    \"\"\"\n    # Your agent logic here\n    latest = messages[-1][\"content\"]\n    return f\"Response to: {latest}\"\n```\n\n2. Run Rogue with Python protocol:\n\n```bash\nuvx rogue-ai cli \\\n  --protocol python \\\n  --python-entrypoint-file ./my_agent.py \\\n  --judge-llm openai/gpt-4o-mini\n```\n\nOr via TUI: select \"Python\" as the protocol and enter the file path.\n\nSee [`examples/python_entrypoint_stub.py`](./examples/python_entrypoint_stub.py) for a complete example.\n\n---\n\n## 🔥 Quick Start\n\n### Prerequisites\n\n- `uvx` — [Install uv](https://docs.astral.sh/uv/getting-started/installation/)\n- Python 3.10+\n- LLM API key (OpenAI, Anthropic, or Google)\n\n### Installation\n\n```bash\n# TUI (recommended)\nuvx rogue-ai\n\n# CLI / CI/CD\nuvx rogue-ai cli\n```\n\n### Try It With the Example Agent\n\n```bash\n# All-in-one: starts both Rogue and a sample T-shirt store agent\nuvx rogue-ai --example=tshirt_store\n```\n\nConfigure in the UI:\n\n- **Agent URL**: `http://localhost:10001`\n- **Mode**: Choose `Automatic Evaluation` or `Red Teaming`\n\n---\n\n## Running Modes\n\n| Mode    | Command               | Description             |\n| ------- | --------------------- | ----------------------- |\n| Default | `uvx rogue-ai`        | Server + TUI            |\n| Server  | `uvx rogue-ai server` | Backend only            |\n| TUI     | `uvx rogue-ai tui`    | Terminal client         |\n| CLI     | `uvx rogue-ai cli`    | Non-interactive (CI/CD) |\n\n### Server Options\n\n```bash\nuvx rogue-ai server --host 0.0.0.0 --port 8000 --debug\n```\n\n### CLI Options\n\n```bash\nuvx rogue-ai cli \\\n  --evaluated-agent-url http://localhost:10001 \\\n  --judge-llm openai/gpt-4o-mini \\\n  --business-context-file ./.rogue/business_context.md\n```\n\n| Option                   | Description                                 |\n| ------------------------ | ------------------------------------------- |\n| `--config-file`          | Path to config JSON                         |\n| `--evaluated-agent-url`  | Agent endpoint (required)                   |\n| `--judge-llm`            | LLM for evaluation (required)               |\n| `--business-context`     | Context string or `--business-context-file` |\n| `--input-scenarios-file` | Scenarios JSON                              |\n| `--output-report-file`   | Report output path                          |\n| `--deep-test-mode`       | Extended testing                            |\n\n---\n\n## Red Teaming\n\n### Scan Types\n\n| Type       | Vulnerabilities | Attacks       | Time       |\n| ---------- | --------------- | ------------- | ---------- |\n| **Basic**  | 5 curated       | 6             | ~2-3 min   |\n| **Full**   | 75+             | 40+           | ~30-45 min |\n| **Custom** | User-selected   | User-selected | Varies     |\n\n### Compliance Frameworks\n\n- **OWASP LLM Top 10** — Prompt injection, sensitive data exposure, excessive agency\n- **MITRE ATLAS** — Adversarial threat landscape for AI systems\n- **NIST AI RMF** — AI risk management framework\n- **ISO/IEC 42001** — AI management system standard\n- **EU AI Act** — European AI regulation compliance\n- **GDPR** — Data protection requirements\n- **OWASP API Top 10** — API security best practices\n\n### Attack Categories\n\n| Category           | Examples                                |\n| ------------------ | --------------------------------------- |\n| Encoding           | Base64, ROT13, Leetspeak                |\n| Social Engineering | Roleplay, trust building                |\n| Injection          | Prompt injection, SQL injection         |\n| Semantic           | Goal redirection, context poisoning     |\n| Technical          | Gray-box probing, permission escalation |\n\n### Risk Scoring (CVSS-based)\n\nEach vulnerability receives a **0-10 risk score** based on:\n\n- **Impact** — Severity if exploited\n- **Exploitability** — Success rate likelihood\n- **Human Factor** — Manual exploitation potential\n- **Complexity** — Attack difficulty\n\n### Reproducible Scans\n\n```bash\n# Use random seeds for reproducible results\nuvx rogue-ai cli --random-seed 42\n```\n\nPerfect for regression testing and validating security fixes.\n\n---\n\n## Configuration\n\n### Environment Variables\n\n```bash\nOPENAI_API_KEY=\"sk-...\"\nANTHROPIC_API_KEY=\"sk-...\"\nGOOGLE_API_KEY=\"...\"\n```\n\n### Config File (`.rogue/user_config.json`)\n\n```json\n{\n  \"evaluated_agent_url\": \"http://localhost:10001\",\n  \"judge_llm\": \"openai/gpt-4o-mini\"\n}\n```\n\n---\n\n## Key Features\n\n| Feature                  | Description                                  |\n| ------------------------ | -------------------------------------------- |\n| 🔄 Dynamic Scenarios     | Auto-generate tests from business context    |\n| 👀 Live Monitoring       | Watch agent conversations in real-time       |\n| 📊 Comprehensive Reports | Markdown, CSV, JSON exports                  |\n| 🔍 Multi-Faceted Testing | Policy compliance + security vulnerabilities |\n| 🤖 Model Support         | OpenAI, Anthropic, Google (via LiteLLM)      |\n| 🛡️ CVSS Scoring          | Industry-standard risk assessment            |\n| 🔁 Reproducible          | Deterministic scans with random seeds        |\n\n---\n\n## Documentation\n\n- **[Quick Reference](./docs/QUICK_REFERENCE.md)** — One-page cheat sheet\n- **[Red Team Workflow](./docs/RED_TEAM_WORKFLOW.md)** — Technical deep-dive\n- **[Implementation Status](./docs/IMPLEMENTATION_STATUS.md)** — Feature breakdown\n- **[Attack Mapping](./docs/ATTACK_VULNERABILITY_MAPPING.md)** — Vulnerability coverage\n\n---\n\n## Contributing\n\n1. Fork the repository\n2. Create a branch (`git checkout -b feature/amazing-feature`)\n3. Commit changes (`git commit -m 'Add amazing feature'`)\n4. Push (`git push origin feature/amazing-feature`)\n5. Open a Pull Request\n\n---\n\n## License\n\nLicensed under a proprietary license — see [LICENSE](LICENSE.md).\n\nFree for personal and internal use. Commercial hosting requires licensing.\nContact: `hello@rogue.security`\n","funding_links":[],"categories":["Agent Security","AI Red Teaming (Testing AI Targets)"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fqualifire-dev%2Frogue","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fqualifire-dev%2Frogue","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fqualifire-dev%2Frogue/lists"}