{"id":47969203,"url":"https://github.com/homeofe/ai-security-arena","last_synced_at":"2026-04-04T10:41:43.170Z","repository":{"id":345263803,"uuid":"1185151151","full_name":"homeofe/ai-security-arena","owner":"homeofe","description":"AI Security Arena: Interactive web interface for AI-powered Red Team vs Blue Team security battles. Model selection, live battle logs, scoring, and scenario building.","archived":false,"fork":false,"pushed_at":"2026-03-30T17:37:26.000Z","size":1034,"stargazers_count":0,"open_issues_count":1,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-03-30T18:29:33.166Z","etag":null,"topics":["ai","arena","blue-team","cybersecurity","elvatis","llm","multi-model","nextjs","red-team","security"],"latest_commit_sha":null,"homepage":null,"language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/homeofe.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-03-18T09:39:26.000Z","updated_at":"2026-03-30T17:37:29.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/homeofe/ai-security-arena","commit_stats":null,"previous_names":["homeofe/ai-security-arena"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/homeofe/ai-security-arena","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/homeofe%2Fai-security-arena","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/homeofe%2Fai-security-arena/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/homeofe%2Fai-security-arena/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/homeofe%2Fai-security-arena/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/homeofe","download_url":"https://codeload.github.com/homeofe/ai-security-arena/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/homeofe%2Fai-security-arena/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31397055,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-04T10:20:44.708Z","status":"ssl_error","status_checked_at":"2026-04-04T10:20:06.846Z","response_time":60,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","arena","blue-team","cybersecurity","elvatis","llm","multi-model","nextjs","red-team","security"],"created_at":"2026-04-04T10:41:43.095Z","updated_at":"2026-04-04T10:41:43.158Z","avatar_url":"https://github.com/homeofe.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# AI Security Arena\n\n\u003e Interactive web interface for AI-powered Red Team vs Blue Team security battles.\n\u003e Combines [ai-red-team](https://github.com/homeofe/ai-red-team) and [ai-blue-team](https://github.com/homeofe/ai-blue-team) in one place.\n\n---\n\n## Screenshots\n\n### Arena Setup\n![Arena Setup](docs/screenshots/arena-setup-final.jpg)\n\n### Battle Report (IoT Factory Floor: Claude Opus 4.6 vs Claude Opus 4.6)\n![Battle Report](docs/screenshots/battle-report-full.jpg)\n\n### Match History (with saved battles)\n![Match History](docs/screenshots/history-with-match.jpg)\n\n### Replay Viewer (VCR controls, step-by-step playback)\n![Replay Viewer](docs/screenshots/replay-viewer.jpg)\n\n### Leaderboard (model rankings after battles)\n![Leaderboard](docs/screenshots/leaderboard-with-data.jpg)\n\n### Scenario Builder\n![Scenario Builder](docs/screenshots/scenario-builder.jpg)\n\n---\n\n## Concept\n\n```\n┌──────────────────────────────────────────────────────────────┐\n│                    AI SECURITY ARENA                         │\n│                                                              │\n│   ┌──────────────┐                    ┌──────────────┐      │\n│   │   RED TEAM   │    ← sandbox →     │  BLUE TEAM   │      │\n│   │  (Attacker)  │                    │  (Defender)   │      │\n│   │              │    live battle     │              │      │\n│   │  Model: ...  │ ←──────────────→  │  Model: ...  │      │\n│   └──────────────┘                    └──────────────┘      │\n│                                                              │\n│   ┌──────────────────────────────────────────────────────┐  │\n│   │                  WEB INTERFACE                        │  │\n│   │  - Model selection per team                          │  │\n│   │  - Custom prompts + example library                  │  │\n│   │  - Live battle log (WebSocket)                       │  │\n│   │  - Scoreboard + leaderboard                          │  │\n│   │  - Scenario builder                                  │  │\n│   │  - Round-by-round replay                             │  │\n│   │  - AAHP evolution tracker                            │  │\n│   │  - Match export (JSON/PDF)                           │  │\n│   └──────────────────────────────────────────────────────┘  │\n└──────────────────────────────────────────────────────────────┘\n```\n\n## Features\n\n### Core\n- **Model Selection:** Choose any LLM for each team (Claude, GPT, Gemini, Grok, local models)\n- **Custom Prompts:** Write your own attack/defense strategies or use curated examples\n- **Live Battle Log:** Real-time WebSocket stream, Red (left) vs Blue (right), every action timestamped\n- **Scenario Builder:** Pick environments (web-server, database, cloud-infra, IoT) or create custom ones\n\n### Scoring \u0026 History\n- **Scoreboard:** Per-match scoring (attacks landed, attacks blocked, time to detect, etc.)\n- **Leaderboard:** Which model pairs perform best? Historical rankings across all matches\n- **Round-by-Round Replay:** Step through completed matches forensic-style\n\n### Intelligence\n- **AAHP Evolution Tracker:** Both teams self-improve via GitHub Issues. Track the evolution over time.\n- **Match Export:** Download reports as JSON or PDF for documentation\n\n### Safety\n- **Sandbox Isolation:** All battles run in isolated sandboxes. Clear visual indicator.\n- **Budget Limiter:** Cap API costs per match (both teams make LLM calls simultaneously)\n- **Prompt Sanitization:** Custom prompts are validated to prevent sandbox escape\n\n## Tech Stack\n\n| Component | Technology |\n|-----------|-----------|\n| Frontend | Next.js 15 (App Router, React Server Components) |\n| Styling | Tailwind CSS |\n| Realtime | WebSocket (ws) + Server-Sent Events fallback |\n| Backend | Next.js API Routes + ai-red-team/ai-blue-team as libraries |\n| Database | SQLite (via better-sqlite3) for match history |\n| Testing | Vitest |\n\n## Getting Started\n\n### Prerequisites\n\n- Node.js 22+\n- pnpm\n\n### Installation\n\n```bash\ngit clone https://github.com/homeofe/ai-security-arena.git\ncd ai-security-arena\npnpm install\npnpm dev\n```\n\nOpen http://localhost:3000\n\n### API Key Configuration\n\n**Option A: Via the Settings UI (recommended)**\n\nNavigate to http://localhost:3000/settings and enter your API keys directly.\nKeys are stored locally in `.data/config.json` (gitignored) and never exposed to the browser.\n\n**Option B: Via environment variables**\n\n```env\n# At least one LLM provider key required\nANTHROPIC_API_KEY=sk-...\nOPENAI_API_KEY=sk-...\nGOOGLE_API_KEY=AIza...\n\n# Optional: budget limit per match (USD)\nMATCH_BUDGET_LIMIT=1.00\n```\n\nKeys set via the Settings UI take priority over environment variables.\n\n### CLI Tools (for CLI mode)\n\nCLI mode requires the respective CLI tools installed on your system:\n\n| Provider | CLI Tool | Install |\n|----------|---------|---------|\n| Anthropic (Claude) | `claude` | `npm install -g @anthropic-ai/claude-code` |\n| Google (Gemini) | `gemini` | `npm install -g @anthropic-ai/gemini-cli` |\n| OpenAI (GPT/Codex) | `codex` | `npm install -g @openai/codex` |\n\nCheck the **Status** page (http://localhost:3000/status) to verify all connectors are healthy.\n\nMock mode works without any API keys or CLI tools installed.\n\n## Project Structure\n\n```\nai-security-arena/\n├── src/\n│   ├── app/                  # Next.js App Router pages\n│   │   ├── page.tsx          # Landing / dashboard\n│   │   ├── arena/\n│   │   │   └── page.tsx      # Battle setup + live view\n│   │   ├── history/\n│   │   │   └── page.tsx      # Match history + replays\n│   │   ├── leaderboard/\n│   │   │   └── page.tsx      # Model rankings\n│   │   ├── settings/\n│   │   │   └── page.tsx      # API key management\n│   │   ├── status/\n│   │   │   └── page.tsx      # System health dashboard\n│   │   └── api/\n│   │       ├── battle/       # Start/stop battles\n│   │       ├── settings/     # API key CRUD\n│   │       ├── status/       # System health check\n│   │       ├── ws/           # WebSocket endpoint\n│   │       └── matches/      # Match CRUD\n│   ├── components/\n│   │   ├── BattleLog.tsx         # Split-screen live log\n│   │   ├── BattleHeader.tsx      # Round counter, timer, cost\n│   │   ├── ModelPicker.tsx       # Model selection per team\n│   │   ├── PromptEditor.tsx      # Tabbed prompt editor (examples + custom)\n│   │   ├── ScenarioSelector.tsx  # Horizontal scroll scenario cards\n│   │   ├── ScoreBar.tsx          # Animated score comparison bar\n│   │   ├── PhaseIcon.tsx         # Color-coded phase badges\n│   │   ├── ReportHeader.tsx      # Classified intel briefing header\n│   │   ├── ScoreOverview.tsx     # Large score display + stats\n│   │   ├── ScoreChart.tsx        # Pure CSS/SVG bar + line charts\n│   │   ├── DecisionTimeline.tsx  # Vertical alternating timeline\n│   │   ├── ReasoningViewer.tsx   # Expandable LLM reasoning per round\n│   │   ├── StrategyBreakdown.tsx # Side-by-side strategy analysis\n│   │   ├── TurningPoints.tsx     # Momentum shift highlights\n│   │   └── ExportButtons.tsx     # JSON/PDF/share export\n│   ├── lib/\n│   │   ├── arena.ts              # Arena controller (mock/cli/api modes)\n│   │   ├── cli-provider.ts       # CLI-based LLM provider (claude/gemini/codex)\n│   │   ├── config.ts             # Server-side config store (API keys)\n│   │   ├── prompt-builder.ts     # Battle prompt construction per round\n│   │   ├── response-parser.ts    # Parse LLM responses into BattleEvents\n│   │   ├── report-generator.ts   # Post-match analysis and report generation\n│   │   ├── mock-battle.ts        # Realistic mock battle events\n│   │   ├── models.ts             # Available model registry\n│   │   ├── prompts.ts            # Example prompt library\n│   │   ├── scenarios.ts          # Built-in scenarios\n│   │   └── scoring.ts            # Scoring engine\n│   └── types/\n│       └── index.ts              # Shared type definitions\n├── .ai/handoff/                  # AAHP protocol files\n├── docs/screenshots/             # UI screenshots\n├── package.json\n├── next.config.ts\n├── postcss.config.mjs\n├── tsconfig.json\n└── vitest.config.ts\n```\n\n## Roadmap\n\n| # | Feature | Status |\n|---|---------|--------|\n| 1 | Project setup (Next.js + Tailwind + dependencies) | ✅ Done |\n| 2 | Model picker + scenario selector UI | ✅ Done |\n| 3 | Arena controller (mock + CLI + API modes) | ✅ Done |\n| 4 | Split-screen battle view (Red vs Blue) | ✅ Done |\n| 5 | Scoring engine | ✅ Done |\n| 6 | Custom prompt editor + example library | ✅ Done |\n| 7 | CLI provider integration (claude/gemini/codex) | ✅ Done |\n| 8 | Battle Report page (timeline, reasoning, strategy, export) | ✅ Done |\n| 9 | Export (JSON/PDF) | ✅ Done |\n| 10 | SSE real-time battle events | ✅ Done |\n| 11 | ai-red-team SDK integration | ✅ Done |\n| 12 | ai-blue-team SDK integration | ✅ Done |\n| 13 | Match history + SQLite persistence | ✅ Done |\n| 14 | Round-by-round replay viewer | ✅ Done |\n| 15 | Leaderboard with model rankings | ✅ Done |\n| 16 | Scenario builder | ✅ Done |\n| 17 | Deployment (Docker + Vercel) | ✅ Done |\n| 18 | System status page (health monitoring) | ✅ Done |\n| 19 | Settings page (API key management via UI) | ✅ Done |\n| 20 | Windows compatibility (CLI spawning, native modules) | ✅ Done |\n\n## License\n\nApache-2.0. Copyright 2026 Elvatis - Emre Kohler.\n\n## Related Projects\n\n- [ai-red-team](https://github.com/homeofe/ai-red-team) - Offensive security agent\n- [ai-blue-team](https://github.com/homeofe/ai-blue-team) - Defensive security agent\n- [AAHP](https://github.com/homeofe/AAHP) - Agent Handoff Protocol (self-evolution)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhomeofe%2Fai-security-arena","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhomeofe%2Fai-security-arena","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhomeofe%2Fai-security-arena/lists"}