https://github.com/azex-ai/hive
AI Agent Control Plane — orchestrate coding agents (Claude Code, Codex) with automated quality gates, repair loops, and workspace management
https://github.com/azex-ai/hive
ai-agents claude-code codex developer-tools multi-agent nextjs orchestration pipeline quality-gates task-orchestration typescript
Last synced: 10 days ago
JSON representation
AI Agent Control Plane — orchestrate coding agents (Claude Code, Codex) with automated quality gates, repair loops, and workspace management
- Host: GitHub
- URL: https://github.com/azex-ai/hive
- Owner: azex-ai
- License: mit
- Created: 2026-03-18T22:15:28.000Z (3 months ago)
- Default Branch: main
- Last Pushed: 2026-05-28T11:57:56.000Z (26 days ago)
- Last Synced: 2026-05-28T13:25:56.213Z (26 days ago)
- Topics: ai-agents, claude-code, codex, developer-tools, multi-agent, nextjs, orchestration, pipeline, quality-gates, task-orchestration, typescript
- Language: TypeScript
- Homepage: https://github.com/azex-ai/hive
- Size: 376 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Hive
**AI Agent Control Plane** — Orchestrate coding agents in parallel with automated quality gates.
Hive turns AI coding assistants (Claude Code, Codex) from single-conversation tools into a managed production pipeline. Describe what you want, and Hive breaks it down, dispatches agents, runs quality checks at every stage, and auto-repairs failures — like an assembly line for code.
## The Problem
AI coding agents are powerful but hard to manage at scale:
- **No orchestration** — You can only talk to one agent at a time
- **No quality gates** — Generated code goes straight to review with no automated checks
- **No repair loop** — When something fails, you manually re-prompt
- **No context persistence** — Switch projects and lose all context
## How Hive Works
```
You: "Implement user authentication"
│
▼
┌─ Supervisor (Opus) ─────────────────────────┐
│ Parse intent → Design spec → Decompose │
│ into independent subtasks │
└──────────────┬──────────────────────────────┘
│
┌──────────┼──────────┐
▼ ▼ ▼
Agent A Agent B Agent C ← Parallel execution in git worktrees
(Claude) (Codex) (Claude)
│ │ │
▼ ▼ ▼
┌─ Quality Gates (automatic) ─────────────────┐
│ lint → build → test → review → integrate │
│ │
│ Gate fails? → New agent repairs → Re-check │
│ 3 rounds failed? → Escalate to human │
└──────────────────────────────────────────────┘
│
▼
✅ Done
```
### Key Concepts
- **Assembly Line** — Tasks flow through stages automatically. Human intervention is the exception, not the norm.
- **Quality Gates** — Each stage (lint, build, test, review, integrate) runs independently and produces command-level evidence. No trusting agent self-reports.
- **Repair by Fresh Eyes** — When a gate fails, a *new* agent fixes it (avoids attention blindness). Like space capsule docking — fix independently, then re-integrate.
- **Workspace Blueprints** — Each project gets a `.hive/blueprint.json` with project type, structure, dependencies, and progress checkpoints.
- **Model Routing** — Deep reasoning (Opus) for architecture and review. Fast models for execution. Benchmark-driven dynamic routing over time.
## Quick Start
```bash
git clone https://github.com/azex-ai/hive.git
cd hive
npm install
npm run dev
```
Open http://localhost:58080, select a workspace, and start describing tasks.
### Requirements
- Node.js 20+
- [Claude Code CLI](https://docs.anthropic.com/en/docs/claude-code) (`claude` command available)
- Optional: [Codex CLI](https://github.com/openai/codex) for multi-agent
## Architecture
```
src/
├── app/ Next.js 16 App Router
│ ├── api/ REST API
│ │ ├── tasks/ Task lifecycle (CRUD, run, approve, reject, pipeline)
│ │ ├── chat/ Supervisor chat (streaming + status)
│ │ ├── workspace/ Workspace init, browse, blueprint
│ │ ├── benchmarks/ Model performance tracking
│ │ └── events/ SSE real-time event stream
│ ├── tasks/[id]/ Task detail page (pipeline view, diff, terminal)
│ └── setup/ Workspace selector with directory browser
├── components/ React components (shadcn/ui, dark mode)
│ ├── pipeline-view.tsx Real-time pipeline stage visualization
│ ├── chat-input.tsx Streaming chat with supervisor
│ └── ...
└── lib/
├── pipeline/ Pipeline automation engine
│ ├── orchestrator.ts ★ Core: runs tasks through quality gates
│ ├── gates.ts Quality gate implementations (lint/build/test)
│ └── model-router.ts Benchmark-driven model selection
├── runtime/ Pluggable agent runtimes
│ ├── types.ts AgentRuntime interface
│ ├── claude.ts Claude Code SDK implementation
│ └── codex.ts Codex CLI implementation
├── blueprint.ts Workspace scanning + checkpoint system
├── scheduler.ts Task DB + lifecycle (SQLite)
├── executor.ts Orchestrator: worktree → runtime → pipeline
├── supervisor.ts Chat supervisor (streaming, session pool)
└── worktree.ts Git worktree isolation
```
## Tech Stack
| Layer | Technology |
|-------|-----------|
| Framework | Next.js 16 (App Router, Turbopack) |
| UI | shadcn/ui + Tailwind CSS v4 + Geist fonts |
| Database | SQLite (better-sqlite3, zero config) |
| Agent Runtime | Claude Code SDK, Codex CLI (pluggable) |
| Real-time | Server-Sent Events (SSE) |
| Isolation | Git worktrees (one branch per task) |
## Pipeline Stages
Every task automatically flows through:
| Stage | What it does | Model |
|-------|-------------|-------|
| **code** | Agent implements the task in isolated worktree | sonnet |
| **lint** | Auto-detects project type (Node/Go), runs linter | machine |
| **build** | Runs build command | machine |
| **test** | Runs test suite | machine |
| **review** | Spec ↔ output consistency check | opus |
| **integrate** | Merge verification (build + test on merged code) | machine |
Gates detect project type automatically — Node.js (`npm run lint/build/test`), Go (`go vet/build/test`), or skip if not applicable.
## Workspace Management
```
/setup → Browse directories → Select project
↓
Blueprint scan:
- Project type (Node/Go/mixed/unknown)
- Config files, dependencies, scripts
- Git state (branch, commit, dirty files)
- Progress checkpoint
↓
Tasks scoped to workspace
Chat history isolated per workspace
Supervisor context includes blueprint
```
## Configuration
Create `hive.yaml` in the project root (optional — defaults are sensible):
```yaml
repo: .
agents:
claude:
command: claude
max_concurrent: 3
codex:
command: codex
max_concurrent: 2
supervisor:
agent: claude
model: opus
pipeline:
max_repair_rounds: 3
self_review_probability: 0.2
gates: [lint, build, test, review, integrate]
model_routing:
default:
design: opus
code: sonnet
review: opus
repair: sonnet
```
## Design Philosophy
> Code is a byproduct of probabilistic generators. Constraints and tests are the real assets.
Based on three theoretical anchors:
- **Control Theory** (Wiener) — Tests are control signals, not just verification
- **Bounded Rationality** (Simon) — Don't expect perfect output; iterate fast with feedback
- **Stigmergy** — Agents coordinate through shared environment (git, DB), not direct messaging
Inspired by Toyota Production System: WIP limits, station-level QA, andon cords, takt time.
Full design document: [`docs/DESIGN.md`](docs/DESIGN.md)
## Status
Early development. The core pipeline works end-to-end:
- [x] Task creation via chat or API
- [x] Auto-dispatch to available agents
- [x] Git worktree isolation per task
- [x] Automated quality gates (lint → build → test → review → integrate)
- [x] Repair loop with fresh agents
- [x] Pipeline visualization (real-time)
- [x] Workspace management + blueprint scanning
- [x] Token-level streaming in chat
- [x] Model benchmarking infrastructure
- [ ] Design spec generation (Layer 1 decompose)
- [ ] Cross-review rounds
- [ ] Benchmark-driven model routing (data collection phase)
- [ ] Task dependency visualization
- [ ] Workspace config persistence to YAML
## License
MIT