An open API service indexing awesome lists of open source software.

https://github.com/withkynam/vibecode-pro-max-kit

Your AI forgets. This remembers. Spec-driven coding harness for vibecoders, product owners, CEOs and real builders โ€” self-improving context memory, 15 agents, 33 skills working with /goal, agent-team, & workflow on autopilot loops with 0 need for human gate. Kills context rot, ships features, not spaghetti. Claude Code & Codex. Any stack
https://github.com/withkynam/vibecode-pro-max-kit

agentic ai-agents ai-coding-assistant ai-development ai-workflow anthropic claude claude-code cli-tools code-quality codex coding-agents cursor developer-tools llm openai prompt-engineering typescript vibe-coding vibecoding

Last synced: about 7 hours ago
JSON representation

Your AI forgets. This remembers. Spec-driven coding harness for vibecoders, product owners, CEOs and real builders โ€” self-improving context memory, 15 agents, 33 skills working with /goal, agent-team, & workflow on autopilot loops with 0 need for human gate. Kills context rot, ships features, not spaghetti. Claude Code & Codex. Any stack

Awesome Lists containing this project

README

          


English |
็ฎ€ไฝ“ไธญๆ–‡ |
ๆ—ฅๆœฌ่ชž |
ํ•œ๊ตญ์–ด |
Tiแบฟng Viแป‡t |
Portuguรชs |
Espaรฑol |
Deutsch |
Franรงais |
เคนเคฟเค‚เคฆเฅ€


Flowser

*Built by world-class engineers, for vibecoders at*

*[flowser.ai](https://flowser.ai) โ€” AI Agents with computers for GTM*


# vibecode-pro-max-kit



Flow like water



"Total Concentration โ€” Spec Breathing, Tenth Form: the Vibe Flow never breaks."

โ€” Tanjiro Kamado

*Drop this into any project. Your AI agent gets a complete plan-first dev process โ€” 7 gated phases, self-healing check loops, and autopilot that runs start to finish without losing its place.*

๐Ÿ“ฆ One-command install
One curl line drops it into any project. It detects new vs. returning users and never overwrites your files.
๐ŸŒ Works everywhere
Any tech stack, any language, and any AI coding agent โ€” Claude Code, Codex, Cursor, Windsurf, Copilot, and more.

๐Ÿงญ RIPER-5 plan-first workflow
7 gated phases (Research โ†’ Spec โ†’ Innovate โ†’ Plan โ†’ Validate โ†’ Execute โ†’ Update-Process) stop the agent from jumping straight to code.
๐Ÿš€ Autopilot mode (quick / fast / full)
Start a hands-free run at any phase with a single phrase. Three lanes match the ceremony to the risk.

๐ŸŽฏ /goal โ€” the run-until-done token
One copy-pasteable block keeps the agent running phase after phase without stopping โ€” and resumes the run in a fresh session.
๐Ÿ” PVL + EVL self-healing loops
Plan-check-fix and test-check-fix loops find gaps, fix them, and re-check on their own โ€” up to 10 cycles each.

๐Ÿ” vc-autoresearch
A reusable find-gaps โ†’ fix โ†’ repeat loop you can point at plans, tests, specs, docs, or evals.
๐Ÿงช Feasibility probes
Test-before-you-build verdicts (VIABLE / NOT-VIABLE) before the agent commits to any design approach.

๐ŸŽ›๏ธ Smart strategy picker
Before each phase it weighs one agent vs. many vs. a coordinated team โ€” with cost estimates โ€” and picks the cheapest that fits.
๐Ÿงฎ Smart model use
The expensive model only writes code; the cheaper model does everything else. Lower cost, same quality.

๐Ÿค” Intent clarification
When a request is vague, the agent asks a few sharp questions up front instead of guessing and building the wrong thing.
๐Ÿ›ก๏ธ 36 validators
Mechanical correctness checks โ€” not opinions โ€” guard the kit's own structure and catch drift before it ships.

๐Ÿ—๏ธ Phase programs
Large projects are split into independent phases with quality gates between them, so big work doesn't fall apart.
๐Ÿ”€ Programs that reshape themselves
As it learns, the agent inserts new phases, reorders work, and skips blocked steps โ€” the plan adapts on the fly.

๐Ÿง  Never loses its place
Progress notes are written to disk every phase, so a run survives a memory reset and picks up exactly where it left off.
๐Ÿ“š Self-improving project memory
It learns your codebase on setup and keeps its own shared notes current after every feature ships, so docs never go stale.

โšก Quick Fix + Fast Mode
Light lanes for small changes skip the heavy ceremony, so a one-line fix stays a one-line fix.
๐Ÿงฑ Layered, auto-discovered skills
Skills are organized in clear layers and discovered automatically โ€” the agent always finds the right tool for the step.

๐Ÿค– 15 agents ยท 33 skills ยท 10 hooks
A full team of specialized agents, reusable skills, and safety hooks, all wired together out of the box.
๐Ÿ”„ Full kit lifecycle
Install, setup, update, and publish are all one command each โ€” keeping every project on the latest kit safely.

๐Ÿ“ SPEC โ€” your plain-language sign-off
Before any design, you state what to build in simple user stories โ€” the cheapest place to catch a misunderstanding.
๐ŸŽฏ Always checks your intent
Every later phase measures back against your SPEC: is what we're building actually what you asked for?


Stars
Forks
License
Contributors
CI
Version
Agents
Skills
Hooks
7 Tools


The simplest, most flexible, team-friendly coding kit for


Claude Codeย 
Codex CLIย 
Cursorย 
Windsurf

Antigravityย 
OpenCodeย 
GitHub Copilot


Works across any tech stack, any language, any project





Tech Stack Row 1






Tech Stack Row 2






Tech Stack Row 3




Not just for show. When you run vc-setup, agents scan your codebase,

detect your stack, and build project-specific knowledge groups that every skill reads before it works.

Other harnesses lock agents to one language โ€” rust-review-agent, python-linter โ€” useless elsewhere.

This one adapts to any combination above and builds up knowledge as you ship.

---

## โšก Get Started โ€” One Command, 30 Seconds

> **Prerequisites:** Node.js โ‰ฅ 22, git, bash (macOS / Linux / WSL / **Git Bash**; on Alpine: `apk add bash`).
>
> **Windows:** the installer is a bash script โ€” run it inside **Git Bash** (ships with [Git for Windows](https://git-scm.com/download/win)) or **WSL**, *not* PowerShell or `cmd.exe`. Both work out of the box: the installer detects Windows shells and, when symlinks aren't permitted, automatically falls back to copying (install still completes). For true symlinks (so Codex auto-reflects `vc-update` changes), [enable Developer Mode](https://learn.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development).

**There is only one command, and it works for everyone.** Run it inside your project folder. It detects whether you are a new or returning user, installs safely without overwriting your files, and then *tells you the exact next thing to say.*

```bash
curl -fsSL https://raw.githubusercontent.com/withkynam/vibecode-pro-max-kit/main/install.sh | bash
```

When it finishes, it prints one of two messages โ€” **read the bottom of the output and do exactly what it says:**

๐Ÿ†• Fresh project


Installer detects no harness and prints:



Next: Run: claude โ†’ Say: "Run vc-setup"



โ†’ Open your agent and say Run vc-setup



vc-setup detects your tech stack, creates the process/ folder, scans your codebase, and fills in your real architecture, conventions, and test commands โ€” a conversation, not a checklist.

๐Ÿ”„ Existing harness (upgrade)


Installer detects a prior install and prints:



Next (upgrade detected): Run: claude โ†’ Say: "Run vc-update"



โ†’ Open your agent and say Run vc-update



vc-update pulls the latest version and, if it finds old-format plans or folders, gives you a ready-to-paste prompt to finish the move with zero data loss. Your process/ is never touched.

> ๐Ÿ’ก **You never have to guess the command.** `install.sh` routes you: fresh โ†’ `vc-setup`, upgrade โ†’ `vc-update`. Re-running install is always safe โ€” it never breaks things. **Codex users:** run `/vc-setup` (or `/vc-update`) instead of saying it in chat.


๐Ÿ“ฆ What install puts on disk (non-destructive)


```
your-project/
โ”œโ”€โ”€ .claude/
โ”‚ โ”œโ”€โ”€ agents/ # ๐Ÿค– 15 agent definitions (.md)
โ”‚ โ”œโ”€โ”€ skills/ # โšก 33 skills (each a dir with SKILL.md)
โ”‚ โ””โ”€โ”€ hooks/ # ๐Ÿช 10 lifecycle hooks (.cjs / .mjs)
โ”œโ”€โ”€ .codex/agents/ # ๐Ÿ”„ Mirrored agents for Codex
โ”œโ”€โ”€ .agents/skills โ†’ # ๐Ÿ”— Symlink to .claude/skills (Codex discovery)
โ”œโ”€โ”€ CLAUDE.md # ๐Ÿ“‹ Orchestrator + routing rules
โ”œโ”€โ”€ AGENTS.md # ๐Ÿ“– Agent + skill registry (cross-tool)
โ””โ”€โ”€ process/
โ””โ”€โ”€ development-protocols/ # ๐Ÿ“œ 22 shared workflow docs (seeded by install)
# context/, plans, features โ†’ built by vc-setup
```

- **Non-destructive.** Your existing `.claude/skills/`, `.claude/agents/`, `process/`, and `settings.json` are never wiped. Only kit-owned files are written or updated.
- **Existing config?** Backed up to `.vibecode-backup/`; your `settings.json` is restored afterward.
- **Existing `CLAUDE.md`?** Backed up as `CLAUDE.md.pre-vibecode`.
- **Existing `process/`?** Never touched by install โ€” `vc-setup` / `vc-update` migrate it interactively, showing you the diff first.

> **One-time first-install caveat:** if you have custom skills/agents whose names start with `vc-` (the reserved kit namespace) and have *never* run install before, the stale-removal step may flag them. After install, run `ls .claude/skills/ .claude/agents/` to confirm. Use a non-`vc-` prefix (`my-`, `team-`, `proj-`) for your own additions to avoid this entirely.

๐Ÿค– Prefer to drive setup from your agent? (full prompt)


> Open Claude Code or Codex **with your project folder as the working directory**, then paste:

```
First, install the vibecode-pro-max-kit agent harness by running this command:

curl -fsSL https://raw.githubusercontent.com/withkynam/vibecode-pro-max-kit/main/install.sh | bash

After install completes, run vc-setup and follow the full interactive flow:

1. DETECT โ€” Read package.json (or go.mod, Cargo.toml, pyproject.toml, etc.), detect my
stack: framework, package manager, monorepo structure, test framework, database, auth.
Also check for any existing .claude/, process/, or context files.
2. SHOW ME WHAT YOU FOUND โ€” Summarize detection and wait for me to confirm. If this is an
existing project, tell me what looks good vs what could be improved.
3. ASK ME ABOUT THE PROJECT โ€” Have a real conversation. Ask follow-ups, probe anything
vague, keep going until you genuinely understand it. Summarize back and confirm.
4. SCAFFOLD โ€” Create the process/ directory. If process/ already exists, show me the plan
and wait for approval. Never silently move or delete my files.
5. STUDY โ€” Deep-scan and populate process/context/all-context.md with REAL content: repo
structure, stack + versions, patterns, import aliases, env vars, routes, schema, tests.
No placeholder text.
6. VALIDATE โ€” Run all validation checks to confirm everything is wired correctly.

Rules: read and preserve good existing context; show me a summary before each major change
and wait for my OK; never create empty placeholder files; ask before reorganizing.
```

Table of Contents

- [At a Glance](#-at-a-glance) ยท [The Problem](#-the-problem) ยท [The Fix](#-the-fix)
- [Vibe Coding Revolution](#the-vibe-coding-revolution) ยท [Who Is This For](#who-is-this-for) ยท [How This Compares](#how-this-compares) ยท [What Makes This Different](#-what-makes-this-different)
- [How It Works: The Coordinator](#-how-it-works--the-coordinator) ยท [The RIPER-5 Lifecycle](#-the-riper-5-lifecycle) ยท [Intent Clarification](#-intent-clarification)
- [The Two Quality Loops (PVL + EVL)](#-the-two-quality-loops--pvl--evl) ยท [Strategy Compare + Model Policy](#-strategy-compare--model-policy) ยท [Autopilot Mode](#-autopilot-mode--hands-free-riper-5) ยท [Feasibility Probes + Validators](#-feasibility-probes--the-validator-safety-net)
- [Built-in Safety Systems](#-built-in-safety-systems) ยท [Pre-Implementation Intelligence](#-pre-implementation-intelligence) ยท [Quality Pipeline](#-quality-pipeline--built-into-execution)
- [Plan Lifecycle](#-the-plan-lifecycle) ยท [Phase Programs](#-phase-programs--large-projects-that-dont-fall-apart) ยท [Context Groups](#-context-groups) ยท [Feature Folders](#-feature-folders) ยท [Skill Layers](#-skill-layers) ยท [Self-Improving Memory](#-self-improving-project-memory)
- [What's Inside](#-whats-inside) ยท [Quick Fix + Fast Mode](#-quick-fix--fast-mode) ยท [Kit Lifecycle](#-kit-lifecycle-install--setup--update--publish) ยท [Contributing](#contributing)

---

## ๐ŸŽ At a Glance

๐Ÿค–

15

Agents
One per phase + 6 specialist agents

โšก

33

Skills
20 workflow + 13 helper, matched by keyword

๐Ÿช

10

Hooks
Safety rails + automatic context loading

๐Ÿ“œ

22

Protocols
Shared rules every agent follows

๐Ÿ›ก๏ธ

36

Validators
Automated checks that catch errors before they ship

๐Ÿ”ง

7

Tools
Claude Code ยท Codex ยท Cursor ยท Windsurf ยท Antigravity ยท OpenCode ยท Copilot

๐ŸŒ

10

Languages
EN ยท ไธญๆ–‡ ยท ๆ—ฅๆœฌ่ชž ยท ํ•œ๊ตญ์–ด ยท VI ยท PT ยท DE ยท FR ยท ES ยท เคนเคฟเคจเฅเคฆเฅ€

โšก

30s

Install
One command, then your agent guides the rest

๐Ÿ›ฉ๏ธ

Autopilot
3 lanes (quick / fast / full) โ€” start at any phase, runs start to finish without stopping

๐Ÿ“Œ

/goal blocks
Short copy-pasteable texts that resume hands-free runs across sessions after a reset

๐Ÿ”

vc-autoresearch
Find-gaps โ†’ fix โ†’ repeat loop (shared tool for plans, tests, and evals)

๐Ÿ”ฌ

Feasibility probes
Test-before-you-build verdicts (VIABLE / NOT-VIABLE) before locking in a design

---

## ๐Ÿ”ฅ The Problem

You ask Claude to "add webhook support." It immediately starts writing code. No questions about your architecture. No check on existing patterns. No plan. You get 400 lines that don't fit your codebase, and you spend an hour fixing it.

**But that's just the surface.** The deeper problems hit harder:

๐Ÿง 


Context dies every session


Your agent forgets everything it learned. Same mistakes, same questions, every time. No memory, no compounding knowledge.

๐Ÿ“„


Docs go stale instantly


You wrote great context docs last week. They're already outdated. Nothing auto-updates them as the codebase evolves.

๐Ÿ’ฅ


Big tasks collapse mid-way


The context window fills, state is lost, the agent starts hallucinating. You restart from scratch on hour 3.

๐Ÿค


No specs, no review, no collaboration


Your PM can't review what the agent is about to build. There is no written plan to share, discuss, or approve before code is written.

๐ŸŽญ


Architecture decisions are hallucinated


The agent invents patterns instead of researching how other codebases solved the same problem.

๐Ÿš€


Nothing verifies "done"


The agent says "all tests pass" โ€” but it never independently re-ran them. You find out in production.

**Your agent has intelligence but no process, no memory, and no way to collaborate with your team.** Whether you're a developer, a PM, or a CEO who just started vibe coding โ€” this hits everyone the same way, and the fix is the same: **give your agent a real development process.**

---

## ๐Ÿ› ๏ธ The Fix

This kit installs a complete development system into your project โ€” not just a `CLAUDE.md`, but **15 specialized agents, 33 skills, 10 hooks, and 22 protocols** โ€” with a step-locked workflow that makes your agent **understand before it builds, and prove before it ships.**


๐Ÿ“‹

Plan-first approach

PMs and devs review the same written plan before any code is written

๐Ÿ”„

Self-improving knowledge

Updates itself every time a feature ships โ€” docs never go stale

โšก

Hands-free execution

Survives session resets โ€” runs for hours, not minutes

๐Ÿงฌ

Architecture research

Studies real codebases before making design decisions

โœ…

Two quality checks

Plans are checked before coding; tests are re-run independently after

๐Ÿงญ

Smart knowledge routing

Loads only what is relevant โ€” not your whole knowledge base every time


### The full RIPER-5 flow โ€” 7 phases, every step gated

```mermaid
%%{init: {'theme': 'base', 'themeVariables': {'fontSize': '14px', 'lineColor': '#8888AA'}} }%%
flowchart TD
R["RESEARCH\nread-only facts"]
S["SPEC\nrequirements doc"]
I["INNOVATE\n2-3 approaches"]
P["PLAN\ndetailed checklist"]
V["VALIDATE\nplan โ†’ contract\n(PVL loop)"]
E["EXECUTE\nimplement\n(EVL loop)"]
U["UPDATE PROCESS\ncapture + archive"]

R -->|"go"| S
S -->|"go"| I
I -->|"go"| P
P -->|"ENTER VALIDATE"| V
V -->|"Gate: PASS"| E
E -->|"gates green"| U

style R fill:#1565C0,stroke:#0D47A1,color:#FFFFFF
style S fill:#0277BD,stroke:#01579B,color:#FFFFFF
style I fill:#E65100,stroke:#BF360C,color:#FFFFFF
style P fill:#2E7D32,stroke:#1B5E20,color:#FFFFFF
style V fill:#558B2F,stroke:#33691E,color:#FFFFFF
style E fill:#C62828,stroke:#B71C1C,color:#FFFFFF
style U fill:#00695C,stroke:#004D40,color:#FFFFFF
```

**In interactive mode**, each phase waits for your "go" before it moves on โ€” you stay in the loop at every step. **In autopilot or /goal mode**, you give approval once up front, then the system drives itself all the way to done. It stops only for three specific hard stops listed below. **VALIDATE** and the post-EXECUTE re-test are not optional โ€” they are hard gates that block bad work from shipping โ€” and they run automatically in both modes.

---

## The Vibe Coding Revolution


"The hottest new programming language is English."


โ€” Andrej Karpathy


**Vibe coding changed who can build software. Plan-first development changes what they can ship.**

63%

of vibe coding users are NOT developers

16.2M

citizen developers worldwide
(38% YoY growth)

$4.7B

vibe coding market
growing 38% annually

25%

of YC W25 startups had 95%+ AI-generated codebases

Most tools help you start a project. This kit helps you **finish it** โ€” with plans your team can review, knowledge that never goes stale, and safety checks that catch mistakes before they ship.

---

## Who Is This For?


"The point isn't who typed it. It's what shipped."


โ€” Garry Tan, YC


๐Ÿง‘โ€๐Ÿ’ผ


CEO / Founder


"Build me a SaaS with auth, billing, and a landing page"


The agent researches your stack, writes an architecture plan you can review, implements with tests, and captures every decision for your technical co-founder to audit later.

๐Ÿ“Š


Product Manager


"Create a dashboard showing MRR, churn, and growth metrics"


It generates a PRD-style SPEC, gets your approval before writing code, implements to spec, and archives the plan as searchable project history.

๐ŸŽจ


Designer


"Match this Figma screenshot pixel-perfect"


The design-aware agent analyzes your mockup, implements component-by-component with your design tokens, and spawns visual comparison checks.

โš™๏ธ


Engineer


"Refactor the auth module to support RBAC with zero downtime"


It researches your current auth code and how other codebases solved RBAC, writes a migration plan that maps which files could be affected, then builds it safely with rollback notes.

---

## How This Compares

| Feature | vibecode-pro-max-kit | Superpowers | GSD | gstack |
|---------|---------------------|-------------|-----|--------|
| Plan-first lifecycle | Full RIPER-5 (research โ†’ spec โ†’ innovate โ†’ plan โ†’ validate โ†’ execute โ†’ update) | Mandatory workflows | Context-rot fix | Partial |
| Step-locked safety | Agent tools are restricted per phase (read-only research, no writing in innovate) | Skill-based constraints | Phase separation | None |
| Quality check loops | **Two** โ€” PVL (check the plan) + EVL (independently re-run tests) | Per-skill | None automatic | None |
| Multi-tool support | 7 tools via `AGENTS.md` + `SKILL.md` open standards | Claude Code plugin | 14 runtimes | 1 tool |
| Auto-improving knowledge | Topic-grouped knowledge, updated after every feature | Plugin memory | Disk-persisted state | Manual |
| Team collaboration | Shared plans, specs, and review files | Solo | Solo | Solo |
| Skills system | 33 auto-discovered, keyword-matched at every prompt | 86 composable skills | Meta-prompting | 23 role tools |
| Large multi-phase projects | Umbrella plans + per-phase inner loop with regression checks | Single task | Single task | Single task |
| Hands-free mode | Autopilot (3 lanes) + standing `/goal` consent | Manual | Manual | Manual |
| Installation | 30s `curl` + auto-routed setup | Plugin marketplace | npx one-liner | git clone |

> **On runtime breadth:** GSD supports 14 runtimes. We support 7 deeply โ€” with full agent harnesses, skill discovery, and lifecycle hooks on every platform. Breadth vs. depth: your choice.

---

## โšก What Makes This Different

๐Ÿ”’


Step-Locked Tool Restrictions


Your agent literally cannot write code during research. RESEARCH is read-only, INNOVATE has no Write, PLAN/VALIDATE write only to process/. Real capability limits, not just suggestions.

๐ŸŽฏ


The Lead Agent Never Touches Code


The coordinator routes, monitors, and drives loops โ€” it never edits source files or runs tests itself. Every edit and every test run happens inside a dedicated sub-agent. No hidden work.

๐Ÿ”


Automatic Skill Discovery


Before handling any request, it scans 33 skills and matches keywords. Say "add webhook support" and vc-security + vc-scenario are pulled in automatically.

๐Ÿ’พ


Survives Session Resets


Plans, reports, knowledge docs, and learnings all live on disk. The startup hook restores approval gates after a session reset. Nothing is lost.

๐Ÿ›ก๏ธ


Self-Policing Step Guard


When the agent is about to skip a required step, it stops itself: "PHASE JUMPING PREVENTED." A built-in guard against shortcuts.

๐Ÿ”„


Works Across 7 AI Coding Tools


Two open standards โ€” AGENTS.md and SKILL.md โ€” mean zero adapters, zero plugins. Start in Claude Code, switch to Cursor, continue in Codex.

---

## ๐Ÿงญ How It Works โ€” The Coordinator

Your main session is a **coordinator** (called the orchestrator), not a worker. It does four things and nothing else:

```
Your request
โ†’ Step 0: Skill Discovery (scan 33 skills, match keywords, attach candidates)
โ†’ Detect intent (feature / bug / question / refactor / UI) + score ambiguity
โ†’ Route to the right agent in a fresh context window
โ†’ Monitor: step compliance, status codes, loop driving
```

๐Ÿง‘โ€โœˆ๏ธ


It delegates, never implements


Research โ†’ vc-research-agent. Plan โ†’ vc-plan-agent. Code โ†’ vc-execute-agent. The coordinator hands off the right context and waits โ€” it never does the actual work itself.

๐Ÿšซ


No hidden execution โ€” ever


The moment a plan with an agreed checklist exists, "ENTER EXECUTE MODE" always launches vc-execute-agent. Even a one-line fix goes through it. Tests run only inside a dedicated vc-tester. This holds regardless of change size.

๐Ÿ“จ


Clear status codes, not vague signals


Every sub-agent ends with one of: DONE ยท DONE_WITH_CONCERNS ยท BLOCKED ยท NEEDS_CONTEXT. The coordinator never ignores a blocker and never retries the same blocked approach three times.

๐Ÿ”


It drives the fix loops


Sub-agents run once, report a result, and stop. Only the coordinator re-launches them. It drives both the PVL (plan-check-fix) and EVL (test-check-fix) loops and owns all tracking.

```mermaid
%%{init: {'theme': 'base', 'themeVariables': {'fontSize': '14px', 'lineColor': '#8888AA'}} }%%
flowchart TD
REQ["User request"]
SD["Step 0: Skill Discovery\nscan 33 skills, match keywords"]
INT{"Detect intent\n+ score ambiguity"}
RES["vc-research-agent\n(feature / question)"]
PLN["vc-plan-agent\n(after INNOVATE)"]
VAL["vc-validate-agent\n(PVL loop)"]
EXE["vc-execute-agent\n(EXECUTE)"]
TST["vc-tester\n(EVL loop)"]
UPD["vc-update-process-agent\n(closeout)"]
MON["Monitor\nstatus codes ยท loop driving\nno inline execution ever"]

REQ --> SD --> INT
INT -->|"feature / research"| RES
INT -->|"plan phase"| PLN
INT -->|"validate phase"| VAL
INT -->|"execute phase"| EXE
EXE -.->|"EVL"| TST
INT -->|"update phase"| UPD
RES --> MON
PLN --> MON
VAL --> MON
TST --> MON
UPD --> MON

style REQ fill:#1565C0,stroke:#0D47A1,color:#FFFFFF
style SD fill:#0277BD,stroke:#01579B,color:#FFFFFF
style INT fill:#E65100,stroke:#BF360C,color:#FFFFFF
style RES fill:#2E7D32,stroke:#1B5E20,color:#FFFFFF
style PLN fill:#558B2F,stroke:#33691E,color:#FFFFFF
style VAL fill:#558B2F,stroke:#33691E,color:#FFFFFF
style EXE fill:#C62828,stroke:#B71C1C,color:#FFFFFF
style TST fill:#6A1B9A,stroke:#4A148C,color:#FFFFFF
style UPD fill:#00695C,stroke:#004D40,color:#FFFFFF
style MON fill:#37474F,stroke:#263238,color:#FFFFFF
```

> **Why this matters:** an agent that can both decide *and* secretly edit will find ways to skip the plan. By separating the coordinator from the workers (sub-agents), the process becomes structurally honest โ€” the only way to write code is to go through the required steps.

---

## ๐Ÿ“Š The RIPER-5 Lifecycle

| Phase | What happens | Agent | You say |
|-------|-------------|-------|---------|
| ๐Ÿ” **RESEARCH** | Read-only fact gathering โ€” codebase + web. Never modifies files. | `vc-research-agent` | *(auto on feature requests)* |
| ๐Ÿ“ **SPEC** | Product-discovery requirements doc โ€” user stories, acceptance criteria, out-of-scope โ€” for **your review before any design**. | `vc-spec-agent` | `go` / `ENTER SPEC MODE` |
| ๐Ÿ’ก **INNOVATE** | Explore 2-3 approaches with trade-offs. Decision summary (chosen + rejected + why). | `vc-innovate-agent` | `go` |
| ๐Ÿ“‹ **PLAN** | Write the detailed spec: touchpoints, public contracts, which files it can touch, verification evidence, resume handoff. | `vc-plan-agent` | `go` |
| โœ… **VALIDATE** | Turn the plan into an agreed checklist (V1โ€“V7 checkpoints). Verdict: **PASS / CONDITIONAL / BLOCKED**. Runs the PVL loop. | `vc-validate-agent` | `ENTER VALIDATE MODE` |
| โšก **EXECUTE** | Implement *exactly* the plan. Progress notes to the phase report, deviation protocol, self-review. Then the EVL loop re-runs the checkpoints. | `vc-execute-agent` | `ENTER EXECUTE MODE` |
| ๐Ÿง  **UPDATE PROCESS** | Capture learnings, update context, archive plan, write closeout packet. | `vc-update-process-agent` | *(recommended after non-trivial work)* |

> ๐Ÿ“ **Why SPEC is its own phase:** most harnesses jump from "understand" to "design." Inserting a product-discovery SPEC step means *you* (or your PM) sign off on **what** is being built โ€” in plain user stories and acceptance criteria โ€” *before* the agent debates **how**. It is the cheapest possible place to catch a misunderstanding. (Inside a phase program's inner loop, SPEC is skipped โ€” the umbrella SPEC governs all phases.)
>
> **The SPEC is the measuring stick.** It states the expected behavior in simple terms you can scan in a minute. Every phase after it โ€” Innovate, Plan, Validate, Execute โ€” checks back against it and asks the same question: *is what we are building actually what you asked for?* When the work starts to drift, the SPEC is what catches it.

```mermaid
flowchart TD
U["You: what I really want
(plain words)"] --> S["๐Ÿ“ SPEC
expected behavior + acceptance
criteria โ€” you approve"]
S --> I["๐Ÿ’ก INNOVATE"]
S --> P["๐Ÿ“‹ PLAN"]
S --> V["โœ… VALIDATE"]
S --> E["โšก EXECUTE"]
I -.->|"check back"| Q{"Is this what
you asked for?"}
P -.->|"check back"| Q
V -.->|"check back"| Q
E -.->|"check back"| Q
Q -->|yes| GO["keep building"]
Q -->|no| S
```


### ๐Ÿ’ป Example sessions

```
# ๐Ÿ†• Feature request
You: "add webhook support to the API"
โ†’ Skill discovery surfaces: vc-scenario, vc-security
โ†’ research-agent gathers context (read-only, can't touch code)
โ†’ "go" โ†’ spec-agent writes requirements doc โ†’ you approve
โ†’ "go" โ†’ innovate-agent compares approaches โ†’ decision summary
โ†’ "go" โ†’ plan-agent writes the plan, listing which files it will touch
โ†’ "ENTER VALIDATE MODE" โ†’ validate-agent gates the plan (PVL loop) โ†’ Gate: PASS
โ†’ "ENTER EXECUTE MODE" โ†’ execute-agent implements โ†’ tester re-runs gates (EVL) โ†’ reviewer โ†’ git-manager
โ†’ Closeout packet: what changed, what's verified, recommended next step
```

```
# ๐Ÿ› Bug fix
You: "login redirect is broken"
โ†’ Routes to vc-debugger โ†’ gathers evidence FIRST โ†’ 2-3 competing hypotheses
โ†’ Systematically eliminates each โ†’ root cause with proof chain
โ†’ execute-agent implements the fix โ†’ EVL re-test โ†’ quality pipeline
```

```
# โฉ Fast mode
You: "ENTER FAST MODE - add rate limiting middleware"
โ†’ Compressed RESEARCH + SPEC + INNOVATE + PLAN + VALIDATE in one pass
โ†’ Mandatory safety pause after VALIDATE โ†’ you review โ†’ "ENTER EXECUTE MODE"
```

```
# ๐Ÿค– Autopilot (hands-free)
You: "autopilot full: build a notifications system"
โ†’ ONE consolidated clarification round โ†’ provisional /goal block (standing consent)
โ†’ Drives the full RIPER-5 sequence autonomously, pausing only on hard stops
```

```
# ๐Ÿ—๏ธ Large program
You: "build a full testing platform"
โ†’ Umbrella plan + phase plans in a feature folder
โ†’ Each phase inner loop: research โ†’ innovate โ†’ plan โ†’ PVL โ†’ execute โ†’ EVL โ†’ update
โ†’ Progress survives context compaction โ€” durable reports on disk
```

---

## ๐ŸŽฏ Intent Clarification

Before routing, the lead agent scores your request's ambiguity on **4 binary signals (0โ€“4)** and picks a tier. It asks questions *only when they would actually change what it does.*

| Tier | When | Behavior |
|---|---|---|
| **Tier 0** โ€” silent auto-route | Score 0โ€“1, or you said "go" / "just do it", or resuming a plan | Routes immediately, no questions |
| **Tier 1** โ€” inline summary | Score 2 | States its understanding + chosen route in one line, then proceeds |
| **Tier 2** โ€” questions | Score 3+ | Asks focused multiple-choice questions before routing |

> ๐Ÿง  **Two rounds max.** If still unclear after Tier 2, it asks one final plain question, then defaults to research with the narrowest reasonable scope. It never loops clarification forever. After RESEARCH, it re-checks intent โ€” if research shows the request was different from what was assumed, it re-presents; if confirmed, it proceeds without re-asking.

```mermaid
%%{init: {'theme': 'base', 'themeVariables': {'fontSize': '14px', 'lineColor': '#8888AA'}} }%%
flowchart TD
REQ["User request"]
SCORE{"ambiguity score\n0โ€“4 binary signals"}
AUTO["Auto-skip conditions\n(go / continue / mid-phase\n/ trivial / explicit mode\n/ resuming plan / pure info)"]
T0["Tier 0\nsilent auto-route\n(score 0โ€“1 OR auto-skip)"]
T1["Tier 1\ninline summary\n(score 2)"]
T2["Tier 2\nask focused questions\n(score 3+)"]
ROUTE["Route to matching agent\n(research / plan / execute / โ€ฆ)"]
STILL{"still unclear\nafter Tier 2?"}
FINAL["One final plain question\n(max 2 clarification rounds)"]
NARROW["Default: vc-research-agent\nnarrowest reasonable scope"]

REQ --> AUTO
AUTO -->|"auto-skip matches"| T0
AUTO -->|"no auto-skip"| SCORE
SCORE -->|"0โ€“1"| T0
SCORE -->|"2"| T1
SCORE -->|"3+"| T2
T0 --> ROUTE
T1 --> ROUTE
T2 --> STILL
STILL -->|"resolved"| ROUTE
STILL -->|"still unclear"| FINAL --> NARROW

style REQ fill:#1565C0,stroke:#0D47A1,color:#FFFFFF
style AUTO fill:#0277BD,stroke:#01579B,color:#FFFFFF
style SCORE fill:#E65100,stroke:#BF360C,color:#FFFFFF
style T0 fill:#2E7D32,stroke:#1B5E20,color:#FFFFFF
style T1 fill:#558B2F,stroke:#33691E,color:#FFFFFF
style T2 fill:#6A1B9A,stroke:#4A148C,color:#FFFFFF
style ROUTE fill:#C62828,stroke:#B71C1C,color:#FFFFFF
style STILL fill:#F57F17,stroke:#F9A825,color:#000000
style FINAL fill:#AD1457,stroke:#880E4F,color:#FFFFFF
style NARROW fill:#00695C,stroke:#004D40,color:#FFFFFF
```

---

## โœ… The Two Quality Loops โ€” PVL + EVL

Most harnesses check *once*, if at all. This one wraps EXECUTE in **two independent loops** โ€” one before code is written, one after.

```mermaid
%%{init: {'theme': 'base', 'themeVariables': {'fontSize': '15px', 'lineColor': '#8888AA'}} }%%
flowchart TD
PLAN["๐Ÿ“‹ PLAN written"]
PVL{"โœ… PVL\nValidate the plan\nV1โ€“V7 gates"}
PASS["Gate: PASS"]
SUPP["๐Ÿ“ plan-agent supplements\n(addresses gaps)"]
EXEC["โšก EXECUTE\nimplement the plan"]
EVL{"๐Ÿงช EVL\ntester re-runs the\nvalidate-contract gates"}
FIX["โšก execute-agent\nsupplement (fix gate)"]
DONE["๐Ÿง  UPDATE PROCESS"]

PLAN --> PVL
PVL -->|"CONDITIONAL / BLOCKED"| SUPP
SUPP -->|"re-run from V1"| PVL
PVL -->|"PASS"| PASS
PASS -->|"ENTER EXECUTE"| EXEC
EXEC --> EVL
EVL -->|"gate fails"| FIX
FIX -->|"re-run"| EVL
EVL -->|"all gates green"| DONE

style PLAN fill:#2E7D32,stroke:#1B5E20,color:#FFFFFF
style PVL fill:#558B2F,stroke:#33691E,color:#FFFFFF
style PASS fill:#00695C,stroke:#004D40,color:#FFFFFF
style SUPP fill:#E65100,stroke:#BF360C,color:#FFFFFF
style EXEC fill:#C62828,stroke:#B71C1C,color:#FFFFFF
style EVL fill:#6A1B9A,stroke:#4A148C,color:#FFFFFF
style FIX fill:#AD1457,stroke:#880E4F,color:#FFFFFF
style DONE fill:#00695C,stroke:#004D40,color:#FFFFFF
```

๐Ÿ“‹ PVL โ€” Plan-Validate-Fix


Before EXECUTE, vc-validate-agent runs the plan through V1โ€“V7 checkpoints โ€” splitting the work across several agents to cover infra, test coverage, breaking changes, security, and per-section feasibility. A first-pass CONDITIONAL or BLOCKED is never the end โ€” it routes back to vc-plan-agent to update the plan, then re-checks from V1.



Tracked by vc-autoresearch (domain: plan) โ€” a find-gaps-and-fix loop. 10-cycle cap. Plateau detection. Only Gate: PASS (or a CONDITIONAL you explicitly accept) unlocks EXECUTE.

๐Ÿงช EVL โ€” Execute-Validate-Fix


After EXECUTE reports done โ€” even when it claims all checkpoints are green โ€” the lead agent always spawns vc-tester to independently re-run the exact agreed-checklist test commands. A failing checkpoint routes to a scoped vc-execute-agent fix, then re-tests.



Tracked by vc-autoresearch (domain: tests). 10-cycle cap. The execute-agent's own internal "iterate until green" loop never substitutes for this independent confirmation.

> ๐Ÿ’Ž **The verdict ladder:** **PASS** โ†’ proceed ยท **CONDITIONAL** โ†’ fixable gaps; the loop fires (or you accept them on record) ยท **BLOCKED** โ†’ a deeper problem; returns to PLAN (under autopilot: the gap goes to a backlog and the run continues).

### ๐Ÿ” vc-autoresearch โ€” Shared Loop Engine

Both PVL and EVL use the same tracking layer: **`vc-autoresearch`** โ€” a find-gaps โ†’ fix โ†’ repeat loop. The lead agent drives the loop โ€” it owns the round counter, per-round reports, TSV log, and plateau/cap/regression checks. Worker agents are fire-and-forget: they return a result and stop. No agent re-spawns itself or spawns another phase agent.

The same engine can run on its own: "harden this spec", "fix all lint", "improve test coverage", "improve these docs" โ€” any repeated find-gaps-and-fix task across 6 domains (spec ยท tests ยท ux ยท docs ยท plan ยท errors).

```mermaid
%%{init: {'theme': 'base', 'themeVariables': {'fontSize': '14px', 'lineColor': '#8888AA'}} }%%
flowchart TD
START["Start loop\n(orchestrator init results.tsv\nheader + baseline row)"]
FIND["Find gaps\n(validate-agent / tester\nreturns gap list)"]
RPT["Write iteration report\n{slug}-iteration-{NNN}_REPORT_{dd-mm-yy}.md"]
TSV["Append results.tsv row\n(cycle N, gap count)"]
FIX["Fix gaps\n(plan-agent supplement\nOR execute-agent supplement)"]
CHK["Guard checks\nplateau? cap hit? regression?"]
RECHECK["Re-check\n(re-run validate / tester)"]
SUCC["SUCCESS\nall-clear 2 consecutive rounds"]
HALT["HALT\nplateau / 10-cycle cap / regression"]

START --> FIND --> RPT --> TSV --> FIX --> CHK
CHK -->|"safe to continue"| RECHECK
RECHECK --> FIND
CHK -->|"plateau / cap / regression"| HALT
FIND -->|"no gaps found (ร—2)"| SUCC

style START fill:#1565C0,stroke:#0D47A1,color:#FFFFFF
style FIND fill:#E65100,stroke:#BF360C,color:#FFFFFF
style RPT fill:#558B2F,stroke:#33691E,color:#FFFFFF
style TSV fill:#2E7D32,stroke:#1B5E20,color:#FFFFFF
style FIX fill:#C62828,stroke:#B71C1C,color:#FFFFFF
style CHK fill:#6A1B9A,stroke:#4A148C,color:#FFFFFF
style RECHECK fill:#0277BD,stroke:#01579B,color:#FFFFFF
style SUCC fill:#00695C,stroke:#004D40,color:#FFFFFF
style HALT fill:#B71C1C,stroke:#7F0000,color:#FFFFFF
```

| Mode | Does | Stops when |
|---|---|---|
| `vc-autoresearch` (core) | find gaps โ†’ fix โ†’ repeat | no gaps found OR metric goal hit |
| `vc-autoresearch:probe` | 8 personas interrogate the corpus until saturation | no new constraints for 3 rounds |
| `vc-autoresearch:reason` | adversarial debate with blind judges | judges converge or iteration cap |
| `vc-autoresearch:evals` | analyze TSV results โ€” trends, plateaus, recommendations | analysis only |

**Stop conditions:** SUCCESS (all-clear two rounds in a row) ยท HALT_PLATEAU (no progress for 3 rounds) ยท HALT_CAP (10-round hard limit) ยท HALT_REGRESSION (a check that was passing now fails).

---

## ๐Ÿ‘ฅ Strategy Compare + Model Policy

At **every phase transition**, the lead agent invokes `vc-agent-strategy-compare` to recommend *how* to run the next phase โ€” with cost estimates.

| Strategy | When | Coordination |
|---|---|---|
| **Sequential** | Work depends on prior output | One agent at a time |
| **Parallel subagents** | Independent dimensions, fire-and-forget | None โ€” lead agent collects + combines results |
| **Workflow** | Predictable splitting of work across a list | Scripted steps |
| **Agent team** | Agents must talk to each other mid-run (e.g. each touches separate files across 3+ phase plans) | TeamCreate + shared task list + SendMessage |

```mermaid
%%{init: {'theme': 'base', 'themeVariables': {'fontSize': '14px', 'lineColor': '#8888AA'}} }%%
flowchart TD
SS["vc-agent-strategy-compare\n(every phase transition)"]
SC{"signal score\n0โ€“7"}
SEQ["Sequential\none agent at a time\n(output feeds next)"]
PAR["Parallel subagents\nfire-and-forget\n(independent dimensions)"]
WF["Workflow\nscripted steps\nacross a list"]
TEAM["Agent team\nTeamCreate + TaskCreate\n+ SendMessage\n(must coordinate mid-run)"]
MC{"which phase?"}
OPUS["๐Ÿ”ด opus\n(EXECUTE only)"]
SONNET["๐Ÿ”ต sonnet\n(every other phase)"]

SS --> SC
SC -->|"low / dependent"| SEQ
SC -->|"mid / independent"| PAR
SC -->|"predictable split"| WF
SC -->|"high / must coordinate\nor 3+ phase plans"| TEAM
SS --> MC
MC -->|"EXECUTE / fast-mode\nquick-fix (real code)"| OPUS
MC -->|"Research / Spec\nInnovate / Plan\nValidate / Update\nall reviewers"| SONNET

style SS fill:#1565C0,stroke:#0D47A1,color:#FFFFFF
style SC fill:#E65100,stroke:#BF360C,color:#FFFFFF
style SEQ fill:#2E7D32,stroke:#1B5E20,color:#FFFFFF
style PAR fill:#558B2F,stroke:#33691E,color:#FFFFFF
style WF fill:#00695C,stroke:#004D40,color:#FFFFFF
style TEAM fill:#6A1B9A,stroke:#4A148C,color:#FFFFFF
style MC fill:#0277BD,stroke:#01579B,color:#FFFFFF
style OPUS fill:#C62828,stroke:#B71C1C,color:#FFFFFF
style SONNET fill:#37474F,stroke:#263238,color:#FFFFFF
```

> โš ๏ธ **"Agent team" means the real machinery** โ€” named teammates, a shared task list, and inter-agent messaging โ€” *not* bare parallel agents called a "team." It is **required** (not optional) for creating 3+ phase plans and for multi-file edits where agents must each stay in their own files. Only a true team can communicate while running.

### ๐Ÿงฎ Model selection policy

| Phase | Model | Why |
|---|---|---|
| **EXECUTE** (+ fast-mode, quick-fix doing real code) | ๐Ÿ”ด **opus** | Real source edits, builds, migrations |
| Research ยท Spec ยท Innovate ยท Plan ยท Validate ยท Update ยท all reviewers/researchers | ๐Ÿ”ต **sonnet** | Planning and analysis โ€” cheaper, plenty capable |

> When work is split across several agents, only the *coding* agent uses opus. Every reviewer, researcher, validator, and planner uses sonnet. The lead agent names the model each time it spawns a worker.

---

## ๐Ÿค– Autopilot Mode โ€” Hands-Free RIPER-5

Say **`autopilot [task]`** (or `run autopilot`, `autonomous mode`, `ENTER AUTOPILOT MODE`) and the agent runs the *entire* remaining RIPER-5 sequence with **one** clarification round up front โ€” then no more pauses until it is done.

**Trigger anywhere:** autopilot can start at the beginning of a session *or* at any point mid-session. On trigger, the lead agent reads the saved files on disk to figure out which RIPER-5 phase you are already in, then picks up from there and drives the rest on its own.

| On-disk state | Entry phase |
|---|---|
| No SPEC file | Start at RESEARCH |
| SPEC file present | Skip to post-SPEC (INNOVATE) |
| Plan file present | Skip to post-PLAN (VALIDATE) |
| Validate-contract with PASS/CONDITIONAL | Skip to EXECUTE |

```mermaid
%%{init: {'theme': 'base', 'themeVariables': {'fontSize': '14px', 'lineColor': '#8888AA'}} }%%
flowchart TD
TRG["autopilot trigger phrase\n(anywhere in session)"]
DISK["Read saved files on disk\ndetect current phase"]
CLR["ONE consolidated\nclarification round"]
GOAL["Emit provisional /goal block\n(โ‰ค4000 chars, copy-pasteable\nstanding EXECUTE consent)"]
ACT["AUTOPILOT_ACTIVATED\nautonomous run begins"]
PHASES["Drive remaining phases\nRESEARCH โ†’ โ€ฆ โ†’ UPDATE PROCESS\n(no user pauses)"]
HS1["๐Ÿ›‘ Irreversible / outward action\nnot pre-approved"]
HS2["โ›” Cascade BLOCKED\n(several phases stuck)"]
HS3["๐Ÿ’ธ Live-provider billed probe\n(double opt-in required)"]
DONE["Run complete\nautopilot deactivates"]

TRG --> DISK --> CLR --> GOAL --> ACT --> PHASES
PHASES -->|"hard stop 1"| HS1
PHASES -->|"hard stop 2"| HS2
PHASES -->|"hard stop 3"| HS3
PHASES -->|"all phases done"| DONE

style TRG fill:#1565C0,stroke:#0D47A1,color:#FFFFFF
style DISK fill:#0277BD,stroke:#01579B,color:#FFFFFF
style CLR fill:#E65100,stroke:#BF360C,color:#FFFFFF
style GOAL fill:#2E7D32,stroke:#1B5E20,color:#FFFFFF
style ACT fill:#558B2F,stroke:#33691E,color:#FFFFFF
style PHASES fill:#C62828,stroke:#B71C1C,color:#FFFFFF
style HS1 fill:#B71C1C,stroke:#7F0000,color:#FFFFFF
style HS2 fill:#B71C1C,stroke:#7F0000,color:#FFFFFF
style HS3 fill:#B71C1C,stroke:#7F0000,color:#FFFFFF
style DONE fill:#00695C,stroke:#004D40,color:#FFFFFF
```

```
You: "autopilot full: add team invitations with email + role management"
โ†’ Reads saved files โ†’ detects current phase โ†’ enters there
โ†’ ONE consolidated clarification round (scope, hard stops, autonomy boundaries, first-phase strategy)
โ†’ Provisional /goal block emitted (โ‰ค4000 chars, copy-pasteable, standing EXECUTE consent)
โ†’ AUTOPILOT_ACTIVATED โ†’ drives remaining phases on its own
โ†’ Stops ONLY for hard stops
```

### Three lanes โ€” match ceremony to risk

| Lane | Trigger | Flow |
|---|---|---|
| ๐ŸŸข **quick** | `autopilot quick: [task]` | Scout โ†’ edit โ†’ scoped check. No plan, no contract, no EVL. |
| ๐ŸŸก **fast** | `autopilot fast: [task]` | Compressed Rโ†’Sโ†’Iโ†’Pโ†’V โ†’ EXECUTE + EVL. |
| ๐Ÿ”ด **full** | `autopilot [task]` / `autopilot full:` | Complete RIPER-5 (default). |

### ๐ŸŒ™ Hands-Free: One Phrase, Built While You Sleep

Say `autopilot full: [task]` โ€” or paste a `/goal` block โ€” and the following all happen with **zero human input**:

- **Plan-check-and-fix loop** โ€” finds gaps in the plan, fixes them, and re-checks. Up to 10 rounds on its own.
- **Build-test-and-fix loop** โ€” writes code, runs tests, fixes failures, re-runs. Up to 10 rounds on its own. It never trusts its own "all green" โ€” a separate checker (vc-tester) independently re-runs every test to confirm.
- **Phase-to-phase advancement** โ€” moves from research to plan to code to done without waiting for you.
- **Picks up after a memory reset** โ€” plans, reports, and progress all live as files on disk. After compaction (when the AI's short-term memory clears), the next session reads those files and continues exactly where it left off.
- **Stuck feature? Set it aside, keep going** โ€” if one phase can't be resolved, the agent writes a backlog note and moves on to the next feature. You can run many features in parallel without one blocker stopping everything.
- **Teams of agents for parallel features** โ€” multiple agents can build separate features at the same time, each locked to its own files so they never collide. A stuck feature is parked, not a blocker for the rest.

### Hard stops always surface (even on autopilot)

These are the **only three times it stops and asks you**:

- ๐Ÿ›‘ Anything it cannot undo, or that reaches the outside world and was not pre-approved (going live, sending real messages, charging money)
- โ›” Several phases in a row get stuck with no progress โ€” a real dead-end worth your eyes
- ๐Ÿ’ธ A test that would spend real money on a paid outside service โ€” it asks before running

---

### ๐ŸŽฏ /goal โ€” the autonomous run token

**Required, not decoration:** after every VALIDATE phase completes, the lead agent *must* emit a copy-pasteable `/goal` block before EXECUTE starts. This is a required handoff file โ€” not optional commentary.

**Format constraints:**

| Block type | Required fields | Hard limit |
|---|---|---|
| Post-VALIDATE block | SESSION GOAL ยท Charter+umbrella plan ยท Autonomy ยท Hard stop conditions ยท Next phase ยท Validate contract ยท Execute start | โ‰ค 4000 chars |
| Provisional (autopilot) block | SESSION GOAL ยท ENTRY PHASE ยท REMAINING PHASES ยท CLARIFICATIONS LOCKED ยท EXECUTE CONSENT ยท DECISION POLICY ยท HARD STOPS ยท TEST GATES ยท START (+ optional LANE) | โ‰ค 4000 chars |

The `/goal` command rejects blocks longer than 4000 characters. Keep it short โ€” use the required fields as the structure, not a prose essay.

**Standalone /goal mode:** paste a `/goal` block into a new session and the run picks up from the phase named in `START`. Clarifications and decision rules are already set โ€” no new clarification round. Under a standing `/goal`, the agent decides on its own at every reversible step, sends BLOCKED items to a backlog, and writes its own reports โ€” but **worker agent delegation stays mandatory.** Autopilot removes *approval pauses* only, never the no-inline-execution rule.

Validated by `validate-autopilot-goal-block.mjs`.

---

## ๐Ÿ”ฌ Feasibility Probes + The Validator Safety Net

### ๐Ÿ”ฌ Feasibility probes โ€” test the assumption before building on it

When SPEC, INNOVATE, or VALIDATE hits a key assumption it cannot confirm by reading alone, it emits `VC-FEASIBILITY-PROBE-NEEDED` and stops. The lead agent spawns `vc-debugger` to run a real test and write a **VERDICT**:

| Verdict | Meaning |
|---|---|
| โœ… **VIABLE** | Assumption holds โ€” design may rely on it |
| โŒ **NOT-VIABLE** | Assumption is false โ€” that approach is forbidden |
| โ“ **INCONCLUSIVE** | Couldn't prove it โ€” carried forward as a known-gap |

Each verdict comes with a 3-part design note: **what the result allows ยท what it rules out ยท what is still uncertain** โ€” fed word-for-word back into the paused phase. Probes are **cost-classed** (`cheap-local` / `needs-container` / `needs-live-provider` โ†’ double opt-in / `needs-browser` / `needs-cf`) so a billed or shared-resource probe never runs silently.

### ๐Ÿ›ก๏ธ 36 validators โ€” mechanical correctness, not opinion

The kit ships **36 validator scripts** that turn "did the agent follow the rules?" into a clear pass/fail result. They run after any phase that touches harness files, and as required checkpoints in UPDATE PROCESS:

| Validator family | Checks |
|---|---|
| `vc-audit-vc` | Agent parity (Claude/Codex), skill registry, kit portability, agent frontmatter |
| `vc-audit-context` | Context routing, discovery frontmatter, skill keywords |
| `vc-audit-plans` | Plan inventory, umbrella state, phase completeness, phase reports, backlog notes |
| 14 VC-system behavior validators | Each owns a pass/fail fixture pair โ€” strategy-compare output, closeout, intent-clarify, feasibility verdict, autoresearch log, and more |

---

## ๐Ÿ›ก๏ธ Built-in Safety Systems

These are not guidelines โ€” they are **hard rules** built into every agent.

๐Ÿ“


Progress Notes, Not Mid-Run Pauses


During coding the agent writes progress notes to the phase report file as it works. No mid-run pause, no "continue or return?" prompt. If it hits a problem that needs a plan change, it stops and returns to PLAN. Otherwise it keeps going.

๐Ÿšซ


Never Quietly Deviate


If coding hits a problem that needs a plan change, the agent stops immediately, explains, and returns to PLAN. No silent improvising.

๐Ÿ”


Privacy Guardrails Hook


The agent is blocked from reading .env, credentials, SSH keys, and .pem files without explicit approval.

โš ๏ธ


High-Risk Evidence Packs


For auth, billing, schema migrations, or public-API changes, the system requires a formal 5-file evidence pack before calling work "done" โ€” always manual, never auto-bypassed.

๐Ÿ“จ


Status-Code Discipline


Worker agents must close with DONE / DONE_WITH_CONCERNS / BLOCKED / NEEDS_CONTEXT. Blockers are never ignored; correctness concerns become action items.

๐Ÿ“Š


Closeout + Drift Scoring


After coding, a closeout packet scores urgency: LOW (light touch) โ†’ MEDIUM (significant) โ†’ HIGH (harness/protocol files touched), and recommends the next safe step.

---

## ๐Ÿ” Pre-Implementation Intelligence

Before a single line of code is written, three specialist skills can catch issues:

๐ŸŽญ


5-Persona Debate โ€” vc-predict


Architect, Security, Performance, UX, and Devil's Advocate debate your plan. Produces a GO / CAUTION / STOP verdict before you write a line.

๐ŸŽฒ


12-Dimension Edge Cases โ€” vc-scenario


Decomposes a feature across 12 dimensions (user types, input extremes, timing, scale, state, env, errors, auth, data, integrations, compliance, business logic). Output doubles as test specs.

๐Ÿ”


STRIDE + OWASP Audit โ€” vc-security


Dual-methodology security audit with dependency auditing, secret detection, and an auto-fix mode that sorts by severity and fixes Critical first with regression guards.

๐Ÿ”ฌ


Evidence-First Debugging โ€” vc-debugger


Gathers evidence โ†’ forms 2-3 competing hypotheses โ†’ tests each โ†’ documents the elimination path. Never guesses โ€” proves.

---

## โœ… Quality Pipeline โ€” Built Into Execution

**Tests first, then code.** The agreed checklist (written before any code is touched) defines the exact tests that must pass. The execute-agent writes code until those tests go green. Then a separate checker โ€” `vc-tester` โ€” re-runs every test on its own to confirm. The execute-agent's own "all green" is never taken at face value. At the very end, the reviewer checks that the whole project still works together, not just the new piece.

The execute-agent does not just write code and call it done. It moves through a **quality pipeline** automatically:


```mermaid
%%{init: {'theme': 'base', 'themeVariables': {'fontSize': '16px', 'lineColor': '#8888AA'}} }%%
flowchart TD
E["โšก Execute-Agent\nImplements the plan"]
SR["๐Ÿ”Ž Self-Review\nLine-by-line check\nagainst plan"]
T["๐Ÿงช Tester (EVL)\nDiff-aware โ€” re-runs\nthe contract gates"]
CR["๐Ÿ” Code Reviewer\nEdge case scout\n+ adversarial review"]
CS["โœจ Code Simplifier\nClarity refactoring"]
GM["๐Ÿ“ฆ Git Manager\nLogical commit splitting\nfrom touched_files"]

E --> SR --> T --> CR --> CS --> GM

style E fill:#C62828,stroke:#B71C1C,color:#FFFFFF
style SR fill:#AD1457,stroke:#880E4F,color:#FFFFFF
style T fill:#6A1B9A,stroke:#4A148C,color:#FFFFFF
style CR fill:#283593,stroke:#1A237E,color:#FFFFFF
style CS fill:#00695C,stroke:#004D40,color:#FFFFFF
style GM fill:#37474F,stroke:#263238,color:#FFFFFF
```

| Step | What it does |
|---|---|
| ๐Ÿ”Ž **Self-review** | Checks every checklist item against the plan, records any deviation |
| ๐Ÿงช **Tester (EVL)** | Re-runs the agreed-checklist tests independently; maps changed files โ†’ test files, escalates to the full suite when >70% mapped |
| ๐Ÿ” **Code reviewer** | Sends an edge-case scout *before* review; checks N+1 queries, auth paths, data leaks |
| โœจ **Simplifier** | Tidies the code for clarity after review โ€” no behavior changes |
| ๐Ÿ“ฆ **Git manager** | Receives `touched_files`, splits into logical conventional commits, refuses unknown files |

---

## ๐Ÿ“‹ The Plan Lifecycle

Every non-trivial feature follows a **plan lifecycle** โ€” a written spec that is created, reviewed, built against, and then archived as permanent project history.


```mermaid
%%{init: {'theme': 'base', 'themeVariables': {'fontSize': '16px', 'lineColor': '#8888AA'}} }%%
flowchart TD
A["๐Ÿ†• Feature Request"]
B["๐Ÿ“ Plan Created\nin active/{slug}_{date}/"]
C{"๐Ÿ‘€ User Reviews\nthe Plan"}
D["โœ… VALIDATE\n(PVL gates)"]
E["โšก Execute + EVL"]
F["๐Ÿ“ฆ Plan Archived\nto completed/"]
G["๐Ÿง  Learnings โ†’\nall-context.md"]
H["๐Ÿ”„ Next Feature\nStarts Smarter"]

A --> B --> C
C -->|"โœ๏ธ Needs Changes"| B
C -->|"โœ… Approved"| D --> E --> F --> G --> H
H -.->|"context compounds"| A

style A fill:#1565C0,stroke:#0D47A1,color:#FFFFFF
style B fill:#E65100,stroke:#BF360C,color:#FFFFFF
style C fill:#F57F17,stroke:#F9A825,color:#000000
style D fill:#558B2F,stroke:#33691E,color:#FFFFFF
style E fill:#C62828,stroke:#B71C1C,color:#FFFFFF
style F fill:#6A1B9A,stroke:#4A148C,color:#FFFFFF
style G fill:#00695C,stroke:#004D40,color:#FFFFFF
style H fill:#1565C0,stroke:#0D47A1,color:#FFFFFF
```

> ๐Ÿ’ก Six months from now, when someone asks *"why did we build auth this way?"*, the answer is in `completed/`. Not lost in a Slack thread.

**Where plans live โ€” task-folder convention:**

```
process/
โ”œโ”€โ”€ general-plans/
โ”‚ โ”œโ”€โ”€ active/
โ”‚ โ”‚ โ””โ”€โ”€ webhooks_28-05-26/ # ๐Ÿ“‹ Task folder: plan + colocated reports/refs
โ”‚ โ”‚ โ””โ”€โ”€ webhooks_PLAN_28-05-26.md
โ”‚ โ”œโ”€โ”€ completed/ # โœ… Archived (searchable history)
โ”‚ โ””โ”€โ”€ backlog/ # ๐Ÿ“Œ Deferred work
โ””โ”€โ”€ features/
โ””โ”€โ”€ billing/ # ๐Ÿท๏ธ Feature-scoped (5+ artifacts)
โ”œโ”€โ”€ active/{slug}_{date}/
โ”œโ”€โ”€ completed/
โ””โ”€โ”€ backlog/
```

```mermaid
%%{init: {'theme': 'base', 'themeVariables': {'fontSize': '14px', 'lineColor': '#8888AA'}} }%%
flowchart TD
PC["vc-plan-agent creates task folder\nprocess/general-plans/active/{slug}_{date}/\nOR features/{feature}/active/{slug}_{date}/"]
PLAN["{slug}_PLAN_{dd-mm-yy}.md\nโ€” plan file\nโ€” Validate Contract appended here"]
REF["{slug}_REF_{dd-mm-yy}.md\nโ€” optional references"]
RPT["{slug}-iteration-{NNN}_REPORT_{dd-mm-yy}.md\nโ€” per PVL/EVL cycle report"]
TSV["results.tsv\nโ€” rolling loop log\n(header + baseline + cycle rows)"]
PP["PHASE PROGRAM extras"]
UMB["umbrella_PLAN_{dd-mm-yy}.md\nโ€” Program Goal Charter\nโ€” /goal block"]
PHN["phase-N_PLAN_{dd-mm-yy}.md\nโ€” one file per phase"]
REG["phase-blast-radius-registry.md\nโ€” per-phase file ownership"]

PC --> PLAN
PC --> REF
PC --> RPT
PC --> TSV
PC --> PP
PP --> UMB
PP --> PHN
PP --> REG

style PC fill:#1565C0,stroke:#0D47A1,color:#FFFFFF
style PLAN fill:#2E7D32,stroke:#1B5E20,color:#FFFFFF
style REF fill:#558B2F,stroke:#33691E,color:#FFFFFF
style RPT fill:#E65100,stroke:#BF360C,color:#FFFFFF
style TSV fill:#C62828,stroke:#B71C1C,color:#FFFFFF
style PP fill:#6A1B9A,stroke:#4A148C,color:#FFFFFF
style UMB fill:#00695C,stroke:#004D40,color:#FFFFFF
style PHN fill:#00695C,stroke:#004D40,color:#FFFFFF
style REG fill:#37474F,stroke:#263238,color:#FFFFFF
```

> Every plan carries: ๐Ÿ“ **touchpoints** (files created/modified) ยท ๐Ÿ“œ **public contracts** ยท ๐Ÿ’ฅ **which files it can touch** (what could break, what to test) ยท โœ… **verification evidence** ยท ๐Ÿ”„ **resume handoff**. `vc-plan-discovery` finds the right plan to resume; the `post-write-plan-check` hook checks plan structure on every plan write.

---

## ๐Ÿ—๏ธ Phase Programs โ€” Large Projects That Don't Fall Apart

Normal features use one plan. **Large multi-phase projects** use a phase program โ€” an umbrella plan plus per-phase plans, each running a full **7-step inner loop** with its own checkpoints and a saved report.


```mermaid
%%{init: {'theme': 'base', 'themeVariables': {'fontSize': '14px', 'lineColor': '#8888AA'}} }%%
flowchart TD
UP["๐ŸŽฏ Umbrella Plan\nProgram Goal Charter\n(north star ยท done ยท scope tiers)"]
P1["๐Ÿ“‹ Phase 1"]
P2["๐Ÿ“‹ Phase 2 ..."]

subgraph LOOP["๐Ÿ” Per-phase inner loop (skips SPEC)"]
direction TB
R["๐Ÿ” Research"]
I["๐Ÿ’ก Innovate"]
PL["๐Ÿ“‹ Plan-supplement"]
PVL["โœ… PVL"]
EX["โšก Execute"]
EVL["๐Ÿงช EVL"]
UPD["๐Ÿง  Update"]
R --> I --> PL --> PVL --> EX --> EVL --> UPD
end

UP --> P1 --> LOOP
LOOP -.->|"learnings feed\nnext phase"| P2

style UP fill:#1565C0,stroke:#0D47A1,color:#FFFFFF
style P1 fill:#E65100,stroke:#BF360C,color:#FFFFFF
style P2 fill:#E65100,stroke:#BF360C,color:#FFFFFF
style R fill:#1565C0,stroke:#0D47A1,color:#FFFFFF
style I fill:#E65100,stroke:#BF360C,color:#FFFFFF
style PL fill:#2E7D32,stroke:#1B5E20,color:#FFFFFF
style PVL fill:#558B2F,stroke:#33691E,color:#FFFFFF
style EX fill:#C62828,stroke:#B71C1C,color:#FFFFFF
style EVL fill:#6A1B9A,stroke:#4A148C,color:#FFFFFF
style UPD fill:#00695C,stroke:#004D40,color:#FFFFFF
```

| | Feature | Why it matters |
|---|---|---|
| ๐Ÿ”„ | **Re-research at every phase** | Checks for code drift, reads latest reports, refreshes assumptions |
| โœ… | **Checkpoints per phase** | A phase is not done until evidence proves it. Honest status: `PLANNED โ†’ CODE DONE โ†’ TESTING โ†’ VERIFIED` or `BLOCKED` |
| ๐Ÿ“„ | **Saved reports** | Every phase writes results to disk โ€” progress survives a memory reset |
| ๐Ÿง  | **Learnings feed forward** | Phase 1 discoveries update Phase 2's plan before coding starts |
| ๐Ÿ—๏ธ | **Foundation vs expansion** | Explicitly separates "prove the architecture" from "implement everything" |
| ๐Ÿšง | **Honest blocker handling** | Stuck phases stay `BLOCKED` with evidence. No faking a green status |


### ๐Ÿ”€ The program reshapes itself as it learns

The plan you write at the start is a rough map, not a fixed contract. As the program runs, it adjusts โ€” so you do not have to predict every step in advance.

**It can add a new phase in the middle of a run.**
While working, the agent may discover a missing step โ€” something that must happen before the next phase can proceed. When that happens, it inserts a new phase right there, renumbers the rest, and carries on. No human needed. (Internal signal: `MID_PROGRAM_PLAN_CREATED` โ€” the new plan is written to disk and added to the registry automatically.)

**It can reorder phases.**
Research sometimes shows the planned order is wrong โ€” for example, Phase 3 depends on something only Phase 4 produces. The agent rearranges the remaining phases and records why. (Internal signal: `PHASE_RESTRUCTURE_NOTICE` โ€” saved in the phase report as an audit trail, not a blocker.)

**It updates each phase's own plan right before coding it.**
Before any phase starts coding, a quick research pass reviews what the program has learned so far. It then updates that phase's checklist with new findings. This is called a **plan-supplement** step. Plans are never frozen โ€” they absorb fresh facts from earlier phases.

**It skips work that cannot start yet.**
If a phase depends on something not yet ready โ€” a service not yet built, a decision not yet made โ€” the agent marks that phase as dependency-blocked, sets it aside, and moves on to the next one. The whole program does not stall because one phase is waiting.

**It knows when to stop and ask.**
A single stuck phase just gets parked in a backlog and the program continues. But if several phases in a row hit a wall with no progress, the agent treats that as a real dead-end โ€” a **cascade stop** โ€” and pauses to show you what happened. One stuck phase is normal. Several in a row signals something structural is wrong.

**It keeps a live scoreboard.**
Every program has a one-page status section in the umbrella plan showing which phase is current, whether it is done, and where the report lives. Anyone โ€” or the agent itself after a memory reset โ€” can read it and know exactly where things stand. It also keeps a simple file registry so two phases working at the same time never edit the same files.

**One big final check.**
At the end of the whole program, the agent runs an end-to-end test that the entire project still works together โ€” not just each piece on its own. Individual phase checkpoints prove each part works; this final check proves the parts work as a whole.

---

### ๐Ÿง  It Never Loses Its Place (Survives a Memory Reset)

Long jobs finish correctly โ€” even when the AI's memory resets mid-way. The plan, the progress, and the proof all live in files on disk, not only in the agent's head.

AI agents have a limited working memory. On a long job that memory fills up and gets squeezed down โ€” details can blur. When a new session starts (or memory is cleared), the agent does not guess where it left off. It reads the files.

Here is exactly how that works:

**1. It writes a short report after every phase.**
When a phase finishes, a report file is written to disk. Progress lives in your project folder, not just in the agent's head. A memory squeeze cannot erase a file.

**2. It keeps a checklist of which steps are done.**
Each phase plan holds a **Phase Loop Progress** list โ€” tick-boxes for every step (research, plan-check, build, test, capture learnings). After a reset, the agent reads those boxes and knows the exact next step. No need to catch it up.

**3. A brief "envelope" at the start of every phase.**
Every worker agent (a focused helper that does one phase of work) opens by emitting a **Context Envelope** โ€” a 10-field note: which feature, which phase, which branch, which plan file, which tests to run. It takes seconds to read. The agent is ready before it does anything.

**4. It trusts the files over its own memory.**
On resume, the agent checks what is actually in the code and git history versus what the plan says. The real state wins. A plan that went stale cannot mislead the agent into repeating work or skipping steps.

**5. A running scoreboard and per-round reports.**
Every fix loop (the plan-check loop and the build-test loop) keeps a `results.tsv` scoreboard file โ€” one row per round, tracking how many issues remain. When a session ends mid-loop, the next session reads the count, picks up at the right round, and continues. No rounds are lost.

**6. It re-injects a reminder on resume.**
When memory is squeezed, the system automatically reloads the latest status note into the new session. If any approval was pending โ€” say, a checkpoint that needed a "yes" before moving on โ€” the reminder flags it. Nothing is silently skipped.

> ๐Ÿ’ก In short: you can start an autopilot run, close your laptop, and come back hours later. The agent will be exactly where it should be โ€” or will pick up from the last saved checkpoint, with evidence on disk to prove it.

---

## ๐Ÿง  Context Groups

Project knowledge is organized into **context groups** โ€” stable knowledge areas, each with an `all-{group}.md` router file that tells agents what to read and when. Agents follow the router, loading only what is relevant โ€” not the whole knowledge base every time.


```
process/context/
โ”œโ”€โ”€ all-context.md # ๐Ÿงญ Root router โ€” architecture, stack, patterns, conventions
โ”œโ”€โ”€ tests/all-tests.md # ๐Ÿงช Test runners, commands, debugging procedures
โ”œโ”€โ”€ container/all-container.md # ๐Ÿณ Docker, deployment, infra procedures
โ”œโ”€โ”€ uxui/all-uxui.md # ๐ŸŽจ Components, design tokens, patterns
โ”œโ”€โ”€ infra/all-infra.md # ๐Ÿ–ฅ๏ธ Server infrastructure, deployment
โ””โ”€โ”€ {your-domain}/all-{domain}.md # ๐Ÿ“š Any domain with 3+ durable docs (auto-promoted)
```

| | How it works |
|---|---|
| ๐Ÿงญ **Router pattern** | Agents read only what is relevant to their task |
| ๐Ÿ“ **Auto-promotion** | Topics with 3+ docs (or a single file that gets too large) get their own group |
| ๐Ÿ”„ **Always current** | Updated by `vc-update-process-agent` after every non-trivial feature |
| ๐Ÿงช **Auditable** | `vc-audit-context` checks routing, discovery frontmatter, and consistency |
| ๐Ÿ“จ **Context Envelope** | Every inner-loop agent emits a 10-field note at start (feature โ†’ phase โ†’ session-goal โ†’ branch โ†’ worktree โ†’ context-group โ†’ blast-radius-packages โ†’ active-plan โ†’ test-runner โ†’ validate-contract) so a fresh worker agent knows exactly where it stands |

> The kit ships only the protocol seed โ€” your context groups are **built for your project** by `vc-setup`, scanning your real code. They are a pattern, not a fixed list.

---

## ๐Ÿ“ Feature Folders

When a topic builds up 5 or more files, it gets its own **feature folder** โ€” a complete lifecycle container.

```
process/features/{feature}/
โ”œโ”€โ”€ active/{slug}_{date}/ # ๐Ÿ“‹ Plans being worked on (reports/refs colocated)
โ”œโ”€โ”€ completed/ # โœ… Archived plans (searchable decision history)
โ””โ”€โ”€ backlog/ # ๐Ÿ“Œ Deferred work (agents check before duplicating)
```

| | What happens |
|---|---|
| ๐Ÿ†• | New work starts in `active/` โ†’ reports accumulate โ†’ plan archives to `completed/` |
| ๐Ÿ“Œ | Deferred work goes to `backlog/` โ€” agents check it before creating duplicate plans |
| ๐Ÿ“ฆ | Feature promotion happens automatically when general artifacts hit 5+ |
| ๐Ÿ” | Every feature has complete, self-contained history โ€” plans, decisions, reports, research |

---

## ๐Ÿงฑ Skill Layers

The 33 skills fall into three layers. Every `SKILL.md` declares its `layer` + `trigger_keywords` in frontmatter, and a generated catalog keeps discovery fast.

๐ŸŽญ


Actor agents


Own a phase or role. Live in .claude/agents/ โ€” these are the 15 agents, not skills.

๐Ÿ“œ


Contract skills (20)


Each one produces a specific file or agreed output โ€” vc-generate-plan, vc-validate-findings, vc-autopilot, the audits. Results can be checked.

๐Ÿ› ๏ธ


Helper skills (13)


Improve how agents work, produce no file of their own โ€” vc-scout, vc-sequential-thinking, vc-problem-solving, vc-docs-seeker.

---

## ๐Ÿง  Self-Improving Project Memory

Every completed feature feeds learnings back into the context system โ€” **the knowledge builds up, it does not reset.**

Most AI-assisted codebases have the opposite property: every new session starts cold. The agent re-reads the same files, re-discovers the same patterns, and re-makes the same decisions โ€” because the last session's insight lived only in a chat window. The kit's answer is not a prompt trick. It is a **durable context-file system** (`process/context/`) that every agent reads at session start, every validator protects, and every completed feature enriches.

Six months and many memory resets later, the agent still knows *why* your auth works the way it does โ€” because that knowledge is on disk, routed, and auditable, not trapped in a dead session.


```mermaid
%%{init: {'theme': 'base', 'themeVariables': {'fontSize': '14px', 'lineColor': '#8888AA'}} }%%
flowchart TD
FEAT["Feature ships\n(EXECUTE + EVL complete)"]
UP["UPDATE PROCESS phase\nvc-update-process-agent"]
CTX["process/context/ updated\nsmallest relevant file\n+ all-context.md router"]
AGENT["Next agent spawned\nreads context router\nโ†’ routes to correct group file"]
BETTER["Better next feature\nno re-discovery, no stale patterns"]
FEAT --> UP --> CTX --> AGENT --> BETTER
BETTER -.->|"compounds each feature"| FEAT

style FEAT fill:#C62828,stroke:#B71C1C,color:#FFFFFF
style UP fill:#00695C,stroke:#004D40,color:#FFFFFF
style CTX fill:#2E7D32,stroke:#1B5E20,color:#FFFFFF
style AGENT fill:#1565C0,stroke:#0D47A1,color:#FFFFFF
style BETTER fill:#E65100,stroke:#BF360C,color:#FFFFFF
```

### The core mechanism: `process/context/` as portable, shared memory

`process/context/` holds structured knowledge organized into topic groups โ€” architecture decisions, coding conventions, deployment steps, test patterns, infrastructure facts. Unlike a chat history, this knowledge:

- **travels into every worker agent** โ€” `vc-context-discovery` routes each spawned agent to the right `all-{group}.md` router for its task, then to the smallest relevant deep file. A research agent, a plan agent, and a coding agent all start with the same shared understanding
- **survives a memory reset** โ€” it is on disk, not in a context window; a squeezed session loses none of it
- **is readable by both Claude and Codex** โ€” `.agents/skills` is a shortcut link to `.claude/skills/`, so the same context system serves both agents without duplication

The root router (`all-context.md`) points to group routers (`all-{group}.md`), which route to the smallest relevant deep file. Agents follow the router โ€” they never hard-code file paths. This means renames and group splits require only router edits, not a codebase-wide search.

```
process/context/
โ”œโ”€โ”€ all-context.md โ† root router (architecture, stack, patterns)
โ”œโ”€โ”€ tests/all-tests.md โ† test runners, debugging, commands
โ”œโ”€โ”€ container/all-container.md โ† Docker, deployment, infra procedures
โ”œโ”€โ”€ uxui/all-uxui.md โ† components, design tokens, visual conventions
โ””โ”€โ”€ {domain}/all-{domain}.md โ† any domain with 3+ durable docs (auto-promoted)
```


### What makes it self-improving (not just "living docs")

The phrase "living docs" usually means "docs we intend to keep up to date but mostly forget." This system enforces the intention mechanically.

**The UPDATE PROCESS phase requires a per-file context review before it can close.** `vc-update-process-agent` cannot finish a phase until every potentially-affected context file has been reviewed with a concrete reason per file. "No updates needed" is allowed โ€” but it must name each reviewed file and explain why. Vague reasons are rejected. The checkpoint is binary: record the review, or the phase does not close.

The full feedback loop per completed feature:

| Step | Owner | What happens |
|------|-------|-------------|
| 1. Git diff analysis | `vc-scout` | Maps changed files โ†’ affected context areas |
| 2. Per-file review | `vc-update-process-agent` | Names each context file, states the update or an explicit "no change + reason" |
| 3. Updates applied | parallel worker agents | Each area's context file is updated with new patterns, decisions, learnings |
| 4. Routing verified | `validate-context-discovery.mjs` | Confirms every doc is indexed and routers are consistent |
| 5. Discovery confirmed | `validate-all-context.mjs` | Confirms `all-context.md` and group routers match the current files on disk |

Your 100th feature benefits from everything learned in the first 99 โ€” not as an aspiration, but as a mechanical guarantee.


### Forward Preview: learnings feed forward, not just backward

Every phase report carries a `## Forward Preview` section written for the *next* phase's agent. It gives the exact commands to keep green, dependency changes, and file-scope changes found mid-phase. The agent picking up Phase 3 does not have to re-read Phase 2's output and guess what matters. It is handed a focused brief.

This is different from context docs: context docs carry *lasting* knowledge (decisions that stay true across features); Forward Preview carries *temporary* handoff state (what the next work session needs to know right now).


### Validator suite prevents rot

Lasting knowledge goes stale when nobody checks it. The kit ships validators that run as part of every phase closeout:

| Validator | What it catches |
|-----------|----------------|
| `validate-context-discovery.mjs` | Docs not indexed by any router; broken links; missing frontmatter |
| `validate-all-context.mjs` | `all-context.md` out of sync with actual files on disk |
| `validate-skill-keywords.mjs` | Skills missing `trigger_keywords` or `layer` fields (breaks routing Step 0) |
| `validate-protocol-discovery.mjs` | Protocol files in `process/development-protocols/` missing discovery frontmatter |

These run like automated checks โ€” a stale or orphaned doc fails. The system polices its own health.


### Context groups self-organize

Groups are created automatically when a topic reaches 3+ docs or a single file goes past ~800 lines. Agents follow routers and never hard-code paths โ€” so adding a new group (e.g. `process/context/billing/all-billing.md`) requires only a router update, not changes to every agent that mentions billing. The router is the stable reference; the files behind it can reorganize freely.

> The kit seeds context groups from your real codebase (via `vc-setup`). The groups are not a fixed list โ€” they are a pattern. Your auth area, your infra area, your payments area each become first-class routable knowledge as the project grows.

---

## ๐Ÿค– What's Inside


### 15 Agents

Click to expand the agent roster


**Core workflow agents** โ€” one per RIPER-5 phase (R โ†’ SPEC โ†’ I โ†’ P โ†’ V โ†’ E โ†’ UP):

| Agent | Model | Role |
|-------|:---:|------|
| ๐Ÿ” `vc-research-agent` | sonnet | Codebase + web research, read-only. Contradiction tracking built in |
| ๐Ÿ“ `vc-spec-agent` | sonnet | Product-discovery requirements doc before INNOVATE. Produces `*_SPEC_*.md` |
| ๐Ÿ’ก `vc-innovate-agent` | sonnet | Compare 2-3 approaches. Decision summary (chosen + rejected) before PLAN |
| ๐Ÿ“‹ `vc-plan-agent` | sonnet | Write the plan with anti-shortcut guards. "I already know how" is not a plan |
| โœ… `vc-validate-agent` | sonnet | Turn plan โ†’ agreed checklist (V1โ€“V7). Checkpoint: PASS/CONDITIONAL/BLOCKED |
| โšก `vc-execute-agent` | **opus** | Implement per plan. Progress notes to phase report, deviation protocol, self-review |
| โฉ `vc-fast-mode-agent` | **opus** | Compressed Rโ†’Sโ†’Iโ†’Pโ†’V with a required safety pause before EXECUTE |
| ๐Ÿ”ง `vc-quick-fix-agent` | **opus** | QUICK FIX lane: one small low-risk edit + scoped check, no plan/validate |
| ๐Ÿง  `vc-update-process-agent` | sonnet | 7-phase closeout: archive, update context, stale-artifact scan, learnings |


**Specialist agents** โ€” called during EXECUTE or standalone:

| Agent | Role |
|-------|------|
| ๐Ÿ› `vc-debugger` | Gathers evidence before forming a hypothesis. Competing hypotheses, elimination chains, feasibility probes |
| ๐Ÿงช `vc-tester` | Change-aware. Re-runs agreed-checklist tests (EVL). Auto-escalates on config changes |
| ๐Ÿ”Ž `vc-code-reviewer` | Sends an edge-case scout BEFORE review. N+1 detection, auth-path checking |
| โœจ `vc-code-simplifier` | Tidies code for clarity without changing behavior |
| ๐ŸŽจ `vc-ui-ux-designer` | Design-aware frontend. Can spawn a research worker mid-build |
| ๐Ÿ“ฆ `vc-git-manager` | Splits into logical commits from `touched_files`. Refuses unknown files |


### 33 Skills (auto-discovered)

Click to expand the skill list (20 contract + 13 helper)


**๐Ÿ“œ Contract skills (20)** โ€” own an artifact: `vc-generate-plan` ยท `vc-generate-context` ยท `vc-generate-spec` ยท `vc-generate-closeout` ยท `vc-generate-phase-program` ยท `vc-audit-context` ยท `vc-audit-plans` ยท `vc-audit-vc` ยท `vc-update` ยท `vc-publish` ยท `vc-feasibility-test` ยท `vc-risk-evidence-pack` ยท `vc-test-coverage-plan` ยท `vc-validate-findings` ยท `vc-autoresearch` ยท `vc-intent-clarify` ยท `vc-autopilot` ยท `vc-agent-strategy-compare` ยท `vc-plan-discovery` ยท `vc-context-discovery`

**๐Ÿ› ๏ธ Helper skills (13)** โ€” improve how agents work: `vc-review-situation` ยท `vc-sequential-thinking` ยท `vc-problem-solving` ยท `vc-scout` ยท `vc-debug` ยท `vc-docs-seeker` ยท `vc-frontend-design` ยท `vc-agent-browser` ยท `vc-web-testing` ยท `vc-setup` ยท `vc-predict` ยท `vc-scenario` ยท `vc-security`

> **โš ๏ธ Naming rule:** Do NOT use the `vc-` prefix for your own skills or agents โ€” that namespace is reserved for kit-shipped files, and the stale-removal guard treats any `vc-*` path under `.claude/skills/` and `.claude/agents/` as kit-owned. Use `my-`, `team-`, or `proj-` instead.


### ๐Ÿช 10 Hooks

| Hook | What it does |
|------|-------------|
| ๐Ÿ” `privacy-block.cjs` | Blocks reading `.env`, credentials, SSH keys. Requires explicit approval |
| ๐Ÿšซ `scout-block.cjs` | Prevents wandering into `node_modules/`, `dist/`. Gitignore-syntax `.ckignore` |
| ๐Ÿง  `session-init.cjs` | Detects stack, injects env, recovers approval gates after compaction |
| ๐Ÿ’‰ `subagent-init.cjs` | Injects a compact context block into every subagent |
| โœจ `post-edit-simplify-reminder.cjs` | After 5+ edits, nudges to run the simplifier (non-blocking, throttled) |
| ๐Ÿ“› `descriptive-name.cjs` | Language-aware file-naming conventions on every Write |
| ๐Ÿ“Š `session-state.cjs` | Session metrics + token awareness |
| ๐Ÿ“‹ `post-write-plan-check.mjs` | Validates plan-artifact structure on every Write to a `*_PLAN_*.md` |
| ๐Ÿงน `post-commit-lint.mjs` | Checks conventional-commits prefix on every `git commit` |
| ๐Ÿ” `stop-validator-sweep.cjs` | Runs core harness validators when the session stops |


**Where everything lives:**

```text
your-project/
โ”œโ”€โ”€ .claude/{agents,skills,hooks}/ # ๐Ÿค– 15 agents ยท โšก 33 skills ยท ๐Ÿช 10 hooks
โ”œโ”€โ”€ .codex/agents/ # ๐Ÿ”„ Mirrored for Codex
โ”œโ”€โ”€ .agents/skills -> .claude/skills # ๐Ÿ”— Symlink for Codex discovery
โ”œโ”€โ”€ CLAUDE.md ยท AGENTS.md # ๐Ÿ“‹ Orchestrator config + cross-tool registry
โ””โ”€โ”€ process/
โ”œโ”€โ”€ context/ # ๐Ÿง  Auto-routed knowledge domains
โ”œโ”€โ”€ general-plans/ # ๐Ÿ“‹ Cross-cutting plans + task folders
โ”œโ”€โ”€ features/ # ๐Ÿท๏ธ Feature-scoped lifecycle folders
โ””โ”€โ”€ development-protocols/ # ๐Ÿ“œ 22 shared workflow docs
```

---

## โšก Quick Fix + Fast Mode

Two lighter options for when the full RIPER-5 process is more than the job needs:

๐Ÿ”ง


Quick Fix โ€” "quick fix: โ€ฆ"


Bigger than a trivial one-liner, smaller than "needs a plan." The lead agent scouts read-only โ†’ one-line confirm โ†’ spawns vc-quick-fix-agent for the edit + a scoped check on touched files only. No plan, no agreed checklist, no EVL.



Cancelled immediately if the change touches schema, auth, API, billing, or migration surfaces โ€” then it routes to full RESEARCH.

โฉ


Fast Mode โ€” "ENTER FAST MODE - โ€ฆ"


Squeezes RESEARCH + SPEC + INNOVATE + PLAN + VALIDATE into one pass โ€” but still writes a plan, writes an agreed checklist, and pauses before EXECUTE.



In plain Fast Mode, there is a post-VALIDATE pause โ€” you review, then say "ENTER EXECUTE MODE." Use autopilot fast: [task] to remove that pause and run all the way through without stopping.

---

## ๐Ÿ”„ Kit Lifecycle: Install ยท Setup ยท Update ยท Publish

| Command | What it does | When |
|---|---|---|
| `curl โ€ฆ install.sh \| bash` | Syncs kit files without overwriting yours; auto-detects fresh vs upgrade and routes you | First install + every upgrade |
| **Run vc-setup** | Detects stack, scaffolds `process/`, deep-scans codebase, populates real context | After a fresh install |
| **Run vc-update** | Computes a precise diff, shows what will change, waits for your OK; migrates old-format plans/folders with zero data loss | On every upgrade |
| **Run vc-publish** *(maintainers)* | Publishes harness changes back out to the kit repo | Contributing to the kit itself |

```mermaid
%%{init: {'theme': 'base', 'themeVariables': {'fontSize': '14px', 'lineColor': '#8888AA'}} }%%
flowchart TD
TRG["EXECUTE complete\nENTER UPDATE PROCESS MODE"]
AGT["spawn vc-update-process-agent"]
A["(a) Archive plan\nactive/ โ†’ completed/"]
B["(b) Update process/context/\nsmallest relevant file\n+ all-context.md router"]
C["(c) Tier-1 audits\n(change-type gated)"]
AVC["vc-audit-vc\nharness/agent edits"]
ACX["vc-audit-context\ncontext-doc edits"]
APL["vc-audit-plans\nplan/program edits"]
D["(d) Capture learnings\nto memory"]
E["(e) Write closeout packet\nvc-generate-closeout"]
F["(f) Conventional commit\nvc-git-manager"]

TRG --> AGT --> A --> B --> C
C --> AVC
C --> ACX
C --> APL
AVC --> D
ACX --> D
APL --> D
D --> E --> F

style TRG fill:#1565C0,stroke:#0D47A1,color:#FFFFFF
style AGT fill:#0277BD,stroke:#01579B,color:#FFFFFF
style A fill:#6A1B9A,stroke:#4A148C,color:#FFFFFF
style B fill:#00695C,stroke:#004D40,color:#FFFFFF
style C fill:#E65100,stroke:#BF360C,color:#FFFFFF
style AVC fill:#2E7D32,stroke:#1B5E20,color:#FFFFFF
style ACX fill:#2E7D32,stroke:#1B5E20,color:#FFFFFF
style APL fill:#2E7D32,stroke:#1B5E20,color:#FFFFFF
style D fill:#558B2F,stroke:#33691E,color:#FFFFFF
style E fill:#C62828,stroke:#B71C1C,color:#FFFFFF
style F fill:#37474F,stroke:#263238,color:#FFFFFF
```

> ๐Ÿ’ก `vc-update` shows a preview diff and waits for your OK. Your `process/` directory and project-specific content are **never** silently changed. Re-running install is safe to run twice.

---

## ๐Ÿ’ก More Reasons It Just Works

Many small, smart defaults add up to less babysitting and lower cost.

- **Each role only gets the tools it needs.** During planning, the agent literally cannot edit code โ€” those tools are turned off. This stops the agent from jumping ahead and changing things before the plan is approved. The system simply does not allow it.

- **It uses the premium AI model only where it matters.** Code-writing uses the top model. Planning, research, review, and checking all use a cheaper, faster model. The result: roughly 60โ€“70% lower cost compared to running the top model for everything โ€” with no quality loss on the work that counts.

- **It tests risky guesses before building on them.** When the agent is not sure something will work โ€” a specific API behavior, a library feature, an infrastructure assumption โ€” it runs a tiny real experiment first. The result is clear: works, does not work, or unclear. That verdict and a plain-English note get fed straight into the plan. The agent does not spend hours building on a wrong assumption.

- **Tidy, meaningful save points.** Changes are committed in clean, logical chunks with clear messages โ€” automatically. The history is easy to read and easy to undo one piece at a time.

- **Helpful automatic reminders.** Small built-in helpers nudge for things like running the right checks on changed files, keeping code simple, and writing a proper commit message. Quality stays high without you having to police it.

- **You can run the self-improving loop on its own.** The same "find problems, fix them, repeat" engine that drives plan-checking and test-fixing also works as a standalone tool on any messy area โ€” a spec, the docs, the tests, an error list. You do not need a full feature build to use it.

- **Built-in proof the workflow rules actually work.** The kit ships with its own test suite: a set of checks with known-good and known-bad examples that prove the workflow rules behave correctly. The system checks itself. You do not have to trust that the guardrails are on โ€” you can run the checks and see.

---

## Contributing

We welcome contributions! See [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.


**Quick links:**

- ๐Ÿ› [Report a bug](https://github.com/withkynam/vibecode-pro-max-kit/issues/new?template=1.bug_report.yml)
- ๐Ÿ’ก [Request a feature](https://github.com/withkynam/vibecode-pro-max-kit/issues/new?template=2.feature_request.yml)
- โšก [Submit a skill](https://github.com/withkynam/vibecode-pro-max-kit/issues/new?template=3.skill_submission.yml)
- ๐ŸŒ [Add a translation](https://github.com/withkynam/vibecode-pro-max-kit/issues/new?template=5.translation.yml)



Contributors


### ๐Ÿ™ Credits

vibecode-pro-max-kit focuses on the spec-driven development framework and self-improving context organization, without bloating you with 80+ skills. Fewer tools, more structure.

---

## โญ Star History





Star History Chart

---

## ๐Ÿ“„ License

MIT