{"id":50551727,"url":"https://github.com/mikeleppane/forge","last_synced_at":"2026-06-04T04:01:30.667Z","repository":{"id":355604060,"uuid":"1227361993","full_name":"mikeleppane/forge","owner":"mikeleppane","description":"Intent-Driven Development plugin for Claude Code - spec-first AI coding with three-layer verification and context-bounded subagents.  Intent -\u003e Spec -\u003e Execute -\u003e Verify, without losing context. ","archived":false,"fork":false,"pushed_at":"2026-05-17T17:35:46.000Z","size":5364,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-17T19:44:51.177Z","etag":null,"topics":["agentic-coding","ai","ai-agents","ai-coding","claude","claude-code","claude-code-plugin","developer-tools","intent-driven-development","python","spec-driven-development"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mikeleppane.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null}},"created_at":"2026-05-02T15:22:21.000Z","updated_at":"2026-05-17T17:35:50.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/mikeleppane/forge","commit_stats":null,"previous_names":["mikeleppane/idd","mikeleppane/forge"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/mikeleppane/forge","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mikeleppane%2Fforge","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mikeleppane%2Fforge/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mikeleppane%2Fforge/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mikeleppane%2Fforge/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mikeleppane","download_url":"https://codeload.github.com/mikeleppane/forge/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mikeleppane%2Fforge/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33888302,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-04T02:00:06.755Z","response_time":64,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agentic-coding","ai","ai-agents","ai-coding","claude","claude-code","claude-code-plugin","developer-tools","intent-driven-development","python","spec-driven-development"],"created_at":"2026-06-04T04:01:29.670Z","updated_at":"2026-06-04T04:01:30.658Z","avatar_url":"https://github.com/mikeleppane.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# FORGE — Intent-Driven Development\n\n![FORGE logo](images/forge-logo.png)\n\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)\n[![Python 3.12+](https://img.shields.io/badge/python-3.12+-blue.svg)](https://www.python.org/downloads/)\n[![Code style: ruff](https://img.shields.io/badge/code%20style-ruff-261230.svg)](https://github.com/astral-sh/ruff)\n[![Type-checked: mypy strict](https://img.shields.io/badge/type--checked-mypy%20strict-2A6DB2.svg)](https://mypy-lang.org/)\n[![Tests: pytest](https://img.shields.io/badge/tests-pytest-0A9EDC.svg)](https://docs.pytest.org/)\n![Status: alpha](https://img.shields.io/badge/status-alpha-orange.svg)\n[![CI](https://github.com/mikeleppane/idd/actions/workflows/ci.yml/badge.svg?branch=main)](https://github.com/mikeleppane/idd/actions/workflows/ci.yml)\n[![Built for: Claude Code](https://img.shields.io/badge/built%20for-Claude%20Code-D97757.svg)](https://code.claude.com)\n\n\u003e **Intent is the source. Spec is the contract. Verification reconciles reality. QA red-teams the result.**\n\u003e\n\u003e *\"The hardest single part of building a software system is deciding precisely what to build. No other part of the work so cripples the resulting system if done wrong. No other part is more difficult to rectify later.\"*\n\u003e — **Fred Brooks**, *No Silver Bullet* (1986)\n\n**⚠️ Alpha.** Expect breaking changes. APIs, schemas, and command surfaces are not yet stable. Pin your `.forge/` artifacts via git.\n\n\"Production-ready\" for FORGE means:\n\n- **Schema + CLI stability** — `schemas/*.schema.json`, `state.json` shape, and `/forge:*` flags are versioned and deprecation-gated.\n- **Host compatibility matrix** — automatic enforcement and source-portable behavior documented per host.\n- **Hook + validator contracts frozen** — `hooks/check_*.py` and `python -m tools.validate` outputs stable across minor versions.\n\nUntil then, treat FORGE as a methodology you can read, fork, and run — not a turnkey dependency.\n\nFORGE is a Claude Code plugin that encodes a disciplined Spec-Driven Development lifecycle for working with AI coding agents on real repositories. It sits alongside TDD / BDD / DDD / SDD — a methodology, not a tool.\n\nFORGE optimizes for **disciplined, resumable** software work over speed-first coding. Every artifact it produces earns its place by clarifying intent, preserving context, reducing drift, or verifying reality.\n\n---\n\n## Table of contents\n\n- [Demo](#demo)\n- [Quickstart](#quickstart)\n- [Install (Claude Code)](#install-claude-code)\n- [What it is](#what-it-is)\n- [Core idea](#core-idea)\n- [System picture](#system-picture)\n- [Why use it](#why-use-it)\n- [How to use it](#how-to-use-it)\n- [Lifecycle](#lifecycle)\n- [Tiers](#tiers)\n- [TDD enforcement](#tdd-enforcement)\n- [The crucible](#the-crucible)\n- [Verification](#verification)\n- [QA: black-box outsider pass](#qa-black-box-outsider-pass)\n- [Research phase](#research-phase)\n- [Cross-AI peer review](#cross-ai-peer-review)\n- [Constitution bootstrap](#constitution-bootstrap)\n- [Convention routing](#convention-routing)\n- [Trap memory](#trap-memory)\n- [Per-feature artifacts](#per-feature-artifacts)\n- [Use outside Claude Code](#use-outside-claude-code)\n- [Host Compatibility](#host-compatibility)\n- [Configuration](#configuration)\n- [Project layout](#project-layout)\n- [Comparison vs alternatives](#comparison-vs-alternatives)\n- [Security](#security)\n- [Contributing](#contributing)\n- [License](#license)\n\n---\n\n## Demo\n\n---\n\n## Quickstart\n\n```bash\n# 1. Clone + install tooling\ngit clone https://github.com/mikeleppane/idd.git forge\ncd forge\nmake install               # creates .venv, installs forge-tools[dev]\n\n# 2. Validate the plugin shape\nmake check                 # ruff + mypy --strict + pytest + validate-health\nclaude plugin validate .   # Claude Code plugin validator (see Install section)\n\n# 3. Drive your first feature\n# In Claude Code, with the plugin loaded:\n/forge:do \"fix CSV import error handling\" --focused\n```\n\nExpected on success:\n\n```text\n.forge/features/2026-05-10-fix-csv-import-error-handling/\n├── SPEC.md          # behavior contract (focused tier)\n├── decisions.md     # running ADR log\n└── state.json       # tier=focused, current_phase=spec, status=in_progress\n\nNext: /forge:spec --feature 2026-05-10-fix-csv-import-error-handling\n```\n\nThen drive the rest of the lifecycle:\n\n```bash\n/forge:status                            # see where the active feature is\n/forge:next                              # print the next dispatch literal\n/forge:spec --feature \u003cid\u003e               # run the current phase\n# … each phase command refuses to run unless the previous phase is done.\n```\n\n---\n\n## Install (Claude Code)\n\n**Prerequisites:** Python 3.12+, `git`, Claude Code CLI.\n\n```bash\ngit clone https://github.com/mikeleppane/idd.git forge\ncd forge\nmake install\n```\n\n**Reference the plugin from Claude Code.** Until `claude plugins install …` is wired, point your Claude Code config at the cloned path. See the [Claude Code plugin docs](https://code.claude.com/docs/en/plugins-reference) for current syntax. Validate the manifest with:\n\n```bash\nclaude plugin validate .\n```\n\nThe `/forge:*` slash commands light up once the plugin is loaded. Verify with `/forge:status` (it will report no active feature).\n\nA formal `claude plugins install …` path is on the roadmap.\n\n**Plugin iteration.** When you edit FORGE source in a checkout that backs a local-directory marketplace, Claude Code caches the plugin under `~/.claude/plugins/cache/\u003cmarketplace\u003e/\u003cplugin\u003e/\u003cversion\u003e/` and does not auto-refresh on source edits without a version bump. Run `make plugin-refresh` from the FORGE repo to rebuild the cache (it tries `claude plugin update`, then falls back to a clean uninstall + reinstall). Restart your Claude Code session afterwards so the new code is loaded.\n\n---\n\n## What it is\n\nA small set of slash commands, skills, hooks, and JSON-schema-validated artifacts (see [`schemas/`](schemas/)) that walk an AI coding agent through a phased lifecycle — focused tier runs three phases (`spec → execute → verify`); standard runs eight (adds scenarios, plan, crucible, review, ship); full runs eleven (adds refine, research, domain). On standard and full, `review` fires twice (target=plan then target=code) and a post-merge `qa` phase pumps terminally after `ship`; migrating a feature to `flow_version: 3` adds `qa` to full's `routing.phase_list` directly. State is persisted on disk per feature, so any session can be paused, resumed, or handed off without losing context.\n\n## Core idea\n\nFORGE is built around one red line: **intent must survive the whole journey from idea to shipped behavior**.\n\nInstead of treating the user's request as a prompt to immediately code, FORGE first turns intent into shared understanding. The human and agent research, ask questions, explore scenarios, clarify domain language, and shape the work into a spec that acts as a contract. BDD helps describe the behavior from the user's perspective; DDD helps name the domain clearly enough that the agent and human are reasoning about the same thing.\n\nThat shared understanding becomes a reviewed plan. Before execution, the plan is challenged through adversarial review and crucible-style questioning, so weak assumptions are exposed while they are still cheap to fix.\n\nExecution then happens in disciplined loops: small TDD slices, bounded subagents, and explicit context management. The agent does not operate from an empty prompt; it is guided by memory — the project's constitution, lessons learned, conventions, rules, and trap history.\n\nFinally, review, verification, and outsider QA reconcile the promise with reality. What ships is not only code; it is new knowledge. Each completed feature updates the memory of the system, making the next intent easier to understand, safer to plan, and harder to derail.\n\n![FORGE core idea](images/core-idea.png)\n\n\n## Why use it\n\nYou already have an AI coding agent. FORGE adds the missing scaffolding around it:\n\n- **Intent** is treated as the project north-star — the *why* behind any change.\n- **Spec** is the contract for intended behavior — adversarially refined before code is written.\n- **Verification** reconciles spec, code, tests, runtime behavior, and user confirmation — three layers, not just \"tests pass.\"\n- **Crucible** is an adversarial post-plan ritual (assumptions inversion → adversarial Q\u0026A → pre-mortem) that produces a shared `UNDERSTANDING.md` between you and the agent.\n- **QA** is a fresh-outsider black-box pass after ship — an agent with no implementation context exercises the artifact from a user's perspective, attempts edge cases, and red-teams it. Verdict gates the pre-PR prompt; advisory post-merge.\n- **TDD** is mechanical, not aspirational — every acceptance criterion requires a paired test commit before its impl commit, enforced by `tools.validate.tdd_evidence`.\n- **Context discipline** keeps the main thread under a hard token budget by isolating slices in subagents and preventing context bleed.\n- **Cross-AI review** uses a different model family (Claude ↔ GPT) as a second-opinion reviewer, reinforcing the adversarial shared-understanding goal. Manual mode (default) writes a self-contained prompt to disk for paste-into-other-CLI workflows; auto mode (opt-in) dispatches an external CLI directly. Both modes apply a configurable redaction filter before any prompt materialises. See [Cross-AI peer review](#cross-ai-peer-review) for the full surface.\n\nFORGE pays off when the cost of building the wrong thing, losing context, drifting from intent, or losing your own mental model of the code is higher than the cost of a disciplined workflow. It is **not** the fastest path to code. It *is* a clear path from intent to verified behavior — without surrendering your understanding of the system along the way.\n\n---\n\n## How to use it\n\nInstall the plugin (see below), then drive a feature through the lifecycle with one command:\n\n```text\n/forge:do \"\u003cyour idea\u003e\" [--focused | --standard | --full]\n```\n\nFor a new repo, run `/forge:setup --dry-run` first to inspect detected test\ncommands, BDD bindings, issue tracker hints, docs paths, and setup config\nwithout writing anything. Run `/forge:setup` when you want to persist accepted\ndecisions to `.forge/config.json` and `.forge/operator-context.md`.\n\n`/forge:do` runs preflights, scans for capability conflicts, picks a tier, and seeds the per-feature folder under `.forge/features/\u003cid\u003e/`. It then dispatches the next slash command in the chosen tier.\n\nPrefer to drive each phase manually? Each phase has its own slash command (`/forge:refine`, `/forge:research`, `/forge:spec`, `/forge:domain`, `/forge:scenarios`, `/forge:plan`, `/forge:crucible`, `/forge:review`, `/forge:execute`, `/forge:verify`, `/forge:ship`, `/forge:qa`). Research auto-runs on full tier; opt into it on standard with `/forge:do --standard --research \"\u003cidea\u003e\"`; focused tier refuses. Each command refuses to run unless the previous phase is complete — the on-disk state machine is the source of truth.\n\n---\n\n## Lifecycle\n\n```text\nrefine → research → spec → domain → scenarios → plan → crucible → review → execute → verify → ship → qa\n```\n\n| Phase | Output | Purpose |\n| --- | --- | --- |\n| **refine** | refined idea statement | Sharpen a vague idea into a single-feature scope |\n| **research** | `RESEARCH.md` | Codebase + external library discovery before the spec is locked. Auto-runs on `--full`; opt-in via `--research` on `--standard`; refused on `--focused`. Emits codebase findings, external docs (with grounding-mode citations), domain notes, and risks. |\n| **spec** | `SPEC.md` | Behavior contract: Intent, Domain, Scope, Scenarios, Acceptance, Open Questions |\n| **domain** | `DOMAIN.md` (full tier) — glossary, bounded contexts, aggregates, invariants | Ubiquitous language with auto-rendered bounded-context Mermaid; SPEC.md `# Domain` becomes a pointer |\n| **scenarios** | Gherkin scenarios in `SPEC.md` | BDD acceptance criteria; auto-escalates to `.feature` files when the project supports it |\n| **plan** | `PLAN.md` | Vertical slices and waves of parallelizable tasks; file-bound |\n| **crucible** | `UNDERSTANDING.md` | Adversarial ritual: assumptions inversion → adversarial Q\u0026A → pre-mortem |\n| **review** | `REVIEW.plan.md` / `REVIEW.code.md` | Layered self + heavy + cross-AI reviews with convergence loops |\n| **execute** | code + tests | Slice-isolated, subagent-bounded, wave-parallel implementation. TDD pairing enforced — every AC requires a `test(...)` commit before its `feat(...)` commit. |\n| **verify** | `VERIFICATION.md` | Three layers: code audit + scenario execution + conversational UAT |\n| **ship** | merged change + updated canonical spec or delta proposal | Reconcile feature spec with shipped capability spec. Pre-PR QA gate prompt available. |\n| **qa** | `QA.md` with verdict + confidence | Fresh-outsider black-box pass: acceptance + edge probing + adversarial + NR regrep. Terminal phase; archives on completion. |\n\nPhases can be skipped via flags or selected automatically by `/forge:do`, which estimates the right tier and routes accordingly.\n\n---\n\n## Tiers\n\n`--focused` means **narrow**, not necessarily fast. Pick the tier that matches the change's risk and surface area, not your patience.\n\n| Tier | Phases | Use when |\n| --- | --- | --- |\n| `--focused` | `spec → execute → verify` | One-file fixes, surgical changes, well-understood bugs |\n| `--standard` | `spec → scenarios → plan → crucible → review → execute → review → verify → ship → qa` | Most features; cross-file changes; non-trivial behavior |\n| `--full` | entire pipeline through `qa` | New subsystems, cross-cutting refactors, anything requiring deep research and DDD |\n\nThe standard tier runs review twice (against `PLAN.md`, then against the code diff). Focused finishes at `verify` (no ship, no qa — `/forge:ship` aborts on focused); standard and full both end in `qa`. The `forge-context-budget` and `forge-subagent-dispatch` skills enforce per-subagent token budgets at every dispatch; `tools.validate.tdd_evidence` enforces paired test/impl commits in execute; `tools.validate.qa_shape` enforces the QA artifact contract on ship/qa exit.\n\n---\n\n## TDD enforcement\n\n\u003e *\"TDD doesn't drive good design. TDD gives you the opportunity to think about good design every few minutes.\"*\n\u003e — **Kent Beck**, *Test-Driven Development: By Example*\n\nIn FORGE, TDD is **mechanical, not aspirational**. Every acceptance criterion in `SPEC.md` produces a paired commit sequence in execute:\n\n```text\ntest(\u003cscope\u003e): AC-3 failing — empty CSV returns InvalidInput\nfeat(\u003cscope\u003e): AC-3 — handle empty CSV in importer\n```\n\n`tools.validate.tdd_evidence` walks the execute-phase commit range and refuses to advance unless every AC has a `test(...)` commit chronologically *before* its matching `feat(...)` commit. ACs without paired tests block phase transition.\n\nExceptions exist (e.g., pure-config changes, generated artifacts) but require an explicit **TDD Exception ADR** in `decisions.md` with rationale. Override is auditable, not silent.\n\n---\n\n## The crucible\n\n\u003e *\"The first principle is that you must not fool yourself — and you are the easiest person to fool.\"*\n\u003e — **Richard Feynman**, Caltech commencement (1974)\n\nThe crucible is FORGE's most opinionated piece — an adversarial ritual run *after* planning and *before* execution:\n\n1. **Assumptions inversion.** Every load-bearing assumption is inverted: \"what if this is wrong?\"\n2. **Adversarial Q\u0026A.** The agent argues against the plan, surfacing the strongest objections.\n3. **Pre-mortem.** Imagine the change has shipped and failed — what failed and why?\n\nThe output is `UNDERSTANDING.md` — a record of shared understanding between you and the agent. Code that doesn't survive the crucible doesn't get written.\n\n---\n\n## Verification\n\n\u003e *\"Have the conversation. Then automate the conversation.\"*\n\u003e — **Dan North**, paraphrased from the BDD origin essays\n\nThree layers, all rolled into `VERIFICATION.md`:\n\n1. **Code audit.** Static review of the implementation against the spec.\n2. **Scenario execution.** Acceptance scenarios run against the actual code (BDD when supported, manual checklist otherwise).\n3. **Conversational UAT.** Structured back-and-forth with the user to confirm behavior matches intent.\n\nA feature ships only after all three layers pass.\n\n---\n\n## QA: black-box outsider pass\n\nAfter verify and before archive, FORGE runs a fresh outsider QA pass. The QA agent has only `SPEC.md` and an opaque `ArtifactDescriptor` (`{kind: cli|library|service|ui|other, identifier: \u003copaque string\u003e}`) — no implementation context, no test files, no plan. Four sections, each producing a status and findings phrased in user-facing terms:\n\n1. **Acceptance.** Does the shipped artifact deliver what `SPEC.md` promised, from a user's perspective? Verdict: `delivers | partial | does-not-deliver`.\n2. **Edge probing.** What would a normal user mistype, misuse, or stumble into?\n3. **Adversarial.** Capped red-team subagent (5 min walltime, 50 attempts) trying to break the feature.\n4. **NR regrep.** Re-greps the merged tree against every Negative Requirement in `SPEC.md` to catch re-introductions.\n\nThe QA artifact is `QA.md` with frontmatter `verdict` + `confidence` (high|partial|low), aggregated mechanically from the four section statuses. `tools.validate.qa_shape` enforces the contract.\n\n### Two timing modes\n\n- **Pre-PR gate (opt-in).** `/forge:ship` prompts: `\"Run QA before creating PR? [Y/n]\"` (default Y for `--standard`/`--full`, N for `--focused`). On accept, QA runs against the working tree at HEAD of the feature branch. Verdicts: `delivers` continues ship; `partial` re-prompts the user; `does-not-deliver` blocks PR creation. `--qa-override-with-rationale \"\u003creason\u003e\"` records an ADR'd override in `decisions.md`.\n- **Post-merge phase (terminal).** `/forge:qa --against merged` runs against the merged artifact after ship. Terminal phase for both `--standard` and `--full`; `--focused` never reaches ship or qa. Phase flips `state.json.phases.qa.status` to `done` and triggers archive.\n\nThe skill is the same in both timings; only the `--against` flag differs.\n\n---\n\n## Research phase\n\nBefore locking a spec for non-trivial work, the research phase emits a `RESEARCH.md` with four sections: codebase findings (top-level layout, modules touched, extension points), external docs (library-by-library, with citation), domain notes (glossary candidates), and risks surfaced. The phase auto-runs on `--full`; opt into it on `--standard` with `/forge:do --standard --research \"\u003cidea\u003e\"`; `--focused` refuses.\n\nExternal library docs are gathered through one of five grounding modes:\n\n| Mode | When | Citation format |\n| --- | --- | --- |\n| `full` | Context7 MCP server installed and reachable | `[context7:\u003clibrary_id\u003e:\u003csnippet_id\u003e]` |\n| `degraded` | No Context7, no BYOD coverage, no WebSearch fallback | none — explicit unavailable marker required in body |\n| `websearch` | Opt-in via `.forge/config.json` `research.websearch_fallback: true` (privacy implication; sends queries externally) | `[websearch:\u003curl\u003e]` |\n| `byod` | All extracted libraries have files at `.forge/external-docs/\u003clibrary\u003e.md` | `[byod:\u003clibrary\u003e:\u003csection-anchor\u003e]` |\n| `byod-partial` | Mixed: some libraries covered locally, some missing | mixed; uncovered libraries fall through to the degraded rule |\n\nThe bring-your-own-docs (BYOD) pattern lets air-gapped repos pre-stage authoritative docs locally — drop a markdown file at `.forge/external-docs/\u003clibrary\u003e.md` and the research subagent reads it as the citation source. Files older than `research.byod_stale_days` (default 90) emit a staleness warning.\n\nEcosystem detection is **pluggable**: out of the box, FORGE recognizes Python, Node, Rust, Go, Ruby, Java, .NET, Elixir, PHP, Swift, and Dart manifests. Polyglot repos (e.g., Node frontend + Python backend) return multiple ecosystem records; library extraction unions across them. Repos using an ecosystem FORGE doesn't recognize fall back to a generic dir-walk and a one-time prompt to pin the ecosystem via `.forge/config.json`. Adding support for a new ecosystem is a single-file plugin — no skill prose changes.\n\nThe grounding mode and BYOD coverage are recorded in the RESEARCH.md frontmatter and surfaced in the ship-time risk summary.\n\n---\n\n## Cross-AI peer review\n\n`/forge:review --cross-ai` delegates the review pass to an external CLI (codex, claude, or gemini) for an independent second opinion. Two modes:\n\n- **Manual (default).** The skill builds a self-contained prompt (spec excerpt, diff, finding format), applies the redaction filter, prints a disclosure summary (file count, diff LOC, estimated tokens, estimated USD with ±50% precision, redaction summary), then writes the prompt to `.forge/features/\u003cid\u003e/cross-ai/\u003ctarget\u003e-\u003cts\u003e-prompt.md`. You dispatch the external CLI yourself; paste the response back via `/forge:review --cross-ai-paste \u003cpath\u003e`. Manual mode performs no external dispatch and does not consume the auto-mode dispatch-approval cache.\n- **Auto (opt-in).** With `cross_ai.mode: auto` in `.forge/config.json` or the `--auto` flag per invocation, the skill spawns the external CLI directly via `subprocess.run` and captures the response. First-run-per-repo requires you to type `APPROVE` (cached as `cross_ai.dispatch_approved_at`); a cost-warn gate fires every invocation when the estimated USD exceeds `cross_ai.cost_warn_threshold_usd` (default `$0.50`) and requires `APPROVE-COST` (or `--skip-cost-warn` with a `decisions.md` deviation row).\n\nRedaction runs deterministically before any prompt materialization. The default deny-list strips `.env`, credentials, secrets, `.aws/`, and `.ssh/` files entirely; user-extendable via `cross_ai.redaction.deny_globs`, `deny_regex`, and `fatal_regex` in `.forge/config.json`. `fatal_regex` matches refuse dispatch unconditionally — even in auto mode.\n\nFindings parsed from the external response are merged into `REVIEW.\u003ctarget\u003e.md` with `Source: external-\u003cmodel\u003e` and feed the existing convergence loop on the same 3-cycle cap.\n\n---\n\n## Constitution bootstrap\n\n`/forge:amend-constitution --bootstrap` seeds `.forge/CONSTITUTION.md` for a fresh repo via the `forge-bootstrap-constitution` skill. Python owns the bounded I/O surface; the skill owns the drafting turn.\n\n- `tools.constitution_amend.collect_bootstrap_signals(repo_root)` reads up to 8 priority-ordered manifest + doc files (`pyproject.toml`, `package.json`, `Cargo.toml`, `go.mod`, `Gemfile`, `pom.xml`, `build.gradle`, `mix.exs`, `composer.json`, `*.csproj`, `AGENTS.md`, `CLAUDE.md`, `README.md`). Each file caps at 16 KiB; total payload caps at 80 KiB. Path-level deny globs (`.env*`, `*.pem`, `*.key`, `id_rsa*`) and a content-level secret scan keep credentials out of the signal payload. No LLM call.\n- The skill drafts Constitution articles in-session from those signals — evidence-based, project-specific, no universal seed (typical 3–12 articles).\n- `tools.constitution_amend.validate_drafted_markdown` gates the draft (frontmatter shape, required fields, per-article body cap of 1153 words, zero-article refusal). `tools.constitution_amend.persist_drafted_constitution` atomically writes the Constitution + decisions.md ADR; append failures roll both files back to the pre-call state.\n\nThe user steers via a sequential `AskUserQuestion` selector: `[a]ccept | [r]efine | [e]dit-in-editor | [s]kip | [c]ancel`. Refine loops up to 5 rounds (one clarifying question per turn), then degrades to accept-or-edit-or-cancel. Each question is its own turn — no batched lists.\n\n---\n\n## Convention routing\n\n`AGENTS.md` and `CLAUDE.md` prose conventions are honor-system until they get a mechanical enforcement surface. `/forge:amend-constitution --resync-agents` routes each MUST / SHOULD / SHALL / forbidden pattern to the mechanism that can actually catch it:\n\n| Mechanism | What it sees | Example fit |\n| --- | --- | --- |\n| **hook-enforced** | dispatch payload / tool input pre-call | dispatch-brief citation rule, files_in_scope shape |\n| **validator-enforced** | repo artifacts (commits, plans, state, review files) post-action | git commit shape, frontmatter, spec semantic |\n| **reviewer-tagged** | diff + Constitution articles, free-text findings | API-shape rules, naming conventions |\n| **advisory** | dispatch context only | tone, style preferences |\n\nThree concrete surfaces:\n\n1. **`.forge/conventions.json`** carries pattern-based rules: `{id, source_file, source_line, pattern_kind, pattern, scope, severity}` where `pattern_kind ∈ {forbidden_text, required_text, filename_glob_forbidden}`, `scope ∈ {commit_body, diff, dispatch_brief}`, severity from `{BLOCK, HIGH, MEDIUM, LOW, WARN}`. `python -m tools.validate --target conventions` checks `commit_body` and `diff` scopes during validation; `hooks/check_conventions.py` also enforces `diff`-scoped `BLOCK` rules at `Write` / `Edit` / `MultiEdit` PreToolUse time. `hooks/check_budget.py` enforces `dispatch_brief` scope at `Agent` PreToolUse time — `BLOCK` or `HIGH` denies the dispatch with the rule id in the deny reason.\n\n2. **`python -m tools.validate --target git-conventions \u003cfeature-folder\u003e`** validates every commit in `state.commits[]` against `.forge/config.json:git_conventions` — subject length, Conventional Commits grammar, scope allowlist, trailer ban patterns. Subject violations are `HIGH`; trailer ban hits are `BLOCK`; missing SHAs (shallow clone, force-push, garbage-collected) downgrade to `WARN`. `tools.ship_gate.partition_git_conventions` buckets the findings (BLOCK/HIGH → gate, MEDIUM → warn, LOW/WARN → info) so `/forge:ship` blocks on commit-shape regressions.\n\n3. **`forge-resync-agents` skill** drafts a convention inventory from `AGENTS.md` + `CLAUDE.md` + `README.md`, classifies each pattern by mechanism (hook / validator / reviewer-tag / advisory), then writes accepted rules to `.forge/conventions.json` via `tools.constitution_amend.append_conventions_entries` (atomic JSON merge + decisions.md ADR + rollback on append failure). Reviewer-tag entries route the user to `/forge:amend-constitution` to add a Constitution article; advisory entries log to decisions.md so honor-system status is explicit.\n\nConstitution articles also flow through the same enforcement chain: `forge-review` tags violations as `[constitution:A\u003cn\u003e]` in REVIEW findings; `tools.ship_gate` blocks ship on unresolved tagged rows above the article's level.\n\n---\n\n## Trap memory\n\nCross-feature trap memory keeps subagent regressions from recurring across features. `.forge/intel/lessons.md` is a parser-validated artifact carrying entries:\n\n```markdown\n## L007 — async fixture teardown leaks DB sessions\n**Captured:** 2026-05-11 from feature m8-p0-substrate\n**Resolved by:** abc1234...def890\n**Trap:** Async DB fixture used module scope; sessions leaked across tests.\n**Avoidance:** Use function scope; explicit teardown.\n**Tags:** fixtures, async\n**Severity:** HIGH\n**Status:** active\n```\n\nTags come from a controlled vocabulary (`imports`, `fixtures`, `state-mutation`, `async`, `secrets`, `validation`, `dispatch`, `review-tagging`, `ship-gate`, `cross-ai`, `bdd`, `frontmatter`) — free-form tags are rejected by the parser. Status transitions (`active`, `retired`, `superseded-by:L\u003cNNN\u003e`) go through `tools.intel.lessons.amend_status` with chain detection.\n\nThree integration points:\n\n1. **Auto-harvest.** `forge-review` Step 7 surfaces a capture prompt when a finding flips to `Status: resolved` AND the new `Resolved by` cell carries a 40-hex SHA AND severity is `HIGH` or `BLOCK`. The reviewer drafts a Lesson with tags matched from the controlled vocabulary; the user accepts, edits, or skips. `accepted-risk`, `spec-edit`, and `plan-edit` resolutions never harvest — no fix-with-SHA means no trap-with-fix pair to learn from.\n\n2. **Manual capture.** `/forge:lesson` opens the `forge-lesson` skill for repo-wide trap authoring — useful when a lesson surfaced outside a feature or when the resolution wasn't a SHA. The skill walks sequential `AskUserQuestion` turns (trap, avoidance, severity, captured-from, tags, accept) and writes via `tools.intel.lessons.append`. `Resolved by: manual`.\n\n3. **Dispatch injection.** `forge-spec`, `forge-plan`, and `forge-execute` call `tools.intel.lessons.load_and_filter(repo_root, idea_text, files_in_scope)` and pass filtered `traps[]` into every subagent's `context_budget`. The filter scores tag-intersection plus idea-text overlap, drops `retired` and `superseded-by:*` lessons, and caps total payload at `MAX_LESSON_WORDS = 600` so a heavy trap load cannot squeeze CRITICAL Constitution articles out of the budget. Articles and lessons share a percentile + cap helper (`tools.intel._relevance.score_and_trim`) but carry separate word budgets.\n\nReviewers tag lesson violations as `[lesson:L\u003cNNN\u003e]` in REVIEW findings; `tools.ship_gate.partition_by_lesson_severity` routes lesson severities to the same gate / warn / info buckets as articles (CRITICAL→BLOCK, HIGH→HIGH, MEDIUM→MEDIUM, LOW→LOW). Validator: `python -m tools.validate --target lessons`.\n\nThe REVIEW.md template gained a `Resolved by` column to make the harvest trigger deterministic — empty / 40-hex SHA / `spec-edit` / `plan-edit` / `accepted-risk:\u003creason\u003e`. Missing column on legacy reviews is tolerated; harvest only fires when the column carries a SHA.\n\n---\n\n## Per-feature artifacts\n\n\u003e *\"The heart of software is its ability to solve domain-related problems for its user. All other features, vital though they may be, support this central purpose.\"*\n\u003e — **Eric Evans**, *Domain-Driven Design* (2003)\n\nEvery FORGE feature lives in `.forge/features/\u003cid\u003e/` with a small set of contracts:\n\n- `SPEC.md` — the behavior contract\n- `RESEARCH.md` — codebase + external library discovery; full tier (auto) and standard with `--research` (opt-in). Carries grounding mode (`full` / `degraded` / `websearch` / `byod` / `byod-partial`) in frontmatter.\n- `DOMAIN.md` — full-tier source of truth for glossary, bounded contexts, aggregates, invariants. SPEC.md `# Domain` becomes a pointer to it.\n- `UNDERSTANDING.md` — output of the crucible\n- `PLAN.md` — file-bound vertical slices and waves\n- `REVIEW.plan.md` / `REVIEW.code.md` — per-target review findings and convergence cycles\n- `VERIFICATION.md` — three-layer verification record\n- `QA.md` — fresh-outsider black-box acceptance record (verdict + confidence + four sections)\n- `decisions.md` — running log of decisions and rationale (includes TDD Exception ADRs and QA Override ADRs)\n- `state.json` — phase / slice / wave state for resumption (carries `flow_version: 3` for v3 features)\n- `.forge/logs/\u003cfeature_id\u003e.jsonl` — optional local-only event log written via `tools.feature_log` when callers append events; gitignored, never sent over the network\n\nCanonical capability specs live in `.forge/specs/\u003ccapability\u003e/SPEC.md`. Feature specs are working artifacts and are merged or archived against canonical specs at ship time. Changes to shipped capabilities flow through OpenSpec-style delta proposals under `.forge/changes/\u003cid\u003e/proposal.md` via `/forge:change`.\n\nA project-wide `.forge/CONSTITUTION.md` carries CRITICAL / SHOULD / MAY articles. Each phase skill calls `tools.constitution.load_and_filter` to inject relevance-filtered articles into the dispatch context budget. The reviewer subagent tags violations as `[constitution:A\u003cn\u003e]` in REVIEW findings; `tools.ship_gate.parse_review_findings` partitions tagged rows by article level so unresolved CRITICAL or HIGH-mapped findings block `/forge:ship` until the user explicitly acknowledges them in `decisions.md`. Two cross-cutting surfaces extend this enforcement chain: `.forge/conventions.json` adds pattern-based hook / validator rules (see [Convention routing](#convention-routing)), and `.forge/intel/lessons.md` adds cross-feature trap memory (see [Trap memory](#trap-memory)). First-time drafting is skill-driven (see [Constitution bootstrap](#constitution-bootstrap)).\n\n### What the artifacts look like\n\n`SPEC.md` (excerpt):\n\n```markdown\n---\nfeature_id: 2026-05-10-fix-csv-import-error-handling\ntier: focused\nflow_version: 3\n---\n\n# Intent\nUsers importing malformed CSVs currently see a stack trace. We want a\nclear, actionable error message that names the offending row.\n\n# Scope\n- IN:  CSV importer error path\n- OUT: file-format detection, encoding negotiation\n\n# Acceptance\n- AC-1: empty CSV returns `InvalidInput(\"file is empty\")`.\n- AC-2: malformed row N returns `InvalidInput(\"row N: \u003creason\u003e\")`.\n- AC-3: well-formed CSV continues to import unchanged.\n\n# Negative requirements\n- NR-1: importer must not raise on user input — only return Result.\n```\n\n`state.json` (excerpt, focused tier mid-execute):\n\n```json\n{\n  \"feature_id\": \"2026-05-10-fix-csv-import-error-handling\",\n  \"tier\": \"focused\",\n  \"current_phase\": \"execute\",\n  \"phases\": {\n    \"spec\":    { \"status\": \"done\",        \"completed_at\": \"2026-05-10T10:14:02Z\" },\n    \"execute\": { \"status\": \"in_progress\", \"started_at\":   \"2026-05-10T10:18:55Z\" },\n    \"verify\":  { \"status\": \"pending\" }\n  },\n  \"skipped\":    [{ \"phase\": \"research\", \"reason\": \"research deferred; manual research acceptable\" }],\n  \"deviations\": [],\n  \"commits\": [\n    { \"sha\": \"abc1234\", \"phase\": \"spec\",    \"subject\": \"spec(csv-import): draft acceptance criteria\" },\n    { \"sha\": \"def5678\", \"phase\": \"execute\", \"subject\": \"test(csv-import): cover empty CSV path\" }\n  ]\n}\n```\n\n`flow_version` is omitted here on purpose — it is a post-ship migration sentinel (added by `tools.state.migrate_to_v3` only after a feature ships) and is never present on a focused-tier mid-execute state.\n\n---\n\n## Use outside Claude Code\n\n[`AGENTS.md`](AGENTS.md) at the repo root is a portable discovery manifest with the canonical command + skill list. Cursor, Aider, and Codex consume the same plain-markdown skills and commands. The markdown source is portable today; see [Host Compatibility](#host-compatibility) for which mechanisms are automatic in each host.\n\n---\n\n## Host Compatibility\n\nFORGE's markdown artifacts are portable across hosts, but automatic enforcement depends on the host integration model. Read each row as the behavior you get for that capability in that host.\n\n| Capability | Claude Code | Cursor | Aider | Codex | plain shell |\n| --- | --- | --- | --- | --- | --- |\n| State guard | Enforced | Advisory | Advisory | Advisory | N/A |\n| Context budget | Enforced | Advisory | Advisory | Advisory | N/A |\n| Diff conventions guard | Enforced | Advisory | Advisory | Advisory | N/A |\n| PreToolUse redaction | Enforced | Advisory | Advisory | Advisory | N/A |\n| PostToolUse audit | Enforced | Advisory | Advisory | Advisory | N/A |\n| Lesson harvest | Enforced | Advisory | Advisory | Advisory | N/A |\n| Session recovery hint | Enforced | Advisory | Advisory | Advisory | N/A |\n| Execute progress monitor | Enforced | Missing | Missing | Missing | N/A |\n| Bin wrappers (forge-do, forge-state, forge-validate) | Enforced | Advisory | Advisory | Advisory | Enforced |\n| Plugin agents (`forge-research`, `forge-review`, `forge-execute`, `forge-verify`, `forge-qa`) | Enforced | Missing | Missing | Missing | N/A |\n\nLegend:\n\n- `Enforced`: host runs the mechanism automatically; violations are blocked or surfaced.\n- `Advisory`: host does not run the mechanism, but the contract is documented in skill prose/artifacts and can be applied by a reviewer or downstream tool.\n- `Missing`: mechanism is not available in that host.\n- `N/A`: mechanism does not apply to that host's interaction model.\n\nRuntime support:\n\n| Surface | Status | Notes |\n| --- | --- | --- |\n| Python | 3.12+ required | `tools/` uses 3.12 syntax (`type` aliases, PEP 695) |\n| OS | Linux / macOS; Windows via WSL | tested on Linux + macOS; native Windows untested |\n\n---\n\n## Configuration\n\nPer-feature state lives in `.forge/features/\u003cid\u003e/state.json` (created by `/forge:do` or `/forge:spec`). Project-wide configuration lives in `.forge/config.json`:\n\n- `cross_ai.*` — cross-AI review provider, mode (`manual` / `auto` / `disabled`), `allowed_clis`, `timeout_seconds`, `max_prompt_tokens`, `cost_warn_threshold_usd`, redaction lists. Schema: `schemas/cross-ai-config.schema.json`.\n- `research.*` — research grounding fallbacks: `websearch_fallback`, `websearch_max_queries_per_run`, `byod_stale_days`, `ecosystems` pin, `ecosystem_overrides`. Schema: `schemas/research-config.schema.json`.\n- `git_conventions.*` — commit subject + scope + trailer rules consumed by `python -m tools.validate --target git-conventions`.\n\nRun `/forge:validate --target \u003cname\u003e` (or the equivalent `python -m tools.validate --target \u003cname\u003e`) to invoke the validator from inside Claude Code.\n\nDefault-tier and context-budget overrides remain on the roadmap.\n\nThe tooling itself (state machine, frontmatter linter, schema validator, archive helpers) is a small Python package shipped in `tools/`.\n\n---\n\n## Project layout\n\n```text\n.\n├── .claude-plugin/plugin.json   Claude Code manifest\n├── AGENTS.md                    portable discovery manifest\n├── README.md                    you are here\n├── commands/                    /forge:* slash commands\n├── skills/                      ambient + invokable skills\n├── hooks/                       PreToolUse hooks (budget, state, conventions)\n├── templates/feature/           per-feature artifact templates\n├── schemas/                     JSON Schemas for state and frontmatter\n├── tools/                       Python: state machine, linters, schema validator\n└── tests/                       unit + smoke + reference fixtures\n```\n\n---\n\n## Comparison vs alternatives\n\n| Tool | Niche | How FORGE differs |\n| --- | --- | --- |\n| **Aider** | terminal AI pair-programmer | Aider is a smart edit loop; FORGE is a phased lifecycle around any agent. Use Aider *inside* an execute slice if you like; FORGE governs the slice. |\n| **OpenSpec** | spec-as-code with delta proposals | FORGE adopts OpenSpec-style deltas (`/forge:change`) but adds a full pre-spec lifecycle (refine, domain, scenarios, crucible) and post-spec verification + QA. |\n| **GitHub Spec Kit** | spec → plan → tasks scaffolding | Spec Kit covers spec/plan/tasks; FORGE adds adversarial crucible, mechanical TDD pairing, three-layer verify, fresh-outsider QA, and on-disk state machine. |\n| **BMAD-Method** | role-played agent orchestration | BMAD scripts agent personas; FORGE encodes a state machine over artifacts. Personas optional. |\n| **Plain TDD/BDD/DDD** | discipline as principle | FORGE is the union of these as a single mechanical workflow, with the validators wired in. |\n\nIf you want **fast unstructured AI editing**, use Aider/Cursor directly. If you want **disciplined, resumable, auditable** AI work where the cost of building the wrong thing is high — FORGE.\n\n---\n\n## Security\n\nFORGE persists artifacts on disk and via git. Treat them like source code:\n\n- **`state.json.routing.idea` stores your prompt verbatim.** Do not paste secrets, API keys, customer PII, or internal hostnames into `/forge:do`. The text is committed alongside other `.forge/` artifacts.\n- **`SPEC.md`, `PLAN.md`, `decisions.md`, `QA.md` are committed.** Anything you tell the agent about the system ends up in git history. Use `.gitignore` patterns under `.forge/features/` for sensitive features.\n- **`.forge/logs/\u003cfeature_id\u003e.jsonl`** is gitignored and never sent over the network — local-only event log.\n- **Cross-AI review** sends review artifacts to a third-party model provider (codex, claude, or gemini). Manual mode (default) writes the prompt to disk for user-driven dispatch; auto mode (opt-in via `cross_ai.mode: auto` or `--auto`) spawns the external CLI directly. `tools.redaction` ships two default layers:\n  - **File globs** (`DEFAULT_DENY_GLOBS`): `.env`, credentials, secrets, `.aws/`, `.ssh/` files are excluded from prompts.\n  - **Inline regexes** (`DEFAULT_DENY_REGEX`): every prompt body is scanned for AWS access keys, GitHub PATs (classic + fine-grained), Anthropic / OpenAI / Gemini API keys, Slack tokens, PEM private-key markers, `Authorization: Bearer ...` headers, and `UPPER_CASE = ...` env-line declarations carrying `SECRET` / `TOKEN` / `API_KEY` / `PASSWORD`. Matches are replaced with `[REDACTED:\u003cidx\u003e]` markers and an unredacted-but-quoted stderr WARN is emitted (operator-visible). False positives are accepted as a deliberate trade-off — a redaction marker costs nothing, a leaked key costs everything.\n\n  Redaction is best-effort — review the prompt file in manual mode before sending. `cross_ai.redaction.fatal_regex` matches refuse dispatch unconditionally. Tighten `deny_globs` / `deny_regex` for repo-specific secrets; user-supplied patterns pass through a 256-char cap + nested-unbounded-quantifier ReDoS sanity filter at load time.\n- **Research grounding fallbacks** can leak refined-idea text externally: `research.websearch_fallback: true` sends queries (and refined-idea tokens, filtered through `tools.redaction`) to a web search provider. Disabled by default. BYOD (`.forge/external-docs/\u003clib\u003e.md`) is air-gapped — no network calls.\n- **Constitution and most conventions are enforced at review and ship time.** The reviewer subagent tags violations; `tools.ship_gate` blocks ship on unresolved tagged findings; `hooks/check_budget.py` denies dispatches that fail `dispatch_brief` convention checks. Diff-scoped `BLOCK` conventions also run at code-write time through `hooks/check_conventions.py`. Treat the gates as defense in depth — they catch what slipped past, not every possible unsafe change.\n- **State-writer hook** (`hooks/check_state_writer.py`) refuses direct `Write` / `Edit` / `MultiEdit` against `.forge/features/\u003cid\u003e/state.json`. The hook is narrow by design: SPEC.md, PLAN.md, decisions.md, and other per-feature artifacts are NOT protected at the tool boundary because operators routinely hand-edit them; their shape is owned by `python -m tools.validate`. Symlink traversal is documented design — the literal path is what the hook judges. The threat model assumes a cooperating agent, not a hostile actor with write access to the repo.\n\nReport security issues privately via GitHub Security Advisories on the repo.\n\n---\n\n## Contributing\n\nFORGE is in early active development. Issues and feedback are welcome.\n\n**Dev loop:**\n\n```bash\nmake install      # creates .venv, installs forge-tools[dev]\nmake check        # ruff + mypy --strict + pytest + validate-health (run before every commit)\nmake format       # apply ruff formatter\nmake test         # pytest only\nmake typecheck    # mypy strict only\npython -m tools.validate --target health   # planning-directory health\n```\n\n**Conventions:**\n\n- Python 3.12+, ruff (lint + format), mypy `--strict`, pytest.\n- [Conventional Commits](https://www.conventionalcommits.org/) with required scopes (e.g., `feat(routing):`, `fix(archive):`, `test(state):`). Atomic commits — one logical change per commit.\n- No `Co-Authored-By: Claude` trailers in commit messages.\n- No internal planning labels in code, docstrings, comments, or commit messages — those belong in the PR description.\n- All planning artifacts (`SPEC.md`, `PLAN.md`, etc.) must pass `python -m tools.validate`.\n- For larger features, follow FORGE's own lifecycle (`/forge:do`).\n\nSee [`AGENTS.md`](AGENTS.md) for the canonical command/skill manifest and contributor guidance.\n\n---\n\n## License\n\nMIT — see [LICENSE](LICENSE).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmikeleppane%2Fforge","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmikeleppane%2Fforge","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmikeleppane%2Fforge/lists"}