https://github.com/layer1labs/specsmith

AEE governance CLI for AI-assisted dev — WI lifecycle (specsmith wi), preflight gates, multi-agent dispatch, requirements<->test traceability, ESDB, MCP server, compliance, 64 project types, 136 skills, and 1607 tests.
https://github.com/layer1labs/specsmith

aee agentic-ai ai-governance applied-epistemic-engineering cli compliance epistemic-engineering esdb fpga llm mcp mcp-configs ollama python rag requirements-management skills traceability wi-lifecycle work-items

Last synced: 27 days ago
JSON representation

Host: GitHub
URL: https://github.com/layer1labs/specsmith
Owner: layer1labs
License: mit
Created: 2026-03-31T21:19:09.000Z (4 months ago)
Default Branch: main
Last Pushed: 2026-06-26T13:13:15.000Z (about 1 month ago)
Last Synced: 2026-06-26T13:23:01.034Z (about 1 month ago)
Topics: aee, agentic-ai, ai-governance, applied-epistemic-engineering, cli, compliance, epistemic-engineering, esdb, fpga, llm, mcp, mcp-configs, ollama, python, rag, requirements-management, skills, traceability, wi-lifecycle, work-items
Language: Python
Homepage: https://specsmith.readthedocs.io
Size: 62.1 MB
Stars: 8
Watchers: 1
Forks: 1
Open Issues: 1
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: CONTRIBUTING.md
- Funding: .github/FUNDING.yml
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
- Codeowners: .github/CODEOWNERS
- Security: SECURITY.md
- Governance: docs/governance/CONTEXT-BUDGET.md
- Roadmap: ROADMAP.md
- Maintainers: MAINTAINERS.md
- Agents: AGENTS.md

Awesome Lists containing this project

README

# specsmith

[![CI](https://github.com/layer1labs/specsmith/actions/workflows/ci.yml/badge.svg)](https://github.com/layer1labs/specsmith/actions/workflows/ci.yml)
[![Sponsor](https://img.shields.io/badge/sponsor-%E2%9D%A4-ea4aaa?logo=github)](https://github.com/sponsors/layer1labs)
[![Docs](https://readthedocs.org/projects/specsmith/badge/?version=stable)](https://specsmith.readthedocs.io/en/stable/)
[![PyPI](https://img.shields.io/pypi/v/specsmith?label=stable&style=flat&color=blue&cacheSeconds=60)](https://pypi.org/project/specsmith/)
[![Python 3.10+](https://img.shields.io/badge/python-3.10%2B-blue.svg)](https://www.python.org/downloads/)
[![License: MIT](https://img.shields.io/badge/license-MIT-green.svg)](https://github.com/layer1labs/specsmith/blob/main/LICENSE)

SpecSmith is the governance layer for AI-assisted development: it sits between agents and your repo, enforces preflight decisions, and records requirement/test traceability with auditable evidence. It is **not** an IDE, autonomous coding agent, CI runner, or legal-compliance certifier. Use SpecSmith when changes need repeatable controls, work-item lineage, and review-ready artifacts; do not use it for throwaway prototyping where governance overhead is unnecessary. Compared with GitHub Spec Kit, OpenSpec, and BMAD, SpecSmith adds execution-time policy gates and trace chains. Compared with Aider, Claude Code, and Cursor, SpecSmith governs those clients instead of replacing them. Compared with LangGraph and AutoGen, SpecSmith prioritizes software-governance outcomes and evidence quality over general-purpose multi-agent orchestration.

## Architecture at a glance

```
AI Agents / IDE Clients
|
v
SpecSmith Governance Layer
├── Repository Files
├── Requirements and Tests ──> CI and MCP Integrations
└── ESDB / Audit Ledger ──> CI and MCP Integrations
```

## When to use / when not to use

- Use when you need governed AI development, auditable decision trails, and requirement-to-test linkage.
- Avoid when rapid local prototyping is the only goal and formal governance is unnecessary.

## Comparison summary

- **GitHub Spec Kit / OpenSpec / BMAD:** strong specification practices; SpecSmith adds execution-time governance, work-item lifecycle control, and trace-chain evidence.
- **Aider / Claude Code / Cursor:** agentic coding interfaces; SpecSmith is the policy and evidence layer around these clients.
- **LangGraph / AutoGen:** orchestration frameworks; SpecSmith is a governance-first development layer with compliance-oriented traceability.

### Governance efficiency benchmark

We ran a [multi-condition benchmark](https://specsmith.readthedocs.io/en/stable/efficiency-benchmark/) comparing specsmith governance against 11 alternatives (ungoverned, BMAD, Cursor rules, Copilot, Aider, Cline, Codex CLI, OpenSpec, Agile BDD/TDD, and context injection) across real coding tasks with gpt-4o-mini and gpt-5.5.

| Condition | Pass Rate | Mean Tokens | Cost/run | Cost-of-Pass |
|---|---|---|---|---|
| Ungoverned (raw agent) | 0% on T1 | 44.6k | $0.0079 | ∞ |
| Context injection (CLAUDE.md) | 100% | 43.7k | $0.0084 | $0.0084 |
| BMAD-style structured prompting | 50% | 139.1k | $0.0262 | $0.0523 |
| **specsmith LIGHT (preflight)** | **100%** | **21.1k** | **$0.0032** | **$0.0032** |
| **specsmith FULL (governed)** | **100%** | **17.1k** | **$0.0026** | **$0.0026** |

**Key findings:** specsmith FULL is the only condition to achieve 100% pass rate on the feature-addition task (T1). It uses 2.6× fewer tokens than ungoverned and produces a cost-of-pass 3.2× lower than the next-best alternative. With gpt-5.5, governance reduces cost-of-pass by **6.3×** ($0.028 vs $0.179).

See the [full benchmark report](https://specsmith.readthedocs.io/en/stable/efficiency-benchmark/) and [model comparison (gpt-4o-mini vs gpt-5.5)](https://specsmith.readthedocs.io/en/stable/model-comparison/).

**v0.20.0** — Native Warp integration: `specsmith integrate warp` scaffolds `.warp/` MCP + launch configs and a Warp-aware `specsmith run` banner (REQ-444). Plus VRAM-aware local model recommendations: `specsmith local-model recommend` prints a per-role lineup (default / fast / harder pass / general) with a `fits`/`tight`/`spills` fit assessment (REQ-445).

**v0.19.x** — `specsmith wi link-test`, the governance-YAML content auditor and sync markdown-reconcile warnings, and a HuggingFace provider + 15-model multi-provider benchmark matrix for GovernanceBench.

**v0.18.0** — ESDB-first dual-write architecture (every governance event is written to ESDB alongside the append-only `LEDGER.md`), the `specsmith inspect` session-start governance block, and a token-pricing / cost-of-pass module.

**v0.17.x** — Canonical `docs/SPECSMITH.yml` scaffold path adopted across every CLI command, with CodeQL alerts driven to zero.

specsmith ships a full compliance and auditability layer aligned to the EU AI Act (2024/1689) and the NIST AI Risk Management Framework 1.0. Every agent action is cryptographically sealed, every AI-generated output is disclosed, context windows are GPU-aware, and compliance settings are configurable per-session and per-project.

### Selected CLI highlights

```bash
specsmith governance-serve --port 7700 # governance REST API
specsmith sync # YAML → JSON → MD (YAML-first mode)
specsmith generate docs # regenerate REQUIREMENTS.md + TESTS.md
specsmith validate --strict # dup IDs, orphans, coverage gaps
specsmith agent permissions-check git_push # tool permission gate (REQ-012)
specsmith ollama gpu # detect GPU VRAM, recommend context size
specsmith local-model recommend # VRAM-aware model lineup (fits/tight/spills)
specsmith integrate warp # scaffold Warp-native governance (MCP + launch config)
specsmith export # generate full compliance report

# Update channels
specsmith channel set stable # pin to stable releases
specsmith channel set dev # opt in to pre-release builds

# ESDB lifecycle
specsmith esdb export --json # dump records to JSON snapshot
specsmith esdb backup # create timestamped snapshot
specsmith esdb compact # WAL compaction

# Skills
specsmith skills deactivate # set active=false
specsmith skills delete --yes # permanently remove

# MCP + agent dispatch
specsmith mcp generate "Search USPTO patents" --json
specsmith agent ask "show esdb status" --json-output
```

It also co-installs the standalone `epistemic` Python library for direct use in any project:

```python
from epistemic import AEESession # works in any Python 3.10+ project
from epistemic import BeliefArtifact, StressTester, CertaintyEngine
```

> **Library vs CLI:** The `specsmith` CLI requires pipx for isolation. The `epistemic` library
> (and `specsmith.esdb`) work in any venv — `pip install specsmith` is all you need for
> library-only use. The pipx guard only fires on CLI invocations.

---

## What is Applied Epistemic Engineering?

AEE treats requirements, decisions, and assumptions — the beliefs your project depends on — as
engineering artifacts subject to the same discipline as code: version control, testing, and refactoring.

**The 4-step core method: Frame → Disassemble → Stress-Test → Reconstruct**

**The 5 foundational axioms:**
1. **Observability** — every belief must be inspectable
2. **Falsifiability** — every belief must be challengeable
3. **Irreducibility** — beliefs decompose to atomic primitives
4. **Reconstructability** — every failed belief can be rebuilt
5. **Convergence** — stress-test + recovery always reaches Equilibrium

---

## The AEE Workflow — 7 Phases

specsmith tracks your project through the full AEE development cycle:

```
🌱 Inception → 🏗 Architecture → 📋 Requirements → ✅ Test Spec
→ ⚙ Implementation → 🔬 Verification → 🚀 Release
```

```bash
specsmith phase # show current phase + readiness checklist
specsmith phase next # advance to the next phase (runs checks first)
specsmith phase set requirements # jump to a specific phase
specsmith phase list # list all phases
```

The current phase is persisted in `scaffold.yml` as `aee_phase`. Each phase has a checklist
of file/command criteria, recommended commands, and a readiness percentage.

## 1.0 release criteria status

| Criterion | Status | Source |
|---|---|---|
| Stable CLI core contract documented | In progress | `docs/stability.md` |
| Stable generated file schemas documented | In progress | `docs/stability.md` |
| Stable MCP tool schemas documented | In progress | `docs/stability.md` |
| Migration tests linked (#218) | In progress | `docs/roadmap/1.0-criteria.md` |
| Security threat model documented | In progress | `docs/security-threat-model.md` |
| Docs/tutorial/glossary baseline complete | In progress | `docs/roadmap/1.0-criteria.md` |
| Upgrade path and changelog criteria defined | In progress | `docs/roadmap/1.0-criteria.md` |

---

## Install

**Recommended — via pipx (CLI + CI):**

```bash
pipx install specsmith
```

That's it. `specsmith audit`, `preflight`, `sync`, `checkpoint`, `esdb`, `mcp serve`,
and all governance commands work immediately with no additional packages.

> **Want `specsmith run` with a cloud LLM?** Inject the provider SDK only if you use
> the built-in agentic REPL with a cloud API key:
>
> ```bash
> pipx inject specsmith anthropic # if you set ANTHROPIC_API_KEY
> pipx inject specsmith openai # if you set OPENAI_API_KEY
> pipx inject specsmith google-genai # if you set GOOGLE_API_KEY
> ```
>
> Ollama works out of the box with no injection — specsmith uses stdlib HTTP.
> For Warp, Claude Code, Cursor, and Copilot, the AI client provides the LLM;
> no injection needed.

**Library-only use (venv / conda / any Python environment):**

```bash
pip install specsmith # epistemic library + SQLite ESDB — no pipx needed
```

This makes `from epistemic import AEESession` and `from specsmith.esdb import SqliteStore`
immediately importable. The pipx isolation guard only applies to the `specsmith` CLI
command — not to library imports. Use this when you want the AEE belief-state machinery
in your own application without managing a pipx environment.

**ESDB — Epistemic State Database**
Terminology used in this repo is strict:
- **ESDB** = the specification/data model category.
- **SQLite backend** = the free/default ESDB implementation bundled in `specsmith`.
- **ChronoMemory** = the commercial package.
- **ChronoStore** = the backend engine/class provided by the ChronoMemory package.

| Tier | Package | License | What you get |
|------|---------|---------|-------------|
| **Default** | `specsmith` (built-in) | MIT, free | SQLite backend — requirements, test cases, confidence filtering |
| **Commercial** | `chronomemory` via `specsmith[esdb]` | Proprietary — license required (see [COMMERCIAL-LICENSE.md](https://github.com/layer1labs/specsmith/blob/develop/COMMERCIAL-LICENSE.md)) | ChronoMemory package (ChronoStore backend): tamper-evident SHA-256 WAL, OEA anti-hallucination fields, Rust acceleration, epistemic rollback |

See `docs/editions.md` for the full OSS vs commercial feature matrix.

`pip install specsmith` always installs the **free SQLite backend** automatically.
No additional packages, no license key, no configuration — it works out of the box.

```bash
specsmith esdb status # shows: SQLite (free, MIT) — active by default
```

**ESDB version-control policy (this repository):**
- Commit canonical SQLite state file: `.specsmith/esdb.sqlite3`.
- Commit canonical ChronoMemory state files: `.chronomemory/events.wal` and `.chronomemory/snapshot.json`.
- Do not commit ChronoMemory timestamped backup copies under `.chronomemory/backup/` (regenerated by `specsmith save` / `specsmith esdb backup`).

**Upgrading to ChronoMemory (ChronoStore backend, commercial):**

If you hold a chronomemory ESDB license, activate the commercial backend:

```bash
# Step 1 — install the chronomemory package
pip install "specsmith[esdb]" # installs chronomemory from PyPI
# or if using pipx:
pipx inject specsmith "chronomemory>=0.2.0" # inject into the specsmith pipx venv

# Step 2 — activate your license key
specsmith esdb enable --key-file /path/to/your.esdb.key
# The key is copied to ~/.specsmith/esdb.key and used automatically from now on.

# Step 3 — verify ChronoStore is active
specsmith esdb status
# ● ESDB — ChronoStore WAL (chronomemory commercial)
# ✔ License: your-org (expires YYYY-MM-DD)
```

To obtain a chronomemory ESDB license:
[licensing@layer1labs.ai](mailto:licensing@layer1labs.ai) · [ESDB licensing docs](https://specsmith.readthedocs.io/en/stable/esdb/#licensing)
· [ChronoMemory commercial terms](https://github.com/layer1labs/specsmith/blob/develop/COMMERCIAL-LICENSE.md)
See the [full ESDB docs](https://specsmith.readthedocs.io/en/stable/esdb/) for a feature comparison and Python API reference.

**Upgrading specsmith:**

```bash
pipx upgrade specsmith # preferred — upgrades the pipx-isolated CLI
specsmith self-update # alternative: self-update from within specsmith
```

---

## Quick Start

```bash
# 1. Install
pipx install specsmith

# 2. Start a new project (or import an existing one)
specsmith init # interactive scaffold wizard
specsmith import --project-dir ./my-project # adopt an existing repo

# 3. Bootstrap every session (run once at the start of each work session)
specsmith migrate run # apply any pending schema migrations
specsmith audit # verify governance health
specsmith sync # YAML → JSON → MD sync
specsmith checkpoint # emit GOVERNANCE ANCHOR

# 4. Before every code change: preflight
specsmith preflight "add paginated GET /todos endpoint" # gate the intent
# → decision: accepted | needs_clarification + work_item_id: WI-XXXXXXXX

# 5. After making changes: verify + save
specsmith verify # check equilibrium
specsmith save # commit governance state + push

# 6. Check AEE workflow phase and health
specsmith phase # current phase + readiness %
specsmith audit # full governance health check
```

> **Agentic REPL:** run `specsmith run` to start the Nexus governance-gated LLM REPL.
> Every utterance is preflighted automatically. Use `/why` to see the governance trace.
> For the multi-agent DAG dispatcher, see `specsmith dispatch run ""`.

### Standalone CLI (no AI agent)

All governance commands work without any AI agent or IDE integration:

```bash
specsmith audit # governance health check
specsmith preflight "" --json # gate a change
specsmith verify # check equilibrium after changes
specsmith save # ESDB backup + commit + push
specsmith kill-session # clean shutdown
```

For the full standalone session workflow, Nexus REPL usage, multi-agent dispatcher, and CI integration: **[Standalone CLI docs →](https://specsmith.readthedocs.io/en/stable/standalone-cli/)**

---

## Machine State Sync + YAML Governance

As of v0.11, specsmith uses **YAML-first governance**: `docs/requirements/*.yml`
and `docs/tests/*.yml` are the canonical sources. `REQUIREMENTS.md` and `TESTS.md`
are **generated artifacts** — do not hand-edit them.

```bash
# YAML-first pipeline (v0.11+)
specsmith sync # YAML → .specsmith/*.json → docs/*.md (all in one)
specsmith generate docs # regenerate only the Markdown artifacts from YAML
specsmith generate docs --check # dry-run: report what would change
specsmith validate --strict # enforce schema: dup IDs, orphans, missing fields
specsmith validate --strict --json # machine-readable validation result

# CI guard (already in .github/workflows/ci.yml)
specsmith sync --check # exits 1 if JSON cache is out of sync with YAML
```

**To add a new requirement**, edit the appropriate `docs/requirements/.yml`
file and run `specsmith sync`. **Never** hand-edit `docs/REQUIREMENTS.md` — it will
be overwritten by the next sync.

**Domain files:**

| File | REQ range | Domain |
|---|---|---|
| `docs/requirements/governance.yml` | REQ-001..064 | Core AEE governance |
| `docs/requirements/agent.yml` | REQ-065..129 | Nexus + CI |
| `docs/requirements/harness.yml` | REQ-130..160 | Slash commands + subagents |
| `docs/requirements/intelligence.yml` | REQ-161..220 | Instinct, eval, memory |
| `docs/requirements/context.yml` | REQ-244..247 | Context window |
| `docs/requirements/esdb.yml` | REQ-248..262 | ESDB + skills + MCP |
| `docs/requirements/ai_intelligence.yml` | REQ-263..299 | AI model intelligence |
| `docs/requirements/yaml_governance.yml` | REQ-300..312 | YAML governance layer |
| `docs/requirements/multiagent_compliance.yml` | REQ-313..320 | Multi-agent governance traceability |
| `docs/requirements/dispatch.yml` | REQ-321..334 | Multi-agent DAG dispatcher |
| `docs/requirements/esdb_full_coverage.yml` | REQ-395..402 | ESDB coverage gap-fill |
| `docs/requirements/esdb_first.yml` | REQ-403..422 | ESDB-first dual-write architecture |
| `docs/requirements/overflow.yml` | REQ-050, REQ-335..445 | VCS ops, skills catalog, session governance, Codity.ai, native Warp integration (REQ-444), VRAM-aware local-model recommendations (REQ-445) |

**Migration from Markdown-primary:**
`scripts/migrate_governance_to_yaml.py` once to convert an existing project.
Idempotent — safe to re-run.

## Least-Privilege Agent Permissions (REQ-012)

```bash
specsmith agent permissions # show active permission profile
specsmith agent permissions-check git_push # check if git_push is allowed
specsmith agent permissions-check git_push --no-log # dry-run (no ledger write)
```

Configure in `docs/SPECSMITH.yml`:
```yaml
agent:
permissions:
preset: standard # read_only | standard | extended | admin
# Or custom:
allow: [read_file, write_file, run_shell, git_status]
deny: [git_push, git_create_pr]
```

---

## AI Compliance & Governance

> **DISCLAIMER — Best-effort only.** specsmith is designed to *help* teams build
> auditable, explainable AI systems, but **it does not guarantee compliance** with
> any law or regulation. Regulations change frequently; final compliance determination
> is solely the responsibility of the end user. Layer1Labs makes no legal warranty.
> Found outdated coverage or a missing regulation?
> [Open a ticket](https://github.com/layer1labs/specsmith/issues) — we actively
> maintain the regulation database and welcome compliance PRs.

specsmith is designed from the ground up for **auditable, explainable, and human-overseen AI**.
It implements concrete compliance mechanisms mapped to the two major regulatory frameworks
that govern AI systems in production today.

### Standards Coverage

**EU AI Act (Regulation 2024/1689)** — The world's first comprehensive legal framework for AI,
enforced across the European Union. High-risk AI systems must provide transparency, auditability,
human oversight, and robustness. specsmith implements:

| EU AI Act Requirement | specsmith Mechanism |
|---|---|
| Art. 9 — Risk Management System | AEE verification loop with confidence scoring and equilibrium checks |
| Art. 12 — Logging & Record-Keeping | `TraceVault` SHA-256 chained ledger (tamper-evident, append-only) |
| Art. 13 — Transparency & Explainability | `ai_disclosure` block in every preflight response; `/why` in Nexus REPL |
| Art. 14 — Human Oversight | Human escalation threshold (`--escalate-threshold`); kill-switch CLI |
| Art. 15 — Accuracy & Robustness | Bounded retry (max 3×), confidence gates, hard context ceiling (REQ-247) |
| Art. 53 — GPAI Model Transparency | Provider + model name emitted in every `ai_disclosure` block |

**NIST AI Risk Management Framework 1.0 (AI RMF)** — The US standard for managing AI risk
across the AI lifecycle. specsmith addresses all four core functions:

| NIST AI RMF Function | specsmith Mechanism |
|---|---|
|| **GOVERN** — Policies & accountability | Governance rules, permissions profile, `scaffold.yml` policy |
| **MAP** — Risk identification | AEE stress-test, belief graph, contradictions and uncertainty metrics |
| **MEASURE** — Risk analysis | Confidence scoring, epistemic equilibrium, `specsmith epistemic-audit` |
| **MANAGE** — Risk treatment | Kill-switch, escalation, bounded retry, safe-write backup, permissions deny-list |

### How Each Compliance Mechanism Works

#### 1. Tamper-Evident Audit Log — `TraceVault` (REQ-206)

Every agent action, decision, milestone, and audit gate is recorded as a JSONL entry in
`.specsmith/trace.jsonl`. Each entry contains a SHA-256 hash of its own content plus the
hash of the previous entry, forming a cryptographic chain:

```jsonl
{"seq":1, "type":"DECISION", "description":"...", "hash":"a3f9...", "prev":"genesis"}
{"seq":2, "type":"MILESTONE", "description":"...", "hash":"7c2b...", "prev":"a3f9..."}
```

Any modification to a past entry breaks every subsequent hash. `specsmith trace verify`
detects and reports the first corrupted entry. The file is append-only — overwrites are
blocked by `safe_write`. This satisfies **EU AI Act Art. 12** (logging and record-keeping)
and **NIST AI RMF GOVERN** (accountability trail).

#### 2. AI Disclosure — Every Response (REQ-207)

Every preflight response includes a mandatory `ai_disclosure` block:

```json
{
"ai_disclosure": {
"governed_by": "specsmith",
"governance_gated": true,
"provider": "ollama",
"model": "qwen2.5:14b",
"spec_version": "0.20.0"
}
}
```

This ensures every AI-generated output is traceable to its source model and version,
meeting **EU AI Act Art. 13** (transparency) and **Art. 53** (GPAI transparency).
It is impossible to suppress — the field is injected at the governance layer before
any response is returned to the client.

#### 3. Human Escalation — Configurable Threshold (REQ-209)

When an action's confidence is below the escalation threshold, specsmith sets
`escalation_required: true` and includes an `escalation_reason` in the preflight payload.
AI clients that support MCP will surface this via the `governance_preflight` tool response.

```bash
specsmith preflight "deploy to production" --escalate-threshold 0.85 --json
# → escalation_required: true, escalation_reason: "confidence 0.71 < threshold 0.85"
```

This implements **EU AI Act Art. 14** (human oversight) and **NIST AI RMF MANAGE**.

#### 4. Kill-Switch — Immediate Session Termination (REQ-210)

A `kill-session` CLI command immediately terminates all active agent sessions and records
a timestamped kill event in `LEDGER.md`:

```bash
specsmith kill-session # terminate all sessions, log kill event
specsmith kill-session --session abc123 # terminate a specific session
```

This satisfies **EU AI Act Art. 14 §4** (ability to intervene and stop the AI system)
and is required for certification of high-risk AI systems.

#### 5. Append-Only Safe Write — `safe_write` (REQ-213)

All governance file writes go through `safe_write`, which:
- **Appends** to `LEDGER.md` and `.specsmith/ledger.jsonl` — never truncates
- **Backs up** any file before overwriting it (timestamped `.bak` copy)
- **Prevents** accidental destruction of audit history

This satisfies **EU AI Act Art. 12** (records must be kept for the lifetime of the system)
and provides recovery capability per **NIST AI RMF MANAGE**.

#### 6. Least-Privilege Permissions (REQ-217, REQ-012)

Every agent tool call is gated through a permission profile. Tools outside the active
profile are denied with exit code 3 and a ledger entry:

```bash
specsmith agent permissions-check git_push # exit 0 = allowed, exit 3 = denied
specsmith agent permissions # show active profile
```

Four built-in presets (`read_only`, `standard`, `extended`, `admin`) plus full
custom allow/deny lists in `.specsmith/config.yml`. This implements **NIST AI RMF GOVERN**
(policy enforcement) and principle of least privilege per standard security practice.

#### 7. Policy Guardrails — `is_safe_command` (REQ-220)

Before any shell command is executed, `agent.safety.is_safe_command()` classifies it
against a deny list of destructive patterns (`rm -rf`, `git push origin main`,
`kubectl apply`, `cat .env`, etc.). Denied commands are blocked and logged.
This implements **NIST AI RMF MANAGE** (risk treatment at the action level).

#### 8. Compliance Export Report (REQ-208, REQ-215)

`specsmith export` generates a full compliance report containing:
- **AI System Inventory** — all providers, models, and versions used
- **Risk Classification** — AEE phase, confidence scores, open work items
- **Human Oversight Controls** — active permission profile, escalation settings, kill-switch state
- **Audit Trail Summary** — TraceVault chain length, last verification, any tampering

```bash
specsmith export --format markdown > compliance-report.md
specsmith export --format json > compliance-report.json
```

This report is suitable as a starting point for audit evidence, but is **not a legal
certification**. Always verify with qualified counsel before regulatory submission.

### Compliance per Session and per Project

Compliance settings are layered:

1. **Global defaults** — `~/.specsmith/config.yml` (user-level defaults)
2. **Per-project policy** — `.specsmith/config.yml` (committed to the repo)
3. **Per-session overrides** — CLI flags

Compliance controls include: escalation threshold, permission profile, kill-switch, and
context window settings. Changes take effect immediately and can optionally be written back
to the per-project `.specsmith/config.yml`.

---

## Context Window Management

specsmith enforces safe, efficient use of LLM context windows — especially critical
when running local models via Ollama where the context limit directly affects GPU VRAM.

### GPU-Aware Context Sizing (REQ-244)

```bash
specsmith ollama gpu # detect GPU VRAM (NVIDIA + AMD supported)
specsmith ollama available # show models within your VRAM budget
```

VRAM tiers and recommended context sizes:

| VRAM | Recommended Context |
|---|---|
| < 6 GB (CPU or low-end GPU) | 4,096 tokens |
| 6–11 GB | 8,192 tokens |
| 12–19 GB | 16,384 tokens |
| 20 GB+ | 32,768 tokens |

Override via `SPECSMITH_OLLAMA_CONTEXT_LENGTH` or `ollama.context_length` in `.specsmith/config.yml`.

### Live Context Fill Indicator (REQ-245)

The context fill tracker emits real-time JSONL events:

```jsonl
{"type": "context_fill", "used": 27500, "limit": 32768, "pct": 83.9}
```

When fill reaches the compression threshold (default 80%), specsmith signals that context
summarization should run before the next turn.

### Auto Context Compression (REQ-246)

When fill reaches the compression threshold, specsmith automatically triggers
conversation summarization — the current context is condensed to a compact summary
that preserves key decisions and facts while freeing window space. This happens
transparently before the next agent turn.

Configure in `.specsmith/config.yml`:

```yaml
context:
compression_threshold_pct: 80 # trigger summarization at 80% fill
auto_compress: true # enable automatic compression
```

### Hard Context Ceiling — Never 100% Full (REQ-247)

A hard reservation of **15% of the context window** (minimum 2,048 tokens) is always
held back for the governance layer. Attempts to fill beyond the effective ceiling raise
`ContextFullError` — making it impossible to reach a state where even a compression
request cannot be processed. This is a safety invariant, not a configuration option.

---

## Governance REST API

```bash
# Start the governance REST API (for MCP clients and IDE integrations)
specsmith governance-serve --port 7700 --project-dir .

# Classify a natural-language utterance under Specsmith governance
specsmith preflight "fix the cleanup dry-run regression" --json

# Start the agentic REPL
specsmith run
> what does the cleanup module do? # read-only ask -> answered
> fix the cleanup dry-run regression # change -> Specsmith approves, runs
> delete the entire dist directory # destructive -> needs clarification
```

---

## Work Item (WI) Lifecycle

Every accepted `specsmith preflight` mints a **Work Item** — a unique ID such as
`WI-3A9F1C02` that tracks user intent through the full governance lifecycle.
WIs are persisted to `.specsmith/workitems.json` and evolve through defined states:

| State | Meaning |
|---|---|
| `open` | Minted by preflight; work in progress |
| `implemented` | `specsmith verify` reached equilibrium (auto-set) |
| `promoted` | Elevated to a formal REQ-NNN via `specsmith wi promote` |
| `closed` | Done; maps to an existing requirement |
| `archived` | Deferred; may be re-opened |
| `rejected` | Explicitly rejected |

```bash
# See all open work items
specsmith wi list --status open

# View full details of a WI
specsmith wi show WI-3A9F1C02

# Close a WI (change covered by an existing REQ)
specsmith wi close WI-3A9F1C02 --reason "covered by REQ-042"

# Promote a WI to a new requirement (new behaviour, no existing REQ)
specsmith wi promote WI-3A9F1C02 \
--title "System must retry on transient HTTP 5xx failures" \
--domain governance
specsmith sync # regenerate REQUIREMENTS.md

# Set kind label
specsmith wi tag WI-3A9F1C02 --kind bug

# Import historical WIs from LEDGER.md
specsmith wi import --from-ledger
```

**When to promote vs close:** Promote (`wi promote`) when the change introduces new
behaviour not covered by any existing REQ and the pattern is expected to recur.
Close (`wi close`) for bug fixes, refactors, and chores that already have a matching REQ.

Full documentation: [`docs/site/wi-lifecycle.md`](https://specsmith.readthedocs.io/en/stable/wi-lifecycle/)

---

## Nexus

The Nexus runtime is specsmith's local-first agentic REPL — a
governance-gated broker that sits between you and the LLM.

Every utterance passes through `specsmith preflight` before execution.
The broker classifies intent, matches requirements, and gates the action.
After execution, `specsmith verify` checks equilibrium. The `/why` command
shows the full governance trace.

```bash
# Interactive REPL with governance
specsmith run
nexus> fix the cleanup bug # broker classifies → accepts → executes → verifies
nexus> /why # show governance trace for last action
nexus> /exit
```

The Nexus broker:
- **Preflight gate**: every change goes through `specsmith preflight`
- **Bounded retry**: failed actions retry up to 3× with strategy classification
- **Execution trace**: every action is sealed in the cryptographic trace vault
- **`/why` toggle**: shows governance rationale in human-readable form

**How it works.** A natural-language **broker** classifies intent, infers scope from
your requirements, and asks Specsmith to **preflight** the request. Only when the
preflight decision is `accepted` does Nexus drive the AG2 orchestrator — and it does so
through a **bounded-retry harness** so you can never accidentally run away. By default,
Nexus speaks plain English; toggle `/why` in the REPL to surface the underlying
requirement, test, and work-item identifiers Specsmith assigned.

**Pieces in this repo.**
- `specsmith preflight` — CLI subcommand emitting a deterministic governance JSON payload
(`decision`, `requirement_ids`, `test_case_ids`, `confidence_target`, `instruction`).
- `src/specsmith/agent/broker.py` — natural-language broker (intent + scope + narration).
- `src/specsmith/agent/repl.py` — Nexus REPL with the `/why` toggle and execution gate.
- `docker-compose.yml` — pinned vLLM `l1-nexus` model server with the Hermes tool-call parser.
- `scripts/nexus_smoke.py` — opt-in live smoke test (`NEXUS_LIVE=1` to run against
a running container).

---

## AI Model Intelligence

specsmith ships a complete AI model intelligence layer for tracking, scoring, and routing
to the best available LLM for each task type.

### HF Open LLM Leaderboard Sync (REQ-263..REQ-269)

Syncs benchmark data from the HuggingFace Open LLM Leaderboard and computes three
task-specific bucket scores — **reasoning**, **conversational**, and **longform** — for
every model. A 40+ model static fallback ensures scores are always available even without
network access.

```bash
specsmith model-intel sync # sync from HF leaderboard (static fallback if offline)
specsmith model-intel scores # list all cached bucket scores
specsmith model-intel scores --model gpt-4o # show scores for a specific model
specsmith model-intel recommendations # top-10 models for reasoning bucket
specsmith model-intel recommendations --bucket conversational # or longform
specsmith model-intel connection # test HF API connectivity + token status
```

Set `SPECSMITH_HF_TOKEN` for authenticated access (1000 req/5min instead of 500).
Scores persist to `~/.specsmith/model_scores.json`. Background sync runs 15s after startup
then daily.

**Bucket formulas (normalised 0-100):**
- Reasoning = 0.35×MATH + 0.30×GPQA + 0.25×BBH + 0.10×IFEval
- Conversational = 0.40×IFEval + 0.35×MMLU-PRO + 0.25×BBH
- Longform = 0.35×MUSR + 0.35×IFEval + 0.30×MMLU-PRO

### Model Capability Profiles (REQ-270..REQ-271)

40+ pre-built model profiles cover all major providers (OpenAI, Anthropic, Google, Mistral,
Meta Llama, Qwen, DeepSeek, and local Ollama variants). Each profile specifies:
`max_tokens`, `prompt_style` (sections/xml/markdown), `supports_vision`,
`supports_tool_calls`, `reasoning_mode`, and `context_window`.

Context-aware history trimming preserves system messages while summarising older turns when
the token budget is exceeded:

```python
from specsmith.agent.model_profiles import get_profile, trim_history

profile = get_profile("qwen2.5:14b") # exact or prefix match; returns default if unknown
messages = trim_history(messages, budget_chars=12000)
```

### LLM Client with Provider Fallback (REQ-275..REQ-277)

`LLMClient` wraps multiple providers with automatic fallback on 429 / 401 errors,
O-series parameter translation (`max_completion_tokens`, temperature=1, developer role),
and vLLM guided-JSON payload injection:

```python
from specsmith.agent.llm_client import LLMClient

client = LLMClient([
{"provider_type": "cloud", "model": "gpt-4o", ...},
{"provider_type": "ollama", "model": "qwen2.5:14b", ...}, # local fallback
])
result = client.chat([{"role": "user", "content": "hello"}])
```

### Endpoint Presets + Suggest Profiles (REQ-278..REQ-280)

A registry of 10+ pre-configured endpoint presets for common cloud and local LLM providers:

```bash
specsmith agent endpoint-presets # list all presets (vllm, lm_studio, openrouter, etc.)
specsmith agent endpoint-presets --json # machine-readable output
specsmith agent suggest-profiles # suggest optimal profiles based on env (API keys, hardware)
specsmith agent suggest-profiles --json # structured suggestions with bucket/role annotations
```

Suggestions are read-only (never persisted) and inspect `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`,
`GOOGLE_API_KEY`, and local Ollama availability.

---

## Multi-Agent DAG Dispatcher (REQ-321..334)

The `specsmith dispatch` command group decomposes a task into a **Directed Acyclic Graph** of
agent work items and executes them concurrently, with fail-forward BLOCKED propagation and
ESDB context injection between nodes.

```bash
# Run a task through the DAG dispatcher (default: up to 4 concurrent workers)
specsmith dispatch run "add API endpoint with tests" --max-workers 4

# Stream JSONL events while the run is in progress
specsmith dispatch run "refactor auth module" --json

# Check status of a saved run
specsmith dispatch status --dag-id abc123def456

# List all saved runs
specsmith dispatch list

# Retry a single failed node from a checkpoint
specsmith dispatch retry --node impl --dag-id abc123def456
```

The dispatcher is also available programmatically:

```python
from specsmith.agent.orchestrator import Orchestrator

orchestrator = Orchestrator()

# Use the DAG path (falls back to GroupChat on cycle detection)
result = orchestrator.run_task("add feature X", use_dag=True)

# Always use DAG — returns DispatchSummary with per-node outcomes
summary = orchestrator.run_dispatch(
"add feature X",
planner_output=[
{"id": "arch", "title": "Design", "role": "architect", "depends_on": []},
{"id": "impl", "title": "Implement", "role": "coder", "depends_on": ["arch"]},
{"id": "test", "title": "Write tests", "role": "tester", "depends_on": ["arch"]},
],
max_workers=3,
)
print(f"{len(summary.completed)} completed, {len(summary.failed)} failed")
```

Events are persisted to `.specsmith/dispatch//events.jsonl` for resume and replay.

---

## Compiler and Tool Support

All agent roles can invoke compiler, linter, and formatter tools. These are registered in
`AVAILABLE_TOOLS` and wired into `ROLE_TOOLS` for the `coder`, `reviewer`, `tester`, `architect`,
and `embedded-coder` roles.

| Tool | Function | Default binary |
|------|----------|-|
| GCC / G++ | `run_gcc(args, compiler='gcc')` | `gcc` / `g++` |
| ARM bare-metal | `run_arm_gcc(args, compiler='arm-none-eabi-gcc')` | `arm-none-eabi-gcc` |
| AArch64 Linux | `run_aarch64_gcc(args, compiler='aarch64-linux-gnu-gcc')` | `aarch64-linux-gnu-gcc` |
| IAR Embedded | `run_iar_compiler(project_file, executable='IarBuild')` | `IarBuild` |
| Intel oneAPI | `run_intel_compiler(args, compiler='icx')` | `icx` / `icpx` / `icc` |
| clang-format | `run_clang_format(files, style='file', in_place=False)` | `clang-format` |
| clang-tidy | `run_clang_tidy(files, checks='', fix=False)` | `clang-tidy` |
| VSG (VHDL) | `run_vsg(files, rules=None, fix=False)` | `vsg` |

All tools are usable directly in the agentic REPL and in `specsmith dispatch` worker nodes:

```python
from specsmith.agent.tools import run_arm_gcc, run_clang_tidy, run_vsg

# Cross-compile for ARM bare-metal
result = run_arm_gcc("-Wall -O2 main.c -o firmware.elf", compiler="arm-none-eabi-gcc")

# Lint C/C++ with clang-tidy
result = run_clang_tidy("src/", checks="modernize-*,readability-*")

# Style-check VHDL files
result = run_vsg("rtl/top.vhd", rules="vsg_rules.yaml")
```

---

## Supporting specsmith

specsmith is open source and built by a small team. Every bit of support helps:

- ⭐ **Star** [specsmith](https://github.com/layer1labs/specsmith) on GitHub
- 📣 **Tell your friends and colleagues** — word of mouth is our best marketing
- 🐛 **Report bugs** via [GitHub Issues](https://github.com/layer1labs/specsmith/issues) — even small ones help
- 💡 **Suggest features** via [GitHub Discussions](https://github.com/layer1labs/specsmith/discussions) — we read every suggestion
- 🔧 **Fix bugs and contribute** — see [CONTRIBUTING.md](https://github.com/layer1labs/specsmith/blob/main/CONTRIBUTING.md); PRs welcome
- 📝 **Write about specsmith** — blog posts, tutorials, and talks help the community grow
- ❤️ **[Sponsor layer1labs](https://github.com/sponsors/layer1labs)** — directly funds development

---

## Ollama — Local LLMs (Zero API Cost)

specsmith has first-class Ollama support, including:

```bash
specsmith ollama gpu # detect GPU and VRAM tier
specsmith ollama available # show catalog filtered by VRAM budget
specsmith ollama available --task code # filter by task type
specsmith ollama pull qwen2.5:14b # download a model
specsmith ollama suggest requirements # task-based recommendations
specsmith ollama list # show installed models
```

GPU-aware context sizing: 4K/8K/16K/32K tokens based on detected VRAM.
Override via `SPECSMITH_OLLAMA_CONTEXT_LENGTH` env var or `ollama.context_length` in `.specsmith/config.yml`.

**VRAM-aware model lineup (REQ-445):** `specsmith local-model recommend` prints a per-role
lineup — default / fast / harder pass / general — keyed to your detected VRAM, with a fit
assessment (`fits` / `tight` / `spills`) so you can see which models run fully on the GPU
before pulling. `specsmith local-model detect` and `specsmith local-model setup` pick and
install a single hardware-appropriate Qwen2.5-Coder model.

---

## FPGA / HDL Projects

specsmith supports FPGA-specific project types with full governance:

```yaml
# scaffold.yml
type: fpga-rtl-amd # or fpga-rtl-intel / fpga-rtl-lattice / fpga-rtl
fpga_tools:
- vivado
- gtkwave
- vsg
- ghdl
- verilator
```

Supported tools: **Synthesis:** vivado, quartus, radiant, diamond, gowin.
**Simulation:** ghdl, iverilog, verilator, modelsim, questasim, xsim.
**Waveform:** gtkwave, surfer. **Linting:** vsg, verible, svlint.
**Formal:** symbiyosys. **OSS flow:** yosys, nextpnr, openFPGALoader.

---

## 50+ CLI Commands

**Governance:** `init` `import` `audit` `validate` `diff` `upgrade` `compress` `doctor` `export` `architect`

**AEE Epistemic:** `stress-test` `epistemic-audit` `belief-graph` `trace seal/verify/log`

**Workflow:** `phase show/set/next/list` `ledger add/list` `req list/add/gaps/trace`

**Work Items:** `wi list` `wi show` `wi close` `wi archive` `wi promote` `wi tag` `wi import`

**Agent:** `run` `agent run/plan/status/verify/improve/reports` `agent providers/tools/skills` `agent suggest-profiles` `agent endpoint-presets`

**Dispatch:** `dispatch run` `dispatch status` `dispatch list` `dispatch retry`

**Model Intel:** `model-intel sync` `model-intel scores` `model-intel recommendations` `model-intel connection`

**Ollama:** `ollama list/available/gpu/pull/suggest`

**Local Model:** `local-model detect/recommend/setup`

**Workspace:** `workspace init/audit/export`

**Integrations:** `integrate warp` `integrate codity` `integrate claude-code` `integrate cursor` `integrate copilot` `integrate aider` `integrate gemini` `integrate windsurf` `integrate agent-skill`

**VCS:** `commit` `push` `pull` `save` `load` `sync` `branch` `pr` `status` `checkpoint`

**Tools:** `tools scan [--fpga]` `tools install ` `tools rules [--tool] [--list]`

**Session & Process:** `exec` `ps` `abort` `watch` `credits` `self-update` `kill-session` `session-end`

**Auth:** `auth set/list/remove/check`

**Patent:** `patent search/prior-art`

---

## 64 Project Types

**Python:** `cli-python`, `library-python`, `backend-frontend`, `backend-frontend-tray`, `embedded-python-hmi`, `research-python`.

**Systems languages:** `cli-rust`, `library-rust`, `cli-go`, `cli-c`, `library-c`, `dotnet-app`.

**Modern web:** `web-frontend`, `fullstack-js`, `nextjs-app`, `nuxt-app`, `sveltekit-app`, `remix-app`, `astro-site`.

**AI / Agents:** `llm-app`, `agent-orchestration`, `mcp-server`, `rag-pipeline`, `mlops-platform`.

**JVM:** `java-spring`, `java-library`.

**Mobile:** `mobile-app`.

**Infrastructure:** `serverless`, `kubernetes-operator`, `microservices`, `devops-iac`, `streaming-pipeline`, `data-warehouse`, `data-ml`.

**Game development:** `game-unity`, `game-godot`.

**Web3:** `smart-contract`.

**Desktop:** `desktop-electron`, `desktop-tauri`.

**Hardware / Embedded:** `fpga-rtl`, `fpga-rtl-amd`, `fpga-rtl-intel`, `fpga-rtl-lattice`, `mixed-fpga-embedded`, `mixed-fpga-firmware`, `yocto-bsp`, `embedded-hardware`, `pcb-hardware`, `safety-critical`.

**Documents & IP:** `spec-document`, `user-manual`, `research-paper`, `research-python`, `api-specification`, `brief-lang`, `requirements-mgmt`, `patent-application`, `patent-prosecution`.

**Business / Legal / AEE:** `business-plan`, `legal-compliance`, `monorepo`, `browser-extension`, `epistemic-pipeline`, `knowledge-engineering`, `aee-research`.

---

## epistemic Library

The standalone `epistemic` Python library works in any Python 3.10+ project — no specsmith coupling:

```python
from epistemic import AEESession, BeliefArtifact, StressTester

session = AEESession("my-project", threshold=0.70)
session.add_belief(
artifact_id="HYP-001",
propositions=["The API always returns valid JSON"],
epistemic_boundary=["Valid auth token required"],
)
session.accept("HYP-001")
result = session.run()
print(result.summary())
# certainty=0.55, failures=2, equilibrium=False
```

Use cases: linguistics research, compliance pipelines, AI alignment, patent prosecution.

---

## Governance Rules

22 hard rules enforced by `specsmith validate` and `specsmith audit`. Rules 1–14 cover
core engineering and traceability (ledger required, proposals required, platform-awareness,
documentation currency, etc.). Rules 15–22 address anti-hallucination and epistemic
stability, derived from the OEA (Ontological Epistemic Anchoring) research framework
(Layer1Labs, 2026).

Full rule reference: [`docs/governance/RULES.md`](https://github.com/layer1labs/specsmith/blob/develop/docs/governance/RULES.md)

---

## Codity.ai AI Code Review Integration

specsmith can scaffold [Codity.ai](https://codity.ai) AI code review into any project:

```bash
specsmith integrate codity --project-dir ./my-project
```

This generates:
- `.github/workflows/codity-review.yml` (GitHub Actions) or `.gitlab-ci-codity.yml` / `.azure-pipelines/codity-review.yml` depending on your VCS
- `docs/codity-setup.md` — one-time setup checklist
- Appends a TODO checklist to `LEDGER.md`

**Windows note:** Codity's `install.sh` is Linux/macOS only. For native PowerShell use, install `codity.exe` from the official `codity-ai/codity-cli` GitHub release zip and place it on PATH (for example `~/.local/bin`).

**AGENTS.md rule (REQ-355):** Projects with Codity configured SHOULD run `codity review --staged` before any commit touching production code. HIGH-severity findings are blocking; MEDIUM findings require inline acknowledgement.

See the `codity-ai-review` governance skill (`specsmith skill install codity-ai-review`) for the full CLI workflow reference.

---

## Skills

specsmith ships **138 built-in skills** across 16 domains that AI agents (Warp, Claude Code, Codex, Cursor) can install and use.

```bash
# List all available skills
specsmith skill list

# Search by keyword
specsmith skill search zephyr

# Install a skill into .agents/skills/
specsmith skill install specsmith
specsmith skill install specsmith-save
specsmith skill install specsmith-audit
```

Skills are installed as `.agents/skills//SKILL.md` and are auto-discovered by any AI tool that scans `.agents/skills/`.

### Skill domains

| Domain | Count | Coverage |
|--------|-------|----------|
| `governance` | 21 | AEE workflows, verification, release, CI polling, patent prosecution, client integrations |
| `ai-agents` | 14 | LLM apps, MCP servers, agent orchestration, RAG, prompt engineering, fine-tuning, MLOps |
| `software-engineering` | 14 | Code review, TDD, debugging, security hardening, API design, ADRs, Brief lang |
| `web-backend` | 11 | Frontend UI, Next.js, REST/GraphQL, PostgreSQL, Redis, WebSockets |
| `platform-engineering` | 10 | Helm, observability, GitOps, secrets, OAuth2, chaos engineering |
| `embedded` | 11 | Zephyr, Yocto, FreeRTOS, bare-metal C, NuttX, Buildroot, Azure RTOS |
| `docs` | 10 | MkDocs, Sphinx, Doxygen, JSDoc, OpenAPI, mdBook |
| `data-engineering` | 8 | ETL/ELT, dbt, Spark, data quality, feature stores, Delta Lake |
| `hardware` | 9 | KiCad, Altium, Vivado, Quartus, GTKWave, JTAG |
| `corporate` | 7 | Budgets, fundraising, marketing, HR, legal |
| `devops` | 6 | Docker, Kubernetes, Terraform, GitHub Actions |
| `cloud` | 4 | AWS, Azure, GCP, GitHub CLI |
| `mobile` | 4 | iOS, Android, Flutter, React Native |
| `ssh` | 3 | SSH, WSL2, remote dev |
| `cross-platform` | 3 | CMake, package managers, terminal awareness |
| `productivity` | 3 | Email, presentations, MS Office |

### Self-referential governance skills

Three skills document specsmith itself:

| Slug | Purpose |
|------|--------|
| `specsmith` | Master CLI reference — session workflow, commands, audit codes |
| `specsmith-save` | When and how to run `specsmith save` |
| `specsmith-audit` | Running audits and interpreting results |

### Remote reference (Warp Oz cloud agents)

```bash
oz agent run-cloud --skill "layer1labs/specsmith:specsmith-save" --prompt "save my work"
```

---

## Agent Integrations

Run `specsmith integrate ` once in your project root — specsmith writes the governance file your AI client reads automatically.

| Tool | Command | Generated file | MCP support |
|---|---|---|---|
| **Warp** | `specsmith integrate warp` | `.warp/specsmith-mcp.json` + `.warp/launch_configs/` | ✓ Native |
| **Claude Code** | `specsmith integrate claude-code` | `CLAUDE.md` | ✓ via `.mcp.json` |
| **Cursor** | `specsmith integrate cursor` | `.cursor/rules/governance.mdc` | ✓ via `.cursor/mcp.json` |
| **Windsurf** | `specsmith integrate windsurf` | `.windsurfrules` | ✓ via Settings → MCP |
| **Gemini CLI** | `specsmith integrate gemini` | `GEMINI.md` | — |
| **Aider** | `specsmith integrate aider` | `.aider.conf.yml` | — |
| **Copilot** | `specsmith integrate copilot` | `.github/copilot-instructions.md` | — |

All generated files embed the same core content: governance rules, the mandatory session protocol, preflight requirements, and AGENTS.md as the primary governance source.

### Mandatory session protocol (all tools)

```bash
# Session start
specsmith kill-session 2>/dev/null || true
specsmith audit --project-dir .
specsmith sync --project-dir .
specsmith checkpoint --project-dir . # output GOVERNANCE ANCHOR verbatim

# Before every code change
specsmith preflight "" --json
# decision == "accepted" → proceed with work_item_id in scope
# decision == "needs_clarification" → surface instruction to user first

# Every 8–10 turns
specsmith checkpoint --project-dir .

# Session end
specsmith save && specsmith kill-session
```

### Claude Code

```bash
specsmith integrate claude-code # writes CLAUDE.md
```

To enable native MCP tool calls, add to `.mcp.json` (project root) or `~/.claude/mcp.json`:

```json
{
"mcpServers": {
"specsmith-governance": {
"command": "specsmith",
"args": ["mcp", "serve", "--project-dir", "."]
}
}
}
```

Or run `specsmith mcp install-claude-code` to print the snippet. With MCP configured, Claude calls governance tools natively — no shell roundtrip.

### Cursor

```bash
specsmith integrate cursor # writes .cursor/rules/governance.mdc
```

Add to `.cursor/mcp.json` to enable MCP:

```json
{
"specsmith-governance": {
"command": "specsmith",
"args": ["mcp", "serve", "--project-dir", "${workspaceFolder}"]
}
}
```

### Windsurf

```bash
specsmith integrate windsurf # writes .windsurfrules
```

MCP: **Settings → MCP Servers** → `{ "specsmith-governance": { "command": "specsmith", "args": ["mcp", "serve"] } }`

### Gemini CLI

```bash
specsmith integrate gemini # writes GEMINI.md (read automatically by the gemini CLI)
```

### Aider

```bash
specsmith integrate aider # writes .aider.conf.yml
```

For full governance, use `--no-auto-commits` and commit via `specsmith save`:

```bash
aider --no-auto-commits --read AGENTS.md --read .agents/skills/specsmith-session-governance/SKILL.md
```

Run `specsmith preflight` in a separate terminal before each change; run `specsmith save` after aider finishes.

### GitHub Copilot

```bash
specsmith integrate copilot # writes .github/copilot-instructions.md
```

Copilot reads `.github/copilot-instructions.md` for all workspace interactions. MCP not yet natively supported; governance is enforced through the instructions file and `AGENTS.md`.

Full per-tool setup: **[Agent Integrations docs →](https://specsmith.readthedocs.io/en/stable/agent-integrations/)**

---

## Warp Terminal Integration

specsmith ships first-class integration with [Warp](https://www.warp.dev) terminal: a
one-command `specsmith integrate warp` setup, a governance MCP server, a Warp-aware
agentic REPL, and repository workflows.

### One-command setup — `specsmith integrate warp`

```bash
specsmith integrate warp # scaffold Warp-native governance into the repo
```

The `warp` adapter (first-class as of v0.20.0 — no longer a legacy alias to `agent-skill`)
writes:

- `.warp/specsmith-mcp.json` — the governance MCP server config (identical to `specsmith mcp install-warp`)
- `.warp/launch_configs/specsmith-governed.yaml` — a Warp launch configuration that opens a governed `specsmith run` session
- `.agents/skills/SKILL.md` — the auto-discovered governance skill

### Warp-aware REPL

When you start `specsmith run` inside Warp, specsmith detects the host terminal (via
`TERM_PROGRAM` / `WARP_*` env vars) and shows a `terminal: Warp … — native integration
active` banner; the JSON `ready` frame carries a matching `terminal` field. Behavior is
unchanged outside Warp.

### Native MCP Governance Server

`specsmith mcp serve` starts a zero-dependency stdio MCP server (JSON-RPC 2.0, MCP 2024-11-05).
Warp/Oz, Cursor, Claude Code, or any other MCP client can call governance commands as structured
tool calls — no shell roundtrip, fully typed inputs and outputs.

**Setup (one time):**

```bash
# Get the Warp config snippet
specsmith mcp install-warp
```

Copy the output JSON into **Warp Settings → Agents → MCP servers**. Or pass it inline:

```bash
oz agent run --mcp '{"specsmith-governance": {"command": "specsmith", "args": ["mcp", "serve"]}}' \
--prompt "check governance health and preflight my next change"
```

**Six MCP tools exposed:**

| Tool | What it returns |
|---|---|
| `governance_audit` | Full audit health JSON — passed/failed checks, fixable count |
| `governance_checkpoint` | GOVERNANCE ANCHOR snapshot — phase, health, REQ/TEST counts, ESDB chain |
| `governance_preflight` | Preflight decision — `accepted`/`needs_clarification` + `work_item_id` |
| `governance_phase` | Current AEE phase, readiness %, failing checks |
| `governance_req_list` | All requirements with status + test coverage, filterable |
| `governance_trace_seal` | Create a cryptographic trace vault seal |

### Third-party CLI agent toolbar

For **specsmith** and **aider** (not yet natively supported by Warp), add this regex once in **Warp → Settings → Agents → Third party CLI agents → Commands that enable the toolbar**:

```
specsmith\s+run|aider
```

`claude`, `codex`, `gemini`, and `cursor` are already natively supported — no regex needed.

Once set, the toolbelt (Rich Input, attach code, File Explorer, Tab Configs, Remote Control) appears whenever `specsmith run` or `aider` is running. `specsmith run` also emits an OSC 9 desktop notification on session start — intercepted automatically by Warp, iTerm2, and Windows Terminal.

Run `specsmith integrate warp` to regenerate `.warp/SETUP.md` with the full per-REPL setup guide.

### Repository Workflows (Ctrl+Shift+R)

Clone this repo and open it in Warp — seven governance workflows appear automatically in
`Ctrl+Shift+R` search:

| Workflow | Command |
|---|---|
| specsmith — Session Start | Full bootstrap: kill → migrate → audit → sync → checkpoint |
| specsmith — Audit | `specsmith audit` |
| specsmith — Checkpoint | `specsmith checkpoint` (emits GOVERNANCE ANCHOR) |
| specsmith — Preflight | `specsmith preflight "{{intent}}" --json` |
| specsmith — Save | `specsmith save` |
| specsmith — Phase Status | `specsmith phase show` |
| specsmith — Session End | `specsmith save && specsmith kill-session` |

---

## The specsmith Bootstrap

specsmith governs itself — the specsmith repo is a specsmith-managed project. Run `specsmith audit`
in this repo to check its governance health. This means every feature we add to specsmith is
immediately dogfooded on specsmith itself.

## Documentation

**[specsmith.readthedocs.io](https://specsmith.readthedocs.io)** — Full manual: AEE primer,
command reference, project types, tool registry, governance model, ESDB, skills integrations, Ollama guide.

## Links

- [PyPI](https://pypi.org/project/specsmith/)
- [Documentation](https://specsmith.readthedocs.io)
- [Stability Contract](https://github.com/layer1labs/specsmith/blob/develop/docs/stability.md)
- [1.0 Release Criteria](https://github.com/layer1labs/specsmith/blob/develop/docs/roadmap/1.0-criteria.md)
- [Editions Matrix](https://github.com/layer1labs/specsmith/blob/develop/docs/editions.md)
- [Changelog](https://github.com/layer1labs/specsmith/blob/develop/CHANGELOG.md)
- [Contributing](https://github.com/layer1labs/specsmith/blob/develop/CONTRIBUTING.md)
- [Roadmap](https://github.com/layer1labs/specsmith/blob/develop/ROADMAP.md)
- [Compatibility Matrix](https://github.com/layer1labs/specsmith/blob/develop/docs/site/compatibility.md)
- [Product Principles](https://github.com/layer1labs/specsmith/blob/develop/docs/product-principles.md)
- [Security](https://github.com/layer1labs/specsmith/blob/develop/SECURITY.md)

## License

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/layer1labs/specsmith

Awesome Lists containing this project

README