{"id":50745174,"url":"https://github.com/henrio123/agent-work","last_synced_at":"2026-06-10T20:02:43.155Z","repository":{"id":355862346,"uuid":"1162203753","full_name":"henrio123/agent-work","owner":"henrio123","description":"Deterministic multi-agent orchestrator for software-development workflows. Schema-validated artifacts, retry-with-feedback, and local run analytics.","archived":false,"fork":false,"pushed_at":"2026-05-05T15:02:21.000Z","size":1156,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-05T16:40:05.865Z","etag":null,"topics":["agent-orchestration","agentic-workflows","ai-engineering","claude","llm","multi-agent","orchestration","schema-validation"],"latest_commit_sha":null,"homepage":null,"language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/henrio123.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":"docs/GOVERNANCE.md","roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null}},"created_at":"2026-02-20T01:24:34.000Z","updated_at":"2026-05-05T15:02:44.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/henrio123/agent-work","commit_stats":null,"previous_names":["henrio123/agent-work"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/henrio123/agent-work","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/henrio123%2Fagent-work","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/henrio123%2Fagent-work/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/henrio123%2Fagent-work/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/henrio123%2Fagent-work/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/henrio123","download_url":"https://codeload.github.com/henrio123/agent-work/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/henrio123%2Fagent-work/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34168086,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-10T02:00:07.152Z","response_time":89,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agent-orchestration","agentic-workflows","ai-engineering","claude","llm","multi-agent","orchestration","schema-validation"],"created_at":"2026-06-10T20:02:42.011Z","updated_at":"2026-06-10T20:02:43.138Z","avatar_url":"https://github.com/henrio123.png","language":"JavaScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# AI Organisation OS\n\nA deterministic, role-based orchestration system that turns unstructured AI work into reproducible, auditable, multi-agent pipelines. Every piece of state lives on the filesystem. Every output is schema-validated. Every decision is traceable.\n\n---\n\n## What It Does\n\n**Problem.** AI agents lose work. Conversations get compacted. Context windows overflow. Terminal scrollback disappears. There is no persistent record of what was decided, who did it, or why.\n\n**Solution.** This system replaces ad-hoc AI usage with a structured execution pipeline. Tickets enter the system. A deterministic scheduler picks the next task. A fixed sequence of role stages (Analyst, Architect, Dev, QA, Review) executes the work. Each stage produces schema-validated artifacts. Gates prevent advancement until quality checks pass. Everything is written to disk.\n\n**Typical outcomes:**\n- Every ticket has a complete audit trail from intake through review.\n- Role boundaries are enforced: a QA agent cannot write code, an Analyst cannot modify architecture.\n- Schema validation catches structural errors before they propagate.\n- Work survives agent restarts, context compaction, and session loss.\n- The entire system runs on Node.js built-ins with zero npm dependencies.\n\n**Who it is for.** Teams and individuals building multi-agent AI systems who need determinism, auditability, and reproducibility. Useful for anyone who has lost work to a crashed agent session or an overflowed context window.\n\n---\n\n## Scope and Limits\n\nWhat this is, stated honestly:\n\n- **Local-only execution.** Every tool runs against a workspace on the local\n  filesystem. There is no hosted service, no managed runner, no scheduled\n  execution beyond the optional cron-friendly drive loop.\n- **Filesystem-as-API by design.** No database, no queue, no HTTP service surface\n  except the optional read-only dashboard bound to `127.0.0.1`. State is files.\n- **Not a vector store or semantic RAG.** Context retrieval for task packs is\n  filename-, schema-, and graph-based. There is no embedding store.\n- **Not LangChain, LangGraph, or MCP.** The orchestrator drives an agent adapter\n  (the Claude CLI) directly via subprocess. No agent-framework dependency.\n- **The orchestrator dispatches; the LLM does not pick tools.** Stage transitions\n  and tool dispatch are deterministic. The LLM authors artifacts; it does not\n  decide what runs next.\n- **No token or cost telemetry yet.** Stage durations and an append-only audit log\n  are written; per-call token accounting is not.\n- **Single-machine.** Distributed execution is out of scope at the current\n  maturity level.\n\nRun validation at any point with `bash tools/test-all.sh`. Zero failures expected.\n\nFor the full design — state machine, determinism model, failure modes — see\n[`ARCHITECTURE.md`](./ARCHITECTURE.md) (overview) and\n[`docs/ARCHITECTURE_DETAILED.md`](./docs/ARCHITECTURE_DETAILED.md) (full).\n\n---\n\n## How It Works\n\n### Key Concepts\n\n| Concept | Description |\n|---------|-------------|\n| **Project** | A workspace initialized with `.claw/` containing metadata, agent roles, and a backlog of tasks. |\n| **Backlog Item** | A unit of work (`.claw/backlog/\u003cid\u003e.json`) with status, priority, owner role, dependencies, and an optional parent epic. |\n| **Run** | A single pipeline execution (`.claw/runs/\u003ctimestamp\u003e_\u003cticket\u003e/`) with a state machine from `intake` to `done`. |\n| **Stage** | A step in the pipeline owned by one role. Each stage requires specific artifacts validated against JSON schemas. |\n| **Picker** | A deterministic scheduler that selects the next eligible task using stable sort rules (priority bucket \u003e status \u003e priority \u003e ID). |\n| **Driver** | A one-shot executor that creates a run for the picked task, invokes the autonomous runner, and writes results. |\n\n### Pipeline Flow\n\n```\nTicket  --\u003e  Backlog Item  --\u003e  Task Pack  --\u003e  Run Creation  --\u003e  Pipeline Stages  --\u003e  Done\n                                                                     |\n                                                    intake -\u003e task-pack-generated\n                                                    analyze -\u003e plan\n                                                    implement -\u003e validate\n                                                    review -\u003e done\n```\n\n### Role Stages\n\n| Stage | Role | Required Artifact | Schema |\n|-------|------|-------------------|--------|\n| analyze | Analyst | `10-pm-brief.json` | `pm-brief.schema.json` |\n| plan | Architect | `20-arch-design.json` | `arch-design.schema.json` |\n| implement | Dev | `40-dev-patch.diff`, `41-dev-notes.json` | `dev-notes.schema.json` |\n| validate | QA | `50-qa-report.json` | `qa-report.schema.json` |\n| review | Review | `60-review-report.json` | `review-report.schema.json` |\n\nA stage advances only when all required artifacts exist and pass schema validation. Roles are enforced at runtime: a Dev agent cannot produce an Analyst brief.\n\n---\n\n## Maturity Model and Roadmap\n\n### Where We Are\n\n| Level | Name | Status |\n|-------|------|--------|\n| **Level 1** | Manual AI usage | Past |\n| **Level 2** | Structured multi-agent execution | Done |\n| **Level 3** | Autonomous org with memory | Done |\n| **Level 4** | Self-improving AI organization | Done |\n| **Level 5** | Closed-loop adaptive execution | Done |\n| **Level 6** | Template-enriched agent execution | Done |\n| **Level 7** | Last-mile delivery | **Current** |\n\n### Levels 1–2 — What's Done\n\nThe system provides deterministic multi-agent pipeline execution with full schema enforcement, role boundaries, and a structured work graph.\n\n**Phase 1 (Agent Identity \u0026 Control) — Complete.**\n- Persistent agent state (`.claw/agents/\u003cid\u003e/state.json`) with role, workload counters, and last-active tracking.\n- Runtime role-to-stage enforcement. A role must match the stage before it can produce artifacts.\n- `responsible_agent` tracked per run and per stage transition.\n- Agent workload visible in the project dashboard.\n- Picker rejects backlog items without an assigned `owner_role`.\n- Cross-role leakage prevention tested end-to-end.\n\n**Phase 2 (Structured Work Graph) — Complete.**\n- Epic-to-child hierarchy via `parent_id` on backlog items.\n- DAG validation with cycle detection (Kahn's algorithm).\n- Graph-aware picker: skips tasks with unsatisfied dependencies and children of blocked epics.\n- Epic completion rule: epics cannot be marked done while children are incomplete.\n- Write-time epic completion guard (`backlog-update-status.js`).\n- Preflight graph validation in the project driver.\n- Dashboard dependency chain visualization.\n\n**Hardening \u0026 Ops — Complete.**\n- `additionalProperties: false` enforced on all schemas (output, input, reference).\n- GitHub Actions CI gate (`bash tools/test-all.sh`) on every push and PR.\n- Preflight graph validation before creating or driving runs.\n- Cron-friendly drive loop wrapper with safe stop.\n\n### Level 3 — What's Done\n\n**Phase 3 (Knowledge \u0026 Artifact Layer) — Complete.**\n\n| Milestone | Description | Status |\n|-----------|-------------|--------|\n| Artifact classification | Tag each artifact with a semantic type (decision, design, implementation, test-result, research-finding) | DONE |\n| Global artifact index | Searchable index of all artifacts across projects and runs | DONE |\n| Research workflow | Dedicated workflow for research tasks with structured findings schema | DONE |\n| Agent memory | Append-only memory layer at `.claw/agents/\u003cid\u003e/memory/`, schema-validated | DONE |\n| Cross-run knowledge | Task pack generator references artifacts from prior runs when building context | DONE |\n\n### Level 4 — What's Done\n\n**Phase 4 (Self-Improving AI Organization) — Complete.**\n\n| Milestone | Description | Status |\n|-----------|-------------|--------|\n| Run analytics engine | Per-run metrics and project-level aggregates from completed runs | DONE |\n| Self-evaluation loops | Agents assess the quality of their own outputs against historical baselines | DONE |\n| Workflow optimization | System proposes pipeline improvements based on execution patterns | DONE |\n| Autonomous ticket creation | System identifies gaps and creates tickets without human intervention | DONE |\n| Adaptive role allocation | Agent assignment optimized based on workload and historical performance | DONE |\n\n### Level 5 — What's Done\n\n**Phase 5 (Closed-Loop Adaptive Execution) — Complete.**\n\n| Milestone | Description | Status |\n|-----------|-------------|--------|\n| Post-run lifecycle hooks | Self-eval + gap scan auto-triggered after every completed run | DONE |\n| Adaptive agent prompt | Agent memory and workflow suggestions injected into Claude Code prompt | DONE |\n| Agent assignment actuation | `recommended_agent` flows from picker through driver to runner and prompt | DONE |\n| Adaptive JS drive loop | Adaptive sleep, post-run hooks, project filtering, graceful stop conditions | DONE |\n| Dashboard Phase 5 fields | Last evaluation, hooks status, loop status in dashboard summary | DONE |\n\n### Level 6 — What's Done\n\n**Phase 6 (Template-Enriched Agent Execution) — Complete.**\n\n| Milestone | Description | Status |\n|-----------|-------------|--------|\n| Template-enriched prompt | `buildAdapterPrompt()` reads stage task files, includes GOAL/STEPS/OUTPUT instructions | DONE |\n| Prior artifact context | `buildArtifactContext()` reads and injects prior artifact content into prompt | DONE |\n| Validation retry loop | `buildRetryPrompt()` provides error feedback, adapter retries up to `maxRetries` times | DONE |\n| Auto-patch application | Post-implement hook applies `40-dev-patch.diff` via `applyDevPatch()` (dry-run first) | DONE |\n| Dashboard Phase 6 fields | Feature flags in dashboard summary | DONE |\n\n### Level 7 — What's Done\n\n**Phase 7 (Last-Mile Delivery) — Complete.**\n\n| Milestone | Description | Status |\n|-----------|-------------|--------|\n| `artifactList` scope fix | Hoisted `artifactList` to function scope in `claudeCodeAdapter` | DONE |\n| Post-patch test execution | `discoverTestCommand()` + `runPostPatchTests()` in `post-patch-verify.js` | DONE |\n| Auto-commit | `autoCommit()` creates structured git commit (opt-in, never pushes) | DONE |\n| Backlog auto-completion | Run reaching `done` auto-transitions backlog item via `updateBacklogStatus` | DONE |\n| `recommendAgent` fallback | Picker fallback to `recommendAgent()` from `agent-performance.js` | DONE |\n| Claude adapter test coverage | Mock-based tests for all `claudeCodeAdapter` paths | DONE |\n| Dashboard Phase 7 fields | Feature flags in dashboard summary | DONE |\n\n---\n\n## Features\n\n### Implemented\n\n- Deterministic pipeline with 8-stage state machine (intake through done)\n- 5 role stages (Analyst, Architect, Dev, QA, Review) with enforced boundaries\n- Schema-validated artifacts with `additionalProperties: false` on all schemas\n- Deterministic project scheduler with priority buckets and stable sort\n- One-shot and loop project drivers with preflight graph validation\n- Epic-child hierarchy with DAG validation and cycle detection\n- Write-time epic completion guard\n- Task pack generation (deterministic, no LLM calls)\n- Autonomous multi-agent runner with scaffold, Claude CLI, and draft-file adapters\n- Append-only audit logging per run\n- Agent identity with persistent state and workload tracking\n- Runtime role enforcement (cross-role leakage prevention)\n- Project dashboard with dependency chains and agent workload\n- Ticket persistence with anti-truncation guards\n- Stop/resume mechanism for autonomous runs\n- Stall detection (30-minute threshold on audit log)\n- HTTP dashboard on localhost:18790\n- External workspace model (pure engine, `.claw/` in target repos)\n- Workspace bootstrap and patch application tools\n- Pluggable capability system (capability registry, manifest-driven stage injection)\n- Goal-driven mission layer (deterministic intent + stack detection, capability activation)\n- Pluggable capabilities: UX audit, security audit, performance audit, research\n- Artifact classification with semantic types and global artifact index\n- Research workflow with structured findings schema\n- Agent memory (append-only, schema-validated, cross-run)\n- Cross-run knowledge retention in task pack generation\n- Run analytics engine (per-run metrics + project-level aggregates)\n- Self-evaluation with quality score, deviation analysis, and memory persistence\n- Workflow suggestion engine (4 detection rules with evidence and confidence)\n- Gap scanner with auto-create backlog items (idempotent)\n- Agent performance profiling with recommended_agent in picker\n- Dashboard Phase 4 summary (performance, suggestions, gaps)\n- Post-run lifecycle hooks (auto self-eval + gap scan after every run)\n- Adaptive agent prompt (memory + workflow suggestions injected into Claude Code prompt)\n- Agent assignment actuation (recommended_agent flows pick → drive → runner → prompt)\n- Adaptive JS drive loop (adaptive sleep, hooks, project filter, graceful stop)\n- Dashboard Phase 5 summary (last evaluation, hooks status, loop status)\n- Template-enriched agent prompt (stage task files, prior artifact context, retry feedback)\n- Validation retry loop (maxRetries with error feedback to agent)\n- Auto-patch application (post-implement dry-run + apply, non-fatal)\n- Dashboard Phase 6 feature flags\n- Post-patch test execution (discover + run project tests after patch application)\n- Auto-commit (opt-in structured git commit after successful tests, never pushes)\n- Backlog auto-completion (run → done transitions backlog item to done)\n- `recommendAgent` fallback (picker uses agent-performance when no recommendation)\n- Dashboard Phase 7 feature flags\n- GitHub Actions CI (run `bash tools/test-all.sh` for current counts)\n- Zero external npm dependencies\n\n---\n\n## Repo Map\n\n```\n.\n├── ARCHITECTURE.md              # Concise architecture overview (links to detailed)\n├── LICENSE                      # MIT\n├── docs/\n│   ├── ARCHITECTURE_DETAILED.md # System design, state machine, determinism model, evolution roadmap\n│   ├── GOVERNANCE.md            # Golden rules: schema enforcement, testing, no external deps\n│   ├── ux-spec-autonomous-runner.md  # CLI contract for autonomous runner\n│   ├── product-diagram.md       # System overview diagram\n│   ├── drift-report.md          # Domain leakage verification report\n│   └── phase{3,4,5,6}-plan.md   # Phase plan documents\n├── skills/\n│   ├── dev-pipeline/\n│   │   ├── SKILL.md                 # Comprehensive operational reference\n│   │   ├── scripts/                 # Core engine (run `bash tools/doc-stats.sh`)\n│   │   │   ├── dev-pipeline.js      # Core pipeline engine (state machine, schema validation, stage gates)\n│   │   │   ├── capability-registry.js # Capability loader, stage injection, template/schema resolution\n│   │   │   ├── goal-selector.js     # Deterministic intent + stack → capability mapping\n│   │   │   ├── create-mission.js    # CLI: goal → mission + capabilities.json\n│   │   │   ├── autonomous-runner.js # Multi-agent autonomous execution loop (Phase 6: template-enriched)\n│   │   │   ├── adapter-prompt-builder.js # Template-enriched prompt assembly (Phase 6)\n│   │   │   ├── project-next-pick.js # Deterministic task picker\n│   │   │   ├── project-next-drive.js # One-shot project driver with preflight validation\n│   │   │   ├── validate-backlog-graph.js  # DAG validator (cycles, parents, epic completion)\n│   │   │   └── ...                  # Index, dashboard, task-pack, ticket-store, agent-state\n│   │   ├── schemas/                 # 27 JSON Schema files (input + output schemas)\n│   │   ├── references/              # 7 artifact schemas (pm-brief, arch-design, dev-notes, etc.)\n│   │   └── tests/                   # Test suites (run `bash tools/test-all.sh`)\n│   └── capabilities/               # Pluggable capability extensions\n│       ├── ux_audit/                # UX audit stage (after analyze)\n│       ├── security_audit/          # Security audit stage (after analyze)\n│       ├── performance_audit/       # Performance audit stage (after analyze)\n│       └── research/               # Research workflow (after analyze)\n├── tools/                       # shell wrappers (the public CLI surface)\n│   ├── dp.sh                    # Main CLI entry point\n│   ├── create-mission.sh        # Goal → capability activation\n│   ├── test-all.sh              # Master test gate\n│   ├── project-next-drive.sh    # One-shot project driver\n│   ├── project-drive-loop.sh    # Cron-friendly loop wrapper\n│   ├── backlog-update-status.sh # Status transition with guards\n│   ├── init-workspace.sh          # Bootstrap .claw/ in a target repo\n│   ├── apply-dev-patch.sh         # Apply dev patch to workspace\n│   ├── _workspace.sh              # Shared --workspace flag parser\n│   └── ...                      # run-next-*, project-*, dashboard-*, ticket-*\n├── openclaw/                    # OpenClaw integration docs and example config\n├── templates/                   # Core role-specific task pack templates\n├── .github/workflows/test.yml   # CI: runs test-all.sh on push and PR\n├── SOUL.md                      # Workspace agent contract: principles\n├── BOOTSTRAP.md                 # Workspace agent contract: first-run init\n├── HEARTBEAT.md                 # Workspace agent contract: periodic-check marker\n├── IDENTITY.md                  # Workspace agent contract: identity template\n├── USER.md                      # Workspace agent contract: facts about the human\n├── TOOLS.md                     # Workspace agent contract: local environment notes\n├── SECURITY.md                  # Security boundaries and access controls\n└── AGENTS.md                    # Workspace orientation for agents (industry standard)\n```\n\n### Workspace Agent Contracts\n\nThe seven root markdown files (`AGENTS.md`, `BOOTSTRAP.md`, `HEARTBEAT.md`,\n`IDENTITY.md`, `SOUL.md`, `USER.md`, `TOOLS.md`) are operating instructions for an\nagent — for example Claude Code — that **visits this workspace as a personal\nassistant**. They are intentionally kept at the workspace root because the\n`AGENTS.md` contract instructs the visiting agent to read them by bare name.\n\n**This is a separate concern from the dev-pipeline orchestrator** described in\nthis README. The orchestrator's roles, schemas, and execution surface live in\n`agents.json`, `skills/dev-pipeline/`, and `tools/`. The orchestrator does not\nread `SOUL.md`, `IDENTITY.md`, or `USER.md`. The two layers coexist in the same\nrepository but solve different problems.\n\n---\n\n## Quickstart\n\n### Requirements\n\n- **Node.js 20+** (LTS). No other runtime dependencies.\n- **No npm install.** The system uses only Node.js built-in modules (`node:fs`, `node:path`, `node:os`, `node:crypto`).\n- **Bash** for shell wrappers.\n\n### Run Tests\n\n```bash\nbash tools/test-all.sh\n```\n\nExpected output: `TOTAL: N passed, 0 failed (M suites)` with zero failures.\n\n### Initialize a Target Repo\n\n```bash\n./tools/init-workspace.sh --workspace /path/to/my-project \\\n  --project_id my-project --title \"My Project\"\n```\n\nCreates `.claw/` directory structure with project.json, agents.json, and all required subdirectories.\n\n### Run the Project Dashboard\n\n```bash\n./tools/project-dashboard.sh --workspace /path/to/my-project | jq\n```\n\n### Pick the Next Eligible Task\n\n```bash\n./tools/project-next-pick.sh --workspace /path/to/my-project | jq\n```\n\n### Drive One Task (One-Shot)\n\n```bash\n./tools/project-next-drive.sh --workspace /path/to/my-project --dry_run | jq\n```\n\nRemove `--dry_run` to execute for real.\n\n### Apply a Dev Patch\n\n```bash\n./tools/apply-dev-patch.sh --workspace /path/to/my-project \\\n  --run_folder .claw/runs/20260220_T-01 --dry_run\n```\n\n### Drive in a Loop (Cron-Friendly)\n\n```bash\n./tools/project-drive-loop.sh --workspace /path/to/my-project --sleep 5 --max 10\n```\n\nStops on `.stop` file, max iterations, or no eligible work. Prints JSON summary on exit.\n\n### Create a Mission (Goal-Driven Capability Activation)\n\n```bash\n./tools/create-mission.sh --workspace /path/to/my-project --goal \"improve UX of checkout\"\n```\n\nDetects intents (`ux`) and stack (`nextjs`), activates the `ux_audit` capability, and writes `.claw/capabilities.json` + `.claw/missions/\u003cid\u003e.json`.\n\n### Validate a Project's Backlog Graph\n\n```bash\n./tools/validate-backlog-graph.sh --workspace /path/to/my-project my-project | jq\n```\n\n### Update Backlog Item Status (With Guards)\n\n```bash\n./tools/backlog-update-status.sh --workspace /path/to/my-project my-project TASK-01 done\n```\n\nRejects epic-to-done transitions when children are incomplete.\n\n---\n\n## Configuration\n\n| Variable | Default | Description |\n|----------|---------|-------------|\n| `WORKSPACE_ROOT` | `~/dev/agent-work` | Root of the target project repo. All `.claw/` paths resolved relative to this. Can also be set via `--workspace` flag. |\n| `DP_AUDIT_LOG` | `0` | Set to `1` to enable append-only audit logging. |\n| `LOOP_SLEEP_SECONDS` | `5` | Seconds between drive loop iterations. |\n| `LOOP_MAX_ITERATIONS` | `100` | Max iterations for drive loop (0 = unlimited). |\n\nAll configuration is via environment variables. No config files to manage.\n\n---\n\n## Safety, Determinism, and Quality Controls\n\n### Guardrails\n\n- **Path confinement.** Every file operation goes through `safePath()` which rejects any path resolving outside `WORKSPACE_ROOT`.\n- **Schema enforcement.** Every JSON output is validated against a schema with `additionalProperties: false`. Undeclared fields are rejected.\n- **Role enforcement.** Agents can only produce artifacts for stages matching their assigned role.\n- **Read-only tools.** Index, pick, dashboard, watch, and list tools never create, modify, or delete files.\n- **Write guards.** The autonomous runner never overwrites existing artifacts. The backlog updater rejects invalid epic transitions.\n- **Graph validation.** Dependency cycles and invalid parent references are detected before runs start.\n\n### Determinism\n\nGiven the same filesystem state:\n- The picker always returns the same task.\n- The index always returns the same project/run arrays in the same order.\n- Task packs are generated identically (excluding timestamps).\n- Schema validation produces the same result.\n\nTimestamps and git HEAD are the only sources of non-determinism.\n\n### Testing Strategy\n\n- Comprehensive test coverage (`bash tools/test-all.sh`) covering:\n  - State machine transitions and gate enforcement\n  - Schema validation round-trips for all artifact types\n  - Picker determinism and graph-aware constraint enforcement\n  - Autonomous runner safety (no directory creation, no artifact overwrite)\n  - Role enforcement and cross-role leakage prevention\n  - Dashboard computed fields and dependency chain enrichment\n  - Epic completion guards (validation-time and write-time)\n  - Real data validation against live schemas\n  - Capability registry, goal-selector, and mission layer\n  - End-to-end capability injection (UX, security, performance audits)\n- **`tools/test-all.sh`** is the single gate. Zero failures required.\n- **GitHub Actions CI** runs on every push and PR.\n\n### Failure Modes\n\n| Failure | Behavior |\n|---------|----------|\n| Corrupted JSON | Skipped by index/pick tools. `run_next_safe` returns `action: \"error\"`. |\n| Missing status.json | Run listed with `has_status: false`, not picked. |\n| Stalled run | Detected when audit log untouched for 30+ minutes. Flagged in dashboard. |\n| Schema violation | Artifact rejected. Stage cannot advance. |\n| Dependency cycle | Detected by validator. Driver skips project with JSON warning. |\n| Agent role mismatch | Artifact submission rejected with clear error. |\n\n---\n\n## Governance\n\n### Where Decisions Live\n\n| Document | Purpose |\n|----------|---------|\n| `ARCHITECTURE.md` / `docs/ARCHITECTURE_DETAILED.md` | Concise overview at the root; full design, state machine, determinism model, and evolution roadmap in the detailed doc. Single source of truth for phase scope and stop conditions. |\n| `docs/GOVERNANCE.md` | Golden rules: every change tied to a ticket, every JSON has a schema, every schema has tests, no external dependencies. |\n| `skills/dev-pipeline/SKILL.md` | Operational reference for all tools, commands, schemas, and behaviors. |\n| `.claw/tickets/\u003cticket_id\u003e.md` | Individual ticket definitions with goals, steps, and acceptance criteria. |\n\n### How Changes Are Proposed\n\n1. Create a ticket file in `.claw/tickets/` with frontmatter and required sections.\n2. Create a backlog item in `.claw/backlog/`.\n3. Reference the active phase. Changes outside the current phase are rejected.\n4. Implement. Run `bash tools/test-all.sh`. Zero failures required.\n5. Update `SKILL.md` if new tools or behaviors were added.\n6. Commit with a descriptive message.\n\n### Evolution Governance\n\n- Only one phase may be active at a time.\n- A phase is complete when every stop condition evaluates to true.\n- Every ticket must reference its phase.\n- `docs/ARCHITECTURE_DETAILED.md` is the single source of truth for phase scope. Conflicts between tickets and the architecture doc are resolved in favor of the architecture doc.\n\n---\n\n## Contributing\n\n### Branch and PR Rules\n\n1. Create a ticket file before starting work.\n2. Run `bash tools/test-all.sh` and confirm zero failures before committing.\n3. Keep commits focused: one logical change per commit.\n4. Use conventional commit prefixes: `feat:`, `fix:`, `chore:`, `refactor:`, `docs:`, `ci:`.\n5. Do not introduce external npm dependencies.\n6. Do not add `additionalProperties` to schemas without the `: false` constraint.\n7. Do not modify the state machine or role boundaries without a ticket referencing a specific phase.\n\n### Verification Checklist\n\n```bash\n# 1. Run the full test suite\nbash tools/test-all.sh\n\n# 2. Initialize a workspace and validate its graph\n./tools/init-workspace.sh --workspace /tmp/test --project_id test --title \"Test\"\n./tools/validate-backlog-graph.sh --workspace /tmp/test test | jq '.valid'\n\n# 3. Confirm dashboard produces valid output\n./tools/project-dashboard.sh --workspace /tmp/test | jq '.ok'\n\n# 4. Confirm git status is clean\ngit status\n```\n\n---\n\n## Risks and Constraints\n\n| Risk | Mitigation |\n|------|------------|\n| **Determinism boundary.** Timestamps and git HEAD introduce non-determinism. | Timestamps are informational only, never used for ordering decisions. |\n| **Agent hallucination.** LLM-generated artifacts may contain incorrect content. | Schema validation catches structural errors. QA and Review stages provide content checks. |\n| **Cost and latency.** Autonomous runner invokes Claude CLI per stage. | `--dry_run` mode for testing. Scaffold adapter for development without API calls. Max step/agent call limits. |\n| **Data privacy.** All data stays on the local filesystem. | No outbound network calls from pipeline scripts. Dashboard binds to `127.0.0.1` only. Workspace permissions set to `700`. |\n| **Single-machine limitation.** No distributed execution. | By design for the current maturity level. Filesystem-as-API is the intentional constraint. |\n| **No rollback mechanism.** Status transitions are one-way writes. | Append-only audit log provides full history. Artifacts are never overwritten. Safe reruns from current state. |\n\n---\n\n## Done\n\n- [x] Deterministic 8-stage pipeline with role boundaries\n- [x] Schema validation on all artifacts, tool outputs, input data, and reference schemas\n- [x] Persistent agent identity with workload tracking and role enforcement\n- [x] Structured work graph: epic hierarchy, DAG validation, cycle detection\n- [x] Graph-aware scheduler: dependency satisfaction, parent blocking, epic completion\n- [x] Write-time epic completion guard\n- [x] Preflight graph validation in project driver\n- [x] Task pack generation (deterministic, no LLM)\n- [x] Autonomous multi-agent runner with stop/resume and audit logging\n- [x] Project dashboard with dependency chains and agent workload\n- [x] Ticket persistence with anti-truncation guards\n- [x] External workspace model (pure engine, `.claw/` state in target repos)\n- [x] Workspace bootstrap (`init-workspace`) and patch application (`apply-dev-patch`)\n- [x] Pluggable capability system with manifest-driven stage injection\n- [x] Goal-driven mission layer (deterministic intent + stack detection)\n- [x] Built-in capabilities: UX audit, security audit, performance audit\n- [x] GitHub Actions CI (all tests green, zero failures)\n- [x] `additionalProperties: false` on all schemas (governance rule enforced)\n- [x] Cron-friendly drive loop with safe stop\n- [x] Zero external dependencies\n- [x] Phase 3: Artifact classification with semantic types\n- [x] Phase 3: Global artifact index across projects and runs\n- [x] Phase 3: Research workflow with structured findings\n- [x] Phase 3: Agent memory persistence across runs\n- [x] Phase 3: Cross-run knowledge retention in task packs\n- [x] Phase 4: Run analytics engine (per-run metrics + project aggregates)\n- [x] Phase 4: Self-evaluation (quality score, deviations, suggestions, memory write)\n- [x] Phase 4: Workflow suggestions (recurring QA failures, bottlenecks, rejection rates, quality trends)\n- [x] Phase 4: Gap scanner with auto-create backlog items\n- [x] Phase 4: Agent performance profiles with recommended_agent in picker\n- [x] Phase 5: Post-run lifecycle hooks (auto self-eval + gap scan)\n- [x] Phase 5: Adaptive agent prompt (memory + suggestions in Claude Code prompt)\n- [x] Phase 5: Agent assignment actuation (recommended_agent pick → drive → runner)\n- [x] Phase 5: Adaptive JS drive loop (sleep, hooks, project filter, stop conditions)\n- [x] Phase 5: Dashboard closed-loop status fields\n- [x] Phase 6: Template-enriched adapter prompt (stage task files as primary instructions)\n- [x] Phase 6: Prior artifact context injection (STAGE_ARTIFACT_DEPS, per-artifact truncation)\n- [x] Phase 6: Validation retry loop (maxRetries with error feedback)\n- [x] Phase 6: Post-implement auto-patch application (dry-run first, non-fatal)\n- [x] Phase 6: Dashboard feature flags\n- [x] Phase 7: `artifactList` scope fix in `claudeCodeAdapter`\n- [x] Phase 7: Post-patch test execution (`discoverTestCommand` + `runPostPatchTests`)\n- [x] Phase 7: Auto-commit after successful tests (opt-in, never pushes)\n- [x] Phase 7: Backlog auto-completion (run done → backlog item done)\n- [x] Phase 7: `recommendAgent` fallback in project driver\n- [x] Phase 7: Claude adapter mock-based test coverage\n- [x] Phase 7: Dashboard feature flags\n\n---\n\n## License\n\nMIT — see [`LICENSE`](./LICENSE).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhenrio123%2Fagent-work","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhenrio123%2Fagent-work","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhenrio123%2Fagent-work/lists"}