https://github.com/peter-stratton/dark-factory
Human constraints, interactive planning, autonomous execution.
https://github.com/peter-stratton/dark-factory
ai-agents autonomous-development cli developer-tools golang
Last synced: about 2 months ago
JSON representation
Human constraints, interactive planning, autonomous execution.
- Host: GitHub
- URL: https://github.com/peter-stratton/dark-factory
- Owner: peter-stratton
- License: other
- Created: 2026-03-02T02:37:00.000Z (4 months ago)
- Default Branch: main
- Last Pushed: 2026-03-31T23:54:22.000Z (3 months ago)
- Last Synced: 2026-04-01T19:20:42.817Z (3 months ago)
- Topics: ai-agents, autonomous-development, cli, developer-tools, golang
- Language: Go
- Homepage:
- Size: 2.89 MB
- Stars: 4
- Watchers: 0
- Forks: 1
- Open Issues: 10
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Roadmap: docs/roadmap/README.md
Awesome Lists containing this project
README
```
_ _ __ _
__| | __ _ _ __| | __ / _| __ _ ___| |_ ___ _ __ _ _
/ _` |/ _` | '__| |/ /_____| |_ / _` |/ __| __/ _ \| '__| | | |
| (_| | (_| | | | <______| _| (_| | (__| || (_) | | | |_| |
\__,_|\__,_|_| |_|\_\ |_| \__,_|\___|\__\___/|_| \__, |
|___/
```
A Go CLI built for [Claude Code](https://docs.anthropic.com/en/docs/claude-code)
that orchestrates autonomous AI agents to implement GitHub issues, review their
own work, and merge — without human intervention.
**[Documentation](https://godarkfactory.com)** ·
**[Getting Started](https://godarkfactory.com/docs/getting-started)** ·
**[Releases](https://github.com/peter-stratton/dark-factory/releases)**
### Philosophy
The hard part of software engineering isn't typing code — it's deciding what to
build and how it fits. Dark Factory keeps those decisions with humans. Engineers
write the roadmap, define architecture layers, design conventions, and author
issue specs. Agents operate within those constraints. The harness *is* the
design.
This is a collaborative architecture tool, not a "throw a ticket at an AI and
hope for the best" system. The adversarial review model reinforces this: a
separate reviewer agent checks whether the code respects the architecture a
human defined, follows conventions a human wrote, and meets acceptance criteria
a human specified. Every judgment call that shapes a codebase stays with the
humans who understand it.
Dark Factory has been built entirely by its own agent pipeline — every feature
was implemented, reviewed, and merged by `godark run`. The humans write specs
and design harnesses; the agents write code.
## Install
**Homebrew** (macOS):
```bash
brew install peter-stratton/dark-factory/godark
```
**Go install**:
```bash
go install github.com/peter-stratton/dark-factory/cmd/godark@latest
```
**Binary download**: grab a pre-built binary from
[GitHub Releases](https://github.com/peter-stratton/dark-factory/releases).
### Platform support
Dark Factory is built for Claude Code and GitHub. The architecture is designed
around Claude Code's specific capabilities — session resumption, CLAUDE.md as a
control surface, slash command skills, and sandboxed execution.
| Layer | Supported |
|-------|-----------|
| AI agent | Claude Code (Anthropic) |
| Version control | GitHub |
### Features
- **Three-agent pipeline** — implementer, quality reviewer, and functional reviewer are independent agents with isolated permissions; reviewers literally cannot edit files
- **Specification-driven quality gates** — human-authored scenario specs define "done"; the functional reviewer generates ephemeral integration tests from specs, not just rubber-stamping the diff
- **Architecture-as-code enforcement** — machine-readable layer definitions validated by `godark vet`; reviewers check architectural compliance, not just correctness
- **Structured agent dialogue** — implementer posts reasoning as PR comments, reviewers challenge it; the PR thread is an auditable record of adversarial design review
- **Full run observability** — local web dashboard with review chain timelines, quality flags, tool traces, and agent dialogue history for every issue
- **Harness engineering lifecycle** — scaffold, validate, and enforce project constraints with `godark new`, `godark init`, `godark vet`, and six harness types
- **Auto-detected multi-language support** — detects project type from marker files and configures the sandbox, build, and test commands automatically
- **Fully sandboxed agent runs by default** — agents execute inside ephemeral Docker containers with no access to the host filesystem or network beyond what's explicitly configured
- **Single binary, runs on a laptop** — no infrastructure fleet, no MCP server farm; just a Go binary, and Docker
## How it works
Given a GitHub repo and a milestone, `godark` runs a three-agent development loop:
1. **Fetch** open issues from the milestone, sorted by priority (`p1` → `p2` → `p3` → unlabeled)
2. **Resolve dependencies** — issues declare `Blocked by: #N` or `Depends on: #N` in their body; skip any whose dependencies are still open
3. **Implementer** — Claude Code implements the issue, writes unit tests, and opens a PR
4. **Guard rails** — verify the PR exists, contains `Closes #N`, and didn't touch protected files
5. **Quality reviewer** — a separate Claude Code instance audits the PR for security, performance, and code quality issues; if it requests changes, the implementer retries before functional review begins
6. **Functional reviewer** — another Claude Code instance reviews the PR against human-authored scenario specs, generates ephemeral integration tests, and approves or requests changes
7. **Retry loop** — if either reviewer rejects, the implementer reads the review comments and pushes fixes (max N retries per gate)
8. **Merge or escalate** — approved PRs are squash-merged; failed PRs are labeled `needs-human-review`
9. **Punchlist** — for each merged PR, a tool-less punchlist agent generates 3-5 concrete manual acceptance tests (specific config values, commands, expected outcomes) rendered as checkboxes alongside the existing punchlist output
10. **Repeat** — move to the next unblocked issue
## Quick start
```bash
# New project
godark new my-project --repo owner/my-project
# Existing project
godark init --repo owner/my-project
```
Then open the project in Claude Code and use the built-in skills to define your
architecture, conventions, and roadmap. See the
[Getting Started guide](https://godarkfactory.com/docs/getting-started) for a
full walkthrough.
## Documentation
Full documentation is available at **[godarkfactory.com](https://godarkfactory.com)**:
- [Getting Started](https://godarkfactory.com/docs/getting-started) — installation, setup, and tutorial
- [CLI Reference](https://godarkfactory.com/docs/cli) — all commands, flags, and usage examples
- [Configuration](https://godarkfactory.com/docs/configuration) — `godark.yaml` deep dive
- [Skills](https://godarkfactory.com/docs/skills) — slash commands for roadmaps, planning, issues, and more
- [Licensing & Adoption](https://godarkfactory.com/docs/licensing) — commercial use, data privacy, and FAQ
## Phase overviews
Each completed phase has a practical overview with real-world examples showing
what was built and how users experience it. These live in
[`docs/phase-overviews/`](docs/phase-overviews/):
| Phase | Overview |
|-------|----------|
| 1 | [Skeleton & Orchestration](docs/phase-overviews/phase-01-skeleton-and-orchestration.md) — CLI scaffold, config, deps, dry-run |
| 2 | [Quality & Vetting](docs/phase-overviews/phase-02-quality-and-vetting.md) — `godark vet` validation framework |
| 3 | [Docker Sandbox](docs/phase-overviews/phase-03-docker-sandbox.md) — container isolation, auth, cloning |
| 4 | [Agent Execution](docs/phase-overviews/phase-04-agent-execution.md) — implementer, reviewer, guard rails, retry loop |
| 5 | [Agent SDK Migration](docs/phase-overviews/phase-05-agent-sdk-migration.md) — SDK wrapper, role permissions, session resumption |
| 6 | [Multi-Language Support](docs/phase-overviews/phase-06-multi-language-support.md) — auto-detect, runtime config, pluggable Dockerfiles |
| 7 | [Review Quality & Dashboard](docs/phase-overviews/phase-07-review-quality-and-dashboard.md) — run data, quality flags, web dashboard |
| 8 | [Harness Engineering](docs/phase-overviews/phase-08-harness-engineering.md) — harness templates, `godark new`, vet architecture |
| 9 | [Harness-Aware Agent Execution](docs/phase-overviews/phase-09-harness-aware-agent-execution.md) — harness injection, dialogue, enforcement |
| 10 | [Deterministic Verification Pipeline](docs/phase-overviews/phase-10-deterministic-verification-pipeline.md) — verify step, auto-fix, bash deny-list |
| 11 | [Run Analysis & Prompt Feedback](docs/phase-overviews/phase-11-run-analysis-and-prompt-feedback.md) — `godark analyze`, trends, prompt gaps |
| 12 | [Complex Project Support](docs/phase-overviews/phase-12-complex-project-support.md) — multi-module, codegen, secrets, CI checks |
| 13 | [Human-in-the-Loop Review](docs/phase-overviews/phase-13-human-in-the-loop-review.md) — graduated auto-merge, watch command, risk classifier, notifications |
| 14 | [Bounded Concurrency](docs/phase-overviews/phase-14-bounded-concurrency.md) — wave-barrier dispatcher, RunMode, serial post-wave merge, rate-limit batching, per-issue logs |
| 15 | *Deferred* — Server Mode & Centralized Operation |
| 16 | [Public Release](docs/phase-overviews/phase-16-public-release.md) — ELv2 license, GoReleaser, Homebrew tap, release workflow, CONTRIBUTING.md |
| 17 | [Configurable Base Branch](docs/phase-overviews/phase-17-configurable-base-branch.md) — base branch config, PR targeting, prompt safety, run data tracking |
| 18 | [Adaptive Agent Loop](docs/phase-overviews/phase-18-adaptive-agent-loop.md) — recon agent, hybrid retry strategy, handoff context |
| 19 | [Spring Cleaning](docs/phase-overviews/phase-19-spring-cleaning.md) — unified verdict parsing, typed constants, shared helpers, CLI consolidation |
| 20 | [Terminal UI](docs/phase-overviews/phase-20-terminal-ui.md) — Bubble Tea TUI, progress reporter, adaptive colors, hybrid output mode |
| 21 | [Analytics Persistence](docs/phase-overviews/phase-21-analytics-persistence.md) — SQLite stats store, retry recovery rate, cost/duration breakdown, repo stats, flag-based prompt gaps |
| 23 | [Watch & Daemon Mode](docs/phase-overviews/phase-23-watch-and-daemon-mode.md) — shared watch package, daemon mode, external merge detection, watch TUI and dashboard |
| 24 | [Container Resource Tracking](docs/phase-overviews/phase-24-container-resource-tracking.md) — Docker stats capture, per-step memory/CPU, analyze output, dashboard columns, host mode |
| 25 | [Docker Socket Mount & Compose Lifecycle](docs/phase-overviews/phase-25-docker-socket-mount-and-compose-lifecycle.md) — compose config, socket mount, up/down lifecycle, env forwarding, doctor checks |
| 22 | [Analytics Overhaul](docs/phase-overviews/phase-22-analytics-overhaul.md) — first-pass rate, wasted cost, failure reasons, per-repo breakdown, sprint report command |
| 26 | [Merge Coordinator Agent](docs/phase-overviews/phase-26-merge-coordinator-agent.md) - dedicated conflict resolver, per-issue and rollup integration, telemetry, dashboard step |
| 27 | [Agent Efficiency & Resilience](docs/phase-overviews/phase-27-agent-efficiency-and-resilience.md) - per-role judge thresholds, benign kill handling, model overrides, handoff context, generalized recon |
| 28 | [Container Health Judge](docs/phase-overviews/phase-28-container-health-judge.md) — real-time log streaming, idle/thrash/transport rules, container retry, intervention flow |
| 29 | [Complete CLI Migration](docs/phase-overviews/phase-29-complete-cli-migration.md) - delete Python runner, simplify Run(), remove --no-sandbox, unconditional Docker, test migration |
| 30 | [Spec Tightening](docs/phase-overviews/phase-30-spec-tightening.md) - GIVEN/WHEN/THEN validation, phase-scoped vet, spec delta generation, pipeline integration |
| 31 | [Planner Agent](docs/phase-overviews/phase-31-planner-agent.md) - structured implementation plans, non-blocking pipeline step, implementer prompt injection, model override |
| 32 | [Decision Flow Tracing](docs/phase-overviews/phase-32-decision-flow-tracing.md) - trace ID generation, SQLite persistence, `godark trace` CLI, dashboard copy button, TUI column |
| 33 | [Semi-Structured Review](docs/phase-overviews/phase-33-semi-structured-review.md) - semi-formal reviewer prompt, config toggle, consistency quality gate, automatic re-run on contradiction |
To generate an overview for a newly completed phase, use `/godark-create-phase-overview `.
## Building
```bash
go build -o bin/godark ./cmd/godark
go test ./...
```
## Status
See [docs/roadmap/](docs/roadmap/) for the full development roadmap.
## License
Dark Factory is licensed under the [Elastic License 2.0](LICENSE). Free for
commercial use — the only restriction is you can't resell it as a hosted
service. See the [Licensing & Adoption](https://godarkfactory.com/docs/licensing)
page for details.