https://github.com/shinpr/codex-workflows
Structured agentic coding workflows for OpenAI Codex CLI with specialized AI subagents, planning, and quality gates.
https://github.com/shinpr/codex-workflows
agent-skills agentic-coding ai-coding ai-coding-agent code-generation code-quality codex codex-cli context-engineering development-workflow multi-agent openai subagents tdd
Last synced: 12 days ago
JSON representation
Structured agentic coding workflows for OpenAI Codex CLI with specialized AI subagents, planning, and quality gates.
- Host: GitHub
- URL: https://github.com/shinpr/codex-workflows
- Owner: shinpr
- License: mit
- Created: 2026-03-17T11:13:57.000Z (about 1 month ago)
- Default Branch: main
- Last Pushed: 2026-04-06T11:57:48.000Z (21 days ago)
- Last Synced: 2026-04-06T13:32:55.149Z (21 days ago)
- Topics: agent-skills, agentic-coding, ai-coding, ai-coding-agent, code-generation, code-quality, codex, codex-cli, context-engineering, development-workflow, multi-agent, openai, subagents, tdd
- Language: JavaScript
- Homepage:
- Size: 326 KB
- Stars: 6
- Watchers: 0
- Forks: 1
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# codex-workflows
[](https://developers.openai.com/codex/cli)
[](https://developers.openai.com/codex/skills/)
[](https://opensource.org/licenses/MIT)
**Structured agentic coding workflows for [OpenAI Codex CLI](https://developers.openai.com/codex/cli)** — specialized AI coding agents plan, implement, test, and review changes with traceable docs, task-level commits, and quality gates.
Built on the [Agent Skills specification](https://developers.openai.com/codex/skills/) and [Codex subagents](https://developers.openai.com/codex/subagents). Designed for long-running tasks, large refactors, and reviewable changes.
---
## Quick Start
```bash
cd your-project
npx codex-workflows install
```
Then in Codex CLI:
```
$recipe-implement Add user authentication with JWT
```
`$` is Codex CLI's syntax for invoking a skill explicitly. Type `$recipe-` to see all available recipes via tab completion.
Small changes stay lightweight. Larger tasks get structure: requirements → design → task decomposition → TDD implementation → quality gates.
codex-workflows is the Codex-native counterpart of [Claude Code Workflows](https://github.com/shinpr/claude-code-workflows): same document-driven development style, adapted for Codex CLI, subagents, and GPT models.
---
## Why codex-workflows?
Codex is already strong at one-shot implementation. The problem starts when a change spans multiple files, needs design decisions to stay visible, or has to survive review, testing, and follow-up edits.
For larger tasks, explicit planning changes the job from raw generation into verification against a design, a task breakdown, and acceptance criteria. That matters because review loops are more reliable than first-shot generation once scope and ambiguity grow.
codex-workflows adds the missing structure around those jobs:
- Traceable artifacts: PRD → Design Doc → Task → Commit
- Built-in TDD and quality gates before code is ready to commit
- Agent context separation for large refactors, migrations, and PR-sized changes
- Diagnosis and reverse-engineering flows for bugs and legacy code
## Not Designed For
- One-shot toy scripts or vibe-coding sessions where speed matters more than traceability
- Repositories that do not use tests, lint, builds, or reviewable commits
- Teams that do not want design docs, task breakdowns, or explicit quality gates
---
## What It Does
A single request becomes a structured development process. The framework chooses the level of ceremony based on scope:
| Scale | File Count | What Happens |
|-------|------------|-------------|
| Small | 1-2 | Simplified plan → direct implementation |
| Medium | 3-5 | Design Doc → work plan → task execution |
| Large | 6+ | PRD → ADR → Design Doc → test skeletons → work plan → autonomous execution |
For larger work, the path usually looks like this: understand the problem, analyze the codebase, design the change, break it into atomic tasks, implement with tests, and run quality checks before commit.
Each step is handled by a specialized subagent in its own context, using context engineering to prevent context pollution and reduce error accumulation in long-running tasks:
```
User Request
↓
requirement-analyzer → Scale determination (Small / Medium / Large)
↓
prd-creator → Product requirements (Large scale)
↓
codebase-analyzer → Existing codebase facts + focus areas
↓
technical-designer → ADR + Design Doc with acceptance criteria
↓
code-verifier → Design Doc vs existing code verification
↓
document-reviewer → Quality gate with verification evidence
↓
acceptance-test-gen → Test skeletons from ACs
↓
work-planner → Phased execution plan
↓
task-decomposer → Atomic tasks (1 task = 1 commit)
↓
task-executor → TDD implementation per task
↓
quality-fixer → Lint, test, build — no failing checks
↓
Ready to commit
```
### The Diagnosis Pipeline
```
Problem → investigator → verifier (ACH + Devil's Advocate) → solver → Actionable solutions
```
### Reverse Engineering
```
Existing code → scope-discoverer (discoveredUnits + prdUnits) → prd-creator → code-verifier → document-reviewer → Design Docs
```
This works best when repository knowledge is explicit and local. Short `AGENTS.md` files can act as entry points, while design docs, plans, and task files hold the deeper instructions that agents need to execute reliably.
---
## Installation
### Requirements
- [Codex CLI](https://developers.openai.com/codex/cli) (latest)
- Node.js >= 22
### Install
```bash
cd your-project
npx codex-workflows install
```
This copies into your project:
- `.agents/skills/` — Codex skills (foundational + recipes)
- `.codex/agents/` — Subagent TOML definitions
- Manifest file for tracking managed files
### Update
```bash
# Preview what will change
npx codex-workflows update --dry-run
# Apply updates
npx codex-workflows update
```
Files you've modified locally are preserved — the updater compares each file against its hash at install time and skips any file you've changed. New files from the update are added automatically.
```bash
# Check installed version
npx codex-workflows status
```
---
## Recipe Workflows
Invoke recipes with `$recipe-name` in Codex. Type `$recipe-` and use tab completion to see all available recipes.
### Backend & General
| Recipe | What it does | When to use |
|--------|-------------|-------------|
| `$recipe-implement` | Full lifecycle with layer routing (backend/frontend/fullstack) | New features — universal entry point |
| `$recipe-task` | Single task with rule selection | Bug fixes, small changes |
| `$recipe-design` | Requirements → ADR/Design Doc | Architecture planning |
| `$recipe-plan` | Design Doc → test skeletons → work plan | Planning phase |
| `$recipe-build` | Execute backend tasks autonomously | Resume backend implementation |
| `$recipe-review` | Design Doc compliance and security validation with auto-fixes | Post-implementation check |
| `$recipe-diagnose` | Problem investigation → verification → solution | Bug investigation |
| `$recipe-reverse-engineer` | Generate PRD + Design Docs from existing code | Legacy system documentation |
| `$recipe-add-integration-tests` | Add integration/E2E tests from Design Doc | Test coverage for existing code |
| `$recipe-update-doc` | Update existing Design Doc / PRD / ADR with review | Spec changes, document maintenance |
### Frontend (React/TypeScript)
| Recipe | What it does | When to use |
|--------|-------------|-------------|
| `$recipe-front-design` | Requirements → UI Spec → frontend Design Doc | Frontend architecture planning |
| `$recipe-front-plan` | Frontend Design Doc → test skeletons → work plan | Frontend planning phase |
| `$recipe-front-build` | Execute frontend tasks with RTL + quality checks | Resume frontend implementation |
| `$recipe-front-review` | Frontend compliance and security validation with React-specific fixes | Frontend post-implementation check |
### Fullstack (Cross-Layer)
| Recipe | What it does | When to use |
|--------|-------------|-------------|
| `$recipe-fullstack-implement` | Full lifecycle with separate Design Docs per layer | Cross-layer features |
| `$recipe-fullstack-build` | Execute tasks with layer-aware agent routing | Resume cross-layer implementation |
### Examples
**Full feature development:**
```
$recipe-implement Add user authentication with JWT and role-based access control
```
**Quick fix with proper rule selection:**
```
$recipe-task Fix validation error message in checkout form
```
**Investigate a bug:**
```
$recipe-diagnose API returns 500 error on user login after deployment
```
**Document undocumented legacy code:**
```
$recipe-reverse-engineer src/auth module
```
---
## Foundational Skills
These load automatically when the conversation context matches — no explicit invocation needed:
| Skill | What it provides |
|-------|-----------------|
| `coding-rules` | Code quality, function design, error handling, refactoring |
| `testing` | TDD Red-Green-Refactor, test types, AAA pattern, mocking |
| `ai-development-guide` | Anti-patterns, debugging (5 Whys), quality check workflow |
| `documentation-criteria` | Document creation rules and templates (PRD, ADR, Design Doc, Work Plan) |
| `implementation-approach` | Strategy selection: vertical / horizontal / hybrid slicing |
| `integration-e2e-testing` | Integration/E2E test design, ROI calculation, review criteria |
| `task-analyzer` | Task analysis, scale estimation, skill selection |
| `subagents-orchestration-guide` | Multi-agent coordination, workflow flows, autonomous execution |
Language-specific references are included for TypeScript/React projects (`coding-rules/references/typescript.md`, `testing/references/typescript.md`).
---
## Subagents
Codex spawns these as needed during recipe execution. Each agent runs in its own context with specialized instructions and skill configurations.
### Document Creation Agents
| Agent | Role |
|-------|------|
| `requirement-analyzer` | Requirements analysis and work scale determination |
| `prd-creator` | PRD creation and structuring |
| `technical-designer` | ADR and Design Doc creation (backend) |
| `technical-designer-frontend` | Frontend ADR and Design Doc creation (React) |
| `ui-spec-designer` | UI Specification from PRD and optional prototype code |
| `codebase-analyzer` | Existing codebase analysis before Design Doc creation |
| `work-planner` | Work plan creation from Design Docs |
| `document-reviewer` | Document consistency and approval |
| `design-sync` | Cross-document consistency verification |
### Implementation Agents
| Agent | Role |
|-------|------|
| `task-decomposer` | Work plan → atomic task files |
| `task-executor` | TDD implementation following task files (backend) |
| `task-executor-frontend` | React implementation with Testing Library |
| `quality-fixer` | Quality checks and fixes until all pass (backend) |
| `quality-fixer-frontend` | React-specific quality checks (TypeScript, RTL, bundle) |
| `acceptance-test-generator` | Test skeleton generation from acceptance criteria |
| `integration-test-reviewer` | Test quality review |
### Analysis Agents
| Agent | Role |
|-------|------|
| `code-reviewer` | Design Doc compliance validation |
| `code-verifier` | Document-code consistency verification |
| `security-reviewer` | Security compliance review after implementation |
| `rule-advisor` | Skill selection via metacognitive analysis |
| `scope-discoverer` | Codebase scope discovery for reverse docs, including PRD unit grouping |
### Diagnosis Agents
| Agent | Role |
|-------|------|
| `investigator` | Evidence collection and hypothesis enumeration |
| `verifier` | Hypothesis validation (ACH + Devil's Advocate) |
| `solver` | Solution derivation with tradeoff analysis |
---
## How It Works
### Autonomous Execution Mode
After work plan approval, the framework enters guided autonomous execution with escalation points:
1. **task-executor** implements each task with TDD
2. **quality-fixer** runs all checks (lint, tests, build) before every commit
3. Escalation pauses execution when design deviation or ambiguity is detected
4. Each task produces one commit — rollback-friendly granularity
### Context Separation
Each subagent runs in a fresh context. This context-engineering pattern keeps long-running agentic coding tasks legible and reviewable:
- generation and verification happen in separate contexts, reducing author bias and carry-over assumptions
- **document-reviewer** reviews without the author's bias
- **investigator** collects evidence without confirmation bias
- **code-reviewer** validates compliance without implementation context
---
## Project Structure
After installation, your project gets:
```
your-project/
├── .agents/skills/ # Codex skills
│ ├── coding-rules/ # Foundational (auto-loaded)
│ ├── testing/
│ ├── ai-development-guide/
│ ├── documentation-criteria/
│ ├── implementation-approach/
│ ├── integration-e2e-testing/
│ ├── task-analyzer/
│ ├── subagents-orchestration-guide/
│ ├── recipe-implement/ # Recipes ($recipe-*)
│ ├── recipe-design/
│ ├── recipe-build/
│ ├── recipe-plan/
│ ├── recipe-review/
│ ├── recipe-diagnose/
│ ├── recipe-task/
│ ├── recipe-update-doc/
│ ├── recipe-reverse-engineer/
│ └── recipe-add-integration-tests/
├── .codex/agents/ # Subagent TOML definitions
│ ├── requirement-analyzer.toml
│ ├── technical-designer.toml
│ ├── task-executor.toml
│ └── ... (23 agents total)
└── docs/ # Created as you use the recipes
├── prd/
├── design/
├── adr/
├── ui-spec/
└── plans/
└── tasks/
```
---
## Works With
If your requirements already live in Linear or an existing PRD, [linear-prism](https://github.com/shinpr/linear-prism) can decompose them into implementation-ready tasks by reading the codebase, making dependencies explicit, and preserving Design Doc boundaries.
Those tasks can then be passed into `$recipe-design` to enter the design phase with clearer scope and better task visibility.
---
## FAQ
**Q: What models does this work with?**
A: Designed for the latest GPT models. Lightweight subagents (e.g. rule-advisor) can use smaller models for faster analysis. Models are configurable per agent in the TOML files.
**Q: Can I customize the agents?**
A: Yes. Edit the TOML files in `.codex/agents/` — change model, sandbox_mode, developer_instructions, or skills.config. Files you modify locally are preserved during `npx codex-workflows update`.
**Q: What's the difference between `$recipe-implement` and `$recipe-fullstack-implement`?**
A: `$recipe-implement` is the universal entry point. It runs requirement-analyzer first, detects affected layers from the codebase, and automatically routes to backend, frontend, or fullstack flow. `$recipe-fullstack-implement` skips the detection and goes straight into the fullstack flow (separate Design Docs per layer, design-sync, layer-aware task execution). Use `$recipe-implement` when you're not sure; use `$recipe-fullstack-implement` when you know upfront that the feature spans both layers.
**Q: Does this work with MCP servers?**
A: Yes. Codex skills and subagents work alongside [MCP](https://developers.openai.com/codex/mcp) — skills operate at the instruction layer while MCP operates at the tool transport layer. You can add MCP servers to any agent's TOML configuration.
**Q: What if a subagent gets stuck?**
A: Subagents escalate to the user when they encounter design deviations, ambiguous requirements, or specification conflicts. The framework stops autonomous execution and presents the issue with options.
---
## Design Rationale
Background reading behind the workflow design
- [Planning Is the Real Superpower of Agentic Coding](https://www.norsica.jp/blog/planning-superpower-agentic-coding) — why explicit planning turns large-task execution from raw generation into verification against a design and task breakdown
- [Why LLMs Are Bad at 'First Try' and Great at Verification](https://www.norsica.jp/blog/llm-verification-over-generation) — why review loops and session separation are more reliable than first-shot generation on complex work
- [Stop Putting Everything in AGENTS.md](https://www.norsica.jp/blog/stop-putting-everything-in-agents-md) — why `AGENTS.md` should stay lean while rules, docs, and task instructions live near the point of use
---
## License
MIT License — free to use, modify, and distribute.
---
Built and maintained by [@shinpr](https://github.com/shinpr)