{"id":50324225,"url":"https://github.com/pathcosmos/codex-on-claude","last_synced_at":"2026-05-29T05:00:46.515Z","repository":{"id":358952038,"uuid":"1243834319","full_name":"pathcosmos/codex-on-claude","owner":"pathcosmos","description":"Bridge that lets Claude Code call OpenAI Codex CLI as an MCP sub-agent. Installable Skills/Agent/Plugin + continuous improvement loop.","archived":false,"fork":false,"pushed_at":"2026-05-19T18:53:11.000Z","size":80,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-19T21:57:31.381Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/pathcosmos.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-05-19T17:56:47.000Z","updated_at":"2026-05-19T18:53:14.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/pathcosmos/codex-on-claude","commit_stats":null,"previous_names":["pathcosmos/codex-on-claude"],"tags_count":3,"template":false,"template_full_name":null,"purl":"pkg:github/pathcosmos/codex-on-claude","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pathcosmos%2Fcodex-on-claude","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pathcosmos%2Fcodex-on-claude/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pathcosmos%2Fcodex-on-claude/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pathcosmos%2Fcodex-on-claude/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/pathcosmos","download_url":"https://codeload.github.com/pathcosmos/codex-on-claude/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pathcosmos%2Fcodex-on-claude/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33637485,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-05-29T02:00:06.066Z","response_time":107,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-05-29T05:00:42.674Z","updated_at":"2026-05-29T05:00:46.493Z","avatar_url":"https://github.com/pathcosmos.png","language":"JavaScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# codex-on-claude\n\n\u003e Call OpenAI **Codex CLI** as a sub-agent from inside Claude Code.\n\u003e MCP transport + curated Skills + an optional isolated review Agent + a continuous-improvement loop with optional PostToolUse hook auto-logging + a persistent thread catalog — installed by one CLI.\n\n[![npm version](https://img.shields.io/npm/v/codex-on-claude.svg)](https://www.npmjs.com/package/codex-on-claude)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)\n\n\u003e 🇰🇷 Korean mirror: [README.ko.md](README.ko.md)\n\n---\n\n## TL;DR\n\n```sh\n# Prerequisites already installed: Codex CLI + Claude Code, both logged in.\ncodex --version \u0026\u0026 claude --version\n\n# Canonical: install or update + reconfigure in one shot\nnpx --yes codex-on-claude@latest\n```\n\nThe installer:\n\n1. Runs preflight — Node 18.17+, `codex`, `claude`, `codex doctor`, `claude auth status`, and the `codex` MCP server's `Connected` status.\n2. Walks you through **eight questions** with arrow-key navigation (↑/↓, Space to toggle, Enter to confirm). v0.5.0 adds question §7 (`usageMode`) — Codex invocation policy (`none / synergy / auto / max`). v0.5.1 adds question §8 (`cerberus`) — opt-in 3-head planning consensus mode (default `off`). Existing installs migrate silently (`usageMode=synergy`, `cerberus=off`).\n3. Shows a final review screen with `Apply / Edit again / Cancel`.\n4. Copies the selected Skills (and optional Agent / hooks / PreToolUse gate when `usageMode=none`) under `~/.claude/`.\n\nRun the same command any time to update + reconfigure — if prior state is found, the installer prints a `vPREV → vCURR` banner and walks you through previous answers as defaults. Or use `codex-on-claude reconfigure` explicitly.\n\n---\n\n## Why this exists\n\nThe bare wiring (Claude Code ↔ Codex CLI via MCP) is a one-line `claude mcp add`, but day-to-day use needs more:\n\n- A **Skill** so you don't re-type the same options every call.\n- An optional **Agent** so a 30KB Codex response doesn't eat the main context.\n- A **continuous-improvement loop** that watches your usage and proposes smarter defaults.\n- A **persistent thread catalog** so Codex sessions you started yesterday don't vanish into a \"Session not found\" error.\n\n`codex-on-claude` installs the whole layered stack, and lets you turn each layer on/off.\n\n---\n\n## Prerequisites\n\n| Item | Minimum | Check |\n|---|---|---|\n| Node.js | 18.17 | `node --version` |\n| Claude Code (`claude`) | 2.1.x | `claude --version` |\n| Codex CLI (`codex`) | 0.13.x | `codex --version`, `codex doctor` |\n| `codex` MCP server | `Connected` | `claude mcp get codex` |\n| zsh / bash | — | the installer uses one of these |\n\nBoth CLIs must be logged in.\n\n```sh\nclaude auth status --text\ncodex doctor --summary\nclaude mcp get codex          # Status: ✓ Connected\ncodex-on-claude doctor        # runs all of the above in one shot\n```\n\n---\n\n## Install\n\n### A. via npx (recommended)\n\n```sh\nnpx --yes codex-on-claude@latest   # always pulls latest + auto-reconfigures if state exists\n```\n\nThe `--yes` flag skips the \"install package\" confirmation; `@latest` cache-busts npx so you never hit a stale `_npx/\u003chash\u003e/` cache. The bare form `npx codex-on-claude` also works but is more prone to corrupted-cache `ENOENT` errors (see [Troubleshooting](#troubleshooting)).\n\n### B. global install\n\n```sh\nnpm install -g codex-on-claude\ncodex-on-claude\n```\n\n### C. from source\n\n```sh\ngit clone https://github.com/pathcosmos/codex-on-claude.git\ncd codex-on-claude\nnode install/install.mjs\n```\n\n### D. non-interactive (CI / scripted)\n\n```sh\ncodex-on-claude \\\n  --patterns=review,followup,fix,routine \\\n  --context-policy=mixed \\\n  --improvement-loop=manual \\\n  --threads=basic \\\n  --usage-mode=synergy \\\n  --auto-tier2-llm-probe=on \\\n  --subscription-claude=max --subscription-codex=pro \\\n  --codex-model-primary=gpt-5.5 --codex-reasoning-primary=xhigh \\\n  --reviewer-model-primary=opus --reviewer-reasoning-primary=xhigh \\\n  --yes\n```\n\nThe `--subscription-*` flags declare what you actually have access to. The `--*-model-primary` / `--*-reasoning-primary` flags pick the model + effort level you want to run by default. **Fallback is auto-locked** to each tier's base — if you exhaust quota on the primary, Skill prose retries once on the fallback (gpt-5/medium for Codex, sonnet/medium for the Claude reviewer agent at default tiers). Pass `--codex-model-fallback=...` etc. if you really need to override the lock; the installer warns and re-snaps to base.\n\n`--share-scope=...` is accepted but **deprecated since v0.3 and ignored** (a one-line notice prints). The plugin-bundle workflow was removed in 0.3.0; if you need team distribution, share this GitHub repo and have everyone run the installer.\n\n---\n\n## Updating codex-on-claude\n\n- **npx users** — one command does both. `npx --yes codex-on-claude@latest` fetches the newest release **and** auto-runs reconfigure when prior state is found (you'll see a `vPREV → vCURR` banner). No need to chain a separate `reconfigure` call.\n- **Global install** — bump via npm, then re-run:\n  ```sh\n  npm install -g codex-on-claude@latest    # `npm update -g` also works\n  hash -r                                  # zsh/bash: refresh the command hash so the new binary is picked up\n  codex-on-claude                          # auto-reconfigures because prior state is detected\n  ```\n  If `codex-on-claude` still says \"command not found\" after `npm update -g`, that's a stale shell hash (see [Troubleshooting](#troubleshooting)).\n- **Re-run reconfigure after updating.** Whether via npx or global, your previous answers come pre-selected, you can change any of the four options via arrow keys, and the final review screen shows exactly what's about to be saved.\n- **Roll back a release** if needed:\n  ```sh\n  npm install -g codex-on-claude@0.3.1\n  codex-on-claude reconfigure\n  ```\n  State files are forward/backward-compatible across patch releases; alias migrations (e.g. `on-demand → manual`) are one-way, and an older binary will just ignore unknown keys.\n- **What gets touched by `npm update` itself?** Only the CLI binary. Files under `~/.claude/skills/codex-*`, `~/.claude/agents/codex-reviewer.md`, `~/.claude/settings.json` (hook entries), and `~/.claude/codex-on-claude/` are modified **only** when you run `reconfigure`, `uninstall`, or another explicit subcommand.\n- Deprecated versions on npm carry a one-line `Use codex-on-claude@^X.Y.Z` pointer that `npm install` will surface.\n\n---\n\n## The install questions\n\nEach answer is also a CLI flag for non-interactive use. Reconfigure later with `codex-on-claude reconfigure` — your previous answers come back pre-selected.\n\n\u003e v0.4.1 raises the count from 4 to 6 (subscription + primary model / reasoning per side). All new questions live below the original four; the originals still work the same way.\n\u003e\n\u003e **v0.5.0** adds question §7 (`usageMode`) — Codex invocation policy. Existing installs migrate silently to `synergy` (no prompt); explicit `reconfigure` surfaces the new question.\n\n### 1. patterns (multi-select)\n\n| Choice | Skills installed |\n|---|---|\n| One-shot read-only review | `codex-review` |\n| Long multi-turn session | `codex-followup`, `codex-resume` |\n| Workspace-write delegation | `codex-fix` (with strict file allowlist guardrails) |\n| Templated repeating job | `codex-routine` |\n\nFlag: `--patterns=review,followup,fix,routine`\n\n### 2. contextPolicy (single)\n\n| Choice | Effect |\n|---|---|\n| `direct` | Skills only — Codex responses land in the main context |\n| `summarize` | Adds `codex-reviewer` subagent; Skills route via the agent so only the summary reaches main |\n| `mixed` (recommended) | Both — Skills decide per call |\n\nFlag: `--context-policy=direct|summarize|mixed`\n\n### 3. improvementLoop (single)\n\n| Choice | Behavior |\n|---|---|\n| `off` | No `codex-analyze` / `codex-improve` / `codex-log` Skills, no hooks |\n| `manual` (default) | Skill prose triggers `codex-on-claude log`; no OS hook (was: `on-demand`, kept as alias) |\n| `auto-on-skill` | Adds two PostToolUse hooks in `~/.claude/settings.json` so every `mcp__codex__codex` / `codex-reply` call is auto-logged — no Skill-prose dependence |\n| `periodic` | `auto-on-skill` plus cron/loop-friendly periodic analysis guidance |\n\nFlag: `--improvement-loop=off|manual|auto-on-skill|periodic` (alias: `on-demand → manual`)\n\nLogging is **metadata only** — prompt / response bodies are never written to disk. The hooks we install carry a `_coc.marker = \"codex-on-claude:auto-log\"` sentinel so `uninstall` / `reconfigure --improvement-loop=manual` remove only the entries we added (hook-level filter as of 0.3.4 — user hooks that happen to coexist in the same group are preserved).\n\n**Runtime config drift (v0.3.4+)**: if you manually edit `~/.claude/codex-on-claude/config.json` to set `improvementLoop` to `off` or `manual` without re-running `reconfigure`, the installed hook now respects the new value on the next call (no more silent telemetry from a stale hook). Corrupt or missing config fail-opens so existing installs don't break unexpectedly.\n\n**Hook payload extraction (v0.3.4+)**: `threadId` is now correctly extracted from Claude Code 2.1.x's PostToolUse payload (where `tool_response` is a JSON-encoded string), and `elapsedMs` is populated from `duration_ms`. Previously every auto-hook log entry had `threadId: null`, which silently broke analyzer rules that join logs to threads.\n\n### 4. threads (single, v0.2+)\n\nPersistent thread catalog at `~/.claude/codex-on-claude/threads/\u003cthreadId\u003e.json`.\n\n| Choice | Captured per thread |\n|---|---|\n| `off` | nothing |\n| `basic` (default when improvement loop is on) | threadId + title + tags + originatingSkill + lastUsed + turnCount |\n| `full` | above plus `goal / outcome / decision / note` summaries, `incidents` (issue/resolution/outcome), and `fallbackStrategy` (`auto-resume / ask / new`) |\n\nFlag: `--threads=off|basic|full`\n\nExternal transport: **none**. Bodies of prompts / responses are never stored — only short text you (or a Skill) explicitly write via `threads outcome | decision | note | incident`.\n\n### 5. subscription tiers (v0.4.1)\n\nYou declare which Claude + Codex subscription tier you actually have. The installer uses this **only** to constrain the model + reasoning choices in questions 6 below — it never sends anything to Anthropic/OpenAI to verify.\n\n| Side | Tiers offered (highest → lowest) | Default |\n|---|---|---|\n| Claude | `enterprise / team / max / pro / free` | `max` |\n| Codex  | `team / pro / plus / free`             | `pro` |\n\nFlags: `--subscription-claude=...` and `--subscription-codex=...`\n\nA declared tier that doesn't match reality is fine — but expect the **frequent-fallback analyzer rule** to fire as your primary tier exhausts quota repeatedly. Reconfigure to a lower tier (and lower primary reasoning) when that happens.\n\n### 6. model / reasoning — primary + fallback (v0.4.1)\n\nFor each side (Codex calls, Claude reviewer subagent) you pick:\n- **Primary**: the model + reasoning effort you want by default\n- **Fallback**: **locked** to the subscription tier's base — used automatically when Skill prose detects a `rate_limit_exceeded` / `quota` / `429` / `usage_limit_reached` error\n\n| Side | Primary flags |\n|---|---|\n| Codex | `--codex-model-primary=gpt-5.5 --codex-reasoning-primary=xhigh` |\n| Reviewer (Claude subagent) | `--reviewer-model-primary=opus --reviewer-reasoning-primary=xhigh` |\n\nReasoning vocabulary (shared by Claude `--effort` and Codex `model_reasoning_effort`): `low / medium / high / xhigh / max`.\n\nHow fallback fires:\n- **Codex side**: each `mcp__codex__codex` call's Skill prose ends with a \"if you see rate_limit/quota, retry **once** with the locked fallback\" block.\n- **Reviewer subagent route**: the primary `codex-reviewer` agent emits the sentinel line `CODEX_QUOTA_FALLBACK_NEEDED`; `/codex-review` Skill prose re-launches via `codex-reviewer-fallback`.\n- **`claude -p` direct callers** (cron / bench): pass `--fallback-model \u003cid\u003e` — Claude's CLI handles it natively without prose. Note: `--effort` only applies to primary; Claude doesn't have a `--fallback-effort` flag.\n\nYou can pass `--codex-model-fallback=...` etc. but values that don't match the matrix base are ignored with a warning — the lock is intentional. To \"unlock\" the fallback, raise your subscription tier (which raises the base).\n\nFallback events accrue in the usage log (`outcome=fallback`, `errorKind=quota|rate_limit|...`). `codex-on-claude analyze` surfaces a `ruleFrequentFallback` candidate at ≥5 events in the window — your hint to upgrade or de-tune the primary.\n\n### 7. usage-mode — Codex invocation policy (v0.5.0)\n\nHow aggressively Claude consults Codex. Pick one of four:\n\n| Mode | One-line definition | Default? | Behavior summary |\n|---|---|---|---|\n| `none` | Codex calls blocked at PreToolUse gate | — | α-only. Privacy / cost-sensitive use. The PreToolUse hook denies `mcp__codex__codex` with a structured reason. |\n| `synergy` | Follow v9 guidance (Quick-Ref 3-Q tree + R1–R6 recipes) | **default for new installs + silent migration default** | 90% sweet spot. Codex fires only when matched recipes apply (adversarial review, sweet-spot partial-fail, hard reasoning). |\n| `auto` | Heuristic signal detection + optional Tier 2 LLM probe | for power users | Tier 1: heuristic regex over prompt. Tier 2 (`--auto-tier2-llm-probe=on`): $0.01–0.02 Codex meta-classifier on ambiguous tasks. |\n| `max` | Quality-first bounded automation | for critical tasks | R1 default ON; R5 always probe; γ hot-swap auto on P5 catastrophe. **Hard DO-NOT rules still enforced** — max ≠ override. |\n\nFlags: `--usage-mode=none|synergy|auto|max` and (auto-only) `--auto-tier2-llm-probe=on|off`.\n\n**Hard guardrails (every mode)** — surfaced as `{{guardrail*}}` placeholders in every SKILL.md:\n- `chainJsonTrap`: hard-block (avoid Chain-JSON Trap, route to R6 Format-Safe Handoff)\n- `subagentStrict`: hard-block (don't ask subagent for strict-JSON when adversarial output is also needed)\n- `turnBurn`: 3-turn-stop (stop multi-turn followups at turn 3 unless new context arrived)\n- `ceilingNoUpside`: warn-and-skip (when α is ≈100% already, β can only equal it)\n\nDetailed spec + decision tree: [`docs/usage-mode-config.md`](docs/usage-mode-config.md). Quick decision card: [`docs/guidance-quick-ref.md`](docs/guidance-quick-ref.md). Recipe details: [`docs/synergy-playbook.md`](docs/synergy-playbook.md).\n\nExisting installs (pre-0.5.0) migrate silently to `synergy` on the next `npx --yes codex-on-claude@latest` — no prompt, no behavior change. Run `codex-on-claude reconfigure` if you want to surface the new question.\n\n#### v0.5.0 hardening (post-L6 review — 14 follow-up fixes)\n\nBeyond the headline 4-mode policy, v0.5.0 ships a comprehensive set of fixes from a Codex peer review pass + adversarial security review. User-visible changes:\n\n- **PreToolUse gate is race-free.** The hook command embeds `--enforce-mode=none` at install time so the gate decision never depends on a separate `config.json` read. Mode toggles do not race in-flight calls.\n- **Gate covers Bash CLI bypasses.** `mode=none` blocks not only `mcp__codex__codex` MCP calls but also any `Bash` invocation whose command contains `codex exec`, `codex-on-claude threads resume`, `npx ...codex...`, or path-qualified codex binaries. Wildcard regex (`mcp__codex__.*`) handles current + future MCP tool variants. Case-insensitive matching.\n- **Gate fails CLOSED for Codex-shaped tools on corrupt state.** When `config.json` is unreadable or the hook payload is malformed AND the tool looks Codex-shaped, the gate denies (was: fail-OPEN). Non-Codex tools still fail-open to avoid breaking unrelated calls.\n- **Gate state reconciles automatically.** Every reconfigure inspects `~/.claude/settings.json` itself (not just `config.json`'s claim) and removes orphan gate entries from manual edits / stale state.\n- **Hook decision JSON emitted in BOTH legacy and new shapes.** `{decision, reason}` + `hookSpecificOutput.permissionDecision` so the gate works against current Claude Code 2.1.x and any future version that drops legacy support. Hard-denies additionally `exit(2)` with stderr backstop.\n- **State directories are 0700.** Logs (which may capture prompt sizes / thread IDs) + thread catalog + improvements are now restricted to user-only access. Best-effort on filesystems that ignore chmod.\n- **Atomic config writes.** `config.json`, `settings.json`, and thread catalog files are written via `temp+rename`; concurrent readers (analyzer, status, gate) never see a torn JSON document.\n- **CLI parsing hardening.** `--usage-mode max` (space-separated, common mistake) now errors with a hint to use `--usage-mode=max`. Use `=` to assign flag values; positional args are rejected against an explicit known-subcommand allowlist.\n- **Silent-fill info line.** Upgrading users see `usageMode: silent default 'synergy' applied for upgrade` on auto-detected reconfigures (npx upgrade path). Explicit `reconfigure` still surfaces the §7 prompt.\n- **`auto` mode classifier sharper.** Tier 1 heuristics now handle: chain step variants (numbered / lettered / first-then-finally / inline \"Step N:\"); mixed adversarial+style prompts (\"find security bugs and naming issues\" keeps adversarial signal); unquoted field-list syntax (\"Return fields: status, risk, file, line\"); \"table\" only with output-intent context; plural forms (\"race conditions\", \"failing tests\"); excluded \"edge cases\" alone from TDD signal.\n- **Every log row carries `usageMode`.** `usage-*.jsonl` entries include the active mode at log time, powering the `ruleUsageModeDrift` analyzer rule.\n- **`ruleUsageModeDrift` filters by `config.updatedAt`.** Logs from before the last reconfigure no longer trigger false-positive drift right after a mode switch.\n\nFull details: [`docs/release-notes-0.5.0.md`](docs/release-notes-0.5.0.md) + [`docs/security-review-0.5.0.md`](docs/security-review-0.5.0.md).\n\n**Final verification (post 5 Codex peer review cycles)**: **168 automated test cases** (112 unit + 37 integration + 17 installer-flow + 2 regression with 7 internal cases) all PASS. npm tarball **115 kB / 32 files** (99% reduction from initial build via explicit `files` allowlist + `.npmignore`). **28 total fixes** applied pre-ship across 5 review cycles (F1-F8 → G1-G7 → H1-H7 → B1-B6 → H1-H4 Bash precision → M1-M3 → A1-A4 — see [CHANGELOG.md](CHANGELOG.md) for the full breakdown). Run all tests yourself: `bash install/fixtures/v05/installer-flow/run-flow.sh \u0026\u0026 node install/fixtures/v05/unit/run-units.mjs \u0026\u0026 ...`.\n\n---\n\n## Cerberus mode (v0.5.5, opt-in)\n\nWhen the planning step matters more than execution speed, enable Cerberus Head mode:\n\n```sh\ncodex-on-claude reconfigure --cerberus=on --yes\n# Then in a Claude Code session:\n/cerberus head \"\u003ctask description\u003e\"\n```\n\n**How it works.** `/cerberus head` spawns three independent planners in parallel — `cerberus-h1-claude-only` (no Codex), `cerberus-h2-codex-only` (delegates fully to Codex CLI), `cerberus-h3-synergy` (Claude + Codex consultation) — then merges their plans via a deterministic algorithm. Topics are extracted from each plan's markdown sections (Decision / Reasons / Risks / Steps), grouped by Jaccard similarity over Porter-stemmed tokens, and classified into one of four cases: case 1 (3-head merge), case 2 (3-head tournament with h3 tie-break + head-weight ladder), case 3 (2-head agreement + 1 missing), case 4 (single-head, validated / disputed / minority / missing buckets). Output is a single consensus plan plus an `agreement_score` (0–1) and `label` (low / moderate / high). **Cost**: ~2–3× a single planner (typically 20–50k tokens total). **Plan-only** — execute and verify phases ship later.\n\n**Hardening since v0.5.1.**\n\n| Version | Change | Why it matters |\n|---------|--------|----------------|\n| v0.5.1 | Initial 3-head + deterministic merge | PoC |\n| v0.5.2 | Porter Stemmer + nonce challenge + case-4 conservative partial credit | Paraphrase absorption (`deterministic`/`deterministically` → same stem) and orchestration-layer guard against fake plans (each head plan must end with `cerberus-nonce: \u003cvalue\u003e` matching init's nonce; consensus rejects mismatches) |\n| v0.5.3 | Polarity tracking (`detectPolarity` blocks `use cache` (+) merging with `do not use cache` (-)) + empty-token guard (no false-merge on Korean / single-char bullets) + minority dissent bucket + Porter `isV(s, -1)` base case fix | Closes 4 critical bugs surfaced by the n=2 self-review |\n| v0.5.4 | Disputed render fix — `*(disputed — opposing polarity)*` for case-4 polarity-split entries (no more `*(lost to ?)*`) | MEDIUM #2 surfaced in Opus 4.7 re-test: 6 user-facing `?` outputs across 4 runs |\n| v0.5.5 | Decision multiplier soft-curve (case-2 → 1.2×, was binary cliff to 1.0×) + contrast conjunction polarity (`but/however/although/despite/except` with `but also/even/too` false-positive guard) + `cerberusConfig` seed/reader (dead path in server fixed) + HU-33 plan-level lock | MEDIUM #1 + #3 + LOW + HU-33 from v0.5.4 plan Phase 2 |\n\n**Test coverage.** 102 cerberus-specific unit + integration tests (consensus 15 + install 13 + config 12 + stemming 14 + stemming-adversarial 7 + nonce 10 + score-formula 10 + v053-fixes 19 + e2e 4).\n\n**Per-machine tuning.** When `cerberus=on`, the installer seeds `choices.cerberusConfig: {}` in `~/.claude/codex-on-claude/config.json`. Override any of `headWeights / jaccardGroupThreshold / bodyMergeThreshold / decisionMultiplier / decisionPartialMultiplier / costCapTokens` to taste — the server reads these on every consensus call and merges with `DEFAULTS`. Empty seed keeps shipped defaults.\n\n**Known backlog (v0.5.6+).** Cerberus **Full mode** (FU-01~10 + FC-02~05, 14 PENDING-IMPL scenarios: 3-worktree execute + verify + iteration loop) — separate minor release. Embedding-based similarity to bridge Porter Stemmer paraphrase miss (opt-in LLM-judge) — v0.5.6+ candidate.\n\nDisable any time with `codex-on-claude reconfigure --cerberus=off --yes`. Spec: [`docs/cerberus-mode-spec.md`](docs/cerberus-mode-spec.md) (Draft 5). PoC walkthrough: [`docs/cerberus-poc-2026-05-22.md`](docs/cerberus-poc-2026-05-22.md). Re-test results: [`docs/test-execution-results-cerberus-v0.5.3-opus47.md`](docs/test-execution-results-cerberus-v0.5.3-opus47.md).\n\n---\n\n## Operational guidance (v0.5.0)\n\nThe v5–v9 analysis series in [`docs/`](docs/) captures **771 runs across 217 scenarios** (~\\$100–115 in measured external cost). Key references for everyday decisions:\n\n| Read this when | File |\n|---|---|\n| 30-second decision card for one task | [`docs/guidance-quick-ref.md`](docs/guidance-quick-ref.md) |\n| Detailed comparison (Claude α / Codex γ / β orchestration) | [`docs/guidance-comparison.md`](docs/guidance-comparison.md) |\n| Practical recipes (R1 Adversarial / R2 Self-review / R3 reasoning=high / R4 γ hot-swap / R5 cheap β trial / R6 Format-Safe Handoff) | [`docs/synergy-playbook.md`](docs/synergy-playbook.md) |\n| Mode configuration spec (what every mode does, per-skill activation matrix) | [`docs/usage-mode-config.md`](docs/usage-mode-config.md) |\n| Cross-domain validation results (post-v8 sweep) | [`docs/v8-validation-sweep.md`](docs/v8-validation-sweep.md) |\n| Why these recommendations changed in v9 | [`docs/v9-adjustment-spec.md`](docs/v9-adjustment-spec.md) |\n\n---\n\n## What you get\n\n### Skills under `~/.claude/skills/codex-*/SKILL.md`\n- `codex-review` — one-shot read-only review of the current diff / files\n- `codex-followup` — continue the same thread via `mcp__codex__codex-reply`\n- `codex-resume` — CLI fallback when MCP session is lost (auto-detects silent-new-session)\n- `codex-fix` — file edits under a strict allowlist (workspace-write)\n- `codex-routine` — templated recurring job\n- `codex-analyze` — analyze the usage log and surface improvement candidates *(improvementLoop ≠ off)*\n- `codex-improve` — apply / reject specific candidates *(improvementLoop ≠ off)*\n- `codex-log` — append usage log entries *(improvementLoop ≠ off; redundant when hooks handle it)*\n- `codex-threads` — browse / annotate / resume the persistent catalog *(threads ≠ off)*\n\n### Agent under `~/.claude/agents/codex-reviewer.md`\nIsolates large Codex responses and returns only a 4-line summary to the main session. Installed when `contextPolicy ∈ {summarize, mixed}`.\n\n### Plugin bundle\nDeprecated since 0.3.0. Existing 0.2.x bundles are left in place — uninstall does **not** remove them; a one-line `Manual: rm -rf …` hint prints.\n\n---\n\n## Operating flow\n\n```\n┌─────────────────────────────────────────────────────────────────┐\n│ Claude Code session                                             │\n│                                                                 │\n│   User invokes /codex-review (or natural language match)        │\n│            │                                                    │\n│            ▼                                                    │\n│   Skill follows its MUST procedure → mcp__codex__codex          │\n│            │                                                    │\n│            ▼                                                    │\n│   ┌─── MCP transport (codex mcp-server) ──────────────────┐    │\n│   │   → Codex model call                                   │    │\n│   │   ← threadId + content                                 │    │\n│   └────────────────────────────────────────────────────────┘    │\n│            │                                                    │\n│            ▼                                                    │\n│   Response lands in main (or in codex-reviewer subagent)        │\n│            │                                                    │\n│            ▼                                                    │\n│   Skill writes catalog entry + log line                         │\n│   (improvementLoop=auto-on-skill: PostToolUse hook does logging)│\n└─────────────────────────────────────────────────────────────────┘\n\nPeriodically — or on demand:\n\n   codex-on-claude analyze\n       │\n       ▼\n   reads logs/threads, ranks improvement candidates\n       │\n       ▼\n   /codex-improve walks the user through Apply / Reject / Skip\n       │\n       ▼\n   decisions persisted under ~/.claude/codex-on-claude/improvements/\n```\n\n---\n\n## CLI reference\n\n```\ncodex-on-claude               Install interactively (preflight included)\ncodex-on-claude reconfigure   Reconfigure (previous answers as defaults; review/edit/cancel at the end)\ncodex-on-claude doctor        Run preflight only (Node/codex/claude/MCP checks)\ncodex-on-claude status        Show install state\ncodex-on-claude uninstall     Remove installed components (Skills, Agent, hooks)\ncodex-on-claude analyze       Analyze usage log and surface improvement candidates\ncodex-on-claude suggest       Record an apply/reject decision for a candidate\ncodex-on-claude log           Append a usage log entry (use --from-stdin for hook payloads)\ncodex-on-claude threads ...   Persistent thread catalog (see below)\ncodex-on-claude help          Print help\n```\n\nCommon combinations:\n\n```sh\ncodex-on-claude status\n\ncodex-on-claude reconfigure --improvement-loop=auto-on-skill --yes   # turn on hook logging\ncodex-on-claude reconfigure --threads=full --yes                     # enable full thread context\n\ncodex-on-claude analyze --days=7 --format=markdown --save            # write a snapshot report\ncodex-on-claude suggest --apply=2                                    # adopt candidate #2\ncodex-on-claude suggest --reject=2 --reason=\"intentional pattern\"\n\ncodex-on-claude uninstall\n```\n\n---\n\n## `threads` subcommand reference (v0.2+)\n\n```sh\ncodex-on-claude threads list [--status=active|resolved|archived] [--tag=...] [--since=7d] [--skill=...]\ncodex-on-claude threads latest [--format=id|json]                       # deterministic latest thread (great for shell automation)\ncodex-on-claude threads show \u003cthreadId\u003e\ncodex-on-claude threads new  \u003cthreadId\u003e --title=\"...\" --tags=a,b --skill=codex-review --cwd=\"$PWD\" --sandbox=read-only\ncodex-on-claude threads goal     \u003cthreadId\u003e \"Verify naming consistency\"\ncodex-on-claude threads outcome  \u003cthreadId\u003e \"Codex flagged 3 issues\"\ncodex-on-claude threads decision \u003cthreadId\u003e \"Adopt suggestion 2\"\ncodex-on-claude threads note     \u003cthreadId\u003e \"arbitrary memo\"\ncodex-on-claude threads incident \u003cthreadId\u003e --issue=session-not-found --resolution=\"codex exec resume\" --outcome=recovered\ncodex-on-claude threads tag      \u003cthreadId\u003e --add=critical --remove=draft\ncodex-on-claude threads status   \u003cthreadId\u003e resolved\ncodex-on-claude threads fallback \u003cthreadId\u003e auto-resume\ncodex-on-claude threads search   \"react refactor\"\ncodex-on-claude threads resume   \u003cthreadId\u003e \"follow-up prompt\"\ncodex-on-claude threads remove   \u003cthreadId\u003e\n```\n\n`threads resume \u003cid\u003e \"prompt\"` reads the stored `fallbackStrategy`:\n\n- `auto-resume` — calls `codex exec resume` and **automatically detects** silent-new-session (when codex returns a different `thread_id`), filing an incident on the original thread and registering the new one as a bifurcation.\n- `ask` — prints the thread metadata + recent summaries so the user picks the next step.\n- `new` — guides starting a fresh `mcp__codex__codex` call with the prior summaries as a preamble.\n\n---\n\n## Verification\n\n### 1. Preflight + MCP\n```sh\ncodex-on-claude doctor\n```\nExpected: `Node.js …`, `Codex CLI …`, `Claude Code …`, `codex doctor: ok`, `Claude auth: Login method: …`, `codex MCP server: ✓ Connected`.\n\n### 2. End-to-end MCP call (a fresh `claude -p`)\n\n```sh\nclaude -p --model haiku --verbose --output-format stream-json \\\n  --permission-mode dontAsk \\\n  --allowedTools=mcp__codex__codex \\\n  \"Use the Codex MCP tool to ask Codex: Return exactly OK. Then return Codex's exact answer.\"\n```\n\n### 3. Skill in a real session\n\nStart a fresh Claude Code session and type:\n\n```\n/codex-review evaluate this README in two sentences.\n```\n\nCodex should answer; the response ends with a deterministic `Thread: \u003cid\u003e` line (per the Skill's MUST procedure).\n\n### 4. Analyze cycle\n\n```sh\ncodex-on-claude log --skill=codex-review --sandbox=read-only --outcome=ok \\\n  --prompt-chars=120 --response-chars=600 --elapsed-ms=2100\ncodex-on-claude analyze --days=1\n```\n\n---\n\n## Operational guidance\n\n- **Default sandbox is `read-only`.** Switch to `workspace-write` only when files genuinely need edits (`/codex-fix` enforces an allowlist). `danger-full-access` is not used in this workflow.\n- **Pass context explicitly.** Claude and Codex don't share hidden context — when you want Codex to know about a goal / file / constraint, name it in the prompt.\n- **threadId discipline.** The `threadId` itself is a public-safe identifier. Within the same MCP server process, prefer `/codex-followup`; on `Session not found`, switch to `/codex-resume` or `codex-on-claude threads resume`.\n- **401 from Claude.** If `claude auth status` looks fine but the model returns `401 Invalid authentication credentials`, re-run `claude auth login --claudeai --email \u003cyou\u003e`.\n- **Shrinking install.** Running `reconfigure` with fewer `--patterns` removes the unselected Skills automatically. Thread catalog data and improvement decisions are kept even when their Skills are removed.\n\n---\n\n## Privacy policy\n\n- All logs / catalog / improvement decisions stay **on this machine**. No outbound network calls beyond the Codex/Claude APIs themselves.\n- Log entries contain only metadata: timestamp, Skill name, sandbox mode, prompt/response **lengths**, threadId, outcome code.\n- Prompt / response **bodies are never written to disk.**\n- `~/.claude/codex-on-claude/` is created with restrictive permissions.\n- Opt out at any time: `codex-on-claude reconfigure --improvement-loop=off` (also removes hook entries) and `codex-on-claude reconfigure --threads=off`.\n- Full wipe: `codex-on-claude uninstall` or `rm -rf ~/.claude/codex-on-claude/`.\n\n---\n\n## Directory layout\n\n```\ncodex-on-claude/\n├── README.md            (this file)\n├── README.ko.md         Korean mirror\n├── CHANGELOG.md\n├── LICENSE\n├── package.json         npm metadata (bin: codex-on-claude)\n├── install/\n│   ├── install.mjs      main installer / reconfigurer / analyze CLI\n│   ├── analyze.mjs      usage log analyzer + improvement candidate rules\n│   ├── threads.mjs      persistent thread catalog CRUD\n│   ├── hooks.mjs        ~/.claude/settings.json PostToolUse merger\n│   ├── manifest.json    option ↔ component mapping\n│   └── components/\n│       ├── agents/codex-reviewer.md\n│       └── skills/codex-{review,followup,resume,fix,routine,analyze,improve,log,threads}/SKILL.md\n└── docs/\n    ├── implementation-log.md   English summary\n    ├── test-report-2026-05-20.md   English summary\n    └── ko/             Korean originals (history)\n```\n\nAfter install (on a user machine):\n\n```\n~/.claude/\n├── skills/codex-*/SKILL.md                       (per selected patterns + improvement loop + threads)\n├── agents/codex-reviewer.md                      (contextPolicy ∈ {summarize, mixed})\n├── settings.json                                 (PostToolUse hooks merged in when improvementLoop ∈ {auto-on-skill, periodic})\n└── codex-on-claude/\n    ├── config.json                               selections from last install/reconfigure\n    ├── logs/usage-YYYY-MM-DD.jsonl\n    ├── reports/                                  analyze --save outputs\n    ├── improvements/                             apply/reject decisions\n    └── threads/\u003cthreadId\u003e.json + index.json      persistent thread catalog\n```\n\n---\n\n## Troubleshooting\n\n### `codex-on-claude: command not found` right after `npm update -g` / `npm install -g`\n\nStale shell command hash — npm replaced the binary file on disk, but your current zsh/bash session is still caching the old path. Fix:\n\n```sh\nhash -r        # zsh / bash: clear cached command lookups\n# or: rehash   # zsh-specific\n# or open a new terminal\ncodex-on-claude --version\n```\n\n`codex-on-claude doctor` (v0.3.3+) auto-detects this case and prints the same hint.\n\n### `npx codex-on-claude` fails with `ENOENT … _npx/\u003chash\u003e/package.json`\n\nCorrupted npx per-package cache, usually from a previously interrupted run. Force a fresh download:\n\n```sh\nnpx --yes codex-on-claude@latest   # `@latest` cache-busts the per-package directory\n# still broken? nuke npx's cache entirely:\nrm -rf ~/.npm/_npx\n```\n\nThe canonical install command in this README (`npx --yes codex-on-claude@latest`) avoids this class of failure because `@latest` forces npx to revalidate the package version on every run.\n\n### `claude mcp get codex` reports \"Failed to connect\"\n\nOften just a sandboxed shell. Re-run in a normal terminal:\n\n```sh\nclaude mcp get codex\nclaude mcp list\n```\n\n### `Session not found for thread_id`\n\nThe MCP server restarted. Fallback:\n\n```sh\ncodex exec resume --skip-git-repo-check --json \u003cthreadId\u003e \"\u003cfollow-up prompt\u003e\"\n```\n\nOr, inside Claude Code: `/codex-resume`. The `codex-on-claude threads resume \u003cid\u003e \"...\"` subcommand wraps this and auto-detects the silent-new-session case (where Codex returns a different `thread_id` than requested).\n\n### `401 Invalid authentication credentials`\n\nEven when `claude auth status` looks healthy:\n\n```sh\nclaude auth login --claudeai --email \u003cyour-email\u003e\n```\n\n### Skill not detected in a new session\n\nRestart Claude Code. Skills are loaded once at session start; install / reconfigure / uninstall only affect future sessions (with the rare exception of hot-load when the harness happens to refresh).\n\n### Reconfigure removed something I wanted\n\nRe-run with the full set:\n\n```sh\ncodex-on-claude reconfigure\n# step through with arrow keys, the previous answers are pre-selected\n```\n\nThe final review screen lets you `Edit again` before saving.\n\n### `codex-on-claude threads latest --format=id` includes the CLI banner / ANSI escape codes\n\nFixed in v0.3.4 — the top-level `codex-on-claude vX.Y.Z` banner now goes to **stderr**, so stdout for `--format=id|json` data subcommands is clean:\n\n```sh\ncodex-on-claude threads latest --format=id            # stdout: bare UUID (or empty)\ncodex-on-claude threads latest --format=id 2\u003e/dev/null  # silences banner entirely\n```\n\nIf you're stuck on ≤ 0.3.3 and need to scrape stdout, strip ANSI + grep UUIDs:\n\n```sh\ncodex-on-claude threads latest --format=id \\\n  | sed 's/\\x1b\\[[0-9;]*m//g' \\\n  | grep -oE '[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}' \\\n  | head -1\n```\n\n### Hook keeps logging after I set `improvementLoop=off` in `config.json` (without reconfigure)\n\nFixed in v0.3.4 — the auto-on-skill hook (`codex-on-claude log --from-stdin`) now reads `~/.claude/codex-on-claude/config.json` on each call and silently no-ops when `improvementLoop` is `off` or `manual`. No `reconfigure` step required for the gate to take effect.\n\nIf your hook command path is stale (still points at the npm-cached binary from before the fix), reinstall + reconfigure to refresh:\n\n```sh\nnpm install -g codex-on-claude@latest\ncodex-on-claude reconfigure --yes\n```\n\nManual `/codex-log` invocations are unaffected by the guard — explicit logging always works.\n\n### Mixed-ownership PostToolUse group lost my user hook on uninstall (≤ 0.3.3)\n\nFixed in v0.3.4. If you manually nested a `_coc.marker` entry inside a group also containing your own hook, the previous code removed the whole group via `.some()` semantics. The new `stripOursFromGroups` operates at the hook level — your hooks are preserved, only marked entries are removed. The installer never produces mixed groups itself; this protects you only against manual `settings.json` edits.\n\n---\n\n## Contributing / license\n\n- Issues / PRs welcome: \u003chttps://github.com/pathcosmos/codex-on-claude\u003e\n- License: MIT\n\n---\n\n## History\n\n- [`CHANGELOG.md`](CHANGELOG.md) — release notes\n- [`docs/implementation-log.md`](docs/implementation-log.md) — English summary of the original 2026-05-19 build session ([Korean original](docs/ko/implementation-log-2026-05-19.md))\n- [`docs/test-report-2026-05-20.md`](docs/test-report-2026-05-20.md) — English summary of the 2026-05-20 verification run ([Korean original](docs/ko/test-report-2026-05-20.md))\n- [`docs/test-scenarios-codex-calls.md`](docs/test-scenarios-codex-calls.md) — 0.3.3-era comprehensive test scenario document (~43 scenarios × 8 groups covering MCP transport, 9 Skills, codex-reviewer Agent, 14 threads subcommands, hooks, analyzer rules, full E2E). Each scenario has Given/Setup/Command/Expected/Notes plus deterministic fixtures at `install/fixtures/analyze-rules/`.\n- [`docs/test-execution-results-2026-05-20.md`](docs/test-execution-results-2026-05-20.md) — execution log of the 0.3.4 cycle: Phase A (hooks, 10/10 PASS), Phase B (resume + SILENT_NEW_SESSION, 7/7 PASS after malformed-id correction), Codex-call ledger (27 calls across the session), bug discovery via meta-audit, synergy measurement (findings/$, findings/min, PRE/POST ratio).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpathcosmos%2Fcodex-on-claude","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpathcosmos%2Fcodex-on-claude","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpathcosmos%2Fcodex-on-claude/lists"}