https://github.com/humanbased-ai/crosscheck
AI code review pipeline that turns agent-written PRs into merge-ready patches
https://github.com/humanbased-ai/crosscheck
agentic-coding ai-agents ai-code-review claude-code cli code-quality codex devtools github-webhooks pull-request
Last synced: 1 day ago
JSON representation
AI code review pipeline that turns agent-written PRs into merge-ready patches
- Host: GitHub
- URL: https://github.com/humanbased-ai/crosscheck
- Owner: humanbased-ai
- License: mit
- Created: 2026-05-07T01:26:44.000Z (about 2 months ago)
- Default Branch: main
- Last Pushed: 2026-06-23T08:24:53.000Z (4 days ago)
- Last Synced: 2026-06-23T10:17:57.625Z (4 days ago)
- Topics: agentic-coding, ai-agents, ai-code-review, claude-code, cli, code-quality, codex, devtools, github-webhooks, pull-request
- Language: TypeScript
- Homepage: https://github.com/humanbased-ai/crosscheck
- Size: 3.79 MB
- Stars: 4
- Watchers: 0
- Forks: 3
- Open Issues: 17
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
- Codeowners: .github/CODEOWNERS
- Agents: AGENTS.md
Awesome Lists containing this project
README
A Humanbased project, built with crosscheck.
# crosscheck
**Your agents ship fast. Crosscheck makes sure they ship right.**
AI coding agents create PRs faster than review habits can absorb. The failure mode isn't broken builds β it's *early victory*: patches that pass CI, look complete, and still hide regressions, brittle edge cases, or half-finished fixes.
Crosscheck adds an independent Review β Fix β Recheck loop. One agent writes the patch. Another reviews it. Findings go back to the author to repair. The result gets rechecked before merge. The PR moves toward genuinely merge-ready β not just "looks green."
No new hosted service. No per-review API bill. Crosscheck runs through the `claude` and `codex` CLIs you already have β your existing subscriptions, your machine or server.
Built by [Humanbased](https://github.com/humanbased-ai). Read the field report: [What 295 Agentic PRs Taught Us About Code Review](https://blog.humanbased.ai/posts/agentic-pr-quality-crosscheck/) β 295 agentic PRs analyzed, real Crosscheck logs included.
## Why crosscheck?
**Agent velocity without lowering the merge bar.**
- **Independent eyes, not self-review** β route Claude-authored PRs to Codex and vice versa. Self-review is exactly where early-victory failures hide.
- **Review β Fix β Recheck, not just comments** β findings return to the author agent for repair; a clean recheck follows before merge. PRs move forward, not sideways.
- **No new vendor** β runs through the `claude` and `codex` CLIs you already pay for. No per-review bill, no extra trust surface.
- **Configurable for any team size** β review-only mode, review + fix, or the full loop. Personal laptop via `watch`, always-on server via `serve`, one-shot via `review`.
## Who uses crosscheck
| Persona | Problem | How crosscheck helps |
|---|---|---|
| **Solo agentic builder** | Same agent that wrote the code may self-approve incomplete work | Independent reviewer from a different vendor, on your machine |
| **Technical founder** | AI PRs look done before delivering stable value | Closes the loop: review finding β agent fix β clean recheck |
| **Engineering lead** | Agent use is hard to supervise or standardize | Configurable workflow, review-only mode, and a visible PR audit trail |
| **OSS maintainer** | Review bandwidth is scarce; comments must be actionable | One-shot `crosscheck review` posts concrete findings directly on the PR |
### Common workflows
```bash
# Catch regressions before merging a solo PR
crosscheck run
# Continuous review on every incoming agent PR
crosscheck serve # always-on team server
# Review stale PRs from a queue
crosscheck scan && crosscheck kickass
# One-shot review of a specific PR
crosscheck review
# Loop until the agent produces an approved patch
crosscheck run --crazy
```
---
## Quick start
### First useful review in 10 minutes
Start with one low-risk PR before turning on continuous watch mode. You only need GitHub CLI plus one authenticated reviewer CLI.
```bash
# 1. Install crosscheck
npm install -g @humanbased/crosscheck
# 2. Authenticate GitHub
brew install gh && gh auth login
# 3. Authenticate one reviewer
npm install -g @openai/codex && codex login --device-auth
# or:
npm install -g @anthropic-ai/claude-code && claude
# 4. Check your setup
crosscheck status
# 5. Review the public fixture PR
crosscheck review https://github.com/humanbased-ai/crosscheck-proof-fixture/pull/1 --reviewer codex
```
This fixture PR intentionally contains a realistic agentic-code regression, so you can see whether Crosscheck produces a useful review before pointing it at your own repo. Use `--reviewer claude` if Claude Code is the authenticated reviewer. After the fixture review works, swap in one low-risk PR from your repo, then run `crosscheck onboard` to configure repos, workflow mode, and continuous monitoring.
### Continuous mode
```bash
# 1. Install crosscheck and the agent CLIs
npm install -g @humanbased/crosscheck
npm install -g @anthropic-ai/claude-code && claude # Claude Pro/Max subscription
npm install -g @openai/codex && codex login --device-auth # ChatGPT Plus/Pro subscription
brew install gh && gh auth login # GitHub CLI
# 2. Guided setup β repos, review mode, workflow pipeline
crosscheck onboard
# 3. Start watching
crosscheck watch # personal laptop
crosscheck serve # always-on team server
```
---
## Commands
```bash
crosscheck onboard # guided setup β pick repos, mode, and pipeline
crosscheck watch # personal use β tunnel + webhook + listening on your laptop
crosscheck serve # team use β fixed port, register webhook once
crosscheck review # review one or more PRs (comma lists, ranges, cross-repo)
crosscheck run # run the full workflow: review β (fix β recheck) Γ max_rounds
crosscheck recheck|fix|resolve # force one workflow step on one or more PRs
crosscheck scan # show open PR workflow state across monitored repos
crosscheck detect-step # explain the next workflow step for one PR
crosscheck kickass # advance stale PRs from an interactive operator queue
crosscheck init # check prerequisites, write starter config
crosscheck status # auth state, config summary, CLI versions
```
**Operator queue (scan + kickass)**
`crosscheck scan` tracks two independent dimensions per PR:
| Workflow stage (`reviewState`) | Meaning | Next action |
|---|---|---|
| `NEEDS_REVIEW` | No crosscheck review for current HEAD | review |
| `NEEDS_FIX` | Reviewed β fix requested | fix |
| `NEEDS_RECHECK` | Fix committed, recheck pending | recheck |
| `APPROVED` | Reviewed and approved | merge |
| Verdict (`verdict`) | Meaning |
|---|---|
| `UNREVIEWED` | No review found |
| `APPROVE` | AI approved |
| `NEEDS_WORK` | AI requested changes |
| `BLOCK` | AI hard-blocked merge |
`BLOCK` and `NEEDS_WORK` both map to `NEEDS_FIX` stage β same next action, but the `verdict` field preserves severity so operators can prioritise.
**How workflow steps are counted**
Crosscheck reconstructs PR workflow state from visible artifacts:
| Evidence | Counts as |
|---|---|
| Review or recheck comment with `` | completed `review` / `recheck` step |
| Fix or conflict-resolve comment, such as `` | completed `fix` / `conflict-resolve` step |
| PR commit trailer, such as `Crosscheck-Step: fix` | completed step declared by that trailer |
Commit trailers are accepted as operator-declared workflow state. In practice, a PR author may command Claude, Codex, or another agent to apply a fix outside a standalone Crosscheck post; if the resulting PR commit carries `Crosscheck-Step: fix`, Crosscheck counts it as fix evidence.
That evidence only advances the next step to `recheck` when the fix commit is the current PR HEAD. If another commit lands after the fix evidence, Crosscheck starts a fresh review round so the newer code is reviewed normally. This prevents an old fix trailer from marking later changes as ready for recheck.
```bash
crosscheck scan [--tidy] [--stale-after ] [--force] [--json]
crosscheck kickass [--dry-run] [--stale-after ] [--force]
```
`crosscheck review --reviewer`, `crosscheck run --reviewer`, `crosscheck run --fixer`, and `crosscheck run --vendor` accept vendor aliases:
- Claude: `claude`, `claude-code`, `cc`, `anthropic`
- Codex: `codex`, `openai`
**Continuous improvement** *(experimental)*
```bash
crosscheck diagnose # surface failure patterns from review logs
crosscheck optimize [--apply] # rewrite reviewer instructions based on diagnose output
crosscheck impact [--money] # time saved, issues caught, code quality trends
crosscheck issue # draft and file a bug report from recent error logs
```
---
### `crosscheck onboard`
Interactive setup wizard. Picks repos/orgs to monitor, selects single-vendor or cross-vendor mode, configures the review pipeline, and writes `~/.crosscheck/config.yml` and `workflow.yml`.
```bash
crosscheck onboard # guided setup
crosscheck onboard --personal # skip persona prompt, go straight to personal mode
crosscheck onboard --team # skip persona prompt, go straight to team mode
crosscheck onboard -y # accept all defaults non-interactively
```
---
### `crosscheck watch`
Personal mode. Starts an SSH tunnel (localhost.run), registers GitHub webhooks, and listens for PR events. Everything self-cleans on Ctrl+C.
```bash
crosscheck watch
crosscheck watch --no-backtrace # skip startup scan for unreviewed open PRs
crosscheck watch --reconfigure # re-run deployment setup before starting
```
---
### `crosscheck serve`
Team mode. Binds to a fixed port β register the webhook once, cover the whole team. Designed for a mac-mini or home server.
```bash
crosscheck serve
crosscheck serve --no-backtrace # skip startup scan
crosscheck serve --personal # personal scope this session only
crosscheck serve --reconfigure # re-run deployment setup
```
---
### `crosscheck review `
Reviews one or more PRs. Clones, checks out, reviews, and posts the comment. The PR argument accepts the [multi-PR spec syntax](#multi-pr-syntax) β multiple PRs are reviewed concurrently.
```bash
crosscheck review https://github.com/org/repo/pull/42
crosscheck review --reviewer claude # force Claude regardless of detection
crosscheck review --reviewer codex # force Codex regardless of detection
crosscheck review --reviewer cc # alias for Claude
crosscheck review --reviewer openai # alias for Codex
crosscheck review .../pull/245,255 # review several PRs at once
crosscheck review .../pull/245-256 # review an inclusive range
```
---
### `crosscheck run `
Runs the full configured workflow against one or more PRs: review β (fix β recheck) Γ `max_rounds`. Loops autonomously through fixβrecheck cycles up to the `max_rounds` value configured in `workflow.yml` (default: 1). Use `--crazy` or `--half-crazy` to loop until approved or unblocked, ignoring `max_rounds`. The PR argument accepts the [multi-PR spec syntax](#multi-pr-syntax); multiple PRs run concurrently (one agent per PR by default).
```bash
crosscheck run
crosscheck run --steps review # only the review step
crosscheck run --steps fix,recheck # skip initial review
crosscheck run --reviewer claude # force review/recheck agent
crosscheck run --fixer claude # force fix agent
crosscheck run --vendor claude # force review/recheck/fix agent
crosscheck run --dry-run # review without posting or fixing
crosscheck run --crazy # π₯π₯ loop until APPROVE
crosscheck run --half-crazy # π₯ loop until not BLOCK
crosscheck run --timeout 10m # custom reviewer timeout
crosscheck run .../pull/245,255 # several PRs, concurrently
crosscheck run .../pull/245-256 --concurrent 3 # range, max 3 agents in parallel
```
---
### `crosscheck recheck` / `fix` / `resolve `
Force a single workflow step against one or more PRs, bypassing next-step auto-detection. Each is sugar for `crosscheck run --steps ` and accepts the same [multi-PR spec syntax](#multi-pr-syntax) and `--concurrent` / `--sequential` / `--stagger` flags as `run`.
| Command | Forces the step |
|---|---|
| `crosscheck recheck ` | `recheck` β re-evaluate against the latest review |
| `crosscheck fix ` | `fix` β apply fixes for the latest review |
| `crosscheck resolve ` | `conflict-resolve` β resolve merge conflicts (Claude only) |
```bash
crosscheck recheck https://github.com/org/repo/pull/42
crosscheck fix .../pull/245,255 --fixer claude
crosscheck resolve .../pull/245-256 --vendor claude
```
When the step is absent from the active `workflow.yml`, `recheck` and `conflict-resolve` are synthesized with built-in defaults so the command still runs.
---
### Multi-PR syntax
`run`, `review`, `recheck`, `fix`, and `resolve` accept a single PR URL or a **spec** that expands to many PRs β a comma-separated list of full URLs, bare numbers, and `N-M` ranges:
```bash
.../pull/245,255 # two PRs in the same repo
.../pull/245-256 # an inclusive range
.../repo/pull/245,https://github.com/o/other/pull/3 # across repos
```
The first token must be a full URL (a bare number inherits the most recent repo). Duplicates are de-duplicated and a spec expands to at most 100 PRs. Multiple PRs run concurrently by default β control parallelism with `--concurrent `, `--sequential`, or `--stagger `.
---
### `crosscheck scan`
Scans every open PR in the configured monitor scope and reports where each one is in the crosscheck workflow. Results are cached for 60 seconds.
States: `NEEDS_REVIEW` Β· `NEEDS_FIX` Β· `BLOCK` Β· `NEEDS_RECHECK` Β· `APPROVE`
```bash
crosscheck scan # all open PRs, grouped stale/not-stale
crosscheck scan --tidy # stale actionable rows only
crosscheck scan --stale-after 4h # custom staleness threshold (default 24h)
crosscheck scan --force # bypass cache
crosscheck scan --json # machine-readable output
```
---
### `crosscheck detect-step`
Explains the workflow history for one PR and prints the next step Crosscheck would run. Use this when a PR has mixed evidence from comments, Crosscheck commits, or ad hoc agent commits with `Crosscheck-Step` trailers.
```bash
crosscheck detect-step
crosscheck detect-step --json
```
---
### `crosscheck kickass`
Selects stale PRs from the operator queue and advances them β runs `scan` first, presents a multi-select picker, shows a preflight summary, then executes after confirmation.
```bash
crosscheck kickass # interactive operator queue
crosscheck kickass --dry-run # preflight only β no mutations
crosscheck kickass --stale-after 2h # tighter staleness threshold
crosscheck kickass --force # bypass scan cache before picking
crosscheck kickass --crazy # π₯π₯ auto loop until APPROVE
crosscheck kickass --half-crazy # π₯ auto loop until not BLOCK
```
Actions: `NEEDS_REVIEW β CR` Β· `NEEDS_FIX/BLOCK β Fix` Β· `NEEDS_RECHECK β Recheck` Β· `APPROVE β Merge`
**`kickass` + `watch` combo**
For the best recovery experience when a batch of PRs is stuck (timed out, stopped before `watch` was running), run both commands together. Each plays a distinct role:
- `kickass` kicks each stuck PR **one step at a time** β it uses `detect-step` to read live PR history and dispatches only the next needed step (review, fix, or recheck).
- `watch` owns **all continuation** β it listens for the webhooks each completed step produces and runs the full remaining pipeline from there.
```
crosscheck kickass
ββ ck run --trigger kickass (one step; detect-step finds where to start)
ββ detect-step β "review" run review only β posts comment
ββ detect-step β "fix" run fix only β pushes commit
ββ detect-step β "recheck" run recheck only β posts verdict
crosscheck watch
ββ issue_comment (type=review) β pick up fix step automatically
ββ synchronize (fix commit) β pick up recheck step automatically
```
> **Note:** `crosscheck run ` invoked directly runs the **full remaining pipeline** from the detected starting step. The one-step behaviour above applies only when kickass dispatches it with `--trigger kickass`.
Start `watch` first, then run `kickass` in a second terminal:
```bash
# terminal 1
crosscheck watch
# terminal 2
crosscheck scan --force # refresh PR state
crosscheck kickass
```
> **How the reviewβfix bridge works:** after `kickass` posts a review comment, GitHub fires an `issue_comment` webhook (not a `pull_request` event). `watch` subscribes to `issue_comment` and, when it sees a crosscheck `type=review` annotation on an open PR, fetches the current PR head and runs the fix step automatically β no new commit required to wake it up. (Introduced in [#193](https://github.com/Motivation-Labs/crosscheck/pull/193).)
**Autonomous loop modes**
`--crazy` and `--half-crazy` turn `run` and `kickass` into autonomous fixβrecheck loops that keep going until the verdict improves β no manual re-runs needed.
| Flag | Stops when | Max rounds | Timeout |
|---|---|---|---|
| `--crazy` π₯π₯ | verdict = `APPROVE` | β | none |
| `--half-crazy` π₯ | verdict β `BLOCK` | β | none |
Both flags disable all reviewer subprocess timeout constraints β long fixes on large PRs won't be cut short. Use `--timeout ` (e.g. `--timeout 10m`) without these flags to set a custom cap.
```bash
# Run full workflow and keep looping until approved
crosscheck run --crazy
# Advance every stale PR until it's no longer blocked
crosscheck kickass --half-crazy
# Custom timeout without looping
crosscheck run --timeout 10m
```
---
## Configuration
Crosscheck uses `~/.crosscheck/config.yml` by default. If that file exists, it wins over `./crosscheck.config.yml` unless you pass `--config ./crosscheck.config.yml`.
### Review depth (`quality.tier`)
```yaml
# crosscheck.config.yml
quality:
tier: balanced # fast | balanced | thorough
```
| Tier | Claude model | Codex model | Latency |
|---|---|---|---|
| `fast` | Haiku | default | ~10s |
| `balanced` | Sonnet (default) | default | ~30s |
| `thorough` | Opus | default | ~60s |
### Pipeline (`workflow.yml`)
```yaml
steps:
- name: review
type: review
reviewer: auto # auto | claude | codex | origin
- name: fix
type: fix
reviewer: origin
when: review.verdict != 'APPROVE'
- name: recheck
type: recheck
reviewer: auto
when: fix.applied_count > 0
```
### Config snapshot
```yaml
# ~/.crosscheck/config.yml
orgs:
- your-org
routing:
allowed_authors:
- your-github-login
mode: cross-vendor # cross-vendor | single-vendor
vendors:
claude:
enabled: true
codex:
enabled: true
quality:
tier: balanced
clone_protocol: ssh # ssh (default) | https
```
Full reference: [get-started.md](./get-started.md)
---
## Requirements
| | Minimum |
|---|---|
| Node.js | 18+ |
| Claude Code CLI | latest β `npm install -g @anthropic-ai/claude-code` |
| Codex CLI | latest β `npm install -g @openai/codex` |
| GitHub CLI | 2.65+ β `brew install gh` |
`GITHUB_TOKEN` is derived automatically from `gh auth login`. No manual export needed.
---
## Documentation
| | |
|---|---|
| **[get-started.md](./get-started.md)** | Full setup guide β prerequisites, all flags, complete config reference, FAQ |
| **[What 295 Agentic PRs Taught Us About Code Review](https://blog.humanbased.ai/posts/agentic-pr-quality-crosscheck/)** | Humanbased field report on agentic PR quality, review routing, and why Crosscheck exists |
| **[docs/fixture-pr.md](./docs/fixture-pr.md)** | Safe public fixture PR for the first Crosscheck review |
| **[crosscheck.config.example.yml](./crosscheck.config.example.yml)** | Annotated config with every option |
| **[CHANGELOG.md](./CHANGELOG.md)** | Release notes |
---
## Contributing
Issues and PRs welcome at [github.com/humanbased-ai/crosscheck](https://github.com/humanbased-ai/crosscheck).
---
## License
[MIT](./LICENSE) β Copyright (c) 2025β2026 Humanbased PTE LTD.