https://github.com/hinanohart/harsh-critic-agent
Harsh-critic subagent definition for Claude Code. Opus-tuned prompt that interrogates its own output for gaps and hand-waving.
https://github.com/hinanohart/harsh-critic-agent
ai-tools claude-code code-review subagent
Last synced: about 1 month ago
JSON representation
Harsh-critic subagent definition for Claude Code. Opus-tuned prompt that interrogates its own output for gaps and hand-waving.
- Host: GitHub
- URL: https://github.com/hinanohart/harsh-critic-agent
- Owner: hinanohart
- License: apache-2.0
- Created: 2026-04-17T18:21:41.000Z (2 months ago)
- Default Branch: main
- Last Pushed: 2026-05-09T17:43:35.000Z (about 1 month ago)
- Last Synced: 2026-05-09T19:32:02.401Z (about 1 month ago)
- Topics: ai-tools, claude-code, code-review, subagent
- Language: Python
- Size: 75.2 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 5
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
- Codeowners: .github/CODEOWNERS
- Security: SECURITY.md
- Notice: NOTICE
Awesome Lists containing this project
README
# harsh-critic-agent
> A battle-hardened final-quality-gate subagent for Claude Code. Blocks false
> approvals with statistical sanity checks, a post-publication adversarial
> pass, and anti-pattern detection.
[](./LICENSE)
[](https://docs.claude.com/en/docs/claude-code/sub-agents)
[](./NOTICE)
---
## What this is
A single-file [Claude Code subagent](https://docs.claude.com/en/docs/claude-code/sub-agents)
that acts as the **final quality gate** for plans, code, and public-facing
releases. Drop it into `~/.claude/agents/` and invoke with
`@harsh-critic` (or via `Agent(subagent_type="harsh-critic", …)`).
It rejects false approvals cold by combining four passes most single-prompt
reviewers skip:
1. **Statistical Sanity** — sample size, confidence intervals, control
validity, multi-seed reproducibility, hardcoded thresholds.
2. **Post-Publication Adversarial Review** — README commands run cleanly,
figures regenerable from data, no dangling references, honest competitor
comparison.
3. **Anti-Pattern Detection** — frame-hopping, n=3 illusion, manufactured
certainty, sunk-cost protection, narrative-fit bias.
4. **Case-Study Anchors** — 6 real-world failures from published research
and OSS releases that the agent cites when current work matches the
pattern.
---
## What makes this different
`harsh-critic-agent` is a **derivative work** (MIT) of the excellent
[`critic.md`](https://github.com/Yeachan-Heo/oh-my-claudecode) in
Yeachan Heo's *oh-my-claudecode*. The upstream agent is already very good at:
- Pre-commitment predictions
- Multi-perspective review (security / new-hire / ops)
- Gap analysis
- Self-audit + realist check
This fork adds the missing half — **what hits you after publication**:
| Pass | Upstream `critic.md` | `harsh-critic-agent` |
|------|----------------------|----------------------|
| Role framing | ✅ | ✅ (retained) |
| Pre-commitment | ✅ | ✅ (retained) |
| Multi-perspective | ✅ | ✅ extended: reviewer / 48-hour-hindsight / downstream user |
| Gap analysis | ✅ | ✅ extended: "what is the author NOT talking about?" |
| Statistical sanity (n, CI, control, seeds) | — | ✅ Phase 6 (7 sub-checks) |
| Post-publication adversarial | — | ✅ Phase 7 (6 sub-checks) |
| Anti-pattern detection | — | ✅ Phase 8 (8 patterns) |
| Case-study anchors | — | ✅ 6 annotated real failures |
| Japanese translation | — | ✅ `agents/harsh-critic.ja.md` |
If your work is research, benchmarks, or a public OSS release, the four added
passes are where the reviewer will find your weak spots in the first 30
minutes. The agent pre-emptively simulates that hostile 30-minute read.
---
## Install
### Option 1 — user-level (global)
```bash
mkdir -p ~/.claude/agents
curl -L -o ~/.claude/agents/harsh-critic.md \
https://raw.githubusercontent.com/hinanohart/harsh-critic-agent/main/agents/harsh-critic.md
```
### Option 2 — project-level
```bash
mkdir -p .claude/agents
curl -L -o .claude/agents/harsh-critic.md \
https://raw.githubusercontent.com/hinanohart/harsh-critic-agent/main/agents/harsh-critic.md
```
### Option 3 — Japanese version
```bash
curl -L -o ~/.claude/agents/harsh-critic.md \
https://raw.githubusercontent.com/hinanohart/harsh-critic-agent/main/agents/harsh-critic.ja.md
```
Claude Code auto-discovers agents in these directories on the next start.
---
## Usage
### From a conversation
```text
@harsh-critic please audit PLAN.md before I execute it
```
### From another agent / code
```python
Agent(
description="Final gate on release plan",
subagent_type="harsh-critic",
prompt="Review RELEASE-v1.0.md. Public-facing — apply Phase 7."
)
```
### Recommended invocation points
- Before merging a PR that changes a migration, security code, or API
contract.
- Before tagging a public release (triggers Phase 7).
- Before submitting a paper, preprint, or benchmark to the public
(triggers Phase 6 + 7).
- When you notice yourself opening "just one more angle" on a decided
plan (triggers Phase 8 / frame-hopping check).
---
## Example output
```text
VERDICT: REVISE
Overall Assessment: Core approach is sound; two claims rest on insufficient
statistical evidence, one figure disagrees with its underlying JSON, and an
auto-tautology control inflates one headline number.
Pre-Commitment Predictions vs Actual Findings:
- Predicted: n-size issue in Table 2, stale README command, missing
ablation for depth parameter.
- Found: n-size issue confirmed (Table 2, n=3). README command works.
New surprise: layer-20 control is patched at layer 20 (auto-tautology,
CS-2 pattern).
Critical Findings:
1. Table 2 "70% accuracy" (paper.md:214) rests on n=3 trials.
- Confidence: HIGH
- Phase: Phase 6/S1 — coin-flip p=0.125 under null.
- Fix: re-run at n≥30 OR re-label claim as "preliminary".
2. Figure 3 delta -0.92 (main.tex:287) contradicts fig3.json:147 (-2.11).
- Confidence: HIGH
- Phase: Phase 7/P2 — figures must regenerate from raw data.
- Fix: replace prose number with symbolic reference to JSON.
…
```
See [`examples/example-review-output.md`](./examples/example-review-output.md)
for a full review.
---
## Design philosophy
- **False approvals cost 10-100× false rejections.** A courteous reviewer is
a liability.
- **Every finding needs evidence.** `file:line`, quoted text, or data path.
Opinions are not findings.
- **Statistical handwaving is a CRITICAL finding.** Not a stylistic preference.
- **Anti-patterns are the author's blind spots.** Name them explicitly; the
author cannot see them alone.
The full philosophy is embedded in the prompt itself — see
[`agents/harsh-critic.md`](./agents/harsh-critic.md).
---
## Documentation
- [`docs/CASE_STUDIES.md`](./docs/CASE_STUDIES.md) — six annotated real-world
failures referenced by the agent.
- [`docs/STATISTICAL_THINKING.md`](./docs/STATISTICAL_THINKING.md) — the
Phase-6 reasoning in plain language, with code snippets for Wilson CI,
bootstrap, permutation test, Fisher's exact.
- [`CHANGELOG.md`](./CHANGELOG.md)
- [`NOTICE`](./NOTICE) — upstream attribution
---
## Credit
This agent would not exist without the upstream
[`oh-my-claudecode`](https://github.com/Yeachan-Heo/oh-my-claudecode) by
**Yeachan Heo**. The role framing, investigation protocol skeleton, and
output-format structure are his. See [`NOTICE`](./NOTICE) for the exact
heritage and the list of substantive additions.
If you like this, star the upstream project too.
---
## Contributing
See [`CONTRIBUTING.md`](./CONTRIBUTING.md). Pull requests that add case-study
anchors from real failures (with public post-mortems) are especially welcome.
## Security
See [`SECURITY.md`](./SECURITY.md). The agent is a Markdown prompt file; it
has no network access and no code execution surface of its own.
Note that the `disallowedTools: Write, Edit` line in the frontmatter is
**prompt-enforced, not harness-enforced** — the LLM chooses to comply.
For destructive-command protection at the harness level, install
[`claude-safety-guard`](https://pypi.org/project/claude-safety-guard/) as
a PreToolUse hook alongside this agent.
## License
MIT — see [`LICENSE`](./LICENSE) and [`NOTICE`](./NOTICE).