https://github.com/f4rkh4d/forge-skill
Skills with teeth. Discipline across every domain an AI agent touches, with a verifier under every skill.
https://github.com/f4rkh4d/forge-skill
agent-design agent-skills ai ai-agents anthropic anti-slop claude claude-code codex coding cursor frontend linter lowcode nocode skill skill-pack skills verifier vibecoding
Last synced: 12 days ago
JSON representation
Skills with teeth. Discipline across every domain an AI agent touches, with a verifier under every skill.
- Host: GitHub
- URL: https://github.com/f4rkh4d/forge-skill
- Owner: f4rkh4d
- License: mit
- Created: 2026-05-22T11:36:37.000Z (25 days ago)
- Default Branch: main
- Last Pushed: 2026-05-22T16:52:30.000Z (24 days ago)
- Last Synced: 2026-05-22T17:44:57.599Z (24 days ago)
- Topics: agent-design, agent-skills, ai, ai-agents, anthropic, anti-slop, claude, claude-code, codex, coding, cursor, frontend, linter, lowcode, nocode, skill, skill-pack, skills, verifier, vibecoding
- Language: Shell
- Homepage: https://f4rkh4d.github.io/forge-skill/
- Size: 2.35 MB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
Awesome Lists containing this project
README
Forge Skill
Skills with teeth. 51 opinionated rules for AI coding agents, 33 with a verifier that fails when ignored.
−72% verifier-visible slop on Sonnet, −58% on Haiku, 20 adversarial prompts × 3 runs per arm. Per 1000 LOC: −85% Sonnet, −71% Haiku. Same model both arms, only the skills change.
Live playground ·
Install ·
Skills ·
Benchmarks ·
Worked example
## What it does
You give Claude / Cursor / Codex a prompt. It writes code. Most "skill packs" stop there - a long markdown file the agent is told to read. **Forge ships executable verifiers** that run on the output and fail when the agent ignored the rule. Same verifiers run in four places:
| Entry point | When it fires |
|-|-|
| [`hooks/`](hooks/README.md) - Claude Code post-edit hook | Every Edit/Write/MultiEdit, blocking with feedback the model sees |
| [`vscode-extension/`](vscode-extension/README.md) - VS Code & Cursor | On save, inline `Diagnostic` warnings with line/column |
| [`mcp-server/`](mcp-server/README.md) - MCP server | Any agent that speaks MCP - call `verify_snippet` / `verify_file` on demand |
| [`.github/actions/forge-verify/`](.github/actions/forge-verify/README.md) - GitHub Action | Every PR; sticky comment, optional merge gate |
51 skills across 14 domains. 33 ship a verifier (shell, AST, or both). The remaining 18 are style registers (brutalist / minimalist / soft / redesign), image-direction (brandkit / imagegen-*), and methodology / orchestration (rag / evals / citation / research / agent-*) - guidance by nature, not mechanically checkable, marked as such.
## Quick proof
The numbers in [`BENCHMARKS.md`](BENCHMARKS.md). 35 prompts total (20 adversarial + 15 neutral) × 3 runs per arm, same model both arms:
| Sonnet 4.6 | Baseline | Forge | Δ | Violations / 1000 LOC |
|-|-:|-:|-:|-|
| **Adversarial** | 115 | **32** | **−72.2%** | 19.87 → 2.98 (**−85%**) |
| **Neutral** | 54 | **9** | **−83.3%** | 9.57 → 1.05 (**−89%**) |
| **Combined** | 169 | **41** | **−75.7%** | 14.79 → 2.12 (**−86%**) |
| Haiku 4.5 | Baseline | Forge | Δ | Violations / 1000 LOC |
|-|-:|-:|-:|-|
| **Adversarial** | 127 | **54** | **−57.5%** | 28.53 → 8.33 (**−71%**) |
Per skill on Sonnet: `forge-api-design` 20→1 (−95%), `forge-error-handling` 61→20 (−67%), six skills zeroed (kubernetes, migrations, logging, frontend, github-actions, prompt-engineering).
Reproduce: `cd benchmarks && npm install && BENCH_N_RUNS=3 BENCH_CORPUS=adv npm run all`. No API key - uses your local `claude` CLI.
## Install
Drop any `SKILL.md` into your project. The Claude Code post-edit hook auto-discovers them:
```bash
git clone https://github.com/f4rkh4d/forge-skill
cd forge-skill
./hooks/install.sh # one-shot install at the user level
./hooks/install.sh --project # or per-project (this repo only)
```
After install, every file Claude Code edits is checked against the applicable forge verifiers automatically. If a verifier flags a violation, the hook exits with code 2 and Claude sees the violation text on its next turn - it fixes them without you in the loop.
For other agents, see the [MCP server](mcp-server/README.md), [GitHub Action](.github/actions/forge-verify/README.md), or [VS Code extension](vscode-extension/README.md).
## Skills
51 skills across 14 domains. Each is a folder with a `SKILL.md` (the rules for the model to read) and optionally `verify/check_*.sh` (the script that runs on the output and fails when the rules were ignored).
| Domain | Skills | Verified |
|-|-:|-:|
| [Design](skills/design/) | 9 | 4 |
| [Backend](skills/backend/) | 9 | 7 |
| [Data](skills/data/) | 2 | 2 |
| [Infra](skills/infra/) | 5 | 5 |
| [Security](skills/security/) | 1 | 1 |
| [Testing](skills/testing/) | 1 | 1 |
| [Output](skills/output/) | 1 | 1 |
| [Docs](skills/docs/) | 1 | 1 |
| [MCP](skills/mcp/) | 3 | 2 |
| [Multi-agent](skills/agents/) | 3 | 0 |
| [LLM apps](skills/llm/) | 5 | 4 |
| [Dev workflow](skills/dx/) | 5 | 3 |
| [Image direction](skills/imagegen/) | 3 | 0 |
| [Research](skills/research/) | 2 | 1 |
Browse [`skills/`](skills/) for the full list. **AST-grade verifiers** (real TypeScript AST traversal, not regex) cover the top 8: `forge-frontend`, `forge-typescript`, `forge-api-design`, `forge-error-handling`, `forge-validation`, `forge-react-hooks`, `forge-tests`, `forge-naming`.
## How the verifiers work
```
skills///
├── SKILL.md # rules + BAD/GOOD examples (the model reads this)
└── verify/
└── check_*.sh # shell script, exits non-zero with VIOLATION lines
```
A verifier returns exit code 0 on clean output, non-zero with a list of violations otherwise. Eight skills delegate to [`verify/lib/ts-ast.mjs`](verify/lib/ts-ast.mjs) which parses the actual TypeScript AST. Card-in-card detected even when extracted to a variable. `c.req.json()` flagged when consumed without `.parse()` on the same flow. Hooks flagged when called inside `if` / loop / ternary / `&&` / after early return.
Install with `npm install` at the repo root for AST mode; falls back to grep heuristics without Node. The verifiers themselves are covered by a [test corpus](tests/README.md) of 22 fixtures that runs in CI - regressions in verifier logic don't ship silently.
## Worked example
[`examples/orders-api/`](examples/orders-api/) is a small but real Hono + Postgres service built by dogfooding ten skills together. ~1100 lines of TypeScript across 17 files. Read it to see what the kit looks like applied end-to-end on real code.
## Try it live
[`f4rkh4d.github.io/forge-skill/playground/`](https://f4rkh4d.github.io/forge-skill/playground/) loads the actual TypeScript compiler in your browser and runs the six AST checks. Paste TypeScript or TSX, hit Run, see violations with line/column. Nothing leaves your browser. Five "Try this" presets included.
## Contributing
The skill format is intentionally lightweight. To add `forge-rust` or `forge-python-fastapi`:
1. `mkdir skills//forge-` with a `SKILL.md` (frontmatter + Quick reference + Hard rules + BAD/GOOD examples, see existing skills for shape).
2. If the rules are grep- or AST-checkable, add `verify/check_*.sh`. The CLI auto-discovers it.
3. Add a fixture to `tests/bad/` and `tests/good/` so the verifier is regression-tested.
4. Open a PR. CI gates: shellcheck on the verifier, frontmatter present, the corpus passes.
## License
MIT. Built by [@f4rkh4d](https://github.com/f4rkh4d).