{"id":48839351,"url":"https://github.com/taniwhaai/arai","last_synced_at":"2026-05-26T05:01:11.989Z","repository":{"id":351400247,"uuid":"1208951206","full_name":"taniwhaai/arai","owner":"taniwhaai","description":"AI coding rules that actually work. Enforce instruction files via hooks — CLAUDE.md, .cursorrules, copilot-instructions, and more.","archived":false,"fork":false,"pushed_at":"2026-04-24T03:51:46.000Z","size":295,"stargazers_count":3,"open_issues_count":2,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-04-24T05:15:32.131Z","etag":null,"topics":["ai-guardrails","claude-code","cli","code-quality","copilot","cursor-rules","developer-tools","llm","rust","windsurf"],"latest_commit_sha":null,"homepage":"https://arai.taniwha.ai","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/taniwhaai.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-04-13T00:21:45.000Z","updated_at":"2026-04-24T03:50:28.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/taniwhaai/arai","commit_stats":null,"previous_names":["taniwhaai/arai"],"tags_count":10,"template":false,"template_full_name":null,"purl":"pkg:github/taniwhaai/arai","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/taniwhaai%2Farai","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/taniwhaai%2Farai/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/taniwhaai%2Farai/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/taniwhaai%2Farai/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/taniwhaai","download_url":"https://codeload.github.com/taniwhaai/arai/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/taniwhaai%2Farai/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32319560,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-26T23:26:28.701Z","status":"online","status_checked_at":"2026-04-27T02:00:06.769Z","response_time":128,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai-guardrails","claude-code","cli","code-quality","copilot","cursor-rules","developer-tools","llm","rust","windsurf"],"created_at":"2026-04-15T01:02:54.236Z","updated_at":"2026-05-26T05:01:11.971Z","avatar_url":"https://github.com/taniwhaai.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Arai\n\n**Instruction files that actually work.** One command. Runs locally. Zero cost.\n\nArai makes your AI coding assistant instruction files structurally enforceable — not just suggestions that get forgotten as context grows.\n\n![Arai blocking a forbidden command at the PreToolUse hook](demos/block.gif)\n\n## Quick Start\n\n```bash\ncurl -sSf https://arai.taniwha.ai/install | sh\n\ncd your-project\narai init\n```\n\nThat's it. Arai discovers your instruction files, extracts the rules, classifies their intent, scans your codebase for context, and sets up native hooks so guardrails fire at the right moment.\n\n## What It Does\n\nWhen your AI coding assistant (Claude Code or Grok TUI) is about to do something your rules cover, Arai injects the relevant guardrail — right when it matters. Rules derived from prohibitive predicates (`never`, `forbids`, `must_not`) actually **block the tool call** instead of just advising.\n\n```\nYou: \"Create a new database migration\"\n\n  PreToolUse: Write migrations/versions/001_add_users.py\n  → Arai: deny\n    reason: \"Alembic never: hand-write migration files\"\n            [from your rules:12, layer-1 imperative]\n\nAssistant: \"I should use alembic revision --autogenerate instead...\"\n```\n\nRules only fire when relevant. No noise on `ls`. No repeating principles already in your instruction files.\n\nEvery firing is written to a local audit log, and every PostToolUse is correlated with the matching PreToolUse to produce a **compliance verdict** — so you can measure whether the model actually honours the rules you wrote.\n\n## How It Works\n\n1. **Discovers** instruction files in your project and home directory\n2. **Extracts** rules by pattern-matching imperative language (\"never\", \"always\", \"don't\", \"must\")\n3. **Classifies** each rule's intent — what action it governs, which tools it applies to, when it should fire\n4. **Scans** your codebase with tree-sitter to understand which tools own which directories\n5. **Tracks** session state — knows if you've already run tests before pushing\n6. **Fires** only relevant rules at the right moment via native hooks (where supported)\n\n## Supported Instruction Files\n\n| File | Tool | Enforcement |\n|------|------|-------------|\n| `CLAUDE.md` | Claude Code | Hooks (block + advise) |\n| `AGENTS.md` / `Agents.md` | Grok TUI (native) | Hooks (block + advise) |\n| `~/.claude/CLAUDE.md` | Claude Code (global) | Hooks (block + advise) |\n| `~/.grok/` AGENTS.* files | Grok TUI (global) | Hooks (block + advise) |\n| `.cursorrules` / `.cursor/rules` | Cursor | MCP (advise) |\n| `.windsurfrules` | Windsurf | MCP (advise) |\n| `.github/copilot-instructions.md` | GitHub Copilot | Ingest only |\n\nRules from every file are parsed, classified, and stored the same way — but\nenforcement strength depends on what surface the assistant exposes.\n\n- **Claude Code** and **Grok TUI** both support real PreToolUse hooks, so Arai\n  can issue `deny` decisions and actually block tool calls.\n- Cursor and Windsurf are MCP clients today — they get strong advisory\n  enforcement via the MCP server.\n- GitHub Copilot currently has no live enforcement surface; the file is\n  still ingested for `arai stats`, `arai diff`, and the audit log.\n\nArai hooks several more events alongside the standard tool-call events\n(when the assistant supports them) so the rule set stays accurate to the live\nworking tree:\n\n- **`FileChanged` + `InstructionsLoaded`** — when an instruction file\n  (CLAUDE.md, rules-dir, memory file, ...) is edited on disk or loaded\n  into context, Arai spawns an `arai scan` in the background. The next\n  tool-call hook sees the updated guardrails — no manual rescan.\n- **`CwdChanged`** — when Claude `cd`s into a different directory\n  (monorepo navigation), Arai re-scans rooted at the new directory so\n  the next tool call matches against the right project's rules.\n- **`PostToolBatch`** — when Claude does a batch of parallel tool calls,\n  Arai correlates each call individually against any PreToolUse firings\n  in the same session, so per-rule compliance verdicts (Obeyed /\n  Ignored / Unclear) stay accurate on parallel workloads.\n\n## Smart Matching\n\nArai doesn't just do keyword matching. It understands your rules:\n\n- **Intent classification** — \"never hand-write migration files\" only fires on Write, not Edit (editing existing migrations is fine)\n- **Code graph** — writing to `migrations/versions/` triggers alembic rules even if the file doesn't mention alembic, because sibling files import it\n- **Content sniffing** — detects `from alembic import op` in file content being written\n- **Session awareness** — \"never push without running tests\" suppresses after tests have been run\n- **Timing routing** — domain rules fire on tool calls, principles stay silent (already in CLAUDE.md)\n- **Broad imperative coverage** — recognises `never/always/don't/must`, `should/shouldn't`, `cannot/refuse`, `make sure/be sure`, `consider/recommend`, bare `No X` prohibitions, conditional shapes (`When X, do Y` / `Before X: do Y` / `If X → do Y`), and the section-aware `Use X` style-guide pattern. Severity mapping mirrors grammatical weight: `should` is `Inform` (soft), `should not` is `Block` (the writer chose to call out a specific prohibition).\n\n## Compliance \u0026 audit\n\nBeyond firing rules, Arai produces a tamper-evident local record of *every*\nguardrail decision and correlates it with what the model actually did. This\nis what tech leads and compliance reviewers want to see — the trail behind\nthe enforcement.\n\n- **Local JSONL audit log** — one line per firing at\n  `~/.taniwha/arai/audit/\u003cproject\u003e/\u003cYYYYMMDD\u003e.jsonl`. Append-only, day-bucketed,\n  queryable with `arai audit` (filters: `--since`, `--tool`, `--event`,\n  `--outcome`, `--rule`). Owner-only on disk (0700 dir / 0600 file on Unix;\n  `icacls`-pinned on Windows).\n- **Hash-chained — actually tamper-evident** — every line carries `prev_hash`\n  and `hash` (SHA-256 over canonical bytes); the chain is anchored per-day in\n  a `.head.YYYYMMDD` sidecar. `arai audit --verify` walks the chain across\n  every day-bucket and exits non-zero on any tamper / reorder / deletion —\n  drop it in a cron or pre-archive job to gate evidence integrity.\n- **Retention controls** — `arai audit --purge --older=90` drops day-buckets\n  older than 90 days; `arai audit --purge --project=\u003cslug\u003e` wipes a specific\n  project (offboarding / decommission). Today's bucket is always preserved\n  and whole files are deleted (never individual lines), so the hash chain on\n  retained days stays valid. Pair with `--dry-run` (and `--json`) for a\n  pre-purge review, or wire into a scheduled job for time-based retention\n  policy.\n- **Derivation trace per firing** — each rule entry records source file,\n  line number, and parser layer (`from CLAUDE.md:42, layer-1 imperative`).\n  Auditors can answer \"why did this rule fire?\" without code spelunking.\n- **Compliance verdicts** — every PostToolUse is correlated against recent\n  PreToolUse firings to produce **Obeyed / Ignored / Unclear** per rule.\n  `arai stats --by-rule` rolls these up into per-rule ratios with a ⚠ flag\n  on rules the model is routing around.\n- **Graduated enforcement** — severity tiers (Block / Warn / Inform) derive\n  from rule predicate; `arai severity` pins individual rules so you can\n  ship a rule set in advise mode and escalate one at a time.\n  `ARAI_DENY_MODE=off` is the project-wide rollback path.\n- **Regression-tested policy** — `arai test` replays scenarios through the\n  live `match_hook` pipeline; `arai record` captures real firings as\n  fixtures. Rule changes become CI assertions, not vibes.\n- **No data egress** — no network on the hook hot path. Anonymous opt-out\n  telemetry is architecturally separate from the audit log; they share no\n  code path. The audit data physically cannot leak via the telemetry\n  channel. The telemetry queue is hard-capped at 2 MiB on disk.\n- **Supply-chain hardened** — every install path verifies the binary\n  against published `checksums.txt` (SHA-256). `arai:extends` upstream\n  policy fetches refuse loopback / RFC1918 / link-local / cloud metadata\n  and disable redirects; **cached upstream policies carry a SHA-256\n  sidecar** so a tampered cache file is detected before its rules reach\n  the parser.\n- **MCP authentication** — the agent-facing MCP server supports an optional\n  shared-secret via `ARAI_MCP_AUTH_TOKEN`. When set, `initialize` must\n  present a matching token (constant-time compare) before any tool call\n  succeeds.\n\nDesigned to align with the **SOC 2 Trust Service Criteria** (CC6.1 logical\naccess, CC6.6 supply-chain, CC7.2 monitoring, CC7.3 detection, CC8.1 change\nmanagement, CC9.2 vendor management). Arai is not itself a certified\nproduct — it gives you the controls and the evidence trail; the\ncertification is yours to pursue. A complete TSC mapping and enterprise /\nprocurement-team feature inventory is in\n[`docs/arai-compliance-features.pdf`](docs/arai-compliance-features.pdf).\nThe Word source (`.docx`) is committed alongside it for editing.\n\n## Enrichment\n\nThree tiers of rule understanding, each more accurate:\n\n```bash\narai scan                  # Tier 1: Built-in verb taxonomy (free, instant)\narai scan --enrich         # Tier 2: Sentence transformer model (local, ~80MB download)\narai scan --enrich-llm     # Tier 3a: LLM classification via CLI\narai scan --enrich-api     # Tier 3b: LLM classification via API (no CLI needed)\n```\n\nConfigure your LLM:\n```bash\n# Via CLI tool (shell-out)\nARAI_LLM_CMD=\"claude -p\" arai scan --enrich-llm\nARAI_LLM_CMD=\"ollama run llama3\" arai scan --enrich-llm\n\n# Via API (OpenAI-compatible endpoints)\nARAI_API_KEY=sk-... arai scan --enrich-api                    # OpenAI (default)\nARAI_API_URL=http://localhost:11434/v1 arai scan --enrich-api  # Ollama (auto-detected)\nARAI_API_URL=https://api.groq.com/openai/v1 ARAI_API_KEY=gsk-... ARAI_API_MODEL=llama-3.3-70b-versatile arai scan --enrich-api\n\n# Or in ~/.taniwha/arai/config.toml\n[enrich]\nllm_command = \"llm -m gpt-4o-mini\"       # for --enrich-llm\napi_url = \"https://api.openai.com/v1\"     # for --enrich-api\napi_key_env = \"OPENAI_API_KEY\"\nmodel = \"gpt-4o-mini\"\n```\n\n## Commands\n\n```bash\narai init                  # Discover, extract, classify, scan, set up hooks\narai status                # Show what's being enforced\narai guardrails            # List all active rules\narai why \"git push --force\" # Explain which rules would fire (dry-run, no audit write)\narai scan                  # Re-scan instruction files\narai scan --code           # Also scan source code (tree-sitter AST)\narai scan --enrich-llm     # Enhance rules via LLM CLI\narai scan --enrich-api     # Enhance rules via API (OpenAI-compatible)\narai add \"Never X\"         # Add a rule manually\narai audit                 # Inspect the local log of rule firings\narai audit --outcome=ignored # Compliance verdicts where the model ignored a rule\narai audit --rule alembic  # Filter audit by rule subject/predicate/object substring\narai audit --verify        # Verify the SHA-256 hash chain across every day-bucket\narai stats                 # Aggregate audit log — top rules, compliance, token economics\narai stats --by-rule       # Just the per-rule compliance + token economics\narai severity alembic block # Pin a rule's severity (incremental deny rollout)\narai severity --reset alembic # Drop the override; severity reverts to predicate-derived\narai diff CLAUDE.md        # Preview rule-set delta before saving an edit\narai test scenarios.json   # Replay synthetic hook scenarios against rules\narai record --since=1h     # Capture recent firings as a scenario skeleton\narai lint CLAUDE.md        # Parse a file and preview extracted rules\narai trust                 # Manage URLs trusted for shared-policy extends\narai mcp                   # Run the MCP server (stdio) for agent-authored guards\narai upgrade --full        # Switch to full binary (with ONNX enrichment)\n```\n\n## Deny mode — actually block bad actions\n\nStarting in v0.2.3, Arai no longer just *advises*: rules derived from\nprohibitive predicates (`never`, `forbids`, `must_not`) emit\n`permissionDecision: \"deny\"` (or equivalent) so the assistant refuses the tool call. Advisory\nrules (`always`, `requires`, `prefers`) keep the previous behaviour.\n\nSeverity is inferred from the predicate at extract time:\n\n| Predicate | Severity | Hook behaviour |\n|-----------|----------|----------------|\n| `never`, `forbids`, `must_not` | `block`  | `permissionDecision: \"deny\"` + reason |\n| `always`, `requires`, `enforces` | `warn` | `permissionDecision: \"allow\"` + context |\n| `prefers`, `learned_from` | `inform` | `permissionDecision: \"allow\"` + context |\n\nRolling Arai out incrementally? Flip deny mode off at the env level:\n\n```bash\nARAI_DENY_MODE=off   # advisory-only — rules still fire in additionalContext\n```\n\nUseful pattern: ship Arai in advise mode for a week, watch `arai audit\n--outcome=ignored`, tune the rules the model keeps flouting, then enable\ndeny mode when the rule set is trustworthy.\n\n## Compliance tracking\n\nAfter every PostToolUse, Arai correlates the call against recent\nPreToolUse firings in the same session and emits a `Compliance` event to\nthe audit log per rule:\n\n- **obeyed** — forbidden phrase absent from the executed command (for\n  prohibitive rules), or the required evidence present (for affirmative\n  rules).\n- **ignored** — forbidden phrase still in the executed command.\n  The model ran the thing anyway (either deny was off or the assistant\n  chose to proceed).\n- **unclear** — not enough signal to decide (short object text, or\n  affirmative rule without evidence in this call).\n\n```bash\narai audit --event=Compliance     # all verdicts\narai audit --outcome=ignored      # shortcut for the painful ones\narai audit --outcome=obeyed       # show the rules doing their job\n```\n\nThis closes the feedback loop the audit log was missing: not just *which*\nrules fired, but *which ones the model actually honoured*.\n\n## arai why — explain before you commit\n\n`arai why \u003caction\u003e` replays a hypothetical tool call through the live\nmatching pipeline and prints the rules that would fire, with severity,\nderivation (source + line + parser layer), and match percentage. No audit\nwrite; read-only against the rule set.\n\n```bash\narai why \"git push --force origin main\"\narai why --tool Write /src/migrations/001_init.py\narai why --tool Bash --event PostToolUse \"rm -rf /data\"\narai why \"git push --force\" --json   # machine-readable\n```\n\nUse it to: debug \"why did that rule fire?\", preview new rules before\ncommitting them, or include the output in a PR description when you\nchange a CLAUDE.md.\n\n## Rule expiry — self-pruning rules\n\nAnnotate rules with `(expires YYYY-MM-DD)` or `(until YYYY-MM-DD)` at the\nend of the line. The annotation is stripped from the rule body at parse\ntime and stored separately; `load_guardrails` filters out expired rows so\nthe rule stops firing on its own, without you having to remember to\nclean it up.\n\n```markdown\n- Never touch the old auth module (expires 2026-09-01)\n- Always rebase against release-1.8 until 2026-12-31\n- Prefer the new payment SDK over the legacy one (until 2027-06-30)\n```\n\nPerfect for `learned_from` incidents that have a shelf life, migration\nwindows, and \"temporarily forbid X until we finish the refactor\" rules.\n\n## Per-rule enrichment opt-out — `(noenrich)`\n\n`arai scan --enrich-llm` and `--enrich-api` send the full text of every\nguardrail to whatever LLM you've configured (`ARAI_LLM_CMD` /\n`ARAI_API_URL`). For most rules that's fine — they're already in\n`CLAUDE.md`. But if a single rule mentions an internal codename you'd\nrather not ship to a third-party endpoint, append `(noenrich)`:\n\n```markdown\n- Never deploy to internal-codename-cluster (noenrich)\n```\n\nThe annotation is stripped from the rule body at parse time and stored\nseparately; the enrichment paths filter the rule out before building the\nprompt. `(noenrich)` and `(expires …)` can appear together in either\norder. To opt out globally, just don't pass `--enrich-llm` /\n`--enrich-api` — neither runs by default.\n\nBefore each enrichment run Arai prints a one-line notice with the\nresolved destination and a locality verdict (`local` / `REMOTE` /\n`unknown locality`), plus the count of rules excluded via `(noenrich)`,\nso you can see at a glance whether rule text is about to leave the\nhost.\n\n## Audit log\n\nEvery time a rule fires, Arai appends one line to a local JSONL log at\n`~/.taniwha/arai/audit/\u003cproject-slug\u003e/\u003cYYYYMMDD\u003e.jsonl`. The log captures the\nhook event, the tool that was called, a truncated prompt preview, the\ndecision (`inject`, `deny`, `review`), and every rule that matched —\nwith source file, line number, parser layer, severity, and confidence.\n\nNothing leaves your machine — this is separate from the anonymous\nusage telemetry below.\n\n```bash\narai audit                    # Today's firings, table view\narai audit --since=7d         # Last week\narai audit --tool=Bash        # Only Bash tool calls\narai audit --event=PreToolUse # Only pre-tool-use firings\narai audit --event=Compliance # Compliance verdicts (Pre/Post correlation)\narai audit --outcome=ignored  # Shortcut: Compliance events marked ignored\narai audit --rule alembic     # Filter to firings/verdicts touching this rule\narai audit --json             # JSONL stream (pipe-friendly)\narai audit --verify           # Verify the SHA-256 hash chain (exits non-zero on any tamper)\narai audit --verify --json    # Machine-readable verify report for CI / cron\n```\n\n`--rule` is a case-insensitive substring match against the rule's\nsubject, predicate, or object — the same shape `arai severity` uses.\nPairs naturally with `--outcome=ignored` to answer \"every time the\nalembic rule was ignored this week\".\n\nUseful for answering:\n\n- *\"Why did Claude suddenly change approach halfway through?\"* —\n  look up the firing, see which rule matched.\n- *\"Which rules are actually load-bearing?\"* — sort firings by rule,\n  prune rules that never trigger.\n- *\"Did the guardrail fire before that regrettable git push?\"* —\n  grep by session id.\n\n## Status — health check your rule set\n\n`arai status` shows how many rules are loaded, where they came from,\nand when they were last scanned. As of v0.2.2 it also surfaces two\ncommon rule-set health issues:\n\n- **Duplicate rules** — the same (subject, predicate, object) ingested\n  from more than one source file. Usually safe to consolidate into\n  one source to reduce drift.\n- **Opposing predicates** — the same subject carries both a\n  prohibitive predicate (`never`, `must_not`, `avoid`) and a required\n  predicate (`always`, `must`, `requires`, `ensure`). Not always a\n  real conflict (the objects may differ), but worth a human look.\n\nThese are advisory only — the hook path ignores them. Fix them at the\nsource.\n\n## Stats — aggregate the audit log\n\n`arai stats` rolls up the same JSONL `arai audit` tails and answers\nthe questions every maintainer asks after a few weeks of use:\n\n```bash\narai stats                # Top rules, compliance, token economics\narai stats --since=30d    # Window to the last month\narai stats --top=5        # Show only top 5 per section\narai stats --by-rule      # Compliance + token economics only\narai stats --json         # Machine-readable for dashboards\n```\n\nOutput includes: total firings, most-fired rules, tools attracting the\nmost guardrails, day-by-day activity, **and a per-rule compliance\nroll-up** — for every rule that has fired, how many Pre/Post pairs\nended up `obeyed` vs `ignored`, plus a ratio:\n\n```\nPer-rule compliance\n  fires obeyed ignored unclear   ratio  rule\n     12     11       1       0     92%  alembic must_not: hand-write migrations\n      7      4       3       0     57%  git must_not: --no-verify  ⚠\n      9      9       0       0    100%  cargo always: test before commit\n```\n\nThe ⚠ flag highlights rules with low ratios and enough volume to\nmean it — these are the ones to either rewrite (rule subject too\nnarrow / object too vague) or escalate via `arai severity` (see\nbelow) once you trust the wording.\n\nThe ratio is computed **once per Pre firing** using a first-\ndefinitive-wins rule: the first non-`unclear` Compliance verdict\ncorrelated against a Pre is the verdict for that Pre, regardless\nof how many subsequent Posts also fall inside the 5-minute\ncorrelation window. So a rule that fires once and is honored stays\nat 1 obeyed / 1 fire, not 8 obeyed / 1 fire just because eight\nunrelated commands followed.\n\nNothing leaves the machine — stats are a local view over your own\naudit log.\n\n## Token economics — calibrated estimates\n\n`arai stats` also surfaces a *token economics* section with\ncalibrated estimates of how Arai is affecting your model's token\nburn. Two streams contribute:\n\n```\nToken economics (estimates)\n     12  repeat-injection suppressions  (~600 tokens, 50 ea.)\n      4  denied-and-honored mistakes    (~8000 tokens, 2000 ea.)\n     17  advised-and-honored events     (~8500 tokens, 500 ea.)\n            total estimated tokens saved:  ~17100\n            (calibrated estimates, not measurements)\n```\n\n- **Repeat-injection suppressions** — when a rule fires a second\n  time in the same session, Arai emits a compact \"still: subject\n  predicate object\" line instead of re-injecting the full source /\n  layer / severity payload. The model already has that context from\n  the first firing. The 50-token estimate is the rough delta\n  between the full and compact forms.\n- **Denied-and-honored mistakes** — a `block`-severity rule fired,\n  the model would otherwise have run a destructive action, and the\n  PostToolUse correlation confirms it didn't. The 2000-token\n  estimate is a conservative bound on what \"fix the mess\" cycles\n  cost (revert files, undo migrations, rollback deploys).\n- **Advised-and-honored events** — a `warn` or `inform` rule fired\n  and the model complied. Lower confidence saving (the model might\n  have done the right thing anyway), so a smaller 500-token\n  estimate.\n\nThese are **estimates, not measurements**. The constants live in\n[`src/stats.rs`](src/stats.rs) and are documented there; treat the\ntotal as an order-of-magnitude reading, not a precise number. If\nyou want to see the underlying counts, `arai stats --json` exposes\nthe `token_economics` object with all three streams broken out.\n\n## Severity — per-rule deny-mode rollout\n\n`arai severity` pins a rule's enforcement strength so re-running\n`arai scan` won't reset it to the predicate-derived classification.\nUse it for **incremental deny-mode rollout**: ship the rule set in\nadvise mode (`ARAI_DENY_MODE=off`), watch `arai stats --by-rule`,\nand flip individual rules into `block` once the model is honouring\nthem in the wild — without forcing the whole rule set into a strict\nmode it isn't ready for yet.\n\n```bash\narai severity                          # List active overrides\narai severity alembic block            # Pin every rule whose subject/object\n                                       # contains \"alembic\" to block\narai severity git warn                 # Demote git rules to advise-only\narai severity --reset alembic          # Drop the override; severity reverts\n                                       # to the predicate-derived value\narai severity alembic block --json     # Machine-readable list of changes\n```\n\nPattern matching is case-insensitive substring against the rule's\nsubject *or* object, so `arai severity migrate` covers both\n`alembic must_not: hand-write migrations` and `migrations require:\nbackfill_plan`.\n\nOverrides survive `arai scan` and `arai init` — they live in their\nown column and are never touched by re-classification. Drop one with\n`--reset` when you're ready to re-derive severity from the rule's\npredicate.\n\n## Diff — preview rule-set changes\n\n`arai diff \u003cfile\u003e` shows what changes a candidate edit to an\ninstruction file would make to the live rule set — added, removed,\nmoved — before you save and run `arai scan`. Read-only against\nthe store; pairs with `arai lint` (preview a file in isolation)\nand `arai why` (preview a single tool call).\n\n```bash\narai diff CLAUDE.md                            # Plain table view\narai diff memory/feedback_testing.md --json    # For pre-commit hooks\n```\n\nOutput is grouped into three sections — `Added` (rules in the file\nthat aren't in the store yet), `Removed` (rules in the store whose\ntext isn't in the new file), `Moved` (same rule, different line\nnumber — caught when you re-order a file without changing its rules).\nJSON output keeps the same shape for CI.\n\n## Lint — preview what a file produces\n\n`arai lint \u003cfile\u003e` parses an instruction file and prints every rule it\nwould extract along with the intent classification, without touching\nthe DB. Use it to iterate on CLAUDE.md wording and see the effect\nbefore you commit.\n\n```bash\narai lint CLAUDE.md\narai lint memory/feedback_testing.md --json   # machine-readable\n```\n\nOutput for each rule: subject / predicate / object, the classified\naction (Create / Modify / Execute / General), the hook timing it routes\nto (ToolCall / Stop / Start / Principle), and which tools the rule\napplies to.\n\n## Test — regression harness for rules\n\n`arai test` replays synthetic hook payloads through the *same*\n`match_hook` pipeline the live hook handler uses, so rule changes get\ncaught before they affect a real session.\n\nThe canonical [alembic example](scenarios/alembic-migration.json) is\nchecked in — run it after `arai init` on any repo with an alembic rule\nin CLAUDE.md:\n\n```bash\narai test scenarios/alembic-migration.json\n```\n\nScenario files are JSON:\n\n```json\n{\n  \"scenarios\": [\n    {\n      \"name\": \"force-push triggers the git guardrail\",\n      \"hook\": {\n        \"hook_event_name\": \"PreToolUse\",\n        \"tool_name\": \"Bash\",\n        \"tool_input\": { \"command\": \"git push --force origin master\" }\n      },\n      \"expect\": {\n        \"matches_subject\": [\"git\"],\n        \"does_not_match_subject\": [\"alembic\"],\n        \"min_matches\": 1\n      }\n    }\n  ]\n}\n```\n\n```bash\narai test scenarios/guards.json\narai test scenarios/guards.json --json   # structured pass/fail for CI\n```\n\nExit code is non-zero when any scenario fails. Matches are checked by\nsubject substring because full SPO triples tend to drift across\nre-ingest.\n\n## Record — seed scenarios from real firings\n\n`arai record` turns entries in the audit log into scenario skeletons\nso you don't hand-write regression tests. Flow: run your assistant, hit a\nrule firing you want pinned, `arai record --since=1h \u003e tests.json`,\ntune the expectations, check in.\n\n```bash\narai record --since=1h              # last hour\narai record --since=7d --tool=Bash  # only Bash firings from the last week\narai record --limit=50              # cap audit entries scanned\n```\n\nDeduplicates by (tool, prompt) so repeated identical firings collapse\nto one scenario. Each scenario's `expect` seeds `matches_subject` with\nwhatever actually fired and `min_matches: 1` — tune from there.\n\nRuntime-capturing *new rules* (as opposed to testing existing ones) is\na different loop: that goes through the MCP `arai_add_guard` tool,\ndocumented below.\n\n## Shared policies — `arai:extends`\n\nInstruction files can inherit rules from a trusted upstream URL. This\nis the \"org-wide CLAUDE.md\" pattern without a policy service — just\nanother markdown file hosted wherever you like.\n\nDeclare the upstream in your CLAUDE.md:\n\n```markdown\n\u003c!-- arai:extends https://example.com/standards/rust-backend.md --\u003e\n\n# My project rules\n- Never publish artifacts before tag push\n```\n\nThen trust the URL:\n\n```bash\narai trust --add https://example.com/standards/rust-backend.md\narai trust                  # List trusted URLs\narai trust --remove \u003curl\u003e   # Revoke\n```\n\nĀrai never fetches a URL that isn't explicitly trusted. HTTPS only,\n512 KB size cap, 24-hour cache with stale-while-error fallback, and\nextends are not recursive — the fetched file can't pull in further\nURLs. On `arai init`, trusted upstream content is inlined ahead of the\nlocal rules before the parser runs, so the rest of the pipeline sees\none merged file.\n\n## MCP: agent-authored guardrails\n\n`arai mcp` is also the integration path for assistants that don't have a\nnative PreToolUse hook surface. Cursor and Windsurf are both MCP clients — point\nthem at `arai mcp` and the agent can read the same rule set, register new guards\nmid-session, and self-check recent decisions.\nThe strongest blocking enforcement is available in assistants with native hook\nsupport (currently Claude Code and Grok TUI), but everything else — rule lookup,\nagent-authored guards, decision history — is shared via MCP.\n\n`arai mcp` runs a [Model Context Protocol](https://modelcontextprotocol.io/)\nserver on stdio. Three tools, exposed to any MCP-capable agent:\n\n| Tool | What it does |\n|------|--------------|\n| `arai_add_guard(rule, reason?)` | Register a new guardrail mid-session. Takes effect on the next PreToolUse hook — same enforcement path as rules in your CLAUDE.md. |\n| `arai_list_guards(pattern?)` | List active guardrails, optionally substring-filtered, so the agent can check what constraints are live before acting. |\n| `arai_recent_decisions(session_id?, limit?, since?)` | Look up recent Ārai decisions (deny / inject / review) so the agent can self-check after a refusal — closes the model-side feedback loop. |\n\nThis closes two gaps instruction files don't cover. First, when an agent\ndiscovers a rule mid-session (*\"from now on, never write to /etc\"*,\n*\"always run the full test suite before pushing\"*), it now has\nsomewhere to register it for deterministic enforcement rather than\nhoping context retention holds. Second, after a deny, the agent can\ncall `arai_recent_decisions` to see what it was just refused for —\nuseful for avoiding \"try the same thing twice\" loops when a single\nrule keeps getting hit.\n\nRegister it with your assistant (for example in Claude Code or Cline) by adding to your MCP settings:\n\n```json\n{\n  \"mcpServers\": {\n    \"arai\": {\n      \"command\": \"arai\",\n      \"args\": [\"mcp\"]\n    }\n  }\n}\n```\n\nFor Cline (in `cline_mcp_settings.json`, or via the MCP UI):\n\n```json\n{\n  \"mcpServers\": {\n    \"arai\": {\n      \"command\": \"arai\",\n      \"args\": [\"mcp\"],\n      \"disabled\": false,\n      \"autoApprove\": []\n    }\n  }\n}\n```\n\nFor Cursor and Windsurf, follow each tool's MCP server registration UI\nand point it at the same `arai mcp` command — the protocol is identical.\n\nPrerequisite: `arai` must be on your `PATH`. The install script, `cargo\ninstall arai`, `npm install -g @taniwhaai/arai`, and the Homebrew tap all\nput it there.\n\n## Installation\n\n```bash\n# Install script (recommended)\ncurl -sSf https://arai.taniwha.ai/install | sh\n\n# Full binary (with local sentence transformer)\nARAI_FULL=1 curl -sSf https://arai.taniwha.ai/install | sh\n\n# npm\nnpm install -g @taniwhaai/arai\n\n# Cargo\ncargo install arai\ncargo install arai --features enrich   # with ONNX model support\n\n# Homebrew\nbrew install taniwhaai/tap/arai\n\n# Docker (sandboxed install or CI-side enforcement)\ndocker build -t arai .\ndocker run --rm -i -v \"$(pwd)/.taniwha/arai:/home/arai/.taniwha/arai\" arai\n# Or via compose with a persistent named volume:\ndocker compose run --rm arai\n```\n\n## Performance\n\n| Operation | Median | p95 |\n|-----------|--------|-----|\n| Hook check (skip-tool — Read/Glob/Agent) | ~22 ms | ~36 ms |\n| Hook check (full match pipeline) | ~32 ms | ~55 ms |\n| Full init | \u003c200 ms | — |\n\nEnd-to-end wall clock per tool call (on supported assistants), measured by\n`bench/hot_path.sh`. Cost is dominated by Rust binary fork+exec\n(~20 ms floor on Linux/WSL); rule matching itself is sub-ms above 200\nrules thanks to the LEFT-JOIN'd intent and Aho-Corasick content sniffing.\nRule count between 50 and 500 doesn't materially move the median —\nmatching is no longer the bottleneck.\n\n## Telemetry\n\nArai collects anonymous usage data to help us understand if guardrails are actually useful. We track:\n\n- Whether a rule fired and on which tool\n- Hook response latency\n- Rule counts and enrichment tier on init\n\nWe **never** collect file paths, rule text, code content, API keys, or anything that could identify you or your codebase.\n\n**Opt out** at any time:\n\n```bash\nexport ARAI_TELEMETRY=off   # or DO_NOT_TRACK=1\n```\n\n## Built By\n\n[Taniwha.ai](https://taniwha.ai) — extracted from the [Kete](https://github.com/taniwhaai/kete) code intelligence platform.\n\n## License\n\nMIT / Apache-2.0\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftaniwhaai%2Farai","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftaniwhaai%2Farai","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftaniwhaai%2Farai/lists"}