{"id":50719891,"url":"https://github.com/suarezpm/apohara-compliance","last_synced_at":"2026-06-12T01:00:39.262Z","repository":{"id":362754059,"uuid":"1260580925","full_name":"SuarezPM/apohara-compliance","owner":"SuarezPM","description":"Maps an AI coding agent's actions — or a repository — to OWASP Agentic Top 10 and compliance framework controls, surfacing candidate findings with citations. Rust scanner + agent skill, SARIF output.","archived":false,"fork":false,"pushed_at":"2026-06-09T22:40:39.000Z","size":456,"stargazers_count":0,"open_issues_count":1,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-06-09T23:14:56.842Z","etag":null,"topics":["agentic-security","ai-governance","ai-security","compliance","owasp","owasp-agentic","rust","sarif","static-analysis"],"latest_commit_sha":null,"homepage":null,"language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/SuarezPM.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE-APACHE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":"NOTICE","maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-06-05T16:39:59.000Z","updated_at":"2026-06-09T22:40:20.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/SuarezPM/apohara-compliance","commit_stats":null,"previous_names":["suarezpm/apohara-compliance"],"tags_count":10,"template":false,"template_full_name":null,"purl":"pkg:github/SuarezPM/apohara-compliance","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SuarezPM%2Fapohara-compliance","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SuarezPM%2Fapohara-compliance/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SuarezPM%2Fapohara-compliance/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SuarezPM%2Fapohara-compliance/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/SuarezPM","download_url":"https://codeload.github.com/SuarezPM/apohara-compliance/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SuarezPM%2Fapohara-compliance/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34175887,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-10T02:00:07.152Z","response_time":89,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agentic-security","ai-governance","ai-security","compliance","owasp","owasp-agentic","rust","sarif","static-analysis"],"created_at":"2026-06-09T23:00:20.235Z","updated_at":"2026-06-11T00:00:40.472Z","avatar_url":"https://github.com/SuarezPM.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cdiv align=\"center\"\u003e\n\n# apohara-compliance\n\n**Audit what your AI coding agent _did_ — not just what your repo _contains_.**\n\n[![CI](https://img.shields.io/github/actions/workflow/status/SuarezPM/apohara-compliance/codeql.yml?style=for-the-badge\u0026label=CI)](https://github.com/SuarezPM/apohara-compliance/actions)\n[![License](https://img.shields.io/badge/license-MIT%20OR%20Apache--2.0-blue?style=for-the-badge)](#-license)\n[![Rust](https://img.shields.io/badge/rust-1.74%2B-orange?style=for-the-badge\u0026logo=rust)](https://www.rust-lang.org)\n[![Version](https://img.shields.io/badge/version-2.2.0--local-purple?style=for-the-badge)](https://github.com/SuarezPM/apohara-compliance/releases)\n[![SARIF](https://img.shields.io/badge/output-SARIF%202.1.0-success?style=for-the-badge)](https://sarifweb.azurewebsites.net)\n[![OpenSSF Scorecard](https://api.scorecard.dev/projects/github.com/SuarezPM/apohara-compliance/badge?style=for-the-badge)](https://scorecard.dev/viewer/?uri=github.com/SuarezPM/apohara-compliance)\n\n**[Quick Start](#-quick-start)** · **[Features](#-features)** · **[Frameworks](#-framework-coverage)** · **[How it works / honesty](#-how-it-works--honesty)** · **[Benchmark](BENCHMARK.md)** · **[Security](SECURITY.md)**\n\nA deterministic Rust scanner that maps an AI coding agent's **observed actions** — or a repository — to compliance and agentic-security framework controls, surfacing **candidate** risks _with citations_ for a human to confirm.\n\n\u003c/div\u003e\n\n\u003e **Honesty lineage at a glance.** `main` carries the v1.1 release **plus** the additive v2.0 → v2.1 → v2.2 → v2.3 trajectory/taint work (ADR-4 → ADR-5 → ADR-6 → ADR-7). The latest crates.io / GitHub Release tag is still **v1.1.0** — the v2.x line is shipped on `main` but **not yet tagged/published** (Pablo-gated). Everything v2.x is offline + deterministic + byte-identical to v1.1 on the single-action engine; the additive passes do not change the existing rules. v2.3's -P variants are ADDITIVE, opt-in, byte-identical passthrough when the flag is empty. See **[How it works / honesty](#-how-it-works--honesty)** and **[BENCHMARK.md](BENCHMARK.md)** for the bound triple and the explicit co-headline limit.\n\n---\n\n```console\n$ apohara-compliance-scanner scan-session session.jsonl --format md\n\n# apohara-compliance — candidate findings\n\n_Guidance/mapping only — these are CANDIDATES for review, not assertions of\ncompliance, certification, or audit conclusions._\n\n**Rules source:** `embedded-fallback` · **Findings:** 5 · **Suppressed:** 0\n\n## Findings\n\n- CANDIDATE — AGT-MIS-001 Destructive Tool Invocation — status: official, confidence: 0.90\n  - triggering_signal: rm -rf\n  - suggested_controls: SP800-53:SI-7, EU-AI-ACT:Art-9, ISO27001:A.12.1\n  - cross_refs: ASI02, ASI05, OWASP-LLM:LLM06, OWASP-LLM:LLM01, AML.T0053, AML.T0050\n  - citation: https://doi.org/10.6028/NIST.SP.800-53r5 (version Rev 5)\n\n- CANDIDATE — AGT-EXF-002 Unauthorized Outbound Network Call — status: official, confidence: 0.90\n  - triggering_signal: curl http\n  - suggested_controls: SP800-53:SC-7, SOC2:CC6.6, ISO27001:A.8.16, OWASP-LLM:LLM02\n  - cross_refs: ASI02, ASI04, OWASP-LLM:LLM06, AML.T0025, ISO42001:A.6.2.6\n  - citation: https://doi.org/10.6028/NIST.SP.800-53r5 (version Rev 5)\n\n- CANDIDATE — AGT-PI-002 Roleplay Persona Manipulation — status: draft, confidence: 0.70\n  - triggering_signal: act as\n  - suggested_controls: OWASP-LLM:LLM01, NIST-AI-RMF:AGENTIC-MAP-PROMPT-SURFACE\n  - cross_refs: ASI01, OWASP-LLM:LLM01, AML.T0051, AML.T0054\n  - citation: https://genai.owasp.org/llm-top-10/ (version 2025)\n```\n\n\u003e Real output from `scan-session` over the committed test fixture, trimmed to three of five findings. Every line is prefixed `CANDIDATE —`: a finding is a _please-confirm_, never a verdict.\n\n---\n\n## 💡 Concept\n\n\u003e [!NOTE]\n\u003e **The agent's actions are the attack surface.** Most AI-governance tooling inspects data-at-rest or the model itself. But when an AI coding agent runs `rm -rf`, opens an outbound `curl`, dumps a table, or follows an `act as …` instruction, the risk lives in **what it did** — the exact surface the [OWASP Top 10 for Agentic Applications (2026)](https://genai.owasp.org/) is built around.\n\n`apohara-compliance` reads an AI coding-agent **session transcript** (the newline-delimited JSON record of every tool call it made) — or a repository — and maps the observed signals to framework controls. Each match is a candidate finding carrying the triggering signal, a confidence score, suggested controls, cross-framework references, and a citation (ID, name, version, source URL). A human reviewer decides what is real.\n\nIt is, as far as we know, the first developer-tier tool built directly on the OWASP Top 10 for Agentic Applications.\n\n---\n\n## ✨ Features\n\n| | |\n|---|---|\n| 🎯 **Action-level scanning** | Maps an agent's actual tool calls (`scan-session`), not just files at rest. Also scans repositories (`scan-repo`) and OTLP-exported telemetry off disk (`scan-otlp`, offline). |\n| 🧠 **Multi-action sequence** | Beyond single-action signals, an ordered second pass surfaces OWASP **ASI06 (Memory \u0026 Context Poisoning)** candidates (`AGT-MEM-001`): untrusted content followed by a write to a memory/RAG sink — candidate-only, never a runtime guarantee. [ADR-2](docs/adr/ADR-2-multi-action-sequence-matching.md) |\n| 🧬 **Trajectory taint-correlation** | A third, additive pass (`AGT-TRJ-001/002/003`) correlates an **injection marker in untrusted data the agent READ** (a `tool-result:` action) with a **later sensitive real-action sink** (exfil / destructive / financial) in the same stream. Post-hoc; recognisable-in-log ≠ would-have-prevented. [ADR-4](docs/adr/ADR-4-trajectory-taint-correlation.md) |\n| 🏷️ **Representation-aware taint** | The v2.1 sink parser emits a reserved `sink:` action carrying canonical role tokens (`recipient=` / `amount=` / `url=` / `command=`, with a `const SINK_GRAMMAR` authority boundary) and the AGT-TRJ rules ship a taxonomy-derived **generic injection-marker** vocabulary. Closes the v2.0 representation gap. [ADR-5](docs/adr/ADR-5-representation-aware-taint-and-evasion-robust-matching.md) |\n| 📊 **Real-trajectory measurement** | The v2.2 eval harness runs the **frozen** rules over real successful indirect-injection trajectories from last-gen frontier models (AgentDyn) and against live current-frontier models (OpenRouter) — bound triple + overlap-miss, no retro-fit. [ADR-6](docs/adr/ADR-6-real-trajectory-efficacy.md) |\n| 🐚 **Structural shell tokenizer** | A `shlex`-backed pass catches flag-reordered destructive commands a substring scan cannot (e.g. `rm -r -f` / `rm -fr` / quoted-arg variants), folded into `AGT-MIS-004`. |\n| 📑 **Cited candidates** | Every finding carries `{id, title, status, confidence, triggering_signal, citation(url+version), suggested_controls, cross_refs}`. No copyrighted framework prose is reproduced. |\n| 🧭 **10-framework crosswalk** | One signal resolves across OWASP Agentic, OWASP LLM, MITRE ATLAS, ISO 42001, EU AI Act, NIST, SOC 2 and ISO 27001 — see the [coverage table](#-framework-coverage). |\n| 🔌 **SARIF 2.1.0 output** | `--format sarif` is CI-ingestible by code scanning. Findings are `note`/`warning` — **never** `error`. A wrapping GitHub Action is included. |\n| 🔍 **Gap analysis** | `gap` lists carried controls with **no** candidate evidence — \"no signal observed for X\", never \"you fail X\". |\n| 📉 **Baseline diff** | `--baseline \u003cprior.json\u003e --only-new` reports only new findings via SARIF `baselineState`. |\n| 🎚️ **Tunable + suppressible** | `--min-confidence` / `--min-severity` thresholds and a visible-by-default suppression channel via `.apohara-compliance.toml`. |\n| 🦀 **Offline \u0026 deterministic** | Pure Rust, MSRV 1.74. No network, no API keys, no telemetry. Same input ⇒ same bytes out. |\n\n---\n\n## 🚀 Quick Start\n\n```sh\n# 1. Install the scanner (builds from source — lowest-trust path)\ncargo install apohara-compliance-scanner\n\n# 2. Audit an AI coding-agent session transcript\napohara-compliance-scanner scan-session ./session.jsonl --format md\n\n# 3. Audit a repository and emit SARIF for code scanning\napohara-compliance-scanner scan-repo . --format sarif \u003e results.sarif\n```\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eAdvanced usage\u003c/b\u003e — formats, ASI view, diffing, gap analysis, config\u003c/summary\u003e\n\n```sh\n# Surface OWASP Agentic (ASI01–ASI10) risks directly alongside the AGT findings\napohara-compliance-scanner scan-session ./session.jsonl --by-asi --format md\n\n# Diff against a prior run — emit only NEW findings (SARIF baselineState)\napohara-compliance-scanner scan-repo . --format json \u003e baseline.json\napohara-compliance-scanner scan-repo . --baseline baseline.json --only-new --format sarif\n\n# Scan OTLP-exported telemetry (logs/traces) an OTel exporter wrote to disk.\n# Runtime coverage for the OFFLINE scanner — it reads FILES only, no socket.\n# Post-hoc and exporter-bounded; findings stay candidates, never real-time.\napohara-compliance-scanner scan-otlp ./otel-export.json --format sarif\napohara-compliance-scanner scan-otlp ./otel-logs/ --format md   # a directory of exports\n\n# Gap analysis: carried controls with no candidate evidence observed\napohara-compliance-scanner gap ./session.jsonl --format md\n\n# Thresholds, file-extension filter, and project config\napohara-compliance-scanner scan-repo . --ext rs,py --min-confidence 0.8 \\\n  --min-severity 3 --config .apohara-compliance.toml --format json\n```\n\n**Global flags:** `--format {json|sarif|md}` (default `json`) · `--by-asi` · `--baseline \u003cfile\u003e` · `--only-new` · `--min-confidence \u003cf\u003e` · `--min-severity \u003cn\u003e` · `--suppress \u003cfile\u003e` · `--config \u003cfile\u003e` · `--ext \u003clist\u003e` (repo only).\n\n**Other acquisition paths.** Pre-built, signed per-OS binaries are published on [Releases](https://github.com/SuarezPM/apohara-compliance/releases). It also installs as an agent skill/plugin.\n\n\u003e [!WARNING]\n\u003e Downloading a pre-built binary is itself a supply-chain surface — the very risk this tool flags. Verify the build attestation and checksum before running it (see **[SECURITY.md → How to verify a release](SECURITY.md#how-to-verify-a-release)**), or prefer `cargo install` and build from source.\n\n\u003c/details\u003e\n\n---\n\n## 🧭 Framework coverage\n\nCross-references resolve along the chain **ASI → OWASP-LLM → ATLAS → ISO 42001 → EU AI Act**, with NIST and audit-standard controls hanging off each node.\n\n| Framework | Version | Scope |\n|---|---|---|\n| **OWASP Top 10 for Agentic Applications** | **2026** (ASI01–ASI10) | Primary mapping target |\n| OWASP Agentic Skills Top 10 | 2026 (AST01–AST10) | Draft project |\n| OWASP Top 10 for LLM Applications | **2025** | LLM-layer cross-refs |\n| MITRE ATLAS | 5.6.0 | Adversarial ML techniques |\n| ISO/IEC 42001 | 2023 | AI management system |\n| EU AI Act | Regulation (EU) 2024/1689 | High-risk obligations |\n| NIST AI RMF | 1.0 | Govern / Map / Measure / Manage |\n| NIST SP 800-53 | Rev 5 | Security \u0026 privacy controls |\n| SOC 2 | AICPA TSC 2017 | Trust services criteria |\n| ISO/IEC 27001 | 2022 | Information security |\n\n---\n\n## 🔬 How it works / honesty\n\n\u003e [!WARNING]\n\u003e **This is a guidance and mapping tool. It is NOT a certification, an audit, or legal advice.** Running it does not make a project \"compliant\". Every finding is a _candidate_ surfaced for human review — never an assertion that a control is met or violated, and never a substitute for a qualified auditor or counsel.\n\n**Candidates only.** Findings are emitted as SARIF `note`/`warning`, never `error`, and every line is prefixed `CANDIDATE —`. A false positive is a \"please confirm\", not a wrong verdict.\n\n**Traceable provenance.** 49 carried controls each trace to a cited source. Each finding records `status: official` or `status: draft`. In particular, NIST `AGENTIC-*` controls are flagged **`draft`** — they derive from a **March-2026 CSA draft profile, not official NIST**, and the scanner says so on every such finding. IDs, names, and versions are cited; no copyrighted framework text is reproduced.\n\nThe detection engine is built up in additive passes, each documented in its own ADR. The passes are independent and the scanner emits a `CANDIDATE —` line for any of them that fires.\n\n### Pass 1 — single-action matching (v1.0)\n\nA regex + word-boundary + context engine matches each observed action against the carried rule set. On the committed synthetic corpus, the tuning removes the substring matcher's false positives without regressing recall:\n\n| Matcher (same synthetic corpus) | Precision | Recall |\n|---|---|---|\n| Naive substring baseline | 0.6389 | 0.9200 |\n| Tuned engine (regex + word-boundary + context) | **1.0000** | **1.0000** |\n\nThe build **fails below precision 0.85**. Full numbers, per-rule breakdown, the reproduction command, and the honest limitations are in **[BENCHMARK.md](BENCHMARK.md)** (the source of record).\n\n\u003e [!NOTE]\n\u003e Those are metrics on a **100% synthetic, hand-crafted fixture corpus** — fixture metrics, not a claim of real-world accuracy. The headline result is the **false-positive reduction** (baseline→tuned delta), not the absolute 1.00. The corpus and the context rules co-evolved, so a perfect tuned score is partly true by construction.\n\n### Pass 2 — multi-action sequence correlation (v1.1, `AGT-MEM-001`, ASI06)\n\nAn opt-in, additive pass correlates an _ordered_ pair — untrusted/unsanitized content **followed by** a write to a memory/RAG sink — to surface OWASP **ASI06 (Memory \u0026 Context Poisoning)** candidates. Sink coverage is bounded to what the transcript/telemetry surfaces (shell persist commands + exported OTLP records). Like every finding it is **candidate-only**: it flags content that _could_ poison future context, never a detection of activated cross-session poisoning. See [ADR-2](docs/adr/ADR-2-multi-action-sequence-matching.md).\n\n### Pass 3 — trajectory taint-correlation (v2.0, `AGT-TRJ-001/002/003`, ADR-4)\n\nThe taint engine expresses the **injection → consequence dataflow** the single-action and sequence passes cannot: a TAINTED source — an action on the untrusted-data `tool-result:` channel carrying injection markers (and **not** a doc/comment quote) — **followed by** a genuine sensitive real-action sink (exfil / destructive / financial) later in the same action stream. The taint persists across intervening steps (forward-correlated).\n\nA fired AGT-TRJ candidate means **\"untrusted-marked data was observed and a sensitive action followed\"** — a candidate injection→consequence correlation, **not** a verdict that the agent obeyed the injection. The module is self-contained in `crates/scanner/src/taint.rs` and runs after the single-action and sequence passes in `matching::match_actions_with_suppress`. See [ADR-4](docs/adr/ADR-4-trajectory-taint-correlation.md).\n\n\u003e **Honesty invariant.** Mechanism proven on synthetic positives; live MiniMax-M3 run on AgentDojo banking-suite `important_instructions` was **0 / 10** attack-success (the target refused every injection), so post-hoc detection on real successes is **0 / 0 — undefined** at v2.0. Real-world efficacy is **UNPROVEN at v2.0** (stated plainly in the PREREG + PROOF).\n\n### Pass 4 — representation-aware taint (v2.1, ADR-5)\n\nv2.1 closes the v2.0 **representation gap**. The parser now emits a reserved `sink:` action carrying a deterministic canonical role string (`recipient=` / `amount=` / `url=` / `command=`, with `const SINK_GRAMMAR` enforcing an authority boundary), and the AGT-TRJ rules gained a **taxonomy-derived generic injection-marker** vocabulary (OWASP ASI02:2026 / AITG-APP-02 / documented IPI canary families — each marker cited in `detection-rules.yaml`). The `sink:` channel is excluded from the single-action loop by a one-line `starts_with(\"sink:\")` guard, so the new representation **cannot** produce a single-action false positive (proven by the C1 FP-safety + C2 grammar-disjointness tests).\n\nMechanism + representation proven on a synthetic positive: `trj-representation-aware-positive.jsonl` fires AGT-TRJ-001 + AGT-TRJ-003; the FinBot direct-injection fixture (negative control) and the benign-trajectory trap fire **zero**. Pre-registered measurement on the committed AgentDojo corpus (frozen rules SHA `ac88825`, no LLM) confirmed the single-action recall is **unchanged** at **23 / 35 (0.657)** — the v2.1 work added **no** new single-action prose rules. See [ADR-5](docs/adr/ADR-5-representation-aware-taint-and-evasion-robust-matching.md).\n\n\u003e **Honesty invariant.** Mechanism + representation proven on synthetic positives; the AgentDojo committed corpus is **flat bait** (no `tool-result:` → `sink:` dataflow), so it has zero trajectory items to exercise the structured-sink representation on a real trace. Real-world efficacy is **still UNPROVEN at v2.1** (stated plainly).\n\n### Pass 5 — real-trajectory measurement (v2.2, ADR-6) — the bound triple + ceiling\n\nv2.2 closes the v2.0/v2.1 \"real-world efficacy UNPROVEN **by absence of any real trajectory**\" gap. The engine is run, with the **same frozen rules** (blob SHA `dcd1ac6`, frozen BEFORE scanning) and the apohara-agnostic `wrap_agentdojo_trace.py` wrapper, over two externally-labeled corpora. The number is reported as a **bound triple** + its representation overlap-miss, and the correlation-not-causation ceiling is stated as a **co-headline of equal prominence**.\n\n**HEADLINE.** apohara v2.1 post-hoc-recognises the injection → sink correlation in **169 / 236 (71.6 %)** of real successful indirect-injection trajectories from last-generation frontier models (AgentDyn open-ended suites). This closes the v2.0 \"absence\" gap — the mechanism fires on real traces, not only synthetic.\n\n**CO-HEADLINE LIMIT (equal prominence, never buried).** It ALSO fires on **28.7 % (659 / 2295)** of **resisted** injections and **1.4 % (5 / 352)** of benign traces. apohara is a **candidate injection → consequence correlation surfacer**, NOT a success / causation discriminator: a resisted injection still carries the marker in a tool-result AND the agent still performs a legitimate structured sink, so the marker → sink correlation fires in both succeeded and resisted cases. **precision-on-success ≈ 169 / (169+659+5) = 169/833 ≈ 20 %.** The discriminating signal (did the agent OBEY the injection) is not representable in a deterministic post-hoc text-pattern model — this is the **quantified ceiling**.\n\n**The bound triple — DOWNLOAD corpus** (AgentDyn `@ 5353cf7`, agentdojo 0.1.35, benchmark v1.2.2; attack `important_instructions`; **last-gen** models, date-labeled; open-ended suites; frozen rules `dcd1ac6`):\n\n| | result |\n|---|---|\n| (a) model attack-success-rate (AgentDyn labels, apohara-independent) | gpt-5.1-2025-11-13 **33 / 851** (3.9 %) · gpt-5-mini-2025-08-07 **2 / 560** (0.4 %) · gemini-2.5-pro **122 / 560** (21.8 %) · gemini-2.5-flash **79 / 560** (14.1 %) |\n| (b) apohara post-hoc AGT-TRJ detection on the 236 real successes | **169 / 236 (71.6 %)** — gpt-5.1 25/33 · gpt-5-mini 0/2 · gemini-2.5-pro 89/122 · gemini-2.5-flash 55/79 (AGT-TRJ-001 dominant, AGT-TRJ-003 secondary) |\n| (c) false-positive control | failed-injection (RESISTED) FP **659 / 2295 (28.7 %)** · benign FP **5 / 352 (1.4 %)** ⇒ precision-on-success **169/833 ≈ 20 %** |\n\n**Overlap-miss** (model-independent representation coverage of the 236 positives): marker `\u003cinformation\u003e` covered 232/236; role-mapped structured sink covered 180/236; BOTH 178/236; NEITHER 2/236. Covered sink roles: `url=170, recipient=60, amount=59, command=34`. MISSED arg-keys (OUTSIDE the frozen role map — the `iban`-analog): `path (161), subject (114), otp (87), title (79), body (68), recipients (68), repo_name (54), password (33)`. **Reported as-is, NEVER closed** — adding any of these after seeing traces would convert the number from a **MEASUREMENT** into a **FIT** (forbidden by the pre-registration).\n\n**The bound triple — LIVE current-frontier** (via OpenRouter; suite `workspace`; attack `important_instructions_no_model_name`; same frozen rules + frozen wrapper + release binary; current-frontier models, date-labeled: gpt-5.5, gemini-3.5-flash, gemini-3.1-pro-preview, MiniMax-M3, claude-opus-4.8):\n\n| | result |\n|---|---|\n| (a) attack-success TOTAL | **0 / 80 (0.0 %)** — EACH model 0 / 16 |\n| (b) apohara post-hoc detection on successes | **0 / 0 — UNDEFINED** (no live success to detect on) |\n| (c) false-positive control | failed-injection FP **0 / 80** · benign FP **0 / 15** (the download 28.7 % correlation-FP did **NOT** reproduce on this live set) |\n| real LIVE usage | **224 API calls, all HTTP 200; 698,959 tokens** (smoke + live; under the 1 M cap); key never logged |\n\n\u003e **CAVEAT (stated).** The live run used `suite=workspace` (the standard AgentDojo suite), NOT AgentDyn's harder open-ended suites (shopping / github / dailylife) where last-gen models reached 14–22 % ASR — because the current-frontier OpenRouter IDs are not in AgentDyn's model registry. So the live 0/80 is on the **easier standard suite**; current-frontier behaviour on the harder open-ended attack is **UNMEASURED** (a documented follow-up). The download corpus (last-gen, open-ended) remains the only set with real successes.\n\n\u003e **Claim ceiling (verbatim, ADR-6).** *\"deterministic, post-hoc, representation-aware injection → consequence CANDIDATE CORRELATION surfacer; mechanism + representation proven on synthetic positives; post-hoc recognition MEASURED on real successful trajectories (169/236, last-gen open-ended) with an explicit model-independent overlap-miss; ALSO fires on resisted (28.7 %) + benign (1.4 %) — a correlation surfacer, NOT a success / causation discriminator (precision-on-success ≈ 20 %); NOT efficacy / recall / prevention; recognisable-in-log ≠ would-have-prevented.\"*\n\nPre-registration (`tests/corpus/PREREG-v2.2-real-trajectory.md`, rules frozen at `dcd1ac6` **BEFORE** scanning, verified unchanged) and the schema-validated numbers-only report (`tests/corpus/v2.2-real-trajectory-report.json`) are committed; the AgentDyn trace content is gitignored. See [ADR-6](docs/adr/ADR-6-real-trajectory-efficacy.md) + `PROOF-v2.2-real-trajectory.md`.\n\n---\n\n## 🏗️ Repository layout\n\n```text\napohara-compliance/\n├── crates/scanner/                       # the deterministic Rust scanner\n│   ├── src/\n│   │   ├── cli.rs                        # clap CLI surface (scan-session / scan-repo / scan-otlp / gap)\n│   │   ├── matching.rs                   # regex + word-boundary + context engine (orchestrates the passes)\n│   │   ├── rules.rs                      # rule loading + resolution ladder\n│   │   ├── sequence.rs                   # Pass 2 — multi-action AGT-MEM-001 (ADR-2)\n│   │   ├── taint.rs                      # Pass 3-4 — trajectory taint + representation-aware (ADR-4/5)\n│   │   ├── shell.rs                      # structural `shlex` shell pass — flag-reorder evasions (v2.1)\n│   │   ├── model.rs                      # the candidate finding + rule data model\n│   │   ├── parse_session.rs              # tolerant NDJSON session-transcript reader\n│   │   ├── parse_otlp.rs                 # tolerant OTLP/JSON telemetry reader (offline, file-only)\n│   │   ├── parse_repo.rs                 # gitignore-respecting repo walker\n│   │   ├── baseline.rs                   # diff vs. a prior run (SARIF baselineState)\n│   │   ├── config.rs / gap.rs / suppress.rs / triage.rs\n│   │   └── format/                       # json · sarif · md · gap renderers\n│   ├── tests/\n│   │   ├── integration.rs                # unit + integration\n│   │   ├── precision_recall.rs           # CI-gated synthetic precision/recall (v1.0)\n│   │   ├── independent_corpus.rs         # AgentDojo / AgentHarm non-gating cross-check (v1.4)\n│   │   └── trajectory_corpus.rs          # v2.0/v2.1 trajectory + AGT-TRJ positive/negative fixtures\n│   └── references/                       # canonical framework rule + crosswalk YAML data\n├── docs/adr/                             # ADR-2 sequence · ADR-3 corpus · ADR-4 taint · ADR-5 repr · ADR-6 efficacy\n├── tests/corpus/                         # synthetic gate + AgentDojo + AgentHarm + v2.x PREREG/PROOF/report\n├── references/                           # canonical rule + mapping data (mirror, symlinked into the crate)\n├── skills/                               # installable agent skill\n├── action/                               # GitHub Action wrapper (uploads SARIF)\n├── tests/fixtures/                       # synthetic session + repo fixtures\n└── scripts/                              # capture + eval harness (FINBOT, v2.2 buckets, polarity gate, …)\n```\n\n---\n\n## 🗺️ Roadmap\n\n**Shipped** (on `main`, not all on crates.io/Releases — see badge)\n\n- [x] v1.0 — Action-level session scanning (`scan-session`), repo scanning (`scan-repo`), gap analysis (`gap`)\n- [x] v1.0 — SARIF 2.1.0 output + GitHub Action\n- [x] v1.0 — Committed synthetic precision/recall CI gate (precision floor 0.85, no-recall-regression bound)\n- [x] v1.0 — Baseline diffing (`--baseline` / `--only-new`)\n- [x] v1.0 — Signed per-OS release binaries with build attestation ([how to verify](SECURITY.md#how-to-verify-a-release))\n- [x] v1.0 — Per-rule precision reporting ([BENCHMARK.md](BENCHMARK.md))\n- [x] v1.1 — `scan-otlp` (OTLP-exported telemetry, offline) + `AGT-MEM-001` multi-action sequence pass (ADR-2)\n- [x] v1.1 — `SECURITY.md` (disclosure / threat model / supply-chain verify) + `BENCHMARK.md` (reproducible)\n- [x] v1.1 — OpenSSF Scorecard, Dependabot, CodeQL\n- [x] v1.4 — Independent corpora (AgentDojo + AgentHarm, non-gating) for prose-rule coverage (ADR-3)\n- [x] v2.0 — Trajectory taint-correlation engine (ADR-4): injection → consequence dataflow, post-hoc, offline\n- [x] v2.1 — Representation-aware taint (ADR-5): `sink:` channel + `const SINK_GRAMMAR` role tokens + generic injection-marker vocabulary + structural `shlex` shell pass (AGT-MIS-004)\n- [x] v2.2 — Real-trajectory measurement (ADR-6): bound triple on real AgentDyn successes (169/236) + live current-frontier cross-check (0/80 resisted); HONEST co-headline (28.7 % FP on resisted, ~20 % precision-on-success) — the framing IS the deliverable\n\n**Exploring** — demand-driven, not committed\n\n- [ ] v2.3 (proposed) — argument-value provenance discriminator to attack the 28.7 % correlation-FP (causal proxy, deterministic, offline). Pre-proposal at `.omc/plans/v2.3-followups.md` (consensus IN PROGRESS).\n- [ ] v2.3 (proposed) — current-frontier on the harder AgentDyn open-ended suites (shopping / github / dailylife). Blocked by AgentDyn's model registry not carrying current-frontier OpenRouter IDs.\n- [ ] v2.3 (proposed) — S2 shell AST escalation (conch-parser vendor) if the `shlex` pass proves insufficient on adversarial inputs.\n- [ ] Repo-file normalisation (ADR-5 M4 deferred gap) — A3 homoglyph / zero-width / casing runs in the session value picker only; a future PR extends it to `parse_repo` for the dominant indirect-injection surface.\n- [ ] Additional agent-transcript formats\n- [ ] First-mover OWASP Agentic Skills (AST01–AST10) rules once the draft stabilises\n\n---\n\n## 🤝 Contributing\n\nContributions are welcome.\n\n1. **Fork** the repository.\n2. Create a feature **branch** (`git checkout -b feature/my-change`).\n3. Make your change and run the tests: `cargo test` (the precision/recall gate + the trajectory + the independent-corpus gates all run here).\n4. Open a **pull request**.\n\n\u003e Unless you state otherwise, any contribution you intentionally submit for inclusion in this work, as defined in the Apache-2.0 license, shall be dual-licensed as below, without any additional terms or conditions.\n\n---\n\n## 📄 License\n\nLicensed under either of **[MIT](LICENSE-MIT)** or **[Apache-2.0](LICENSE-APACHE)**, at your option.\n\nMaintained by **[SuarezPM](https://github.com/SuarezPM)**.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsuarezpm%2Fapohara-compliance","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsuarezpm%2Fapohara-compliance","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsuarezpm%2Fapohara-compliance/lists"}