{"id":45742374,"url":"https://github.com/garagon/aguara","last_synced_at":"2026-05-17T06:04:37.182Z","repository":{"id":339182789,"uuid":"1160362952","full_name":"garagon/aguara","owner":"garagon","description":"Security scanner for AI agent skills and MCP servers. Static analysis, incident response, no LLM. One binary.   Detection engine behind oktsec.","archived":false,"fork":false,"pushed_at":"2026-05-12T22:49:32.000Z","size":584,"stargazers_count":76,"open_issues_count":0,"forks_count":15,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-05-12T23:05:44.365Z","etag":null,"topics":["ai-agents","ai-security","claude","data-exfiltration","devsecops","golang","mcp","mcp-server","model-context-protocol","prompt-injection","sast","security","security-scanner","static-analysis","supply-chain-security"],"latest_commit_sha":null,"homepage":"https://aguarascan.com","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/garagon.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null}},"created_at":"2026-02-17T21:13:42.000Z","updated_at":"2026-05-12T22:49:37.000Z","dependencies_parsed_at":"2026-05-12T23:02:27.791Z","dependency_job_id":null,"html_url":"https://github.com/garagon/aguara","commit_stats":null,"previous_names":["garagon/aguara"],"tags_count":25,"template":false,"template_full_name":null,"purl":"pkg:github/garagon/aguara","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/garagon%2Faguara","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/garagon%2Faguara/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/garagon%2Faguara/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/garagon%2Faguara/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/garagon","download_url":"https://codeload.github.com/garagon/aguara/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/garagon%2Faguara/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33129101,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-16T18:38:32.183Z","status":"online","status_checked_at":"2026-05-17T02:00:05.366Z","response_time":107,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai-agents","ai-security","claude","data-exfiltration","devsecops","golang","mcp","mcp-server","model-context-protocol","prompt-injection","sast","security","security-scanner","static-analysis","supply-chain-security"],"created_at":"2026-02-25T15:17:49.407Z","updated_at":"2026-05-17T06:04:37.159Z","avatar_url":"https://github.com/garagon.png","language":"Go","funding_links":[],"categories":["🔍 Static Analysis \u0026 Linters","Testing Tools","Tools"],"sub_categories":["Common Lisp"],"readme":"\u003cp align=\"center\"\u003e\n  \u003ch1 align=\"center\"\u003eAguara\u003c/h1\u003e\n  \u003cp align=\"center\"\u003e\n    Security scanner for AI agent skills and MCP servers.\n    \u003cbr /\u003e\n    Detect prompt injection, data exfiltration, and supply-chain attacks before they reach production.\n  \u003c/p\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"https://github.com/garagon/aguara/actions/workflows/ci.yml\"\u003e\u003cimg src=\"https://github.com/garagon/aguara/actions/workflows/ci.yml/badge.svg\" alt=\"CI\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://goreportcard.com/report/github.com/garagon/aguara\"\u003e\u003cimg src=\"https://goreportcard.com/badge/github.com/garagon/aguara\" alt=\"Go Report Card\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://pkg.go.dev/github.com/garagon/aguara\"\u003e\u003cimg src=\"https://pkg.go.dev/badge/github.com/garagon/aguara.svg\" alt=\"Go Reference\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://github.com/garagon/aguara/releases\"\u003e\u003cimg src=\"https://img.shields.io/github/v/release/garagon/aguara\" alt=\"GitHub Release\"\u003e\u003c/a\u003e\n  \u003ca href=\"LICENSE\"\u003e\u003cimg src=\"https://img.shields.io/github/license/garagon/aguara\" alt=\"License\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://github.com/garagon/aguara/stargazers\"\u003e\u003cimg src=\"https://img.shields.io/github/stars/garagon/aguara?style=flat\" alt=\"GitHub Stars\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://github.com/garagon/aguara/blob/main/Dockerfile\"\u003e\u003cimg src=\"https://img.shields.io/badge/docker-ghcr.io%2Fgaragon%2Faguara-blue?logo=docker\" alt=\"Docker\"\u003e\u003c/a\u003e\n  \u003ca href=\"#installation\"\u003e\u003cimg src=\"https://img.shields.io/badge/homebrew-garagon%2Ftap-orange\" alt=\"Homebrew\"\u003e\u003c/a\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"#installation\"\u003eInstallation\u003c/a\u003e \u0026bull;\n  \u003ca href=\"#quick-start\"\u003eQuick Start\u003c/a\u003e \u0026bull;\n  \u003ca href=\"#how-it-works\"\u003eHow It Works\u003c/a\u003e \u0026bull;\n  \u003ca href=\"#usage\"\u003eUsage\u003c/a\u003e \u0026bull;\n  \u003ca href=\"#rules\"\u003eRules\u003c/a\u003e \u0026bull;\n  \u003ca href=\"#supply-chain-check\"\u003eSupply-Chain Check\u003c/a\u003e \u0026bull;\n  \u003ca href=\"#aguara-mcp\"\u003eAguara MCP\u003c/a\u003e \u0026bull;\n  \u003ca href=\"#aguara-watch\"\u003eAguara Watch\u003c/a\u003e \u0026bull;\n  \u003ca href=\"#contributing\"\u003eContributing\u003c/a\u003e\n\u003c/p\u003e\n\nhttps://github.com/user-attachments/assets/851333be-048f-48fa-aaf3-f8cc1d4aa594\n\n## Why Aguara?\n\nAI agents and MCP servers run code on your behalf. A single malicious skill file can exfiltrate credentials, inject prompts, or install backdoors. Aguara catches these threats **before deployment** with static analysis that requires no API keys, no cloud, and no LLM.\n\n- **193 detection rules across 13 categories** — prompt injection, data exfiltration, credential leaks, supply-chain attacks, MCP-specific threats, command execution, SSRF, unicode attacks, and more.\n- **7 scan analyzers** — pattern matching, GitHub Actions trust-chain detection (ci-trust), npm package metadata (pkgmeta), JavaScript payload risk (jsrisk), NLP analysis, taint tracking, and rug-pull detection work together to catch threats that any single technique would miss. Separate `aguara check` / `aguara audit` commands flag installed npm and Python environments AND lockfiles for Go, Rust, PHP/Composer, Ruby/Bundler, Java/Maven/Gradle, and .NET/NuGet against the embedded threat-intel snapshot. Built from [OSV.dev](https://osv.dev) malicious-package records plus OpenSSF Malicious Packages and hand-curated emergency advisories. Offline by default.\n- **8 decoders for encoded evasion** — base64, hex, URL encoding, Unicode escapes, HTML entities, hex escapes, base32, and C-style octal escapes. Obfuscated payloads are decoded and re-scanned automatically.\n- **NLP on markdown, JSON, and YAML** — goldmark AST analysis for markdown files, plus string extraction and classification for JSON/YAML tool descriptions. Catches MCP tool poisoning in structured configs.\n- **Cross-file toxic flow analysis** — detects dangerous capability combinations split across files in the same MCP server directory (e.g., one tool reads credentials, another sends to a webhook).\n- **Aggregate risk score** — 0-100 score with diminishing returns across findings. Available in JSON, SARIF, and terminal output.\n- **Context-aware scanning** — pass the tool name (`--tool-name Edit`) and the scanner automatically skips rules that are always false positives for that tool. Built-in exemptions for Edit, Write, WebFetch, Bash, and more.\n- **Scan profiles** — `strict` (default), `content-aware`, or `minimal` enforcement. Findings are always preserved for audit; only the verdict (clean/flag/block) changes.\n- **Evasion prevention** — NFKC normalization catches fullwidth character evasion. 8 decoders catch encoded payloads. Crypto address filtering prevents hex decoder false positives.\n- **Dynamic confidence scoring** — every finding carries a confidence level (0.50-0.95) that reflects signal quality: pattern hit ratio, classifier score, and code-block awareness.\n- **Remediation guidance** — all 193 rules include actionable fix suggestions, shown in every output format.\n- **Deterministic** — same input, same output. Every scan is reproducible.\n- **CI-ready** — JSON, SARIF, and Markdown output. GitHub Action. `--fail-on` threshold. `--changed` for incremental scans.\n- **17 MCP clients supported** — auto-discover and scan configs from Claude Desktop, Cursor, VS Code, Windsurf, and 13 more.\n- **Library API for embedding** — `WithDeduplicateMode()` preserves all cross-rule findings for verdict pipelines. `WithStateDir()` enables rug-pull detection for persistent consumers.\n- **Extensible** — write custom rules in YAML. No code required.\n\n## Installation\n\n```bash\ncurl -fsSL https://raw.githubusercontent.com/garagon/aguara/main/install.sh | sh\n```\n\nInstalls the latest binary to `~/.local/bin`. Customize with environment variables:\n\n```bash\ncurl -fsSL https://raw.githubusercontent.com/garagon/aguara/main/install.sh | VERSION=v0.17.0 sh\ncurl -fsSL https://raw.githubusercontent.com/garagon/aguara/main/install.sh | INSTALL_DIR=/usr/local/bin sh\n```\n\nTo update an existing install, rerun the installer. It downloads the selected release archive, verifies `checksums.txt`, and replaces the binary:\n\n```bash\n# Update to latest\ncurl -fsSL https://raw.githubusercontent.com/garagon/aguara/main/install.sh | sh\n\n# Update/pin to a specific release\ncurl -fsSL https://raw.githubusercontent.com/garagon/aguara/main/install.sh | VERSION=v0.17.0 sh\n```\n\n### Alternative methods\n\n**Homebrew** (macOS/Linux):\n\n```bash\nbrew install garagon/tap/aguara\n```\n\n**Docker** (no install required):\n\n```bash\n# Scan current directory\ndocker run --rm -v \"$(pwd)\":/scan ghcr.io/garagon/aguara scan /scan\n\n# Scan with options\ndocker run --rm -v \"$(pwd)\":/scan ghcr.io/garagon/aguara scan /scan --severity high --format json\n\n# Use a specific version\ndocker run --rm -v \"$(pwd)\":/scan ghcr.io/garagon/aguara:0.17.0 scan /scan\n```\n\n**From source** (requires Go 1.25+):\n\n```bash\ngo install github.com/garagon/aguara/cmd/aguara@latest\n```\n\nPre-built binaries for Linux, macOS, and Windows are also available on the [Releases page](https://github.com/garagon/aguara/releases).\n\n### Verifying signed releases\n\nStarting with the next release after this section landed, every release is signed with [Cosign](https://github.com/sigstore/cosign) keyless, ships an SPDX SBOM per archive, and is built with `-trimpath` for reproducibility. The container image is signed at the digest and carries SBOM + SLSA provenance attestations.\n\n**Verify the release archive**:\n\n```bash\nVERSION=vX.Y.Z\nARCHIVE=aguara_${VERSION#v}_linux_amd64.tar.gz\n\n# Download archive, checksums, and the cosign bundle\ncurl -fsSLO https://github.com/garagon/aguara/releases/download/${VERSION}/${ARCHIVE}\ncurl -fsSLO https://github.com/garagon/aguara/releases/download/${VERSION}/checksums.txt\ncurl -fsSLO https://github.com/garagon/aguara/releases/download/${VERSION}/checksums.txt.bundle\n\n# Verify checksums.txt was signed by the GitHub Actions release workflow\ncosign verify-blob \\\n  --bundle checksums.txt.bundle \\\n  --certificate-identity \"https://github.com/garagon/aguara/.github/workflows/release.yml@refs/tags/${VERSION}\" \\\n  --certificate-oidc-issuer \"https://token.actions.githubusercontent.com\" \\\n  checksums.txt\n\n# Now verify the archive matches the signed checksums\nsha256sum --check --ignore-missing checksums.txt\n```\n\n**Verify the container image**:\n\n```bash\ncosign verify ghcr.io/garagon/aguara:${VERSION#v} \\\n  --certificate-identity \"https://github.com/garagon/aguara/.github/workflows/docker.yml@refs/tags/${VERSION}\" \\\n  --certificate-oidc-issuer \"https://token.actions.githubusercontent.com\"\n```\n\n**Inspect the SBOM and provenance**:\n\n```bash\n# Release archive SBOM (SPDX 2.3)\ncurl -fsSL https://github.com/garagon/aguara/releases/download/${VERSION}/${ARCHIVE}.sbom.json | jq .\n\n# Container image SBOM and SLSA build provenance.\n# docker/build-push-action publishes these as BuildKit attestation manifests\n# (in-toto / SLSA spec) attached to the OCI image index, not as cosign\n# attestations. Use `docker buildx imagetools inspect` to read them:\ndocker buildx imagetools inspect ghcr.io/garagon/aguara:${VERSION#v} \\\n  --format '{{ json .SBOM }}' | jq .\ndocker buildx imagetools inspect ghcr.io/garagon/aguara:${VERSION#v} \\\n  --format '{{ json .Provenance }}' | jq .\n```\n\n`install.sh` performs SHA256 verification automatically and aborts if `sha256sum`/`shasum` is unavailable, so the curl-pipe install path is never silently downgraded.\n\n## Quick Start\n\n```bash\n# Am I exposed to a compromised package?\naguara check\n\n# Same, but refresh threat intel from OSV first\naguara check --fresh\n\n# CI gate: fail if a compromised package is installed\naguara check --ci\n\n# Audit: scan code + check packages, single verdict\naguara audit\n\n# Is Aguara's threat intel fresh?\naguara status\n\n# Pull the latest threat intel for future offline checks\naguara update\n```\n\n```bash\n# Discover which MCP clients are configured on this machine\naguara discover\n\n# Scan a skills directory (or any path) for content threats\naguara scan .claude/skills/\n\n# CI mode for content scan: --fail-on high, no color\naguara scan .claude/skills/ --ci\n\n# Auto-discover and scan all MCP configs on this machine\naguara scan --auto\n```\n\nBoth `aguara check` and `aguara audit` run offline by default using the\nthreat intel baked into the binary. The network is used only when you\nopt in with `--fresh` or `aguara update`.\n\n## How It Works\n\nAguara runs 6 scan analyzers sequentially on every file by default; a 7th (Rug-Pull) joins when `--monitor` is enabled and a state store is configured. Each catches a different class of attack:\n\n| Analyzer | Engine | What it catches |\n|----------|--------|-----------------|\n| **Pattern Matcher** | Regex + Aho-Corasick matching | Known attack signatures, credential patterns, dangerous commands. Aho-Corasick automaton for O(n+m) multi-pattern search. 8 decoders (base64, hex, URL encoding, Unicode escapes, HTML entities, hex escapes, base32, octal escapes) decode obfuscated payloads and re-scan. Code-block severity downgrade. Dynamic confidence based on pattern hit ratio. |\n| **CI Trust** | GitHub Actions YAML parser | `pull_request_target` chains, cache poisoning across fork boundaries, OIDC token surface paired with install/build/test, persisted-credentials checkouts on PR head refs. |\n| **PkgMeta** | `package.json` JSON parser | npm install-time lifecycle scripts plus git-sourced dependencies, optional-git deps with suspicious names, publish surfaces paired with trusted-publishing references. |\n| **JSRisk** | JavaScript single-pass scanner | Obfuscator-shape payloads, install-time daemonization via `child_process`, CI secret harvesting through real `process.env` reads plus network/registry sinks, runner-process memory pivots to extract OIDC tokens, Claude Code / VS Code workspace persistence. |\n| **NLP Analyzer** | Goldmark AST + JSON/YAML extraction | Prompt injection in markdown structure, plus tool poisoning in JSON/YAML description fields. Keyword classification with proximity weighting; clustered keywords score higher, sparse keywords in long text get penalized. |\n| **Taint Tracker** | Source-to-sink flow analysis | Dangerous capability combinations within a single file and across files in the same directory. Detects credential reads paired with webhook sends, env vars flowing to shell execution, destructive plus exec combos across MCP server tools. |\n| **Rug-Pull Detector** | SHA256 hash tracking | Tool descriptions that change between scans. CLI: `--monitor` flag. Library: `WithStateDir()` for persistent consumers. |\n\nSeparate `aguara check` and `aguara audit` commands inspect installed\npackage trees (Python `site-packages`, npm `node_modules` including the\npnpm `.pnpm` store) AND walk the repo recursively for Go, Rust, PHP,\nRuby, Java, and .NET lockfiles, matching every declared package against\nthe embedded threat-intel snapshot. See\n[Supply-Chain Check](#supply-chain-check) for the full surface.\n\nAll content is NFKC-normalized before scanning to prevent Unicode evasion attacks. All layers report findings with severity, dynamic confidence score (0.50-0.95), matched text, file location with context lines, and remediation guidance. An aggregate risk score (0-100) summarizes overall threat level.\n\n## Usage\n\n```\naguara scan [path] [flags]\n\nFlags:\n      --auto                  Auto-discover and scan all MCP client configs\n      --severity string       Minimum severity to report: critical, high, medium, low, info (default \"info\")\n      --format string         Output format: terminal, json, sarif, markdown (default \"terminal\")\n  -o, --output string         Output file path (default: stdout)\n      --workers int           Number of worker goroutines (default: NumCPU)\n      --rules string          Additional rules directory\n      --disable-rule strings  Rule IDs to disable (comma-separated, repeatable)\n      --max-file-size string  Maximum file size to scan (e.g. 50MB, 100MB; default 50MB, range 1MB-500MB)\n      --tool-name string      Tool context for false-positive reduction (e.g. Bash, Edit, WebFetch)\n      --profile string        Scan profile: strict (default), content-aware, minimal\n      --no-color              Disable colored output\n      --no-update-check       Disable automatic update check (also: AGUARA_NO_UPDATE_CHECK=1)\n      --fail-on string        Exit code 1 if findings at or above this severity\n      --ci                    CI mode: --fail-on high --no-color\n      --changed               Only scan git-changed files\n      --monitor               Enable rug-pull detection: track file hashes across runs\n  -v, --verbose               Show rule descriptions, confidence scores, and remediation\n  -h, --help                  Help\n```\n\n### Output Formats\n\n| Format | Flag | Use case |\n|--------|------|----------|\n| **Terminal** | `--format terminal` (default) | Human-readable with color, severity dashboard, top-files chart |\n| **JSON** | `--format json` | Machine processing, API integration, custom tooling |\n| **SARIF** | `--format sarif` | GitHub Code Scanning, IDE integrations, SAST dashboards |\n| **Markdown** | `--format markdown` | GitHub Actions job summaries, PR comments |\n\n### MCP Client Discovery\n\nAguara auto-detects MCP configurations across **17 clients**: Claude Desktop, Cursor, VS Code, Cline, Windsurf, OpenClaw, OpenCode, Zed, Amp, Gemini CLI, Copilot CLI, Amazon Q, Claude Code, Roo Code, Kilo Code, BoltAI, and JetBrains.\n\n```bash\n# List all detected MCP configs\naguara discover\n\n# JSON output (sensitive env values are automatically redacted)\naguara discover --format json\n\n# Markdown output\naguara discover --format markdown\n\n# Discover + scan in one command\naguara scan --auto\n```\n\n### CI Integration\n\n#### GitHub Action\n\n```yaml\n- uses: garagon/aguara@v0.17.0\n  with:\n    path: .\n    fail-on: high\n    version: v0.17.0\n```\n\nBoth pins (the action ref AND the `version:` input) are required. The\naction ref alone pins only the composite action and its install\nscript; `version:` pins the Aguara binary the action installs. Setting\nboth makes the workflow reproducible and dependabot-friendly: when\nv0.17.1 lands, the bot updates both together.\n\nScans your repository, uploads findings to GitHub Code Scanning, and\noptionally fails the build:\n\n```yaml\n- uses: garagon/aguara@v0.17.0\n  with:\n    path: ./mcp-server/\n    severity: medium\n    fail-on: high\n    version: v0.17.0\n```\n\nAll inputs are optional. See [`action.yml`](action.yml) for the full list.\n\n| Input | Default | Description |\n|-------|---------|-------------|\n| `path` | `./` | Path to scan |\n| `severity` | `info` | Minimum severity to report |\n| `fail-on` | _(none)_ | Fail if findings at or above this severity |\n| `format` | `sarif` | Output format: sarif, json, terminal, markdown |\n| `upload-sarif` | `true` | Upload SARIF to GitHub Code Scanning |\n| `version` | _(latest)_ | Pin a specific Aguara version |\n\n\u003e **Note**: SARIF upload requires the `security-events: write` permission and is free for public repositories.\n\n#### Docker in CI\n\n```yaml\n# GitHub Actions with Docker (no install step)\n- name: Scan for security issues\n  run: docker run --rm -v \"${{ github.workspace }}\":/scan ghcr.io/garagon/aguara scan /scan --ci\n```\n\n#### Manual / GitLab CI\n\n```yaml\n# GitHub Actions (without the action)\n- name: Scan skills for security issues\n  run: |\n    curl -fsSL https://raw.githubusercontent.com/garagon/aguara/main/install.sh | sh\n    aguara scan .claude/skills/ --ci\n```\n\n```yaml\n# GitLab CI\nsecurity-scan:\n  script:\n    - curl -fsSL https://raw.githubusercontent.com/garagon/aguara/main/install.sh | sh\n    - aguara scan .claude/skills/ --format sarif -o gl-sast-report.sarif --fail-on high\n  artifacts:\n    reports:\n      sast: gl-sast-report.sarif\n```\n\n### Configuration\n\nCreate `.aguara.yml` in your project root:\n\n```yaml\nseverity: medium\nfail_on: high\nmax_file_size: 104857600  # 100 MB (default: 50 MB, range: 1 MB-500 MB)\nignore:\n  - \"vendor/**\"\n  - \"node_modules/**\"\nrule_overrides:\n  CRED_004:\n    severity: low\n  EXTDL_004:\n    disabled: true\n  TC-005:\n    apply_to_tools: [\"Bash\"]       # only enforce on Bash\n  MCPCFG_004:\n    exempt_tools: [\"WebFetch\"]     # enforce on everything except WebFetch\n```\n\n`apply_to_tools` and `exempt_tools` are mutually exclusive per rule. They filter findings at scan time when a tool name is provided via `--tool-name` or the library API.\n\n### Inline Ignore\n\nSuppress specific findings directly in your source files using inline comments:\n\n```yaml\n# aguara-ignore CRED_004\napi_key: \"sk-test-1234567890\"  # this finding is suppressed\n```\n\n```markdown\n\u003c!-- aguara-ignore-next-line PROMPT_INJECTION_001 --\u003e\nIgnore all previous instructions (this is a test)\n```\n\nSupported directives:\n\n| Directive | Effect |\n|-----------|--------|\n| `# aguara-ignore RULE_ID` | Suppress rule on the same line |\n| `# aguara-ignore RULE_ID, RULE_ID2` | Suppress multiple rules on the same line |\n| `# aguara-ignore-next-line RULE_ID` | Suppress rule on the next line |\n| `# aguara-ignore` | Suppress all rules on the same line |\n| `\u003c!-- aguara-ignore RULE_ID --\u003e` | HTML/Markdown comment variant |\n| `// aguara-ignore RULE_ID` | C-style comment variant |\n\n## Rules\n\n193 built-in rules across 13 pattern-rule categories (what `aguara list-rules` enumerates) plus the toxic-flow chain analyzer with its own emit-time category. The table groups coverage by emit-time category for readability:\n\n| Category | Rules | What it detects |\n|----------|-------|-----------------|\n| Credential Leak | 22 | API keys (OpenAI, AWS, GCP, Stripe, ...), private keys, DB strings, HMAC secrets |\n| Prompt Injection | 18 + NLP | Instruction overrides, role switching, delimiter injection, jailbreaks, event injection |\n| Supply Chain | 24 | Download-and-execute, reverse shells, sandbox escape, symlink attacks, privilege escalation, OIDC token vars, runner-pivot memory, Claude Code persistence path |\n| External Download | 16 | Binary downloads, curl-pipe-shell, auto-installs, profile persistence |\n| MCP Attack | 16 | Tool injection, name shadowing, canonicalization bypass, capability escalation |\n| Data Exfiltration | 16 + NLP | Webhook exfil, DNS tunneling, sensitive file reads, env var leaks |\n| Command Execution | 16 | shell=True, eval, subprocess, child_process, PowerShell |\n| MCP Config | 13 | Unpinned npx/uvx servers, hardcoded secrets, Docker cap-add, host networking, pip without hashes |\n| Indirect Injection | 10 | Fetch-and-follow, remote config, DB-driven instructions, webhook registration |\n| SSRF \u0026 Cloud | 11 | Cloud metadata, IMDS, Docker socket, internal IPs, redirect following |\n| Third-Party Content | 10 | eval with external data, unsafe deserialization, missing SRI, HTTP downgrade |\n| Unicode Attack | 10 | RTL override, bidi, homoglyphs, zero-width sequences, normalization bypass |\n| Supply Chain Exfil | 11 | Credential file reads, .pth executable code, bulk env collection, K8s secrets access, systemd persistence, archive+POST exfil, Session-Network endpoints |\n| Toxic Flow | 3 + cross-file | Single-file taint tracking plus cross-file correlation across MCP server directories |\n\nSee [RULES.md](RULES.md) for the complete rule catalog with IDs and severity levels.\n\n### Remediation Guidance\n\nAll 193 rules include remediation text. It appears in every output format:\n\n- **Terminal**: always shown for CRITICAL findings, shown for all severities with `--verbose`\n- **JSON**: included in every finding object\n- **SARIF**: mapped to the `help` field on each rule\n- **Markdown**: shown for HIGH and CRITICAL findings\n- **Explain**: `aguara explain RULE_ID` shows the full remediation text\n\n```bash\n# See remediation for a specific rule\naguara explain CRED_002\n\n# Terminal output with remediation for all findings\naguara scan . --verbose\n```\n\n```json\n{\n  \"rule_id\": \"PROMPT_INJECTION_001\",\n  \"severity\": 4,\n  \"matched_text\": \"Ignore all previous instructions\",\n  \"remediation\": \"Remove instruction override text. If this is documentation, wrap it in a code block to indicate it is an example.\",\n  \"confidence\": 0.95\n}\n```\n\n### Custom Rules\n\n```yaml\nid: CUSTOM_001\nname: \"Internal API endpoint\"\ndescription: \"Detects references to internal APIs\"\nseverity: HIGH\ncategory: custom\ntargets: [\"*.md\", \"*.txt\"]\nmatch_mode: any\nremediation: \"Replace internal API URLs with the public endpoint or environment variable.\"\npatterns:\n  - type: regex\n    value: \"https?://internal\\\\.mycompany\\\\.com\"\n  - type: contains\n    value: \"api.internal\"\nexclude_patterns:            # optional: suppress match in these contexts\n  - type: contains\n    value: \"## documentation\"\nexamples:\n  true_positive:\n    - \"Fetch data from https://internal.mycompany.com/api/users\"\n  false_positive:\n    - \"Our public API is at https://api.mycompany.com\"\n```\n\n`exclude_patterns` suppress a match when the matched line (or up to 3 lines before it) matches any exclude pattern. Useful for reducing false positives in documentation headings, installation guides, etc.\n\nCustom rules are validated at load time: unknown YAML fields are rejected, and all rules require `id`, `name`, `category`, and at least one pattern.\n\n```bash\naguara scan .claude/skills/ --rules ./my-rules/\n```\n\n## Supply-Chain Check\n\nIn v0.17, `aguara check .` walks the dependency surface of a modern repo\ninstead of stopping at npm/Python.\n\nAguara ships with native threat intel built from [OSV.dev](https://osv.dev)'s\npublic vulnerability database and OpenSSF Malicious Packages records, plus a\nhand-curated list of high-priority emergency advisories (event-stream,\nnode-ipc 2022 and 2026, litellm). Both run **offline by default** — the\nbinary carries the snapshot. Runtime updates are opt-in.\n\nThe check commands are organised by user intent.\n\n### `aguara check` — am I exposed?\n\n```bash\n# Run from a repo root. Aguara finds installed npm / Python environments\n# AND lockfiles for Go, Rust, PHP, Ruby, Java, and .NET recursively under\n# the path, then matches every declared package against the snapshot.\naguara check .\n\n# Refresh threat intel from OSV first, then check. The only check mode\n# that uses the network; the rest stay offline. --fresh refreshes only\n# the ecosystems the plan actually touches.\naguara check --fresh\n\n# CI gate: --fail-on critical, no color, exit 1 on compromised packages\naguara check --ci\n\n# Constrain to specific ecosystems (repeatable or comma-separated)\naguara check --ecosystem go,ruby\naguara check --ecosystem maven --ecosystem nuget\n\n# Machine-readable\naguara check --format json\n```\n\nWhat it checks:\n- **Known compromised package versions** across all supported ecosystems\n  (manual advisories + OSV malicious-package records + OpenSSF Malicious\n  Packages origins).\n- **`.pth` files with executable code** (import, subprocess, exec, eval).\n  Python-only.\n- **pip/uv/npx caches** so a compromised package in the cache surfaces\n  even without a virtualenv. Python / npm only.\n- **Persistence backdoors** (systemd user services, sysmon artifacts).\n  Python-only.\n- **Credential files at risk** (SSH, AWS, K8s, git, npm, PyPI, databases).\n  Python-only.\n\nCoverage by ecosystem:\n\n| Ecosystem | Evidence read | Coverage today |\n|---|---|---|\n| npm | `node_modules` (incl. pnpm `.pnpm` store) | Strong malicious-package coverage |\n| PyPI | `site-packages`, `.pth`, pip/uv/npx caches | Strong malicious-package + persistence coverage |\n| RubyGems | `Gemfile.lock` | Strong malicious-package coverage |\n| NuGet | `packages.lock.json`, `*.csproj`/`*.fsproj`/`*.vbproj` | Strong exact-version malicious-package coverage |\n| Go | `go.sum`, `go.mod` | Parser ready; limited exact-version embedded matches today |\n| crates.io | `Cargo.lock` (public registry only) | Parser ready; range-aware OSV matching deferred |\n| Packagist | `composer.lock` | Parser ready; range-aware OSV matching deferred |\n| Maven | `pom.xml`, `gradle.lockfile`, `gradle/dependency-locks/*` | Parser ready; range-aware OSV matching deferred |\n\nAguara is not claiming to be a full SCA vulnerability scanner yet. v0.17\nadds offline malicious-package checks across ecosystems. General\nCVE/range matching is the next layer.\n\nAuto-detection rules (in order):\n- If the path is or contains `node_modules`, run the npm check.\n- Walk the path recursively for lockfiles (`go.sum`, `Cargo.lock`,\n  `composer.lock`, `Gemfile.lock`, `pom.xml`, `gradle.lockfile`,\n  `gradle/dependency-locks/*.lockfile`, `packages.lock.json`,\n  `*.csproj`/`*.fsproj`/`*.vbproj`). Skip `.git/`, `vendor/`,\n  `node_modules/`, `.aguara/`, `target/`, `bin/`, `obj/`, `.gradle/`.\n- If the explicit `--path` looks like a Python install\n  (`site-packages` / `dist-packages` basename, or contains `*.dist-info`),\n  run the Python check.\n- If `aguara check` runs with no flags and no signals are found, fall\n  back to global Python `site-packages` autodiscovery (legacy\n  behaviour).\n\nAn explicit `--path` with no signals returns a clean result with\n`\"ecosystems\": []` and never silently falls back to the host's global\nPython.\n\n### `aguara audit` — code AND packages, one verdict\n\n```bash\naguara audit          # check + scan on the current directory\naguara audit --ci     # CI gate: --fail-on critical, no color\naguara audit --fresh  # refresh intel, then audit\n```\n\n`aguara audit` composes the supply-chain check and the content scan into a\nsingle verdict. JSON output carries both sub-results (`.check` and `.scan`)\nplus per-section counts so a dashboard can drill into either side.\n\n### `aguara status` — is my threat intel fresh?\n\n```bash\naguara status\n```\n\nPrints the Aguara version, the embedded snapshot's generated-at date and\nrecord count, and whether a local cached snapshot exists from a prior\n`aguara update` run. Does no network I/O.\n\n### `aguara update` — refresh intel for future offline checks\n\n```bash\naguara update                       # fetch every supported ecosystem, cache locally\naguara update --ecosystem npm       # just npm\naguara update --ecosystem go,ruby   # scope to Go + RubyGems\n```\n\n`aguara update` and `--fresh` are the only commands that use the network.\nThe default refreshes every ecosystem the registry supports (npm, PyPI,\nGo, crates.io, Packagist, RubyGems, Maven, NuGet); scope with `--ecosystem`\n(repeatable or comma-separated). The refreshed cache lives at\n`~/.aguara/intel/snapshot.json`; subsequent `aguara check` runs layer it\nover the embedded snapshot automatically and stay offline.\n\n`aguara check --fresh` refreshes only the ecosystems the plan actually\ntouches, so `aguara check --fresh --ecosystem maven` does not pull npm,\nPyPI, or the other six.\n\nIf a refresh returns zero records (upstream outage, schema shift), the\nupdate is refused so cached intel cannot be silently wiped. Pass\n`--allow-empty` to override during initial bootstrap.\n\n### `aguara clean` — quarantine compromised packages\n\n```bash\naguara clean                       # interactive confirmation\naguara clean --yes --purge-caches  # non-interactive, also purge pip/uv caches\naguara clean --dry-run             # preview\n```\n\nFiles are quarantined to `/tmp/aguara-quarantine/`, not deleted. After\ncleaning, Aguara prints a credential rotation checklist for every\ncredential file present on the system.\n\n### Advanced: explicit ecosystem and path\n\nUse these when auto-detection cannot find the environment you want to check:\n\n```bash\naguara check --ecosystem python --path /opt/venv/lib/python3.12/site-packages/\naguara check --ecosystem npm --path ./node_modules\n```\n\n### Threat-intel sources\n\nThe embedded snapshot is built from two sources:\n\n- **Manual** — a short hand-curated list of high-priority emergency\n  advisories. Takes display precedence when an advisory ID also appears\n  in OSV.\n- **OSV.dev** — high-confidence records only: OpenSSF Malicious Packages\n  IDs (the `MAL-` namespace), records with\n  `database_specific.malicious-packages-origins`, plus keyword-qualified\n  records that carry exact affected versions. Generic CVE / DoS records\n  are filtered out at import time so Aguara stays focused on malicious\n  packages, not general SCA.\n\nBuilt originally in response to the\n[litellm supply chain attack](https://github.com/garagon/aguara/releases/tag/v0.11.0)\n(March 2026), where malicious `.pth` files exfiltrated credentials and\ninstalled K8s backdoors. The toolset grew from that incident into the\nbroader check + audit + update + status surface above.\n\n## Aguara MCP\n\n[Aguara MCP](https://github.com/garagon/aguara-mcp) is an MCP server that gives AI agents the ability to scan skills and configurations for security threats — before installing or running them. It imports Aguara as a Go library — one `go install`, no external binary needed.\n\n```bash\n# Install and register with Claude Code\ngo install github.com/garagon/aguara-mcp@latest\nclaude mcp add aguara -- aguara-mcp\n```\n\nYour agent gets 4 tools: `scan_content`, `check_mcp_config`, `list_rules`, and `explain_rule`. No network, no LLM, millisecond scans — the agent checks first, then decides.\n\n## Aguara Watch\n\nAguara Watch is being reworked. The previous public observatory is stale, so we are not using it as a product signal for this release. The scanner, CLI, Docker image, and signed release artifacts remain the supported surfaces for v0.17.0.\n\n## Go Library\n\nAguara exposes a public Go API for embedding the scanner in other tools. [Aguara MCP](https://github.com/garagon/aguara-mcp) uses this API.\n\n```go\nimport \"github.com/garagon/aguara\"\n\n// Scan a directory\nresult, err := aguara.Scan(ctx, \"./skills/\")\n\n// Scan inline content (no disk I/O, NFKC-normalized)\nresult, err := aguara.ScanContent(ctx, content, \"skill.md\")\n\n// Scan with tool context for false-positive reduction\nresult, err := aguara.ScanContentAs(ctx, content, \"skill.md\", \"Edit\")\n// result.Verdict: aguara.VerdictClean, VerdictFlag, or VerdictBlock\n// result.ToolName: \"Edit\"\n// result.Findings: always preserved (even when verdict is clean)\n\n// Scan with a profile\nresult, err := aguara.ScanContent(ctx, content, \"skill.md\",\n    aguara.WithToolName(\"Edit\"),\n    aguara.WithScanProfile(aguara.ProfileContentAware),\n)\n// result.RiskScore: 0-100 aggregate risk score\n\n// Preserve cross-rule findings (for verdict pipelines)\nresult, err := aguara.ScanContent(ctx, content, \"skill.md\",\n    aguara.WithDeduplicateMode(aguara.DeduplicateSameRuleOnly),\n)\n\n// Enable rug-pull detection with persistent state\nresult, err := aguara.ScanContent(ctx, content, \"tool.md\",\n    aguara.WithStateDir(\"/var/lib/myapp/aguara-state\"),\n)\n\n// Discover all MCP client configs on the machine\ndiscovered, err := aguara.Discover()\nfor _, client := range discovered.Clients {\n    fmt.Printf(\"%s: %d servers\\n\", client.Client, len(client.Servers))\n}\n\n// List rules, optionally filtered\nrules := aguara.ListRules(aguara.WithCategory(\"prompt-injection\"))\n\n// Get rule details with remediation\ndetail, err := aguara.ExplainRule(\"PROMPT_INJECTION_001\")\nfmt.Println(detail.Remediation)\n```\n\nOptions: `WithMinSeverity()`, `WithDisabledRules()`, `WithCustomRules()`, `WithRuleOverrides()`, `WithWorkers()`, `WithIgnorePatterns()`, `WithMaxFileSize()`, `WithCategory()`, `WithToolName()`, `WithScanProfile()`, `WithDeduplicateMode()`, `WithStateDir()`.\n\n## Architecture\n\n```\naguara.go              Public API: Scan, ScanContent, ScanContentAs, Discover, ListRules, ExplainRule\noptions.go             Functional options (WithToolName, WithStateDir, WithDeduplicateMode, ...)\ndiscover/              MCP client discovery: 17 clients, config parsers, auto-detection\ncmd/aguara/            CLI entry point (Cobra)\ncmd/wasm/              WASM build for browser-based scanning\ninternal/\n  engine/\n    pattern/           Pattern matcher: Aho-Corasick + regex, 8 decoders (base64, hex, URL, Unicode, HTML, hex-escape, base32, octal-escape)\n    ci/                CI Trust: .github/workflows/ YAML parser, pwn-request / cache / OIDC / persisted-credentials chains\n    pkgmeta/           PkgMeta: package.json parser, npm lifecycle / git source / publish-surface chains\n    jsrisk/            JSRisk: .js / .mjs / .cjs scanner, obfuscation / daemonization / CI-secret-harvest / runner-pivot / agent-persistence\n    nlp/               NLP: markdown AST + JSON/YAML string extraction, proximity-weighted classifier\n    toxicflow/         Taint: single-file taint tracking + cross-file correlation across directories\n    rugpull/           Rug-pull: SHA256 change detection (CLI --monitor, library WithStateDir)\n  rules/               Rule engine: YAML loader, compiler, self-tester\n    builtin/           193 embedded rules across 13 YAML files (go:embed)\n  scanner/             Orchestrator: file discovery, parallel analysis, inline ignore, result aggregation\n    exemptions.go      Tool exemptions, scan profiles, verdict computation\n  meta/                Post-processing: configurable dedup, scoring, risk score, correlation, confidence\n  output/              Formatters: terminal (ANSI), JSON, SARIF, Markdown\n  config/              .aguara.yml loader (supports tool-scoped rules)\n  incident/            Incident response: compromised package detection, cleanup, quarantine\n  state/               Persistence for rug-pull detection (CLI and library mode)\n  types/               Shared types (Finding, Severity, ScanResult, Verdict, DeduplicateMode)\n```\n\n## Comparison\n\nAguara is purpose-built for AI agent content. General-purpose SAST tools target application source code, not the skill files, tool descriptions, and MCP configs that agents consume.\n\n| Feature | Aguara | Semgrep | Snyk Code | CodeQL |\n|---------|--------|---------|-----------|--------|\n| AI agent skill scanning | Yes | No | No | No |\n| MCP config analysis | Yes | No | No | No |\n| Prompt injection detection | Yes (18 rules + NLP) | No | No | No |\n| Rug-pull detection | Yes | No | No | No |\n| Supply chain exfil detection | Yes (10 rules) | No | No | No |\n| Incident response (check/clean) | Yes | No | No | No |\n| Taint tracking for skills | Yes | Yes | Yes | Yes |\n| Offline / no account | Yes | Partial | No | Partial |\n| Custom YAML rules | Yes | Yes | No | No |\n| SARIF output | Yes | Yes | Yes | Yes |\n| Free \u0026 open source | Yes (Apache 2.0) | Partial | No | Partial |\n\nAguara complements traditional SAST - use Semgrep for your app code, Aguara for your agent skills and MCP servers.\n\n## Contributing\n\nContributions are welcome! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for development setup, adding rules, and the PR process.\n\nFor security vulnerabilities, see [SECURITY.md](SECURITY.md).\n\n## License\n\n[Apache License 2.0](LICENSE)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgaragon%2Faguara","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgaragon%2Faguara","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgaragon%2Faguara/lists"}