https://github.com/garagon/aguara
Security scanner for AI agent skills and MCP servers. Static analysis, incident response, no LLM. One binary. Detection engine behind oktsec.
https://github.com/garagon/aguara
ai-agents ai-security claude data-exfiltration devsecops golang mcp mcp-server model-context-protocol prompt-injection sast security security-scanner static-analysis supply-chain-security
Last synced: about 1 month ago
JSON representation
Security scanner for AI agent skills and MCP servers. Static analysis, incident response, no LLM. One binary. Detection engine behind oktsec.
- Host: GitHub
- URL: https://github.com/garagon/aguara
- Owner: garagon
- License: apache-2.0
- Created: 2026-02-17T21:13:42.000Z (4 months ago)
- Default Branch: main
- Last Pushed: 2026-05-12T22:49:32.000Z (about 1 month ago)
- Last Synced: 2026-05-12T23:05:44.365Z (about 1 month ago)
- Topics: ai-agents, ai-security, claude, data-exfiltration, devsecops, golang, mcp, mcp-server, model-context-protocol, prompt-injection, sast, security, security-scanner, static-analysis, supply-chain-security
- Language: Go
- Homepage: https://aguarascan.com
- Size: 570 KB
- Stars: 76
- Watchers: 1
- Forks: 15
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
- Security: SECURITY.md
- Agents: AGENTS.md
Awesome Lists containing this project
- awesome-mcp-devtools - garagon/aguara - Static security scanner for MCP servers and AI agent skills. 173 detection rules, prompt injection, credential leaks, taint tracking. Scans configs and tool descriptions before deployment. (Testing Tools / Common Lisp)
README
Aguara
Security scanner for AI agent skills and MCP servers.
Detect prompt injection, data exfiltration, and supply-chain attacks before they reach production.
Installation •
Quick Start •
How It Works •
Usage •
Rules •
Supply-Chain Check •
Aguara MCP •
Aguara Watch •
Contributing
https://github.com/user-attachments/assets/851333be-048f-48fa-aaf3-f8cc1d4aa594
## Why Aguara?
AI agents and MCP servers run code on your behalf. A single malicious skill file can exfiltrate credentials, inject prompts, or install backdoors. Aguara catches these threats **before deployment** with static analysis that requires no API keys, no cloud, and no LLM.
- **193 detection rules across 13 categories** — prompt injection, data exfiltration, credential leaks, supply-chain attacks, MCP-specific threats, command execution, SSRF, unicode attacks, and more.
- **7 scan analyzers** — pattern matching, GitHub Actions trust-chain detection (ci-trust), npm package metadata (pkgmeta), JavaScript payload risk (jsrisk), NLP analysis, taint tracking, and rug-pull detection work together to catch threats that any single technique would miss. Separate `aguara check` / `aguara audit` commands flag installed npm and Python environments AND lockfiles for Go, Rust, PHP/Composer, Ruby/Bundler, Java/Maven/Gradle, and .NET/NuGet against the embedded threat-intel snapshot. Built from [OSV.dev](https://osv.dev) malicious-package records plus OpenSSF Malicious Packages and hand-curated emergency advisories. Offline by default.
- **8 decoders for encoded evasion** — base64, hex, URL encoding, Unicode escapes, HTML entities, hex escapes, base32, and C-style octal escapes. Obfuscated payloads are decoded and re-scanned automatically.
- **NLP on markdown, JSON, and YAML** — goldmark AST analysis for markdown files, plus string extraction and classification for JSON/YAML tool descriptions. Catches MCP tool poisoning in structured configs.
- **Cross-file toxic flow analysis** — detects dangerous capability combinations split across files in the same MCP server directory (e.g., one tool reads credentials, another sends to a webhook).
- **Aggregate risk score** — 0-100 score with diminishing returns across findings. Available in JSON, SARIF, and terminal output.
- **Context-aware scanning** — pass the tool name (`--tool-name Edit`) and the scanner automatically skips rules that are always false positives for that tool. Built-in exemptions for Edit, Write, WebFetch, Bash, and more.
- **Scan profiles** — `strict` (default), `content-aware`, or `minimal` enforcement. Findings are always preserved for audit; only the verdict (clean/flag/block) changes.
- **Evasion prevention** — NFKC normalization catches fullwidth character evasion. 8 decoders catch encoded payloads. Crypto address filtering prevents hex decoder false positives.
- **Dynamic confidence scoring** — every finding carries a confidence level (0.50-0.95) that reflects signal quality: pattern hit ratio, classifier score, and code-block awareness.
- **Remediation guidance** — all 193 rules include actionable fix suggestions, shown in every output format.
- **Deterministic** — same input, same output. Every scan is reproducible.
- **CI-ready** — JSON, SARIF, and Markdown output. GitHub Action. `--fail-on` threshold. `--changed` for incremental scans.
- **17 MCP clients supported** — auto-discover and scan configs from Claude Desktop, Cursor, VS Code, Windsurf, and 13 more.
- **Library API for embedding** — `WithDeduplicateMode()` preserves all cross-rule findings for verdict pipelines. `WithStateDir()` enables rug-pull detection for persistent consumers.
- **Extensible** — write custom rules in YAML. No code required.
## Installation
```bash
curl -fsSL https://raw.githubusercontent.com/garagon/aguara/main/install.sh | sh
```
Installs the latest binary to `~/.local/bin`. Customize with environment variables:
```bash
curl -fsSL https://raw.githubusercontent.com/garagon/aguara/main/install.sh | VERSION=v0.17.0 sh
curl -fsSL https://raw.githubusercontent.com/garagon/aguara/main/install.sh | INSTALL_DIR=/usr/local/bin sh
```
To update an existing install, rerun the installer. It downloads the selected release archive, verifies `checksums.txt`, and replaces the binary:
```bash
# Update to latest
curl -fsSL https://raw.githubusercontent.com/garagon/aguara/main/install.sh | sh
# Update/pin to a specific release
curl -fsSL https://raw.githubusercontent.com/garagon/aguara/main/install.sh | VERSION=v0.17.0 sh
```
### Alternative methods
**Homebrew** (macOS/Linux):
```bash
brew install garagon/tap/aguara
```
**Docker** (no install required):
```bash
# Scan current directory
docker run --rm -v "$(pwd)":/scan ghcr.io/garagon/aguara scan /scan
# Scan with options
docker run --rm -v "$(pwd)":/scan ghcr.io/garagon/aguara scan /scan --severity high --format json
# Use a specific version
docker run --rm -v "$(pwd)":/scan ghcr.io/garagon/aguara:0.17.0 scan /scan
```
**From source** (requires Go 1.25+):
```bash
go install github.com/garagon/aguara/cmd/aguara@latest
```
Pre-built binaries for Linux, macOS, and Windows are also available on the [Releases page](https://github.com/garagon/aguara/releases).
### Verifying signed releases
Starting with the next release after this section landed, every release is signed with [Cosign](https://github.com/sigstore/cosign) keyless, ships an SPDX SBOM per archive, and is built with `-trimpath` for reproducibility. The container image is signed at the digest and carries SBOM + SLSA provenance attestations.
**Verify the release archive**:
```bash
VERSION=vX.Y.Z
ARCHIVE=aguara_${VERSION#v}_linux_amd64.tar.gz
# Download archive, checksums, and the cosign bundle
curl -fsSLO https://github.com/garagon/aguara/releases/download/${VERSION}/${ARCHIVE}
curl -fsSLO https://github.com/garagon/aguara/releases/download/${VERSION}/checksums.txt
curl -fsSLO https://github.com/garagon/aguara/releases/download/${VERSION}/checksums.txt.bundle
# Verify checksums.txt was signed by the GitHub Actions release workflow
cosign verify-blob \
--bundle checksums.txt.bundle \
--certificate-identity "https://github.com/garagon/aguara/.github/workflows/release.yml@refs/tags/${VERSION}" \
--certificate-oidc-issuer "https://token.actions.githubusercontent.com" \
checksums.txt
# Now verify the archive matches the signed checksums
sha256sum --check --ignore-missing checksums.txt
```
**Verify the container image**:
```bash
cosign verify ghcr.io/garagon/aguara:${VERSION#v} \
--certificate-identity "https://github.com/garagon/aguara/.github/workflows/docker.yml@refs/tags/${VERSION}" \
--certificate-oidc-issuer "https://token.actions.githubusercontent.com"
```
**Inspect the SBOM and provenance**:
```bash
# Release archive SBOM (SPDX 2.3)
curl -fsSL https://github.com/garagon/aguara/releases/download/${VERSION}/${ARCHIVE}.sbom.json | jq .
# Container image SBOM and SLSA build provenance.
# docker/build-push-action publishes these as BuildKit attestation manifests
# (in-toto / SLSA spec) attached to the OCI image index, not as cosign
# attestations. Use `docker buildx imagetools inspect` to read them:
docker buildx imagetools inspect ghcr.io/garagon/aguara:${VERSION#v} \
--format '{{ json .SBOM }}' | jq .
docker buildx imagetools inspect ghcr.io/garagon/aguara:${VERSION#v} \
--format '{{ json .Provenance }}' | jq .
```
`install.sh` performs SHA256 verification automatically and aborts if `sha256sum`/`shasum` is unavailable, so the curl-pipe install path is never silently downgraded.
## Quick Start
```bash
# Am I exposed to a compromised package?
aguara check
# Same, but refresh threat intel from OSV first
aguara check --fresh
# CI gate: fail if a compromised package is installed
aguara check --ci
# Audit: scan code + check packages, single verdict
aguara audit
# Is Aguara's threat intel fresh?
aguara status
# Pull the latest threat intel for future offline checks
aguara update
```
```bash
# Discover which MCP clients are configured on this machine
aguara discover
# Scan a skills directory (or any path) for content threats
aguara scan .claude/skills/
# CI mode for content scan: --fail-on high, no color
aguara scan .claude/skills/ --ci
# Auto-discover and scan all MCP configs on this machine
aguara scan --auto
```
Both `aguara check` and `aguara audit` run offline by default using the
threat intel baked into the binary. The network is used only when you
opt in with `--fresh` or `aguara update`.
## How It Works
Aguara runs 6 scan analyzers sequentially on every file by default; a 7th (Rug-Pull) joins when `--monitor` is enabled and a state store is configured. Each catches a different class of attack:
| Analyzer | Engine | What it catches |
|----------|--------|-----------------|
| **Pattern Matcher** | Regex + Aho-Corasick matching | Known attack signatures, credential patterns, dangerous commands. Aho-Corasick automaton for O(n+m) multi-pattern search. 8 decoders (base64, hex, URL encoding, Unicode escapes, HTML entities, hex escapes, base32, octal escapes) decode obfuscated payloads and re-scan. Code-block severity downgrade. Dynamic confidence based on pattern hit ratio. |
| **CI Trust** | GitHub Actions YAML parser | `pull_request_target` chains, cache poisoning across fork boundaries, OIDC token surface paired with install/build/test, persisted-credentials checkouts on PR head refs. |
| **PkgMeta** | `package.json` JSON parser | npm install-time lifecycle scripts plus git-sourced dependencies, optional-git deps with suspicious names, publish surfaces paired with trusted-publishing references. |
| **JSRisk** | JavaScript single-pass scanner | Obfuscator-shape payloads, install-time daemonization via `child_process`, CI secret harvesting through real `process.env` reads plus network/registry sinks, runner-process memory pivots to extract OIDC tokens, Claude Code / VS Code workspace persistence. |
| **NLP Analyzer** | Goldmark AST + JSON/YAML extraction | Prompt injection in markdown structure, plus tool poisoning in JSON/YAML description fields. Keyword classification with proximity weighting; clustered keywords score higher, sparse keywords in long text get penalized. |
| **Taint Tracker** | Source-to-sink flow analysis | Dangerous capability combinations within a single file and across files in the same directory. Detects credential reads paired with webhook sends, env vars flowing to shell execution, destructive plus exec combos across MCP server tools. |
| **Rug-Pull Detector** | SHA256 hash tracking | Tool descriptions that change between scans. CLI: `--monitor` flag. Library: `WithStateDir()` for persistent consumers. |
Separate `aguara check` and `aguara audit` commands inspect installed
package trees (Python `site-packages`, npm `node_modules` including the
pnpm `.pnpm` store) AND walk the repo recursively for Go, Rust, PHP,
Ruby, Java, and .NET lockfiles, matching every declared package against
the embedded threat-intel snapshot. See
[Supply-Chain Check](#supply-chain-check) for the full surface.
All content is NFKC-normalized before scanning to prevent Unicode evasion attacks. All layers report findings with severity, dynamic confidence score (0.50-0.95), matched text, file location with context lines, and remediation guidance. An aggregate risk score (0-100) summarizes overall threat level.
## Usage
```
aguara scan [path] [flags]
Flags:
--auto Auto-discover and scan all MCP client configs
--severity string Minimum severity to report: critical, high, medium, low, info (default "info")
--format string Output format: terminal, json, sarif, markdown (default "terminal")
-o, --output string Output file path (default: stdout)
--workers int Number of worker goroutines (default: NumCPU)
--rules string Additional rules directory
--disable-rule strings Rule IDs to disable (comma-separated, repeatable)
--max-file-size string Maximum file size to scan (e.g. 50MB, 100MB; default 50MB, range 1MB-500MB)
--tool-name string Tool context for false-positive reduction (e.g. Bash, Edit, WebFetch)
--profile string Scan profile: strict (default), content-aware, minimal
--no-color Disable colored output
--no-update-check Disable automatic update check (also: AGUARA_NO_UPDATE_CHECK=1)
--fail-on string Exit code 1 if findings at or above this severity
--ci CI mode: --fail-on high --no-color
--changed Only scan git-changed files
--monitor Enable rug-pull detection: track file hashes across runs
-v, --verbose Show rule descriptions, confidence scores, and remediation
-h, --help Help
```
### Output Formats
| Format | Flag | Use case |
|--------|------|----------|
| **Terminal** | `--format terminal` (default) | Human-readable with color, severity dashboard, top-files chart |
| **JSON** | `--format json` | Machine processing, API integration, custom tooling |
| **SARIF** | `--format sarif` | GitHub Code Scanning, IDE integrations, SAST dashboards |
| **Markdown** | `--format markdown` | GitHub Actions job summaries, PR comments |
### MCP Client Discovery
Aguara auto-detects MCP configurations across **17 clients**: Claude Desktop, Cursor, VS Code, Cline, Windsurf, OpenClaw, OpenCode, Zed, Amp, Gemini CLI, Copilot CLI, Amazon Q, Claude Code, Roo Code, Kilo Code, BoltAI, and JetBrains.
```bash
# List all detected MCP configs
aguara discover
# JSON output (sensitive env values are automatically redacted)
aguara discover --format json
# Markdown output
aguara discover --format markdown
# Discover + scan in one command
aguara scan --auto
```
### CI Integration
#### GitHub Action
```yaml
- uses: garagon/aguara@v0.17.0
with:
path: .
fail-on: high
version: v0.17.0
```
Both pins (the action ref AND the `version:` input) are required. The
action ref alone pins only the composite action and its install
script; `version:` pins the Aguara binary the action installs. Setting
both makes the workflow reproducible and dependabot-friendly: when
v0.17.1 lands, the bot updates both together.
Scans your repository, uploads findings to GitHub Code Scanning, and
optionally fails the build:
```yaml
- uses: garagon/aguara@v0.17.0
with:
path: ./mcp-server/
severity: medium
fail-on: high
version: v0.17.0
```
All inputs are optional. See [`action.yml`](action.yml) for the full list.
| Input | Default | Description |
|-------|---------|-------------|
| `path` | `./` | Path to scan |
| `severity` | `info` | Minimum severity to report |
| `fail-on` | _(none)_ | Fail if findings at or above this severity |
| `format` | `sarif` | Output format: sarif, json, terminal, markdown |
| `upload-sarif` | `true` | Upload SARIF to GitHub Code Scanning |
| `version` | _(latest)_ | Pin a specific Aguara version |
> **Note**: SARIF upload requires the `security-events: write` permission and is free for public repositories.
#### Docker in CI
```yaml
# GitHub Actions with Docker (no install step)
- name: Scan for security issues
run: docker run --rm -v "${{ github.workspace }}":/scan ghcr.io/garagon/aguara scan /scan --ci
```
#### Manual / GitLab CI
```yaml
# GitHub Actions (without the action)
- name: Scan skills for security issues
run: |
curl -fsSL https://raw.githubusercontent.com/garagon/aguara/main/install.sh | sh
aguara scan .claude/skills/ --ci
```
```yaml
# GitLab CI
security-scan:
script:
- curl -fsSL https://raw.githubusercontent.com/garagon/aguara/main/install.sh | sh
- aguara scan .claude/skills/ --format sarif -o gl-sast-report.sarif --fail-on high
artifacts:
reports:
sast: gl-sast-report.sarif
```
### Configuration
Create `.aguara.yml` in your project root:
```yaml
severity: medium
fail_on: high
max_file_size: 104857600 # 100 MB (default: 50 MB, range: 1 MB-500 MB)
ignore:
- "vendor/**"
- "node_modules/**"
rule_overrides:
CRED_004:
severity: low
EXTDL_004:
disabled: true
TC-005:
apply_to_tools: ["Bash"] # only enforce on Bash
MCPCFG_004:
exempt_tools: ["WebFetch"] # enforce on everything except WebFetch
```
`apply_to_tools` and `exempt_tools` are mutually exclusive per rule. They filter findings at scan time when a tool name is provided via `--tool-name` or the library API.
### Inline Ignore
Suppress specific findings directly in your source files using inline comments:
```yaml
# aguara-ignore CRED_004
api_key: "sk-test-1234567890" # this finding is suppressed
```
```markdown
Ignore all previous instructions (this is a test)
```
Supported directives:
| Directive | Effect |
|-----------|--------|
| `# aguara-ignore RULE_ID` | Suppress rule on the same line |
| `# aguara-ignore RULE_ID, RULE_ID2` | Suppress multiple rules on the same line |
| `# aguara-ignore-next-line RULE_ID` | Suppress rule on the next line |
| `# aguara-ignore` | Suppress all rules on the same line |
| `` | HTML/Markdown comment variant |
| `// aguara-ignore RULE_ID` | C-style comment variant |
## Rules
193 built-in rules across 13 pattern-rule categories (what `aguara list-rules` enumerates) plus the toxic-flow chain analyzer with its own emit-time category. The table groups coverage by emit-time category for readability:
| Category | Rules | What it detects |
|----------|-------|-----------------|
| Credential Leak | 22 | API keys (OpenAI, AWS, GCP, Stripe, ...), private keys, DB strings, HMAC secrets |
| Prompt Injection | 18 + NLP | Instruction overrides, role switching, delimiter injection, jailbreaks, event injection |
| Supply Chain | 24 | Download-and-execute, reverse shells, sandbox escape, symlink attacks, privilege escalation, OIDC token vars, runner-pivot memory, Claude Code persistence path |
| External Download | 16 | Binary downloads, curl-pipe-shell, auto-installs, profile persistence |
| MCP Attack | 16 | Tool injection, name shadowing, canonicalization bypass, capability escalation |
| Data Exfiltration | 16 + NLP | Webhook exfil, DNS tunneling, sensitive file reads, env var leaks |
| Command Execution | 16 | shell=True, eval, subprocess, child_process, PowerShell |
| MCP Config | 13 | Unpinned npx/uvx servers, hardcoded secrets, Docker cap-add, host networking, pip without hashes |
| Indirect Injection | 10 | Fetch-and-follow, remote config, DB-driven instructions, webhook registration |
| SSRF & Cloud | 11 | Cloud metadata, IMDS, Docker socket, internal IPs, redirect following |
| Third-Party Content | 10 | eval with external data, unsafe deserialization, missing SRI, HTTP downgrade |
| Unicode Attack | 10 | RTL override, bidi, homoglyphs, zero-width sequences, normalization bypass |
| Supply Chain Exfil | 11 | Credential file reads, .pth executable code, bulk env collection, K8s secrets access, systemd persistence, archive+POST exfil, Session-Network endpoints |
| Toxic Flow | 3 + cross-file | Single-file taint tracking plus cross-file correlation across MCP server directories |
See [RULES.md](RULES.md) for the complete rule catalog with IDs and severity levels.
### Remediation Guidance
All 193 rules include remediation text. It appears in every output format:
- **Terminal**: always shown for CRITICAL findings, shown for all severities with `--verbose`
- **JSON**: included in every finding object
- **SARIF**: mapped to the `help` field on each rule
- **Markdown**: shown for HIGH and CRITICAL findings
- **Explain**: `aguara explain RULE_ID` shows the full remediation text
```bash
# See remediation for a specific rule
aguara explain CRED_002
# Terminal output with remediation for all findings
aguara scan . --verbose
```
```json
{
"rule_id": "PROMPT_INJECTION_001",
"severity": 4,
"matched_text": "Ignore all previous instructions",
"remediation": "Remove instruction override text. If this is documentation, wrap it in a code block to indicate it is an example.",
"confidence": 0.95
}
```
### Custom Rules
```yaml
id: CUSTOM_001
name: "Internal API endpoint"
description: "Detects references to internal APIs"
severity: HIGH
category: custom
targets: ["*.md", "*.txt"]
match_mode: any
remediation: "Replace internal API URLs with the public endpoint or environment variable."
patterns:
- type: regex
value: "https?://internal\\.mycompany\\.com"
- type: contains
value: "api.internal"
exclude_patterns: # optional: suppress match in these contexts
- type: contains
value: "## documentation"
examples:
true_positive:
- "Fetch data from https://internal.mycompany.com/api/users"
false_positive:
- "Our public API is at https://api.mycompany.com"
```
`exclude_patterns` suppress a match when the matched line (or up to 3 lines before it) matches any exclude pattern. Useful for reducing false positives in documentation headings, installation guides, etc.
Custom rules are validated at load time: unknown YAML fields are rejected, and all rules require `id`, `name`, `category`, and at least one pattern.
```bash
aguara scan .claude/skills/ --rules ./my-rules/
```
## Supply-Chain Check
In v0.17, `aguara check .` walks the dependency surface of a modern repo
instead of stopping at npm/Python.
Aguara ships with native threat intel built from [OSV.dev](https://osv.dev)'s
public vulnerability database and OpenSSF Malicious Packages records, plus a
hand-curated list of high-priority emergency advisories (event-stream,
node-ipc 2022 and 2026, litellm). Both run **offline by default** — the
binary carries the snapshot. Runtime updates are opt-in.
The check commands are organised by user intent.
### `aguara check` — am I exposed?
```bash
# Run from a repo root. Aguara finds installed npm / Python environments
# AND lockfiles for Go, Rust, PHP, Ruby, Java, and .NET recursively under
# the path, then matches every declared package against the snapshot.
aguara check .
# Refresh threat intel from OSV first, then check. The only check mode
# that uses the network; the rest stay offline. --fresh refreshes only
# the ecosystems the plan actually touches.
aguara check --fresh
# CI gate: --fail-on critical, no color, exit 1 on compromised packages
aguara check --ci
# Constrain to specific ecosystems (repeatable or comma-separated)
aguara check --ecosystem go,ruby
aguara check --ecosystem maven --ecosystem nuget
# Machine-readable
aguara check --format json
```
What it checks:
- **Known compromised package versions** across all supported ecosystems
(manual advisories + OSV malicious-package records + OpenSSF Malicious
Packages origins).
- **`.pth` files with executable code** (import, subprocess, exec, eval).
Python-only.
- **pip/uv/npx caches** so a compromised package in the cache surfaces
even without a virtualenv. Python / npm only.
- **Persistence backdoors** (systemd user services, sysmon artifacts).
Python-only.
- **Credential files at risk** (SSH, AWS, K8s, git, npm, PyPI, databases).
Python-only.
Coverage by ecosystem:
| Ecosystem | Evidence read | Coverage today |
|---|---|---|
| npm | `node_modules` (incl. pnpm `.pnpm` store) | Strong malicious-package coverage |
| PyPI | `site-packages`, `.pth`, pip/uv/npx caches | Strong malicious-package + persistence coverage |
| RubyGems | `Gemfile.lock` | Strong malicious-package coverage |
| NuGet | `packages.lock.json`, `*.csproj`/`*.fsproj`/`*.vbproj` | Strong exact-version malicious-package coverage |
| Go | `go.sum`, `go.mod` | Parser ready; limited exact-version embedded matches today |
| crates.io | `Cargo.lock` (public registry only) | Parser ready; range-aware OSV matching deferred |
| Packagist | `composer.lock` | Parser ready; range-aware OSV matching deferred |
| Maven | `pom.xml`, `gradle.lockfile`, `gradle/dependency-locks/*` | Parser ready; range-aware OSV matching deferred |
Aguara is not claiming to be a full SCA vulnerability scanner yet. v0.17
adds offline malicious-package checks across ecosystems. General
CVE/range matching is the next layer.
Auto-detection rules (in order):
- If the path is or contains `node_modules`, run the npm check.
- Walk the path recursively for lockfiles (`go.sum`, `Cargo.lock`,
`composer.lock`, `Gemfile.lock`, `pom.xml`, `gradle.lockfile`,
`gradle/dependency-locks/*.lockfile`, `packages.lock.json`,
`*.csproj`/`*.fsproj`/`*.vbproj`). Skip `.git/`, `vendor/`,
`node_modules/`, `.aguara/`, `target/`, `bin/`, `obj/`, `.gradle/`.
- If the explicit `--path` looks like a Python install
(`site-packages` / `dist-packages` basename, or contains `*.dist-info`),
run the Python check.
- If `aguara check` runs with no flags and no signals are found, fall
back to global Python `site-packages` autodiscovery (legacy
behaviour).
An explicit `--path` with no signals returns a clean result with
`"ecosystems": []` and never silently falls back to the host's global
Python.
### `aguara audit` — code AND packages, one verdict
```bash
aguara audit # check + scan on the current directory
aguara audit --ci # CI gate: --fail-on critical, no color
aguara audit --fresh # refresh intel, then audit
```
`aguara audit` composes the supply-chain check and the content scan into a
single verdict. JSON output carries both sub-results (`.check` and `.scan`)
plus per-section counts so a dashboard can drill into either side.
### `aguara status` — is my threat intel fresh?
```bash
aguara status
```
Prints the Aguara version, the embedded snapshot's generated-at date and
record count, and whether a local cached snapshot exists from a prior
`aguara update` run. Does no network I/O.
### `aguara update` — refresh intel for future offline checks
```bash
aguara update # fetch every supported ecosystem, cache locally
aguara update --ecosystem npm # just npm
aguara update --ecosystem go,ruby # scope to Go + RubyGems
```
`aguara update` and `--fresh` are the only commands that use the network.
The default refreshes every ecosystem the registry supports (npm, PyPI,
Go, crates.io, Packagist, RubyGems, Maven, NuGet); scope with `--ecosystem`
(repeatable or comma-separated). The refreshed cache lives at
`~/.aguara/intel/snapshot.json`; subsequent `aguara check` runs layer it
over the embedded snapshot automatically and stay offline.
`aguara check --fresh` refreshes only the ecosystems the plan actually
touches, so `aguara check --fresh --ecosystem maven` does not pull npm,
PyPI, or the other six.
If a refresh returns zero records (upstream outage, schema shift), the
update is refused so cached intel cannot be silently wiped. Pass
`--allow-empty` to override during initial bootstrap.
### `aguara clean` — quarantine compromised packages
```bash
aguara clean # interactive confirmation
aguara clean --yes --purge-caches # non-interactive, also purge pip/uv caches
aguara clean --dry-run # preview
```
Files are quarantined to `/tmp/aguara-quarantine/`, not deleted. After
cleaning, Aguara prints a credential rotation checklist for every
credential file present on the system.
### Advanced: explicit ecosystem and path
Use these when auto-detection cannot find the environment you want to check:
```bash
aguara check --ecosystem python --path /opt/venv/lib/python3.12/site-packages/
aguara check --ecosystem npm --path ./node_modules
```
### Threat-intel sources
The embedded snapshot is built from two sources:
- **Manual** — a short hand-curated list of high-priority emergency
advisories. Takes display precedence when an advisory ID also appears
in OSV.
- **OSV.dev** — high-confidence records only: OpenSSF Malicious Packages
IDs (the `MAL-` namespace), records with
`database_specific.malicious-packages-origins`, plus keyword-qualified
records that carry exact affected versions. Generic CVE / DoS records
are filtered out at import time so Aguara stays focused on malicious
packages, not general SCA.
Built originally in response to the
[litellm supply chain attack](https://github.com/garagon/aguara/releases/tag/v0.11.0)
(March 2026), where malicious `.pth` files exfiltrated credentials and
installed K8s backdoors. The toolset grew from that incident into the
broader check + audit + update + status surface above.
## Aguara MCP
[Aguara MCP](https://github.com/garagon/aguara-mcp) is an MCP server that gives AI agents the ability to scan skills and configurations for security threats — before installing or running them. It imports Aguara as a Go library — one `go install`, no external binary needed.
```bash
# Install and register with Claude Code
go install github.com/garagon/aguara-mcp@latest
claude mcp add aguara -- aguara-mcp
```
Your agent gets 4 tools: `scan_content`, `check_mcp_config`, `list_rules`, and `explain_rule`. No network, no LLM, millisecond scans — the agent checks first, then decides.
## Aguara Watch
Aguara Watch is being reworked. The previous public observatory is stale, so we are not using it as a product signal for this release. The scanner, CLI, Docker image, and signed release artifacts remain the supported surfaces for v0.17.0.
## Go Library
Aguara exposes a public Go API for embedding the scanner in other tools. [Aguara MCP](https://github.com/garagon/aguara-mcp) uses this API.
```go
import "github.com/garagon/aguara"
// Scan a directory
result, err := aguara.Scan(ctx, "./skills/")
// Scan inline content (no disk I/O, NFKC-normalized)
result, err := aguara.ScanContent(ctx, content, "skill.md")
// Scan with tool context for false-positive reduction
result, err := aguara.ScanContentAs(ctx, content, "skill.md", "Edit")
// result.Verdict: aguara.VerdictClean, VerdictFlag, or VerdictBlock
// result.ToolName: "Edit"
// result.Findings: always preserved (even when verdict is clean)
// Scan with a profile
result, err := aguara.ScanContent(ctx, content, "skill.md",
aguara.WithToolName("Edit"),
aguara.WithScanProfile(aguara.ProfileContentAware),
)
// result.RiskScore: 0-100 aggregate risk score
// Preserve cross-rule findings (for verdict pipelines)
result, err := aguara.ScanContent(ctx, content, "skill.md",
aguara.WithDeduplicateMode(aguara.DeduplicateSameRuleOnly),
)
// Enable rug-pull detection with persistent state
result, err := aguara.ScanContent(ctx, content, "tool.md",
aguara.WithStateDir("/var/lib/myapp/aguara-state"),
)
// Discover all MCP client configs on the machine
discovered, err := aguara.Discover()
for _, client := range discovered.Clients {
fmt.Printf("%s: %d servers\n", client.Client, len(client.Servers))
}
// List rules, optionally filtered
rules := aguara.ListRules(aguara.WithCategory("prompt-injection"))
// Get rule details with remediation
detail, err := aguara.ExplainRule("PROMPT_INJECTION_001")
fmt.Println(detail.Remediation)
```
Options: `WithMinSeverity()`, `WithDisabledRules()`, `WithCustomRules()`, `WithRuleOverrides()`, `WithWorkers()`, `WithIgnorePatterns()`, `WithMaxFileSize()`, `WithCategory()`, `WithToolName()`, `WithScanProfile()`, `WithDeduplicateMode()`, `WithStateDir()`.
## Architecture
```
aguara.go Public API: Scan, ScanContent, ScanContentAs, Discover, ListRules, ExplainRule
options.go Functional options (WithToolName, WithStateDir, WithDeduplicateMode, ...)
discover/ MCP client discovery: 17 clients, config parsers, auto-detection
cmd/aguara/ CLI entry point (Cobra)
cmd/wasm/ WASM build for browser-based scanning
internal/
engine/
pattern/ Pattern matcher: Aho-Corasick + regex, 8 decoders (base64, hex, URL, Unicode, HTML, hex-escape, base32, octal-escape)
ci/ CI Trust: .github/workflows/ YAML parser, pwn-request / cache / OIDC / persisted-credentials chains
pkgmeta/ PkgMeta: package.json parser, npm lifecycle / git source / publish-surface chains
jsrisk/ JSRisk: .js / .mjs / .cjs scanner, obfuscation / daemonization / CI-secret-harvest / runner-pivot / agent-persistence
nlp/ NLP: markdown AST + JSON/YAML string extraction, proximity-weighted classifier
toxicflow/ Taint: single-file taint tracking + cross-file correlation across directories
rugpull/ Rug-pull: SHA256 change detection (CLI --monitor, library WithStateDir)
rules/ Rule engine: YAML loader, compiler, self-tester
builtin/ 193 embedded rules across 13 YAML files (go:embed)
scanner/ Orchestrator: file discovery, parallel analysis, inline ignore, result aggregation
exemptions.go Tool exemptions, scan profiles, verdict computation
meta/ Post-processing: configurable dedup, scoring, risk score, correlation, confidence
output/ Formatters: terminal (ANSI), JSON, SARIF, Markdown
config/ .aguara.yml loader (supports tool-scoped rules)
incident/ Incident response: compromised package detection, cleanup, quarantine
state/ Persistence for rug-pull detection (CLI and library mode)
types/ Shared types (Finding, Severity, ScanResult, Verdict, DeduplicateMode)
```
## Comparison
Aguara is purpose-built for AI agent content. General-purpose SAST tools target application source code, not the skill files, tool descriptions, and MCP configs that agents consume.
| Feature | Aguara | Semgrep | Snyk Code | CodeQL |
|---------|--------|---------|-----------|--------|
| AI agent skill scanning | Yes | No | No | No |
| MCP config analysis | Yes | No | No | No |
| Prompt injection detection | Yes (18 rules + NLP) | No | No | No |
| Rug-pull detection | Yes | No | No | No |
| Supply chain exfil detection | Yes (10 rules) | No | No | No |
| Incident response (check/clean) | Yes | No | No | No |
| Taint tracking for skills | Yes | Yes | Yes | Yes |
| Offline / no account | Yes | Partial | No | Partial |
| Custom YAML rules | Yes | Yes | No | No |
| SARIF output | Yes | Yes | Yes | Yes |
| Free & open source | Yes (Apache 2.0) | Partial | No | Partial |
Aguara complements traditional SAST - use Semgrep for your app code, Aguara for your agent skills and MCP servers.
## Contributing
Contributions are welcome! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for development setup, adding rules, and the PR process.
For security vulnerabilities, see [SECURITY.md](SECURITY.md).
## License
[Apache License 2.0](LICENSE)