{"id":50216083,"url":"https://github.com/jsleekr/skilldigest","last_synced_at":"2026-05-26T09:03:37.140Z","repository":{"id":351941447,"uuid":"1213138661","full_name":"JSLEEKR/skilldigest","owner":"JSLEEKR","description":"Static analyzer for AI agent skill libraries. Finds dead/bloated/conflicting skills, counts tokens, gates CI. Single Rust binary.","archived":false,"fork":false,"pushed_at":"2026-04-17T07:03:36.000Z","size":133,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-04-17T07:28:21.886Z","etag":null,"topics":["ai-agents","claude-code","cli","codex","cursor","rust","sarif","skill-library","static-analysis","tiktoken"],"latest_commit_sha":null,"homepage":null,"language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/JSLEEKR.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-04-17T04:43:22.000Z","updated_at":"2026-04-17T07:03:40.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/JSLEEKR/skilldigest","commit_stats":null,"previous_names":["jsleekr/skilldigest"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/JSLEEKR/skilldigest","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JSLEEKR%2Fskilldigest","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JSLEEKR%2Fskilldigest/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JSLEEKR%2Fskilldigest/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JSLEEKR%2Fskilldigest/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/JSLEEKR","download_url":"https://codeload.github.com/JSLEEKR/skilldigest/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JSLEEKR%2Fskilldigest/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33512335,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T03:12:49.672Z","status":"ssl_error","status_checked_at":"2026-05-26T03:12:47.976Z","response_time":63,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai-agents","claude-code","cli","codex","cursor","rust","sarif","skill-library","static-analysis","tiktoken"],"created_at":"2026-05-26T09:03:29.940Z","updated_at":"2026-05-26T09:03:37.133Z","avatar_url":"https://github.com/JSLEEKR.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"# skilldigest\n\n[![for-the-badge](https://img.shields.io/badge/skilldigest-1.0.0-blue?style=for-the-badge)](https://github.com/JSLEEKR/skilldigest/releases)\n[![language](https://img.shields.io/badge/language-rust-orange?style=for-the-badge)](https://www.rust-lang.org/)\n[![edition](https://img.shields.io/badge/edition-2021-lightgrey?style=for-the-badge)](https://doc.rust-lang.org/edition-guide/rust-2021/index.html)\n[![MSRV](https://img.shields.io/badge/MSRV-1.75-informational?style=for-the-badge)](https://releases.rs/)\n[![license](https://img.shields.io/badge/license-MIT-brightgreen?style=for-the-badge)](./LICENSE)\n[![platform](https://img.shields.io/badge/platform-linux%20%7C%20macOS%20%7C%20windows-purple?style=for-the-badge)](#installation)\n[![status](https://img.shields.io/badge/status-v1-success?style=for-the-badge)](./CHANGELOG.md)\n[![tokenizer](https://img.shields.io/badge/tokenizer-cl100k%20%7C%20o200k%20%7C%20llama3-red?style=for-the-badge)](#tokenizers)\n[![output](https://img.shields.io/badge/output-text%20%7C%20json%20%7C%20sarif%20%7C%20markdown%20%7C%20dot-darkgreen?style=for-the-badge)](#output-formats)\n\n\u003e **skilldigest** is a static analyzer for AI coding-assistant skill libraries\n\u003e (`SKILL.md`, `AGENTS.md`, `.cursorrules`, `CLAUDE.md`, agent plugins, etc.).\n\u003e It walks a directory of skills, measures per-skill token cost with a\n\u003e tiktoken-compatible BPE, builds a reference graph, and reports\n\u003e **dead**, **bloated**, **conflicting**, **stale**, and **cyclic** skills,\n\u003e plus a recommended **loadout** for a given task tag. Single static Rust\n\u003e binary. SARIF output drops straight into GitHub code-scanning.\n\n---\n\n## Table of contents\n\n- [Why this exists](#why-this-exists)\n- [Features](#features)\n- [Installation](#installation)\n- [Quick start](#quick-start)\n- [CLI reference](#cli-reference)\n  - [Global flags](#global-flags)\n  - [`scan`](#scan-subcommand)\n  - [`tokens`](#tokens-subcommand)\n  - [`loadout`](#loadout-subcommand)\n  - [`graph`](#graph-subcommand)\n- [Output formats](#output-formats)\n  - [JSON schema](#json-schema)\n  - [SARIF 2.1.0](#sarif-210)\n  - [Markdown for PR comments](#markdown-for-pr-comments)\n- [Exit codes](#exit-codes)\n- [Configuration file](#configuration-file)\n- [Tokenizers](#tokenizers)\n- [Rule catalogue](#rule-catalogue)\n- [CI integration (GitHub Actions)](#ci-integration-github-actions)\n- [Performance](#performance)\n- [Determinism and reproducibility](#determinism-and-reproducibility)\n- [Security and robustness](#security-and-robustness)\n- [Comparison with other JSLEEKR tools](#comparison-with-other-jsleekr-tools)\n- [Architecture](#architecture)\n- [Development](#development)\n- [Roadmap](#roadmap)\n- [Contributing](#contributing)\n- [License](#license)\n\n---\n\n## Why this exists\n\nAI coding-assistant skill libraries have *exploded* in 2026. A partial list:\n\n| Project | Skills | Stars |\n|---------|-------:|------:|\n| `antigravity-awesome-skills` | 1,400+ | 33,455 |\n| `Vibe-Skills` | 340+ | 1,535 |\n| `claude-skills` | 232+ | 11,401 |\n| `awesome-claude-code` | 190+ | 39,123 |\n| `oh-my-claudecode` | many | 29,372 |\n\nEvery one of them ships as a giant directory of markdown. Nobody knows:\n\n- Which skills are **actually referenced** by an index/manifest and which are dead code?\n- Which skills **exceed the token budget** of the target model?\n- Which skills **contradict each other** (e.g. one says \"MUST use `Bash(jq)`\", another says \"MUST NOT\")?\n- Which skills link to **files that no longer exist**?\n- Given a task tag `refactor-tests`, **which minimal loadout** fits in 10k tokens?\n\n`skilldigest` answers all five. Adjacent tools do not:\n\n- `skillpack` — packages/locks skills, doesn't audit them.\n- `agentlint` — validates agent *config* files (YAML/JSON), not skill *bodies* (markdown).\n- `tokencost` — counts tokens per *prompt*, not per skill-library entry.\n- `rtk` — runtime token *reducer*, not a static analyzer.\n\nskilldigest is the missing piece. One Rust binary, no runtime deps, ships a\nSARIF report your CI already knows how to upload.\n\n## Features\n\n- **Deterministic** — same input → byte-identical output.\n- **Offline-first** — cl100k tokenizer data ships inside the binary.\n- **Fast** — ~1,400 skills in \u003c 2 s on an 8-core laptop (rayon parallel tokenization).\n- **Multi-format** — text, JSON, SARIF 2.1, Markdown (PR comment), GraphViz dot.\n- **Library-format agnostic** — detects `SKILL.md`, `AGENT.md`, `AGENTS.md`, `CLAUDE.md`, `GEMINI.md`, `.cursorrules`, `.cursor/rules/**`, `.claude/skills/**`, `plugin.toml`.\n- **Rule catalogue** — 12 distinct issue classes with SARIF `ruleId`s (`SKILL001`–`SKILL012`).\n- **Robust** — tolerates BOM, CRLF, mixed indent, malformed frontmatter, non-UTF-8 bytes.\n- **Configurable** — `.skilldigest.toml` with per-skill budget overrides and ignore globs.\n- **Zero `unsafe`** — `#![forbid(unsafe_code)]` at the crate root.\n\n## Installation\n\n### From source\n\n```bash\ngit clone https://github.com/JSLEEKR/skilldigest\ncd skilldigest\ncargo build --release\n./target/release/skilldigest --help\n```\n\n### Via `cargo install`\n\n```bash\ncargo install --path .\n# or, once published:\ncargo install skilldigest\n```\n\n### MSRV\n\n`rust-version = \"1.75\"`. Any newer stable toolchain works.\n\n### Platforms\n\nLinux (x86_64, aarch64), macOS (x86_64, aarch64), Windows (x86_64).\nOne static binary per platform. No runtime dependencies.\n\n## Quick start\n\n```bash\n# Audit a skill library\nskilldigest scan ./my-skills\n\n# Token count for a single file\nskilldigest tokens ./my-skills/git/commit/SKILL.md\n\n# Recommend a loadout for the \"refactor\" task tag\nskilldigest loadout ./my-skills --tag refactor --max-tokens 8000\n\n# Emit the skill reference graph as GraphViz dot\nskilldigest graph ./my-skills --format dot | dot -Tsvg \u003e skills.svg\n```\n\n## CLI reference\n\n### Global flags\n\n| Flag | Default | Description |\n|------|---------|-------------|\n| `-f, --format \u003cFORMAT\u003e` | `text` | Output format: `text`, `json`, `sarif`, `markdown`, `dot` |\n| `-o, --output \u003cFILE\u003e` | stdout | Write output to a file |\n| `-t, --tokenizer \u003cNAME\u003e` | `cl100k` | Tokenizer: `cl100k`, `o200k`, `llama3` |\n| `-b, --budget \u003cN\u003e` | `2000` | Per-skill token budget |\n| `--total-budget \u003cN\u003e` | none | Aggregate token budget across the library |\n| `--offline` | off | No-op retained for forward compatibility — skilldigest is always fully offline (tokenizer data is bundled in the binary; no network I/O at scan time) |\n| `--follow-symlinks` | off | Follow symlinks during scan |\n| `--max-file-size \u003cB\u003e` | `1048576` | Skip files larger than this many bytes |\n| `--config \u003cFILE\u003e` | auto | Path to `.skilldigest.toml` |\n| `--no-color` | off | Disable ANSI color in text output |\n| `-v, --verbose` | off | Log to stderr |\n| `-q, --quiet` | off | Suppress non-error output |\n| `--version` | — | Print version and exit |\n| `--help` | — | Print help and exit |\n\n### `scan` subcommand\n\n```\nskilldigest scan \u003cDIR\u003e [OPTIONS]\n```\n\nRuns a full audit. Emits a report in the chosen format. Returns exit-1 when\nany error-severity issue is found.\n\n```bash\nskilldigest scan ./skills\nskilldigest scan ./skills --format json --output report.json\nskilldigest scan ./skills --format sarif --output skills.sarif.json\nskilldigest scan ./skills --budget 3000 --no-color\nskilldigest scan ./skills --fix-hint  # emit rm hints to stderr\n```\n\n### `tokens` subcommand\n\n```\nskilldigest tokens \u003cFILE\u003e [OPTIONS]\n```\n\nCount tokens in a single file.\n\n```bash\nskilldigest tokens ./skills/git/commit/SKILL.md\nskilldigest tokens ./skills/git/commit/SKILL.md --by-section --format json\nskilldigest tokens ./CLAUDE.md --tokenizer o200k\n```\n\n### `loadout` subcommand\n\n```\nskilldigest loadout \u003cDIR\u003e --tag \u003cTAG\u003e [--max-tokens \u003cN\u003e] [OPTIONS]\n```\n\nScore every skill for the tag and greedily select the highest-scoring\nsubset that fits in `--max-tokens`. Ties broken deterministically by skill ID.\n\n```bash\nskilldigest loadout ./skills --tag git --max-tokens 10000\nskilldigest loadout ./skills --tag refactor --max-tokens 5000 --format json\n```\n\n### `graph` subcommand\n\n```\nskilldigest graph \u003cDIR\u003e [OPTIONS]\n```\n\nEmit the skill reference graph.\n\n```bash\nskilldigest graph ./skills --format dot | dot -Tsvg -o graph.svg\nskilldigest graph ./skills --format json\nskilldigest graph ./skills --format markdown   # embedded code-block\n```\n\n## Output formats\n\n### JSON schema\n\nPretty-printed; stable snake_case keys; versioned via `schema_version`.\n\n```json\n{\n  \"schema_version\": \"skilldigest-report/1\",\n  \"tokenizer\": \"cl100k_base\",\n  \"tool_version\": \"1.0.0\",\n  \"scan_root\": \"./skills\",\n  \"total_skills\": 12,\n  \"total_tokens\": 18432,\n  \"budget\": { \"per_skill\": 2000, \"total\": null },\n  \"skills\": [\n    {\n      \"id\": \"git/commit-style\",\n      \"name\": \"commit-style\",\n      \"path\": \"git/commit-style/SKILL.md\",\n      \"tokens\": { \"frontmatter\": 32, \"body\": 814, \"total\": 846 },\n      \"tags\": [\"git\", \"commit\"],\n      \"refs_out\": 2,\n      \"refs_in\": 1,\n      \"issue_kinds\": [\"bloated\"]\n    }\n  ],\n  \"issues\": [\n    {\n      \"kind\": \"dead\",\n      \"severity\": \"warning\",\n      \"skill\": \"legacy/old-thing\",\n      \"message\": \"skill 'legacy/old-thing' is never referenced by any index or other skill\",\n      \"location\": { \"path\": \"legacy/old-thing/SKILL.md\", \"line\": 1, \"column\": 1 },\n      \"related\": []\n    }\n  ],\n  \"loadout\": null\n}\n```\n\n### SARIF 2.1.0\n\nThe SARIF emitter is designed to be accepted by GitHub code-scanning\n(`github/codeql-action/upload-sarif@v3`). Each issue class has its own rule\n(`SKILL001` – `SKILL012`) with stable `id`, `name`, `shortDescription`,\n`fullDescription`, `defaultConfiguration.level`, and `helpUri`.\n\n```bash\nskilldigest scan ./skills --format sarif --output skills.sarif.json\n# …then in your GH Actions workflow:\n#   - uses: github/codeql-action/upload-sarif@v3\n#     with: { sarif_file: skills.sarif.json }\n```\n\n### Markdown for PR comments\n\n```markdown\n### skilldigest report\n**12 skills**, **18,432 tokens** (cl100k_base), **3 issues** (1 error, 2 warning, 0 note)\n\n| Skill | Tokens | Issues |\n|-------|-------:|--------|\n| `git/commit-style` | 846 | bloated |\n| `legacy/old-thing` | 1204 | dead |\n\n#### Issues\n\n- [ERROR] **bloated** `git/commit-style` `git/commit-style/SKILL.md:1` — 846 tokens exceeds budget 500\n- [warn] **dead** `legacy/old-thing` `legacy/old-thing/SKILL.md:1` — skill 'legacy/old-thing' is never referenced\n```\n\n## Exit codes\n\n| Code | Meaning | Typical CI reaction |\n|-----:|---------|---------------------|\n| `0` | Scan completed, no error-severity issues | green build |\n| `1` | Error-severity issues found | fail the build / block merge |\n| `2` | Operational error (bad args, IO, malformed config) | fail the build as infra error |\n\n## Configuration file\n\nDrop a `.skilldigest.toml` at the scan root.\n\n```toml\n# Global token budgets\n[budget]\nper_skill = 2000\ntotal = 40000\n\n# Default tokenizer (CLI flag still wins)\n[tokenizer]\ndefault = \"cl100k\"\n\n# Gitignore-style globs to skip\n[ignore]\nglobs = [\"archive/**\", \"drafts/**\", \"*.bak.md\"]\n\n# Per-skill overrides\n[overrides.\"git/commit-style\"]\nbudget = 3000\n\n[overrides.\"onboarding/company-context\"]\nbudget = 5000\n```\n\nPrecedence (highest wins) — most-specific override beats more-global setting:\n\n1. Frontmatter `budget:` on an individual skill (most specific)\n2. `[overrides]` section in `.skilldigest.toml` (per-skill, by id)\n3. `--budget` CLI flag (sets the global per-skill default for this run)\n4. `[budget] per_skill` config section\n5. Built-in default (2000)\n\nThe same shape applies to the global `[budget] total` cap: `--total-budget`\non the CLI overrides `[budget] total` in the config file. There is no\nper-skill override for the aggregate cap.\n\n## Tokenizers\n\n| Name | Backed by | Offline? | Notes |\n|------|-----------|----------|-------|\n| `cl100k` | `tiktoken-rs::cl100k_base` | Yes (bundled) | GPT-4, Claude-ish. **Default.** |\n| `o200k` | `tiktoken-rs::o200k_base` | Yes (bundled) | GPT-4o. |\n| `llama3` | Deterministic word-piece approximation | Yes (algorithmic) | Within ~10% of real Llama 3 counts on English prose. Useful for *relative* comparisons. |\n\nThe llama3 backend is intentionally an approximation — we do not ship the\nfull HuggingFace `tokenizer.json` (which would require either a network\nfetch or a ~20 MB binary bloat). The approximation is deterministic and\nside-effect free; documented as approximate so downstream tooling knows\nnot to trust it for absolute billing.\n\n## Rule catalogue\n\n| Rule ID | Issue kind | Default severity | Description |\n|---------|-----------|------------------|-------------|\n| `SKILL001` | dead | warning | Skill never referenced by any index or other skill |\n| `SKILL002` | bloated | **error** | Skill exceeds per-skill token budget |\n| `SKILL003` | conflict | **error** | Two skills contain opposing rules about the same subject |\n| `SKILL004` | stale | warning | A link or file reference points to a missing file |\n| `SKILL005` | cycle | **error** | Reference cycle in the skill graph |\n| `SKILL006` | oversize | **error** | File exceeds `--max-file-size` |\n| `SKILL007` | non-utf8 | warning | File contained bytes that could not be decoded as UTF-8 |\n| `SKILL008` | bad-frontmatter | warning | YAML frontmatter failed to parse |\n| `SKILL009` | symlink | note | Symlink skipped (use `--follow-symlinks`) |\n| `SKILL010` | duplicate | **error** | Two files produced the same normalized skill identifier |\n| `SKILL011` | path-escape | warning | Discovered file canonicalised to a path outside the scan root (e.g. via a symlink target) |\n| `SKILL012` | total-bloated | **error** | Aggregate library token cost exceeds `--total-budget` / `[budget] total` |\n\n## CI integration (GitHub Actions)\n\n```yaml\nname: skill-digest\n\non:\n  pull_request:\n    paths:\n      - '.claude/skills/**'\n      - '.cursor/rules/**'\n      - 'AGENTS.md'\n      - 'CLAUDE.md'\n\njobs:\n  skilldigest:\n    runs-on: ubuntu-latest\n    permissions:\n      security-events: write  # required for upload-sarif\n      contents: read\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Install skilldigest\n        run: |\n          curl -L https://github.com/JSLEEKR/skilldigest/releases/latest/download/skilldigest-linux-amd64 -o /usr/local/bin/skilldigest\n          chmod +x /usr/local/bin/skilldigest\n\n      - name: Run skilldigest (SARIF)\n        run: skilldigest scan . --format sarif --output skills.sarif.json || true\n\n      - name: Upload SARIF to GitHub code-scanning\n        uses: github/codeql-action/upload-sarif@v3\n        with:\n          sarif_file: skills.sarif.json\n          category: skilldigest\n\n      - name: Fail on any error-severity issue\n        run: skilldigest scan . --no-color\n```\n\nOr drop it straight into a PR comment:\n\n```yaml\n      - name: Render Markdown report\n        id: digest\n        run: skilldigest scan . --format markdown \u003e digest.md\n      - name: Comment on PR\n        uses: marocchino/sticky-pull-request-comment@v2\n        with:\n          path: digest.md\n```\n\n## Performance\n\nOn an 8-core x86_64 laptop with warm filesystem cache:\n\n| Library size | Wall time |\n|-------------:|----------:|\n| 20 skills | ~5 ms |\n| 200 skills | ~35 ms |\n| 1,400 skills | \u003c 2 s |\n\nRun the bench yourself:\n\n```bash\ncargo bench --bench bench_scan\ncargo bench --bench bench_tokenize\n```\n\n## Determinism and reproducibility\n\n- All collections sorted before emit.\n- Tokenizer version and schema version are stamped into every JSON/SARIF output.\n- No timestamps anywhere in the output — runs at different times produce byte-identical files.\n- Deterministic tie-breakers in the loadout recommender (integer math, no floats).\n\n```bash\nskilldigest scan ./skills --format json \u003e a.json\nskilldigest scan ./skills --format json \u003e b.json\ndiff -u a.json b.json   # → empty\n```\n\n## Security and robustness\n\n- **`#![forbid(unsafe_code)]`** at the crate root.\n- **File-size cap** (1 MiB default) prevents memory blowup on malicious inputs.\n- **Symlinks skipped by default** — reject path traversal via canonicalization.\n- **UTF-8 strict** on the fast path (`simdutf8`), graceful fallback flags\n  non-UTF-8 files instead of panicking.\n- **No network I/O** at scan time — tokenizer data is bundled inside the binary.\n- **No shell-outs** — no subprocess execution at any point.\n- **Frontmatter YAML** is parsed in a bounded mode with `serde_yaml` and\n  failures produce `bad-frontmatter` issues rather than halting the scan.\n\n## Comparison with other JSLEEKR tools\n\n| Tool | Round | Language | Scope | Unique to skilldigest |\n|------|------:|----------|-------|-----------------------|\n| `skillpack` | R81 | Go | Lockfile + install for skills | Token audit, dead-code detection |\n| `agentlint` | R83 | TypeScript | Validate agent *config* files (JSON/YAML) | Operates on skill *bodies* (markdown) |\n| `tokencost` | R54 | — | Tokens per prompt | Tokens **per skill** + library audit |\n| `mcpbench` | R84 | Go | Benchmark MCP servers | Different category |\n| `ragcheck` | R82 | Python | RAG eval harness | Different category |\n| `agentmem` | — | — | Agent memory persistence | Different category |\n\nTogether, `skillpack` (R81) + `agentlint` (R83) + `skilldigest` (R85) cover\npackaging, config validation, and content analysis of AI-agent skill\nlibraries — three non-overlapping quality gates.\n\n## Architecture\n\n```\n+------------------+\n|  CLI (clap v4)   |\n+---------+--------+\n          |\n          v\n+---------+---------+      +----------------+\n|  Scanner (walkdir)|----\u003e| Parser (md+yaml)|\n+---------+---------+      +-------+--------+\n          |                        |\n          |                        v\n          |                 +------+------+\n          |                 |  Skill AST  |\n          |                 +------+------+\n          |                        |\n          v                        v\n+---------+----------+      +------+---------+\n| Tokenizer pool     |\u003c----\u003e| Graph (petgraph)|\n| (tiktoken-rs)      |      +------+---------+\n+---------+----------+             |\n          |                        v\n          |                 +------+---------+\n          |                 |  Audit rules   |\n          |                 +------+---------+\n          |                        |\n          v                        v\n+-------------------+      +---------------+\n|  Output emitter   |\u003c-----+  Issue list   |\n|  (text/json/sarif/md)  | +---------------+\n+-------------------+\n```\n\nModule layout (`src/`):\n\n| Module | Purpose |\n|--------|---------|\n| `cli.rs` | clap v4 derive, subcommand dispatch |\n| `scan.rs` | directory walk, file classification |\n| `parse.rs` | markdown + frontmatter parser |\n| `model.rs` | core data types |\n| `tokenize.rs` | cl100k / o200k / llama3-approx tokenizers |\n| `graph.rs` | petgraph-backed reference graph |\n| `rules.rs` | bloat / conflict / stale / duplicate / dead detectors |\n| `audit.rs` | orchestration |\n| `loadout.rs` | task-tag loadout recommender |\n| `config.rs` | `.skilldigest.toml` loader |\n| `output/*` | text / json / sarif / markdown / dot renderers |\n| `error.rs` | canonical error type + exit codes |\n\n## Development\n\n```bash\n# Full test suite\ncargo test --all-features\n\n# Clippy — strict, warnings = errors\ncargo clippy --all-targets --all-features -- -D warnings\n\n# Format check\ncargo fmt --check\n\n# Benchmarks\ncargo bench\n```\n\nTest count at v1.0.0: **200+ tests** (unit + integration + doc).\n\n## Roadmap\n\nOut of scope for v1 (tracked for future rounds):\n\n- **LLM-assisted conflict detection** — v1 is structural only.\n- **`--fix` auto-repair** — v1 only emits shell-hints via `--fix-hint`.\n- **VS Code / Cursor extension** — may ship as a separate project.\n- **Integration with `skillpack` lockfile** — cross-reference pinned skill versions.\n- **Language-specific rule packs** — currently tool-detection is hard-coded\n  to Claude-style tool names; a plugin system would allow Cursor/Copilot\n  tool-name dictionaries.\n\n## Contributing\n\n1. Fork the repo.\n2. Create a topic branch (`git checkout -b feat/your-feature`).\n3. Make sure `cargo fmt --check`, `cargo clippy -- -D warnings`,\n   `cargo test --all-features` all pass.\n4. Add tests for any new behavior.\n5. Open a PR with a clear description of the change.\n\nCommit messages loosely follow conventional-commits (`feat:`, `fix:`,\n`docs:`, `refactor:`). The pre-commit checklist is simply the three\ncommands above.\n\n## License\n\nMIT © 2026 JSLEEKR. See [LICENSE](./LICENSE).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjsleekr%2Fskilldigest","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjsleekr%2Fskilldigest","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjsleekr%2Fskilldigest/lists"}