{"id":51193819,"url":"https://github.com/bokuweb/monomi","last_synced_at":"2026-06-27T18:02:49.834Z","repository":{"id":358221344,"uuid":"1240335734","full_name":"bokuweb/monomi","owner":"bokuweb","description":null,"archived":false,"fork":false,"pushed_at":"2026-05-25T10:53:48.000Z","size":469,"stargazers_count":1,"open_issues_count":1,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-06-04T00:41:45.150Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/bokuweb.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-05-16T02:51:16.000Z","updated_at":"2026-05-25T10:53:15.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/bokuweb/monomi","commit_stats":null,"previous_names":["bokuweb/monomi"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/bokuweb/monomi","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bokuweb%2Fmonomi","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bokuweb%2Fmonomi/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bokuweb%2Fmonomi/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bokuweb%2Fmonomi/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/bokuweb","download_url":"https://codeload.github.com/bokuweb/monomi/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bokuweb%2Fmonomi/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34862627,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-27T02:00:06.362Z","response_time":126,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-06-27T18:02:49.105Z","updated_at":"2026-06-27T18:02:49.826Z","avatar_url":"https://github.com/bokuweb.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"# monomi 物見\n\n\u003e **monomi** (物見): *a scout*. Runs ahead of the line that\n\u003e [**sakimori**](https://github.com/bokuweb/sakimori) (防人) holds —\n\u003e looking over published packages before they reach a developer's\n\u003e machine.\n\nA two-stage supply-chain analyzer for **npm**, **crates.io**,\n**PyPI**, and **NuGet**. Stage 1 is fast deterministic Rust rules over the\npublished tarball; Stage 2 escalates ambiguous cases to an LLM\n(Claude / Ollama / LM Studio / OpenAI). Verdicts are persisted in\na Cloudflare R2 catalog keyed by tarball integrity hash, so\nconsumers (primarily sakimori's HTTPS proxy) can answer\n\"is this artifact safe to admit?\" with a single object-store GET.\n\nSee [plan.md](plan.md) for the design rationale,\n[architecture.md](architecture.md) for the concrete crate layout,\nand [docs/deploy.md](docs/deploy.md) for a copy-pasteable production\ndeployment recipe (R2 + systemd + rclone).\n\n## Status\n\n✅ Working end-to-end for npm + cargo + PyPI + NuGet:\n\n- Stage 1 deterministic rules (50 rules)\n- Stage 2 LLM adjudicator with fail-open semantics\n- Content-addressed R2 catalog layout (writes to filesystem;\n  `rclone`/`aws s3 sync` to R2)\n- npm `_changes` continuous feed daemon\n- Ecosystem-agnostic backfill\n\n## Install\n\n```bash\ncargo install --git https://github.com/bokuweb/monomi monomi-cli\n# Provides the `monomi` binary.\n```\n\n## CLI overview\n\n```text\nmonomi scan \u003ctarball\u003e                       # local .tgz or .crate\nmonomi scan-npm \u003cname\u003e@\u003cversion\u003e            # fetch from npm + scan\nmonomi scan-cargo \u003cname\u003e@\u003cversion\u003e          # fetch from crates.io + scan\nmonomi scan-pypi \u003cname\u003e==\u003cversion\u003e          # fetch from PyPI sdist + scan\nmonomi scan-nuget \u003cid\u003e@\u003cversion\u003e            # fetch from NuGet .nupkg + scan\nmonomi feed --catalog-dir \u003cdir\u003e             # daemon: subscribe to npm _changes\nmonomi backfill \u003clist\u003e --ecosystem npm|cargo|pypi|nuget --catalog-dir \u003cdir\u003e\nmonomi publish \u003cverdict.json\u003e --catalog-dir \u003cdir\u003e\nmonomi lookup \u003cname\u003e@\u003cversion\u003e --catalog-dir \u003cdir\u003e   # local catalog\nmonomi lookup \u003cname\u003e@\u003cversion\u003e --catalog-url \u003curl\u003e   # HTTP/R2 catalog\nmonomi diff \u003cname\u003e@\u003cv1\u003e \u003cname\u003e@\u003cv2\u003e                  # capability + finding delta\n```\n\nAdd `--stage1-only` to any command to skip the LLM step.\n\n## One-shot scan\n\n```bash\n$ monomi scan-npm left-pad@1.3.0 --stage1-only\n{\n  \"artifact\": {\n    \"ecosystem\": \"npm\",\n    \"name\": \"left-pad\",\n    \"version\": \"1.3.0\",\n    \"integrity\": { \"algo\": \"sha512\", \"digest_b64\": \"XI5MPzVNAp…\" }\n  },\n  \"stage1\": { \"findings\": [], \"score\": 0, \"verdict\": \"clean\" },\n  \"final_verdict\": { \"status\": \"clean\", \"confidence\": 0.9 }\n}\n# Exit code: 0 clean/warn, 1 block, 2 error\n```\n\n## Stage 2 — LLM adjudication\n\nmonomi auto-detects which LLM provider to use from environment vars\n(precedence: `ANTHROPIC_API_KEY` → `OLLAMA_HOST` → `OPENAI_API_KEY`).\nForce a specific provider with `--llm`:\n\n```bash\n# Anthropic (default if ANTHROPIC_API_KEY is set)\nexport ANTHROPIC_API_KEY=sk-ant-...\nmonomi scan-npm some-pkg@1.0.0\n\n# Local Ollama\nexport OLLAMA_HOST=http://localhost:11434\nmonomi --llm ollama --llm-model llama3.1 scan-npm some-pkg@1.0.0\n\n# Any OpenAI-compatible endpoint (vLLM, LM Studio, etc.)\nmonomi --llm openai \\\n       --llm-base-url http://localhost:1234/v1 \\\n       --llm-model qwen2.5-coder \\\n       scan-npm some-pkg@1.0.0\n```\n\nStage 2 is only invoked when Stage 1 finds something genuinely\nambiguous; clean and decisively-malicious cases short-circuit.\n\n## Building the catalog (R2)\n\n`monomi` writes verdicts to a directory in the canonical R2 layout.\nThe intended workflow is to write locally and mirror to R2:\n\n```bash\n# 1. Warm-start from a list of packages\necho -e \"left-pad\\nlodash\\nexpress\" \\\n  | monomi backfill - --ecosystem npm --catalog-dir ./catalog\n\n# 2. Run the npm change-stream daemon for ongoing updates\nmonomi feed --catalog-dir ./catalog --max-concurrent 8 \u0026\n\n# 3. Mirror to R2 (any S3-compatible target works)\nrclone sync ./catalog r2:my-bucket/ --transfers=16\n# OR\naws s3 sync ./catalog s3://my-bucket/ --endpoint-url https://\u003cacct\u003e.r2.cloudflarestorage.com\n```\n\nThe catalog layout:\n\n```text\ncatalog/\n├── verdicts/by-integrity/\u003calgo\u003e/\u003caa\u003e/\u003crest\u003e.json   # primary lookup\n├── verdicts/\u003ceco\u003e/\u003cname\u003e/\u003cversion\u003e.json            # convenience pointer\n├── index/latest.jsonl                              # rolling feed\n└── feed-state.json                                 # _changes cursor\n```\n\n## Reading from the catalog\n\n```bash\n# Local\nmonomi lookup left-pad@1.3.0 --catalog-dir ./catalog\n\n# Public R2 / any HTTP base URL\nmonomi lookup left-pad@1.3.0 --catalog-url https://catalog.example.com\n```\n\nLibrary consumers (such as sakimori's proxy) use `monomi-catalog`'s\n`HttpCatalogReader` directly:\n\n```rust\nuse monomi_catalog::{CatalogReader, HttpCatalogReader};\n\nlet reader = HttpCatalogReader::new(\"https://catalog.example.com\");\nif let Some(v) = reader.lookup_by_integrity(\u0026integrity).await? {\n    if v.final_verdict.status == Status::Block {\n        return forbidden_403();\n    }\n}\n```\n\n## Rules\n\nStage 1 ships with 12 rules. Decisive Critical findings cause an\nimmediate **block** verdict; everything else defers to Stage 2.\n\n| ID         | Severity | Decisive? | What it catches |\n|------------|----------|-----------|-----------------|\n| `NPM001`   | Info     | no        | Any install lifecycle script |\n| `NPM002`   | High     | defer     | Lifecycle uses `child_process`/net |\n| `NPM004`   | High     | defer     | `process.env` bulk enumeration |\n| `NPM005`   | Critical | yes       | Large base64/hex blob + `eval` |\n| `NPM009`   | High     | defer     | Undeclared native binary |\n| `NPM010`   | Critical | yes       | Crypto-wallet drainer literal (Exodus / MetaMask ext-id / `wallet.dat` …) |\n| `NPM011`   | Critical | yes (lifecycle) / defer (source) | CI / registry token theft (`NPM_TOKEN`, `GITHUB_TOKEN`, `AWS_*`) |\n| `NPM012`   | High     | defer     | `bundleDependencies` declared (hides deps from `npm audit`) |\n| `NPM013`   | High     | defer     | Dynamic `require()` / `import()` with non-literal arg |\n| `NPM014`   | High     | defer     | Typosquat candidate (edit distance ≤ 2 to popular name + recent publish) |\n| `NPM015`   | Critical | yes       | Encoded `http` byte sequence (URL hidden as `[104, 116, 116, 112, …]`) |\n| `NPM016`   | Med/High | defer     | Publish-recency: brand-new package OR fresh version on an established package |\n| `NPM017`   | Critical | yes (lifecycle) / defer (source) | Fetch from `raw.githubusercontent.com` / Gist / GitLab raw / codeload |\n| `NPM018`   | Critical | yes       | Self-deleting payload (`fs.unlinkSync(__filename)`) |\n| `NPM019`   | Critical | yes       | `curl ... \\| sh` / `eval $(curl ...)` in install-time script |\n| `NPM020`   | Critical | yes       | `eval(atob(...))` / `new Function(atob(...))()` chain |\n| `NPM021`   | High     | defer     | Tarball ships files outside the `files` allow-list |\n| `NPM023`   | High     | defer     | Install-time outbound HTTP/fetch (`https.get` / `axios.get` / …) |\n| **`NPM022`** \\* | Critical/High | bidi=yes / zw=defer | Trojan Source bidi override / zero-width / mixed-script identifier |\n| **`NPM024`** \\* | Critical | yes       | Crypto-miner indicators (`stratum+tcp://`, known pools, CoinHive, XMR addr) |\n| **`NPM025`** | High | defer | DNS exfil: `dns.lookup(secret + '.attacker.com')` shape |\n| **`NPM026`** | Critical | yes | Executable payload smuggled into README / LICENSE / CHANGELOG |\n| **`NPM027`** | Medium | defer | publish-time hook (`prepublishOnly` etc) contains network/shell — possible CI compromise |\n| **`NPM028`** | High/Med | defer | Time-bomb activation — `Date.now() \u003e \u003cfuture epoch\u003e` or literal future date |\n| **`NPM030`** | Critical/High | mixed | Capability *newly introduced* vs prior versions (post-Stage1 diff against catalog history; decisive for `SelfDelete` / `CryptoMiner` / `WalletAccess` / `FsWritePersistence` / `RegistryWrite` / `SecretMaterial`, defers otherwise) |\n| **`NPM033`** | High | defer | Cryptocurrency private-key / mnemonic / seed-phrase literal in source (`@solana/web3.js` 2024 hijack shape) |\n| **`NPM034`** | Critical | yes | npm CLI (`npm publish` / `npm token` / `npx`) invoked from a lifecycle script — worm-propagation shape (Shai-Hulud 2024) |\n| **`NPM035`** | High | defer | Linux privesc / recon path literal: `/etc/shadow`, `/proc/self/environ`, `/root/.ssh/`, … |\n| **`NPM036`** | High | defer | `fs.chmodSync(p, 0o755)` / `chmod +x` inside an install-time script — fetch-and-run shape |\n| **`CARGO005`** | High | defer | Proc-macro source uses `std::process::Command` — runs at compile time in every downstream crate |\n| **`CARGO006`** | High | defer | Proc-macro source uses `std::fs` / `OpenOptions` — compile-time file IO |\n| **`CARGO007`** | High | defer | Proc-macro source uses `reqwest` / `std::net::TcpStream` — compile-time network |\n| `CARGO001` | Info     | no        | `build.rs` present |\n| `CARGO002` | High     | defer     | build.rs uses `Command::new` etc. |\n| **`CARGO003`** | High | defer     | Crate is a proc-macro (compile-time code in every downstream crate) |\n| **`CARGO004`** | High | defer     | build.rs uses `include_bytes!` / `include_str!` to embed file at compile time |\n| `PYPI001`  | Info     | no        | `setup.py` or non-stdlib build-backend |\n| `PYPI002`  | High     | defer     | setup.py uses `subprocess`/`socket` etc. |\n| `NUGET001` | Info     | no        | `tools/install.ps1` / `init.ps1` present |\n| `NUGET002` | High     | defer     | install.ps1 uses `Invoke-WebRequest` / `IEX` etc. |\n| **`NUGET003`** | High | defer     | Native DLL/EXE in `tools/` alongside install.ps1 |\n| `NPM006` * | Critical | yes       | Hardcoded cloud-metadata IP |\n| `NPM007` * | Critical | yes       | Known exfil endpoint (webhook.site, Discord, etc.) |\n| `NPM008` * | High     | defer     | Sensitive-path literal (~/.ssh/, LaunchAgents, …) |\n\n\\* These literal-pattern rules fire on every ecosystem (the\n`NPM` prefix is historical).\n\n## Architecture in one paragraph\n\n`monomi-core` defines the `Ecosystem` and `Rule` traits.\n`monomi-{npm,cargo,pypi,nuget}` are ecosystem clients (fetch +\nmanifest + walk + lifecycle extraction). `monomi-rules` is the\nshared rule set. `monomi-pipeline` wires Stage 1 + Stage 2 +\nverdict merge. `monomi-llm` is the Stage 2 adjudicator (Anthropic,\nOpenAI-compatible including Ollama / LM Studio, plus a Noop for\noffline mode). `monomi-catalog` is the content-addressed verdict\nstore (`LocalDirCatalog` + `HttpCatalogReader`). `monomi-feed` is\nthe npm change-stream daemon + multi-ecosystem backfill.\n`monomi-cli` ties them together.\n\n## Limitations\n\n- npm `_changes` is the only continuous feed (cargo and PyPI have\n  no equivalent push stream; use `backfill` for those).\n- `monomi-catalog` writes to a filesystem only; direct R2 SDK\n  writes are deferred to a future feature gate. `rclone` / `aws s3\n  sync` covers production needs without bundling an AWS SDK.\n- PyPI ecosystem covers sdists only (`.tar.gz`); wheel parsing\n  comes later.\n- proc-macro execution risk for cargo is not modeled per-crate\n  yet (it needs a resolve-graph view).\n- NuGet `tools/install.ps1` only runs under legacy\n  `packages.config`; modern `PackageReference` ignores them. The\n  proxy can't tell which consumer will pick up the package, so\n  monomi flags them anyway.\n\n## License\n\nMIT OR Apache-2.0\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbokuweb%2Fmonomi","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbokuweb%2Fmonomi","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbokuweb%2Fmonomi/lists"}