{"id":50873675,"url":"https://github.com/hinanohart/weightlock","last_synced_at":"2026-06-15T07:31:17.746Z","repository":{"id":361432101,"uuid":"1250390486","full_name":"hinanohart/weightlock","owner":"hinanohart","description":"AI Asset Compliance Gate — classify model-weight licenses (commercial-use / derivatives / gating / CONFLICT) and fail CI closed on non-commercial or unverifiable assets. pip + CLI, CPU-only. Not legal advice.","archived":false,"fork":false,"pushed_at":"2026-06-10T12:09:06.000Z","size":83,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-06-10T12:22:55.470Z","etag":null,"topics":["ai-bom","ci","compliance","huggingface","license-compliance","mlops","model-weights","supply-chain"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/hinanohart.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":"NOTICE","maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-05-26T15:26:58.000Z","updated_at":"2026-06-10T12:09:30.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/hinanohart/weightlock","commit_stats":null,"previous_names":["hinanohart/weightlock"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/hinanohart/weightlock","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hinanohart%2Fweightlock","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hinanohart%2Fweightlock/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hinanohart%2Fweightlock/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hinanohart%2Fweightlock/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/hinanohart","download_url":"https://codeload.github.com/hinanohart/weightlock/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hinanohart%2Fweightlock/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34353189,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-15T02:00:07.085Z","response_time":63,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai-bom","ci","compliance","huggingface","license-compliance","mlops","model-weights","supply-chain"],"created_at":"2026-06-15T07:31:16.647Z","updated_at":"2026-06-15T07:31:17.730Z","avatar_url":"https://github.com/hinanohart.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# weightlock\n\n**AI Asset Compliance Gate** — classify the commercial-use, derivative, gating\nand *conflict* status of model weights, and fail your CI **closed** on\nnon-commercial or unverifiable assets.\n\n\u003e ⚠️ **weightlock is NOT legal advice.** It is a best-effort engineering aid for\n\u003e spotting license risk early. Every verdict carries a `source_url` and a\n\u003e `confidence` level — final licensing decisions belong to your legal team.\n\n---\n\n## Why\n\n\"Open weights\" is not \"open source.\" Between 2024 and 2026 the model ecosystem\nfilled with both genuinely commercial-friendly weights (MIT / Apache-2.0) and\nlandmines: CC-BY-NC checkpoints, RAIL behavioral-use licenses, community\nlicenses with monthly-active-user caps (Llama, Qwen), and gated repos. SPDX was\nbuilt for source code and cannot express \"non-commercial weights\", \"use-based\nrestrictions\", or \"gated\". Today most teams check this by hand.\n\nweightlock turns that check into one command with a non-zero exit code, so a\nnon-commercial or unverifiable model can't slip into a commercial pipeline\nunnoticed.\n\nIndependent motivation for this gap:\n- *New Tools are Needed for Tracking Adoption and Adaptation of ML Models with Behavioral Use Clauses* — [arXiv:2505.22287](https://arxiv.org/abs/2505.22287)\n- *Permissive-Washing* (95.8% of permissively-labeled models lack full license text) — [arXiv:2602.08816](https://arxiv.org/abs/2602.08816)\n\n---\n\n## Install\n\n```bash\npip install weightlock          # core (CPU-only, no GPU, no heavy deps)\npip install \"weightlock[rich]\"  # prettier tables\n```\n\n---\n\n## Quickstart\n\n```bash\n# Check one HuggingFace repo\nweightlock check meta-llama/Llama-3.1-8B-Instruct\n\n# In CI: fail the job if any asset is not unconditionally commercial-usable\nweightlock check $(cat models.txt) --context commercial\n\n# Machine-readable output\nweightlock check facebook/musicgen-large --format json\n```\n\n### Exit codes (for `\u0026\u0026` chaining in CI)\n\n| code | meaning |\n|------|---------|\n| `0`  | all assets pass the policy |\n| `1`  | at least one asset violates the policy (the gate did its job) |\n| `2`  | an asset could not be resolved — **fail-closed** |\n| `3`  | invalid configuration |\n\n### Policy flags\n\n| flag | effect |\n|------|--------|\n| `--fail-on nc,gated,unknown,conflict` | which conditions fail the gate (this is the default) |\n| `--strict` | fail on `nc,unknown,conflict` (ignores gating) |\n| `--allow-unknown` | do not fail on unverifiable assets (opt out of fail-closed for `unknown`) |\n| `--context commercial` | treat *restricted* and *prohibited* commercial use as violations |\n| `--format table\\|json` | output format |\n\n`nc` means **not unconditionally commercial-usable**: `commercial_use` is\n`prohibited` or `restricted`, or outputs are non-commercial. \"Restricted\"\n(Llama \u003e700M-MAU, Gemma ToU, RAIL behavioral) counts, because for most orgs it\nis not a clean commercial \"yes\".\n\n---\n\n## How it works\n\n### What it classifies (6 axes + status)\n\n`commercial_use`, `derivatives`, `redistribution`, `gating`,\n`output_restriction`, `attribution` — plus an independent `status`:\n\n- **`ok`** — sources agree.\n- **`conflict`** — the host's declared license, the license body, and/or the\n  curated seed DB disagree. weightlock adopts the *more restrictive* value and\n  flags it. This is how it catches **permissive-washing** (a repo tagged\n  `apache-2.0` whose actual LICENSE body says \"non-commercial\"). `conflict`\n  fails the gate by default.\n- **`unknown`** — nothing could be verified; fail-closed.\n\n### Verdict resolution pipeline\n\n1. **HuggingFace Hub metadata** (declared license, gated) — primary, always current.\n2. **Curated seed DB overlay** — an *authoritative overlay* of ~20 high-authority\n   entries (Llama / Gemma / RAIL / CC-NC / gated families) where the tag alone\n   misleads. On disagreement → `conflict`. Not a substitute for the primary lookup.\n3. **License body / model-card text** — declared-vs-body cross-check.\n4. Nothing resolved → `unknown` (fail-closed).\n\n---\n\n## Architecture\n\n\u003cdiv align=\"center\"\u003e\n  \u003cimg src=\"docs/architecture.png\" alt=\"weightlock architecture\" width=\"840\"\u003e\n\u003c/div\u003e\n\n---\n\n## Honest limitations (v0.1.0a1, alpha)\n\n- **Not legal advice.** Best-effort, conservative, fail-closed.\n- Covers **directly named models** only. Transitive `base_model` scanning, a\n  GitHub Action wrapper and SARIF output are planned for v0.1.1.\n- `gating` is a **snapshot at fetch time**, not continuous monitoring.\n- The seed DB favors authority over coverage; unrecognized licenses return\n  `unknown` and fail-closed rather than guessing.\n- Dataset license classification and CycloneDX ML-BOM export are v0.2 scope.\n- License-body parsing is regex-based and conservative; it reports a conflict\n  only on unambiguous signals.\n- Seed-vs-host `conflict` is keyed on the **commercial-use axis** (the gate's\n  headline judgment). A disagreement on `derivatives` or `redistribution` alone\n  is still merged to the stricter value, but is not raised as a `conflict` until\n  v0.2.\n\n---\n\n## How weightlock differs\n\n- **vs. JFrog Curation** — JFrog can block models by policy in CI, but it is a\n  commercial product with Artifactory lock-in. weightlock is OSS, `pip`-installable\n  and CPU-only.\n- **vs. license-scanning tools** (ScanCode, ORT, licensecheck) — those target\n  *source code* licenses; they don't produce a commercial-use judgment for model\n  weights or a `--fail-on` CI gate.\n\n---\n\n## Development\n\n```bash\nuv venv \u0026\u0026 uv pip install -e \".[dev,rich]\"\nuv run pytest            # unit tests (no network)\nuv run pytest -m network # optional live HuggingFace Hub smoke test\nuv run ruff check .\n```\n\n---\n\n## License\n\nMIT. See [LICENSE](LICENSE).\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhinanohart%2Fweightlock","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhinanohart%2Fweightlock","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhinanohart%2Fweightlock/lists"}