{"id":50363911,"url":"https://github.com/seasontemple/transmutary","last_synced_at":"2026-06-03T08:01:10.529Z","repository":{"id":361167707,"uuid":"1253348282","full_name":"SeasonTemple/transmutary","owner":"SeasonTemple","description":"Proactive open-source ecosystem intelligence — watch repositories and their dependencies, diagnose changes, and get pushed what matters before it becomes an incident.","archived":false,"fork":false,"pushed_at":"2026-05-30T02:08:46.000Z","size":460,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-30T03:21:43.869Z","etag":null,"topics":["devops","github","llm","monitoring","observability","prompt-injection","python","rss","supply-chain-security"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/SeasonTemple.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-05-29T11:22:38.000Z","updated_at":"2026-05-30T02:08:48.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/SeasonTemple/transmutary","commit_stats":null,"previous_names":["seasontemple/transmutary"],"tags_count":2,"template":false,"template_full_name":null,"purl":"pkg:github/SeasonTemple/transmutary","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SeasonTemple%2Ftransmutary","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SeasonTemple%2Ftransmutary/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SeasonTemple%2Ftransmutary/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SeasonTemple%2Ftransmutary/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/SeasonTemple","download_url":"https://codeload.github.com/SeasonTemple/transmutary/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SeasonTemple%2Ftransmutary/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33854119,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-03T02:00:06.370Z","response_time":59,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["devops","github","llm","monitoring","observability","prompt-injection","python","rss","supply-chain-security"],"created_at":"2026-05-30T03:01:17.198Z","updated_at":"2026-06-03T08:01:10.522Z","avatar_url":"https://github.com/SeasonTemple.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cdiv align=\"center\"\u003e\n\n# Transmutary\n\n**Proactive open-source ecosystem intelligence — continuously watch repositories and their dependencies, turn changes into diagnostic reports, and get pushed what matters before it becomes an incident.**\n\n\u003cimg src=\"assets/dashboard.png\" alt=\"Transmutary dashboard: watchlist, supply-chain alerts, trend candidates, and recent reports\" width=\"920\" /\u003e\n\n\u003csub\u003eDashboard + offline demo: observe repos, triage supply-chain alerts, promote trend candidates, and preview the full pipeline with zero credentials. \u003ca href=\"#dashboard\"\u003eOpen the dashboard →\u003c/a\u003e\u003c/sub\u003e\n\n[![License: Apache 2.0](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](LICENSE)\n[![Python 3.9+](https://img.shields.io/badge/python-3.9%2B-blue.svg)](https://www.python.org/)\n[![CI](https://github.com/SeasonTemple/transmutary/actions/workflows/ci.yml/badge.svg)](https://github.com/SeasonTemple/transmutary/actions/workflows/ci.yml)\n[![Tests: 440 passing](https://img.shields.io/badge/tests-440_passing-brightgreen.svg)](#tests)\n\n[English](README.md) · [简体中文](README.zh-CN.md) · [Why](#why-transmutary) · [Dashboard](#dashboard) · [Try the demo](#try-the-demo) · [Getting started](#getting-started) · [How it works](#how-it-works) · [Releases](#releases--versioning)\n\n\u003c/div\u003e\n\n---\n\nA repository-observation system for external open-source **ecosystem intelligence**. It continuously watches a set of repositories and their dependencies, turns changes into readable diagnostic / explanatory reports, and pushes them to its subscribers — converting *reactive post-incident investigation* into *proactive awareness*.\n\n## Why Transmutary\n\nSubscribers are structurally late to changes in the ecosystem they depend on:\n\n- **Dependency breakage is found after the fact** — an upstream CLI tool changes, your internal gateway starts returning 504s, and someone manually asks an LLM to cross-check the two repos.\n- **No discovery channel for AI trends** — fast-rising tools surface on social media, not in any feed you own.\n- **Slow reaction to supply-chain attacks** — a malicious npm release lands before anyone notices.\n\nTransmutary closes these gaps with a pure-pull, all-free-data-source pipeline that needs no webhooks and no paid APIs.\n\n## Observation modes\n\nThe system is **two collection pipelines + one shared delivery layer** (two pipelines, one delivery layer — not a unified engine):\n\n- **Mode A · event-driven (watchlist)** — watches specific repos a subscriber maintains or depends on. On any change (release, issue surge, supply-chain advisory) it detects, diagnoses the source, and pushes by severity.\n- **Mode B · scheduled batch (trend radar)** — periodically scans a defined scope (MVP: AI domain), finds repos with rapidly rising stars, and emits explanatory summaries.\n\nThe two modes diverge only at the collection stage, then share `LLM report → channel delivery (private RSS / email)`. A repo discovered by Mode B can be **promoted** into Mode A's watchlist.\n\n## Try the demo\n\nSee the whole pipeline run in one command — **zero credentials, zero network, zero LLM**:\n\n\u003cimg src=\"assets/demo.gif\" alt=\"transmutary-demo: one offline command runs the full pipeline\" width=\"720\" /\u003e\n\n```bash\npip install -e .\ntransmutary-demo\n```\n\nIt feeds the *real* pipeline (`build_runtime` + the three ticks) built-in mock data through an `httpx.MockTransport` and a stub LLM, so nothing leaves the process: no GitHub token, no API key, no outbound HTTP, no model call. One pass produces a release diagnosis, an issue-surge diagnosis (with dependency-edge related context), a supply-chain alert from a deterministic OSV hit, and three trend explanations.\n\nArtifacts land in a fresh temp directory (printed at the top of the run; pass `--out DIR` to choose one) with the exact same private layout the real service writes — `0700` dirs, `0600` files:\n\n```\n\u003cartifact_root\u003e/\n├── octocat__hexbridge-cli/        # per-repo analysis archive (canonical, R24)\n│   └── \u003cts\u003e-diagnose.md\n├── _delivered/\n│   ├── immediate/                 # urgent route: diagnosis + supply-chain alert\n│   └── digest/                    # digest route: trend explanations\n└── _feed/\n    ├── immediate.atom.xml         # private RSS feeds, one per route\n    └── digest.atom.xml\n```\n\nThe run prints the artifact tree plus a couple of rendered-report excerpts so you can read the output it would deliver. Reproduce by copy-pasting the two commands above — no setup.\n\nTo refresh the README terminal GIF after CLI/demo copy changes:\n\n```bash\nvhs assets/demo.tape\n```\n\n`assets/demo.tape` records only the offline `transmutary-demo` command. The Web dashboard image (`assets/dashboard.png`) should be refreshed from a browser screenshot so the admin UI is captured as users see it.\n\n## How it works\n\n```\ncollect → clean → dedup → filter → report → deliver\n```\n\n- **Pure-pull architecture** — no webhooks (you can't create webhooks on third-party repos); Atom feeds + incremental REST polling instead.\n- **Clean before LLM** — structured checks (URL/content fingerprint, staleness, reachability) run first; only passing content reaches the LLM for chunk-level relevance filtering.\n- **L1 → L2 → L3 funnel** — cheap keyword/rule gating (L1), then embedding-cosine *semantic grouping* of survivors (L2, representative-linkage, zero-miss), then the expensive LLM-as-judge once per group (L3). Authoritative supply-chain / release signals bypass L2; if embedding is unavailable the funnel degrades to full L3 so nothing is dropped.\n- **Deterministic API, LLM only for semantics** — external APIs run through deterministic code; the LLM only does diagnosis / relevance / summarization. Security verdicts are cross-validated against deterministic OSV/GHSA hits.\n- **Tiered scheduling** — a single resident service with internal cadences: supply-chain (minutes), releases/issues (~10 min), trends (daily).\n- **Security baseline** — untrusted external content is structurally isolated from instructions (prompt-injection defense); credentials live only in env, never persisted; SSRF allowlist with no redirects; private access-controlled artifacts.\n\n## Getting started\n\n### Install\n\n```bash\npython -m venv .venv\n.venv/bin/pip install -e \".[dev]\"\n```\n\n### Configure\n\nCopy the example configs and fill them in. Non-LLM credentials (GitHub token, SMTP, RSS) are read from environment variables only. LLM credentials may come from env vars **or** `config/llm.yaml` (0600 permissions).\n\n```bash\ncp config/watchlist.example.yaml   config/watchlist.yaml\ncp config/trend_scope.example.yaml config/trend_scope.yaml\ncp config/delivery.example.yaml    config/delivery.yaml\nexport TRANSMUTARY_GITHUB_TOKEN=...      # read-only\n```\n\n**LLM configuration** — three options (pick one):\n\n```bash\n# Option 1: environment variables\nexport TRANSMUTARY_LLM_API_KEY=...       # any LiteLLM-supported provider\nexport TRANSMUTARY_LLM_BASE_URL=...      # optional: OpenAI/Anthropic-compatible endpoint\n\n# Option 2: interactive CLI wizard\n.venv/bin/transmutary config\n\n# Option 3: Dashboard Settings panel → LLM Configuration\n#           (available after starting the dashboard)\n```\n\n### Verify\n\n```bash\n.venv/bin/python -m pytest -q\n.venv/bin/ruff check src tests\n```\n\n## Configuration\n\n| File | Purpose |\n|---|---|\n| `config/watchlist.yaml` | Mode A repos + manual dependency edges |\n| `config/trend_scope.yaml` | Mode B scope filter (topics + keywords) |\n| `config/delivery.yaml` | DB/artifact paths, digest hour, optional RSS feed dir + SMTP recipients |\n| `config/llm.yaml` | LLM API key + optional base URL (0600, env vars take precedence) |\n\n## Output \u0026 storage\n\nTwo roots, both configured in `delivery.yaml`. All reports are private (files `0600`, dirs `0700`) and gitignored — nothing is committed.\n\n```\n\u003cartifact_root\u003e/\n├── \u003cowner\u003e__\u003crepo\u003e/                       # per-repo analysis archive (canonical, R24)\n│   └── \u003cts\u003e-\u003ckind\u003e.md                     #   citation-bearing record of each report\n├── _delivered/\u003croute\u003e/                    # channel-rendered reports\n│   └── \u003cowner\u003e__\u003crepo\u003e-\u003ckind\u003e.md          #   route = immediate (urgent) | digest\n└── _feed/\u003croute\u003e.atom.xml                 # private RSS feed, one per route\n\n\u003cstate_db_path\u003e  state.sqlite3  (SQLite, WAL)\n  event_fingerprint   event dedup (release / advisory / issue-cluster)\n  seen_set            rolling 7-day seen set (artifact diff)\n  issue_baseline      per-repo issue-rate baseline\n  collect_cursor      per-repo incremental since-cursor (survives restart)\n  star_snapshot       Mode B star snapshots (growth rate)\n  subscriber_token    per-subscriber RSS tokens (revoke / expiry)\n  promoted_repo       mode-B candidates promoted into the watchlist (F4)\n```\n\nOne pipeline pass (tick):\n\n```\ncollect (atom + incremental REST)\n  → dedup (event_fingerprint / seen_set)\n  → release → diagnose   |   issues → filter funnel (L1 rules → L2 semantic group → L3 judge) → diagnose\n  → diagnose (LLM + R18 quality gate + OSV/GHSA cross-validation)\n  → archive per-repo artifact  +  deliver (route → _delivered/\u003croute\u003e/ + RSS; email on immediate)\n  → persist state (advance cursor, update baseline, record fingerprints)\n```\n\nRouting is severity-driven: urgent (malware/critical) → `immediate` + email; everything else (or R18-downgraded) → `digest`.\n\n## Promotion (mode-B → mode-A)\n\nPromotion adds a mode-B trend candidate to the effective watchlist so it is observed by mode A. Use the `transmutary` CLI:\n\n```bash\ntransmutary promote owner/repo            # add to the watchlist (persisted)\ntransmutary promote owner/repo --source manual\ntransmutary demote owner/repo             # remove\ntransmutary list-watchlist                # config repos + promoted repos, with source\n```\n\nThe effective watchlist is `config watchlist ∪ promoted_repo`. The CLI runs in a separate process and only writes the shared `promoted_repo` table; a running service's periodic **reconcile** job (every 60s) full-syncs its per-repo jobs to the effective watchlist, so a promote/demote takes effect **without restarting** the service. Promotion never touches credentials.\n\n## Deployment\n\nRun the resident service (embedded tiered scheduler) via Docker:\n\n```bash\ndocker pull ghcr.io/seasontemple/transmutary:latest\ncp .env.example .env            # fill credentials (gitignored, never baked into the image)\n# prepare ./config/{watchlist,trend_scope,delivery}.yaml\n#   delivery.yaml: point state_db_path \u0026 artifact_root under /var/lib/transmutary\ndocker compose up -d\n# optional local dashboard/admin UI\ndocker compose --profile dashboard up -d dashboard\n```\n\nThe image runs as a non-root user; credentials come from `.env` at runtime; the state DB and private artifacts persist in the `transmutary-state` volume. The dashboard profile shares the same config mount and state volume, binds `127.0.0.1:8787` by default, and expects `TRANSMUTARY_ADMIN_TOKEN` for the settings UI. For public access, keep it behind HTTPS/auth/rate limiting. Without Docker, run the entrypoints directly: `transmutary-serve` and `transmutary-dashboard` (both read `TRANSMUTARY_CONFIG_DIR`, default `config`).\n\nRelease images are published to GHCR as\n`ghcr.io/seasontemple/transmutary:vX.Y.Z`, `ghcr.io/seasontemple/transmutary:X.Y.Z`,\nand `ghcr.io/seasontemple/transmutary:latest`. For a local source build, use\n`docker build -t transmutary:local .` and run compose with\n`TRANSMUTARY_IMAGE=transmutary:local`.\n\n## Dashboard\n\nA local web view over the system's runtime state and archived reports — open it in a browser to see the effective watchlist, recent diagnostic/explanatory reports, supply-chain alerts, trend candidates, per-repo runtime (issue baseline / star snapshots / cursor), and feed links.\n\n```bash\npip install -e \".[dashboard]\"     # adds jinja2 (Starlette/uvicorn are already core)\ntransmutary-dashboard             # serves on http://127.0.0.1:8787\n```\n\nIt reuses the existing Starlette stack and store interfaces. On localhost it can promote/demote Mode B candidates through a server-side confirmation flow; the POST only writes the shared `promoted_repo` table, so the resident service picks it up on the next reconcile pass without restart.\n\nThe Settings area is an authenticated admin control plane for non-secret config: add/remove tracked repos, add manual dependency edges, edit trend topics/keywords, and adjust email recipients / digest hour. It does **not** edit YAML files directly. Effective runtime config is `YAML base ∪ SQLite admin overrides ∪ promoted_repo`; the service reconcile job picks up repo-scope changes without restart. Provider secrets stay env-only: GitHub/SMTP/RSS/LLM credentials are never accepted through the UI, never rendered, and never persisted. The UI only shows configured/missing status.\n\nSecurity posture: binds `127.0.0.1` by default; a non-localhost bind is **refused** unless you pass `--allow-public`; public binds stay **read-only** unless you also pass `--allow-public-writes`. Settings writes require `TRANSMUTARY_ADMIN_TOKEN` login, a signed HttpOnly session cookie, double-submit CSRF, `SameSite=Strict`, Origin/Referer host check, same-origin form action, and server-side validation. Public admin use should still sit behind HTTPS, rate limits, and repo allow-lists.\n\nExternal repository content is HTML-escaped (XSS), dangerous source URLs are blanked, and credentials/tokens are never rendered. Filesystem paths (`state_db_path`, `artifact_root`, `feed_dir`) and provider credentials remain YAML/env-owned.\n\nThe UI is a modern sidebar dashboard (stat tiles, Sentry-style issue stream, severity encoded by colour + icon + text for accessibility), with a light/dark theme toggle and an EN/中文 language toggle (both remembered, both rendered server-side on first paint so there is no flash). A per-request CSP nonce keeps the inline theme bootstrap script precisely allow-listed without weakening the policy.\n\n**Agent-native access** — every data endpoint supports content negotiation: request `Accept: application/json` (or `?format=json`) to get the view-model as JSON instead of HTML, with the same credential/token exclusion as the HTML path. `GET /llms.txt` describes the endpoint surface for agents (no private data).\n\n### LAN / intranet access\n\nTo expose the dashboard on your local network (e.g. for team viewing or a reverse proxy on another machine), use the `dashboard-lan` profile:\n\n```bash\ndocker compose --profile dashboard-lan up -d dashboard-lan\n```\n\nThis binds `0.0.0.0:8787` on the host while reusing the same `.env`, config, and `transmutary-state` volume. The `--allow-public` flag is already set (it skips Host-header validation and sets `csrf_secure=False` so cookies work over plain HTTP). All hard gates remain: explicit opt-in, CSRF double-submit, and `TRANSMUTARY_ADMIN_TOKEN` authentication.\n\n**Important:** this mode accepts plaintext HTTP on your LAN. Only run it on a trusted network segment. For public internet access, place it behind an HTTPS reverse proxy with authentication and rate limiting (see security posture above).\n\n## Architecture \u0026 docs\n\n- Domain glossary: [`CONTEXT.md`](CONTEXT.md)\n- Requirements (brainstorm): [`docs/brainstorms/`](docs/brainstorms/)\n- Implementation plans: [`docs/plans/`](docs/plans/)\n\n## Releases \u0026 versioning\n\nReleases are version-automated with [python-semantic-release](https://python-semantic-release.readthedocs.io/). Version numbers, the changelog, tags, and GitHub Releases are derived from [Conventional Commits](https://www.conventionalcommits.org/) on `main`:\n\n- `feat:` → minor · `fix:` / `perf:` → patch · `BREAKING CHANGE:` → major.\n\nGitHub Release body is curated, not left to generated notes. Each published tag\nmust have a bilingual note at `docs/release-notes/vX.Y.Z.md` with both `## 中文`\nand `## English`; the release workflow applies that file after publishing.\nRelease assets include the Python wheel/sdist and a GHCR Docker image.\n\nEnable the local commit-message hook once after cloning:\n\n```bash\ngit config core.hooksPath .githooks\ngit config commit.template .gitmessage\n```\n\nSee [`CHANGELOG.md`](CHANGELOG.md) for machine-derived history and\n[`docs/release-workflow.md`](docs/release-workflow.md) for the release-note\npolicy.\n\n## Project\n\n### Status\n\n| Stage | Status |\n|---|---|\n| Requirements + plan | ✅ done (multi-round review) |\n| Phase 0 — shared skeleton (U1-U5, U14) | ✅ done |\n| Phase 1 — Mode A (collect / diagnose / deliver / supply-chain) | ✅ done · F1 real-repo milestone verified |\n| Phase 2 — Mode B (trend radar) | ✅ done |\n| Phase 3 — scheduling wiring (pipeline + service) | ✅ done |\n| Phase B — F4 promotion · deployment · L2 semantic grouping · critique→refine | ✅ done |\n| Offline demo (`transmutary-demo`) | ✅ done |\n| Web dashboard (`transmutary-dashboard`) | ✅ done |\n| Tests | ✅ 440 passing · ruff clean |\n\n### Roadmap\n\nDeferred by design: channel interface abstraction, Web secret storage/rotation, multi-user RBAC/OAuth, subscription config, live resident run controls. (The dashboard, admin non-secret settings UI, one-click promote/demote UI, L2 semantic grouping, and the optional critique→refine report pass are implemented — see [Dashboard](#dashboard) and [How it works: optional critique→refine](#how-it-works-optional-critiquerefine-r11).)\n\n### Tests\n\n```bash\n.venv/bin/python -m pytest -q      # 440 passing\n.venv/bin/ruff check src tests     # clean\n```\n\n### Contributing\n\nSee [CONTRIBUTING.md](CONTRIBUTING.md) for dev setup, the Conventional-Commits convention (enforced by `.githooks/commit-msg`), and the automated release process. PRs target `main`.\n\n### License\n\n[Apache-2.0](LICENSE) © SeasonTemple\n\n\n## How it works: optional critique→refine (R11)\n\nBoth diagnostic reports (mode A) and explanatory reports (mode B) support an\n**optional** three-stage quality pass: `synthesize (draft) → critique → refine`,\n**off by default**. It is enabled per-tick via `refine_reports=True` on\n`run_release_issue_tick` / `run_trend_tick`.\n\n- **Synthesize** — produce a single-pass draft (today's behavior).\n- **Critique** — have the model critique its own draft against the evidence\n  (unsupported claims / omissions / logic gaps).\n- **Refine** — rewrite the draft per the critique, without going beyond the\n  supplied evidence.\n\nThe critique/refine **instructions** go in the trusted system slot; the draft,\nthe critique, and the evidence go **only** in the untrusted data slot (injection\nisolation, KTD3). The key guarantee: **the revised text is NOT exempt from any\nsecurity control** — a diagnostic revised draft passes the exact same OSV/GHSA\ncross-validation + verdict sanitization + R18 source gate as the single-pass\ndraft; critique→refine runs only *before* draft generation and never bypasses the\ndownstream security pipeline (KTD-C). Any LLM failure in either stage degrades to\nthe draft, so a report is never lost to it (KTD-D). With `refine_reports=False`\n(the default), behavior is identical to before.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fseasontemple%2Ftransmutary","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fseasontemple%2Ftransmutary","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fseasontemple%2Ftransmutary/lists"}