{"id":50454008,"url":"https://github.com/mizcausevic-dev/procurement-pulse-engine","last_synced_at":"2026-06-01T01:05:33.078Z","repository":{"id":360813139,"uuid":"1245936921","full_name":"mizcausevic-dev/procurement-pulse-engine","owner":"mizcausevic-dev","description":"The crawl + aggregate engine behind the AI Procurement Pulse. Probes a universe of vendor domains for the 11 Kinetic Gain Protocol Suite documents and produces the quarterly issue dataset. Issue #1: the zero baseline.","archived":false,"fork":false,"pushed_at":"2026-05-28T00:10:18.000Z","size":32,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-28T02:16:11.977Z","etag":null,"topics":["ai-governance","ai-procurement-pulse","crawler","data-journalism","javascript","kinetic-gain-protocol-suite","procurement","research","well-known"],"latest_commit_sha":null,"homepage":"https://pulse.kineticgain.com/","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mizcausevic-dev.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-05-21T17:53:52.000Z","updated_at":"2026-05-28T00:10:23.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/mizcausevic-dev/procurement-pulse-engine","commit_stats":null,"previous_names":["mizcausevic-dev/procurement-pulse-engine"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/mizcausevic-dev/procurement-pulse-engine","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mizcausevic-dev%2Fprocurement-pulse-engine","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mizcausevic-dev%2Fprocurement-pulse-engine/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mizcausevic-dev%2Fprocurement-pulse-engine/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mizcausevic-dev%2Fprocurement-pulse-engine/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mizcausevic-dev","download_url":"https://codeload.github.com/mizcausevic-dev/procurement-pulse-engine/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mizcausevic-dev%2Fprocurement-pulse-engine/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33755379,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-05-31T02:00:06.040Z","response_time":95,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai-governance","ai-procurement-pulse","crawler","data-journalism","javascript","kinetic-gain-protocol-suite","procurement","research","well-known"],"created_at":"2026-06-01T01:05:33.010Z","updated_at":"2026-06-01T01:05:33.067Z","avatar_url":"https://github.com/mizcausevic-dev.png","language":"JavaScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# procurement-pulse-engine\n\n\u003e The crawl + aggregate engine behind the [AI Procurement Pulse](https://pulse.kineticgain.com/). Probes a universe of vendor domains for the eleven [Kinetic Gain Protocol Suite](https://github.com/mizcausevic-dev/kinetic-gain-protocol-suite) documents and produces the quarterly issue dataset.\n\n```bash\nnode src/run.mjs --issue issue-1 --concurrency 8 --timeout 6000\n# → data/issue-1.json  (the publishable aggregate)\n# → data/issue-1-raw.json  (per-domain ProbeResults)\n```\n\n## Status — [v0.4.0](https://github.com/mizcausevic-dev/procurement-pulse-engine/releases/tag/v0.4.0) (2026-05-28)\n\n- **Universe:** 1,408 unique domains across 49 verticals. Pre-Issue #5 expansion (2026-05-28) added 408 buyer-readable vendors across ten under-represented verticals: Customer Data Platform (25), eCommerce Platform (37), AI Coding \u0026 Developer AI (31), AI Agent Platform (36), AI Data Labeling (32), MarTech (63), Customer Service Platform (30), Document \u0026 eSignature (24), Mid-Market HR/Payroll (48), Vertical SaaS (82). Smoke-tested via Cloudflare DNS — 4 placeholders removed, 6 typos corrected (e.g. `weights-and-biases.com` → `wandb.com`, `mindbody.com` → `mindbodyonline.com`, `stack.com` → `constructconnect.com`). Previous footprint: 1,007 / 38 verticals; 37 at Issue #1. **Locked baseline snapshot** at [`data/baseline-2026-05-28.json`](data/baseline-2026-05-28.json) (1,007 domains, frozen — August Issue #5 measures the +408 expansion against this baseline so the original 1,007 stay comparable issue-to-issue). Snapshot: 1 publisher (KG), 0.10% rate, verifiedRate 1.0.\n- **Quarterly cadence pre-registered.** [`.github/workflows/quarterly-crawl.yml`](.github/workflows/quarterly-crawl.yml) auto-fires Aug/Nov/Feb/May 15 at 14:00 UTC, against universe.csv at HEAD, with drift vs the locked `data/issue-4-v04-full.json` baseline. The August fire is Issue #5.\n- **Per-spec discriminator on all 11 Suite paths.** Each spec now requires its canonical `*_version` field (`agent_card_version`, `incident_card_version`, etc.). Closes the Gatsby/SPA-catchall false positive surfaced in Issue #4 — `corporate.charter.com` dropped 82/100 → 0/100; canonical publisher `kineticgain.com` still 100/100.\n- **First verified signing posture.** All 11 kineticgain.com dogfooded `/.well-known/` docs are ed25519-signed against the public key at [`kineticgain.com/.well-known/pulse-signing.json`](https://kineticgain.com/.well-known/pulse-signing.json). Engine probe reports `{ verified: 11, unsigned: 0, invalid: 0 }`.\n- **Engine tests:** 15/15 pass on `main`.\n\n\u003e The sections below preserve the Issue #1 baseline narrative and the journey through Issues #2–#4. Issue #5 (the first true quarterly delta, August 2026) inherits the v0.4 discriminator + signing posture by default.\n\n## What it does\n\n1. Reads [`universe.csv`](universe.csv) — `domain,vertical,note`, the set of vendors measured this issue.\n2. Probes each domain's eleven well-known paths in parallel (bounded concurrency) using the vendored [`well-known-probe`](https://github.com/mizcausevic-dev/well-known-probe-js) core.\n3. Aggregates into headline rate, per-vertical rollups, per-spec adoption, and a leaderboard.\n4. Writes the issue dataset that [`procurement-pulse-landing`](https://github.com/mizcausevic-dev/procurement-pulse-landing) renders.\n\n## The Issue #1 baseline finding\n\nThe first real crawl (37 domains across 9 verticals — AI Platform, EdTech, HealthTech, FinTech, Enterprise SaaS, Data, Observability, Identity, plus the Kinetic Gain reference properties) returned:\n\n\u003e **0.0% publication rate. Zero domains — including kineticgain.com's own properties — publish any Suite document at `/.well-known/` yet.**\n\nThis is the honest starting line. The Suite shipped in 2025; adoption begins from zero. Issue #1 is **\"The Zero Baseline\"** — the instrument is calibrated, the universe is defined, and every future issue measures the climb. (Methodology note: an empty `index.json` counts as published; a `200` of the wrong shape does not. The zero is real, not a probe artifact — verified against the discriminator-bearing specs too.)\n\n## Usage\n\n```bash\nnpm run crawl -- --issue issue-1                  # full universe.csv\nnode src/run.mjs --issue issue-1 --limit 5        # first 5 domains (smoke test)\nnode src/run.mjs --issue issue-2 --concurrency 12 # next issue, more parallelism\n\n# diff two issues (aggregate deltas; add --from-raw/--to-raw for per-domain movers)\nnpm run drift -- --from data/issue-1.json --to data/issue-2.json \\\n  --from-raw data/issue-1-raw.json --to-raw data/issue-2-raw.json\n```\n\n| Flag | Default | Meaning |\n| --- | --- | --- |\n| `--issue` | `issue-1` | Output filename stem (`data/\u003cissue\u003e.json`) |\n| `--concurrency` | `8` | Parallel domain probes |\n| `--timeout` | `6000` | Per-fetch timeout (ms) |\n| `--limit` | `0` (all) | Cap the universe (for smoke tests) |\n\n## Output shape (`data/\u003cissue\u003e.json`)\n\n```jsonc\n{\n  \"issue\": \"issue-2\",\n  \"generatedAt\": \"2026-05-24T…\",\n  \"universe\": { \"total\": 350, \"verticals\": 18 },\n  \"headline\": { \"domainsPublishingAny\": 1, \"publicationRate\": 0.0029, \"avgScore\": 0.3 },\n  \"signatures\": { \"foundDocs\": 11, \"verifiedDocs\": 0, \"invalidDocs\": 0, \"verifiedRate\": 0 },\n  \"byVertical\": { \"AI Platform\": { \"domains\": 40, \"publicationRate\": 0, \"avgScore\": 0 }, … },\n  \"bySpec\": { \"aeo\": { \"label\": \"AEO Protocol\", \"publishers\": 1, \"rate\": 0.0029, \"verified\": 0 }, … },\n  \"leaderboard\": [ { \"domain\": \"kineticgain.com\", \"score\": 100, \"tier\": \"comprehensive\", \"published\": 11 }, … ]\n}\n```\n\n## Signature verification (ed25519)\n\nEach found document is checked for an ed25519 signature using Node's built-in `crypto` (no dependencies). A signed Suite document carries a top-level `signature` block:\n\n```jsonc\n\"signature\": {\n  \"algorithm\": \"ed25519\",\n  \"public_key\": \"\u003cbase64 SPKI-DER\u003e\",\n  \"signing_key_url\": \"https://vendor.example/.well-known/keys/2026.json\", // optional\n  \"value\": \"\u003cbase64 ed25519 signature\u003e\"\n}\n```\n\nThe signature covers the **canonical** serialization of the document with its own `signature` block removed (recursively key-sorted JSON, array order preserved). Each document resolves to one of three states, surfaced per-document and rolled up into `signatures` / `bySpec[].verified`:\n\n| Status | Meaning |\n| --- | --- |\n| `verified` | Signature present and verifies (tamper-evident). |\n| `unsigned` | No `signature` block. |\n| `invalid` | Signature present but fails verification, or malformed. |\n\nBy default the embedded `public_key` is used (tamper-evidence). Pass `--verify-key-fetch` semantics (`verifyKeyFetch` option) to instead trust the key fetched from `signing_key_url` (provenance). Mirrors [`hash-attestation-rs`](https://github.com/mizcausevic-dev/hash-attestation-rs).\n\n## Drift (`src/drift.mjs`)\n\n`driftAggregate(prev, curr)` diffs two committed issue datasets (headline / per-vertical / per-spec / signature-rate deltas, and flags `new` / `dropped` verticals). `driftDomains(prevRaw, currRaw)` diffs the per-domain raw arrays to find movers — newly publishing, stopped, score and per-spec changes. The CLI writes `data/\u003cto\u003e-drift.json` and prints a human summary.\n\n## Issue markdown (`src/summarize.mjs`)\n\n`summarize.mjs` renders a Pulse-issue markdown body by filling token placeholders in [`docs/issues/ISSUE_TEMPLATE.md`](docs/issues/ISSUE_TEMPLATE.md) with the issue's aggregate JSON and (optional) drift JSON. Output goes to `docs/issues/\u003cstem\u003e.md`, ready to paste into a GitHub Issue or publish via pulse.kineticgain.com.\n\n```bash\nnode src/summarize.mjs --issue issue-2026-08 --baseline issue-4-v04-full --issue-number 5\n```\n\nThe quarterly-crawl workflow invokes the summarizer automatically after drift, so the scheduled August / November / February / May firings produce **JSON + ready-to-publish markdown** in a single commit. The staged preview at [`docs/issues/issue-5-staged-draft.md`](docs/issues/issue-5-staged-draft.md) shows what Issue #5 will look like — generated from the 2026-05-28 snapshot vs the locked v0.4 baseline as a pipeline dry-run.\n\n## Crawl etiquette\n\nGET-only requests to public, designed-to-be-fetched `/.well-known/` paths. Bounded concurrency, per-request timeout, `redirect: follow`. No authentication, no headless browser, no scraping of page content. The universe is published alongside each issue so the run is reproducible by anyone.\n\n## Roadmap\n\n- **Expand the universe** further toward ~1,200 domains. Issue #2 grew the lens from 37 → **350 domains across 18 verticals**; the next pass deepens toward Fortune 500 + top 100 K-12 EdTech + top 50 HealthTech AI.\n- ✅ **Signature verification** — ed25519 over the canonical document hash (`src/signature.mjs`).\n- ✅ **Drift tracking** — `src/drift.mjs` diffs each issue against the previous.\n- **Scheduled GH Action** that runs the crawl quarterly and commits the dataset, which triggers the landing-site rebuild.\n- **Sign the dogfooded docs** — sign kineticgain.com's own Suite documents so they flip from `unsigned` to `verified`.\n\n## Composes with\n\n| Repo | Relationship |\n| --- | --- |\n| [`well-known-probe-js`](https://github.com/mizcausevic-dev/well-known-probe-js) | The probe core (vendored into `src/probe.mjs`) |\n| [`procurement-pulse-landing`](https://github.com/mizcausevic-dev/procurement-pulse-landing) | Renders this engine's dataset at pulse.kineticgain.com |\n| [`kinetic-gain-protocol-suite`](https://github.com/mizcausevic-dev/kinetic-gain-protocol-suite) | The eleven specs measured |\n| [`aeo-crawler`](https://github.com/mizcausevic-dev/aeo-crawler) | Heavyweight crawler; this is the Pulse-specific lightweight aggregator |\n\n## License\n\nMIT.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmizcausevic-dev%2Fprocurement-pulse-engine","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmizcausevic-dev%2Fprocurement-pulse-engine","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmizcausevic-dev%2Fprocurement-pulse-engine/lists"}