https://github.com/wan-huiyan/dashboard-audit-toolkit

Synchronous Claude Code toolkit for auditing live data dashboards end-to-end — parallel cluster agents methodology + the most common fix-shape + single-metric depth audit + the squash-merge gotcha. Sister to wan-huiyan/overnight-workflows.
https://github.com/wan-huiyan/dashboard-audit-toolkit

claude-code claude-code-plugin claude-code-skill dashboard-audit data-dashboard fact-checking language-audit parallel-agents

Last synced: about 1 month ago
JSON representation

Host: GitHub
URL: https://github.com/wan-huiyan/dashboard-audit-toolkit
Owner: wan-huiyan
License: mit
Created: 2026-05-08T09:52:42.000Z (3 months ago)
Default Branch: main
Last Pushed: 2026-05-15T22:09:28.000Z (2 months ago)
Last Synced: 2026-05-28T17:27:09.008Z (about 2 months ago)
Topics: claude-code, claude-code-plugin, claude-code-skill, dashboard-audit, data-dashboard, fact-checking, language-audit, parallel-agents
Size: 24.4 KB
Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# Dashboard Audit Toolkit

A synchronous toolkit of four [Claude Code](https://claude.com/claude-code) skills for **auditing a live data dashboard end-to-end and shipping the fixes** in a single ~30–45-minute round, without overnight runs or autonomous loops.

[![license](https://img.shields.io/github/license/wan-huiyan/dashboard-audit-toolkit)](LICENSE)
[![last commit](https://img.shields.io/github/last-commit/wan-huiyan/dashboard-audit-toolkit)](https://github.com/wan-huiyan/dashboard-audit-toolkit/commits)
[![Claude Code](https://img.shields.io/badge/Claude_Code-plugin-orange)](https://claude.com/claude-code)

## The four plugins

| Plugin | Role |
|---|---|
| [**whole-dashboard-factcheck-via-parallel-cluster-agents**](plugins/whole-dashboard-factcheck-via-parallel-cluster-agents/) | The methodology spine. Cluster a 6+ page dashboard by audience-purpose (3–4 clusters), dispatch one research subagent per cluster, synthesise into a single audit report with TL;DR + per-page detail + "what's solid" credibility anchor. |
| [**template-hardcoded-literal-vs-existing-payload**](plugins/template-hardcoded-literal-vs-existing-payload/) | The most common fix-shape produced by the audit: the page renders `~930` while the backend payload (which has been correct for months) says `94`. Diagnostic + zero-data-pipeline-change fix. |
| [**dashboard-metric-label-vs-sql-definition**](plugins/dashboard-metric-label-vs-sql-definition/) | Single-metric depth audit. When a stakeholder pushes back on what a KPI "means," read the SQL — not the rendered label, the variable name, or the column alias. |
| [**gh-squash-merge-closes-only-one-issue**](plugins/gh-squash-merge-closes-only-one-issue/) | Operational gotcha when running the audit→issues→fix-bundles flow at scale: GitHub squash-merge auto-closes only ONE issue per PR. Path A prevents; Path B recovers. |

These skills are designed as a set. The methodology spine generates the audit; the fix-shape skill resolves most of what it finds; the depth audit handles single-metric pushbacks; and the GitHub gotcha makes shipping the fixes mechanical.

## When to reach for this toolkit

- You have a **live data dashboard** (BigQuery, Postgres, JSON cache, baked-payload table) with **6+ user-facing pages** and a stated voice/persona doc.
- A **leadership review, client demo, or Phase 2 hand-over** is approaching, and you want confidence the rendered numbers match the source database AND the language lands for the stated audiences.
- You can sit synchronously for **30–45 minutes** of audit + a few hours of follow-up fixes — this is NOT an overnight workflow.

If your situation is different, see **Related repos** below.

## Why use these

A naive "review this dashboard" prompt produces a generic-sounding bullet list that misses the load-bearing problems. The most common failure modes:

- **Hardcoded literals next to correct payloads.** The template renders `~930 students` next to a baked payload that says `cohort_size=94`. Counselors export "930 to call" and Salesforce shows 94. Silent. No error. No broken build.
- **Display labels diverge from SQL definitions.** "Active Students" sounds like "students with recent activity" but the SQL is `COUNT(*) FROM predictions WHERE scoring_date = MAX(scoring_date) AND term_suppressed IS NULL` — i.e. "rows in today's scoring run, post-suppression." Two different things.
- **Audit shipped, but issues stay open.** PR squash-merges with `Closes #447, #448, #449, #450`. Only `#447` closes. Three confused minutes later, you realise GitHub's parser binds one keyword to one issue.
- **Single-agent shallow coverage.** Asking one agent to cover 14 pages produces shallow output and runs out of context. Asking page-by-page synchronously takes hours.

These plugins encode the patterns that fix each of these — extracted from real audits that caught real cohort-size and language errors before they reached real stakeholders.

## Core patterns (shared across the toolkit)

### 1. Cluster split by audience-purpose

Don't cluster by URL prefix. Cluster by **what each page is for** — typically `Headline / executive`, `Operational / action surfaces`, `Library / methodology / explorer`. Three clusters is the sweet spot for 12–16 pages; four for larger surfaces.

### 2. Diagnostic table schema

The load-bearing artefact is a per-page table:

| Element | Source | Claimed | Actual (DB) | Verdict |
|---|---|---|---|---|
| ... | payload-derived / hardcoded / mock-data / route-computed | what page renders | what BQ says | ✅ / ⚠️ / ❌ |

This schema makes mismatches scannable AND makes "what to fix first" mechanical. Reused across all four plugins.

### 3. Audit → issues → fix-bundles flow

Ship the audit doc as **PR-1 (report-only, docs label)** so it lands fast. File **one GitHub issue per P0 item** with labels at creation. Bundle fixes into **2–4 PRs** by *kind* (wiring fixes, jargon cleanup, copy edits) — not one mega-PR and not N mini-PRs. Each fix PR closes its bundle of issues with one keyword per issue (`Closes #X. Closes #Y.` — never `Closes #X, #Y`).

### 4. Fix in the consumer, not the producer

The most common fact-check finding produced by this audit is **the consumer never connected to a correct producer**. Don't reach for the bake / data pipeline / source-of-truth table. Add a `{{ variable }}` accessor in the template, thread it through the route handler, and ship in 30 LoC.

### 5. Read the SQL, not the label

When a stakeholder pushes back on what a KPI means, open the query. Quote the aggregation (`COUNT(*)`, `SUM`) and the **complete** `WHERE` clause. Both shape the meaning. Note every UI toggle that silently re-parametrises the count.

### 6. "What's solid" credibility anchor

Pure-negative reports erode trust in the audit's calibration. Always include a "What's solid" section ("baked KPIs match BQ exactly", "diagnostics floor logic correct", "AI Insights numbers are arithmetically right — only the labels are wrong") so the negative finds land credibly.

## Installation

```bash
# Add the marketplace
/plugin marketplace add wan-huiyan/dashboard-audit-toolkit

# Install one or all four plugins
/plugin install whole-dashboard-factcheck-via-parallel-cluster-agents@wan-huiyan-dashboard-audit-toolkit
/plugin install template-hardcoded-literal-vs-existing-payload@wan-huiyan-dashboard-audit-toolkit
/plugin install dashboard-metric-label-vs-sql-definition@wan-huiyan-dashboard-audit-toolkit
/plugin install gh-squash-merge-closes-only-one-issue@wan-huiyan-dashboard-audit-toolkit
```

Or clone directly:

```bash
git clone https://github.com/wan-huiyan/dashboard-audit-toolkit.git
cp -R dashboard-audit-toolkit/plugins/whole-dashboard-factcheck-via-parallel-cluster-agents ~/.claude/skills/
cp -R dashboard-audit-toolkit/plugins/template-hardcoded-literal-vs-existing-payload ~/.claude/skills/
cp -R dashboard-audit-toolkit/plugins/dashboard-metric-label-vs-sql-definition ~/.claude/skills/
cp -R dashboard-audit-toolkit/plugins/gh-squash-merge-closes-only-one-issue ~/.claude/skills/
```

## Quick start

> "Review this dashboard end-to-end. Fact-check the rendered numbers against the BigQuery payload table, and audit the language for `PRODUCT.md`'s two stated audiences. Three clusters: headline / operational / library. Report only — I'll triage what to fix."

The methodology-spine plugin will:

1. Map the user-facing routes + payload table + source-of-truth tables.
2. Cluster pages by audience-purpose (3–4 clusters).
3. Dispatch one research subagent per cluster in parallel — same prompt template, same diagnostic-table output schema.
4. Synthesise into a single Markdown audit report at `docs/reviews/__factcheck_language_audit.md`.

Once you triage, the other three plugins handle most of what the audit surfaces:

- Most fact-check P0s → `template-hardcoded-literal-vs-existing-payload`
- Stakeholder pushback on what a metric "means" → `dashboard-metric-label-vs-sql-definition`
- Issues that won't close after fix-PRs merge → `gh-squash-merge-closes-only-one-issue`

## Related repos

This toolkit is for **synchronous** audits — one ~30–45 min round of parallel cluster agents, with you in the loop. For different shapes:

- **Autonomous overnight runs** — [`wan-huiyan/overnight-workflows`](https://github.com/wan-huiyan/overnight-workflows). Two sister plugins:
- `overnight-review-client-delivery` — you already have a client deliverable that needs polishing + quality-gating before a morning hand-off. 8-agent review panel.
- `overnight-insight-discovery` — generate a client-facing insight brief from scratch, surfacing funnel leaks and surprise patterns from data.
- **Per-instance SHAP waterfall + post-hoc calibrator** — [`wan-huiyan/shap-waterfall-calibrator-skill`](https://github.com/wan-huiyan/shap-waterfall-calibrator-skill). One of the more common per-instance explainability bugs an audit will surface: the chart's Final bar disagrees with the score chip beside it, and the "rescale to fit" fix introduces sign-flip artifacts on a different slice of instances. The correct fix (point-by-point calibration of the cumulative path) and the SHAP-maintainer caveats.
- **Single-page sanitize for client send** — `client-readiness-pass-on-rendered-html` (separate skill, not in this toolkit). Different shape: one page (or a small subset) inside an internal-flavored generated site, sanitised so it can be sent directly to a non-technical client without rebuilding the generator.
- **Plan / requirements docs** — `document-review` (separate skill). Persona-agent review of plans/requirements rather than cluster agents over a live product.
- **Adversarial debate on a design** — `agent-review-panel` (separate skill). Multi-agent debate with a Supreme Judge.

## License

MIT — see [LICENSE](LICENSE).

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/wan-huiyan/dashboard-audit-toolkit

Awesome Lists containing this project

README