{"id":44235708,"url":"https://github.com/cmdrvl/rvl","last_synced_at":"2026-02-25T00:10:51.672Z","repository":{"id":336034065,"uuid":"1147599636","full_name":"cmdrvl/rvl","owner":"cmdrvl","description":"rvl reveals the smallest set of numeric changes that explain what actually changed between two datasets — or confidently tells you nothing changed.","archived":false,"fork":false,"pushed_at":"2026-02-24T03:17:32.000Z","size":413,"stargazers_count":0,"open_issues_count":0,"forks_count":1,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-02-24T09:42:43.161Z","etag":null,"topics":["cli","csv","data","data-quality","data-validation","diff","finance","numerical-analysis","open-source","ops","rust","tooling"],"latest_commit_sha":null,"homepage":"https://cmdrvl.com","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/cmdrvl.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null}},"created_at":"2026-02-02T01:24:06.000Z","updated_at":"2026-02-24T03:17:36.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/cmdrvl/rvl","commit_stats":null,"previous_names":["cmdrvl/rvl"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/cmdrvl/rvl","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cmdrvl%2Frvl","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cmdrvl%2Frvl/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cmdrvl%2Frvl/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cmdrvl%2Frvl/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/cmdrvl","download_url":"https://codeload.github.com/cmdrvl/rvl/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cmdrvl%2Frvl/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29806147,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-24T22:43:48.403Z","status":"ssl_error","status_checked_at":"2026-02-24T22:43:18.536Z","response_time":75,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cli","csv","data","data-quality","data-validation","diff","finance","numerical-analysis","open-source","ops","rust","tooling"],"created_at":"2026-02-10T09:26:55.664Z","updated_at":"2026-02-25T00:10:51.665Z","avatar_url":"https://github.com/cmdrvl.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"# rvl\n\n\u003cdiv align=\"center\"\u003e\n\n[![CI](https://github.com/cmdrvl/rvl/actions/workflows/ci.yml/badge.svg)](https://github.com/cmdrvl/rvl/actions/workflows/ci.yml)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n[![GitHub release](https://img.shields.io/github/v/release/cmdrvl/rvl)](https://github.com/cmdrvl/rvl/releases)\n\n**Reveal the smallest set of numeric changes that explain what actually changed.**\n\nNo AI. No inference. Pure deterministic arithmetic.\n\n```bash\nbrew install cmdrvl/tap/rvl\n```\n\n\u003c/div\u003e\n\n---\n\n## TL;DR\n\n**The Problem**: Comparing CSV exports by hand is slow and noisy — Excel hell, brittle scripts, eyeballing numbers. When two files differ, you need to know *what actually changed* and whether it matters.\n\n**The Solution**: One command, one verdict. `rvl` finds the smallest ranked set of numeric deltas that explain the change — or proves nothing changed — using deterministic arithmetic. Never probabilistic. Never ambiguous.\n\n### Why Use rvl?\n\n| Feature | What It Does |\n|---------|--------------|\n| **Ranked explanations** | Finds the fewest cells that account for ≥95% of total numeric change |\n| **Three clear outcomes** | REAL CHANGE, NO REAL CHANGE, or REFUSAL — never a partial answer |\n| **Tolerance-aware** | Ignores floating-point noise below your threshold — no false positives |\n| **Machine-readable** | `--json` output for pipelines, CI gates, and automation |\n| **Zero config** | Auto-detects delimiters, numeric formats, currency symbols, accounting parens |\n| **Deterministic** | Same inputs always produce the same output — no sampling, no heuristics |\n\n---\n\n## Quick Example\n\n```bash\n$ rvl old.csv new.csv --key id\n```\n\n```\nRVL\n\nREAL CHANGE\n\nCompared: old.csv -\u003e new.csv\nAlignment: key=id\nColumns: common=15 old_only=2 new_only=1\nChecked: 4,183 rows, 12 numeric columns (50,196 cells)\nDialect(old): delimiter=, quote=\" escape=none\nDialect(new): delimiter=, quote=\" escape=none\nRanking: abs(delta) (unscaled)\nSettings: threshold=95.0% tolerance=1e-9\n\n3 cells explain 95.2% of total numeric change (threshold 95.0%):\n\n1. NVDA.market_value  +1842100  (123 -\u003e 1842223)\n2. UST10Y.price       -0.37     (4.21 -\u003e 3.84)\n3. EURUSD.fx_rate     +0.0013   (1.0842 -\u003e 1.0855)\n\nEverything else in common numeric columns is \u003c= tolerance or in the tail (not required to reach threshold).\n```\n\nOut of 50,196 cells, **3 cells** explain 95.2% of all numeric change. That's the whole answer.\n\n```bash\n# No change? Proof:\n$ rvl old.csv old_copy.csv\n# → NO REAL CHANGE (exit 0), max delta 7e-10\n\n# Machine-readable:\n$ rvl old.csv new.csv --json | jq '.contributors[0]'\n\n# Exit code only (for scripts):\n$ rvl old.csv new.csv \u003e /dev/null 2\u003e\u00261\n$ echo $?  # 0 = no change, 1 = changed, 2 = refused\n```\n\n---\n\n## The Three Outcomes\n\n`rvl` always produces exactly one of three outcomes. There are no partial results, \"and N more\" buckets, or probabilistic scores.\n\n### 1. REAL CHANGE\n\nPrinted when the top contributors (up to 25) explain ≥ threshold of total numeric change.\n\n```\nRVL\n\nREAL CHANGE\n\nCompared: old.csv -\u003e new.csv\nAlignment: key=id\nColumns: common=15 old_only=2 new_only=1\nChecked: 4,183 rows, 12 numeric columns (50,196 cells)\nDialect(old): delimiter=, quote=\" escape=none\nDialect(new): delimiter=, quote=\" escape=none\nRanking: abs(delta) (unscaled)\nSettings: threshold=95.0% tolerance=1e-9\n\n3 cells explain 95.2% of total numeric change (threshold 95.0%):\n\n1. NVDA.market_value  +1842100  (123 -\u003e 1842223)\n2. UST10Y.price       -0.37     (4.21 -\u003e 3.84)\n3. EURUSD.fx_rate     +0.0013   (1.0842 -\u003e 1.0855)\n\nEverything else in common numeric columns is \u003c= tolerance or in the tail (not required to reach threshold).\n```\n\n**How to read this:**\n- **3 cells explain 95.2%** — only 3 numeric cells (out of 50,196) account for 95.2% of all numeric change.\n- **Contributors** — ranked by `abs(delta)`, largest first. Each shows the cell label (`row_id.column`), signed delta, and old → new values.\n- **Coverage** — cumulative share of total change (L1 distance). rvl prints the smallest prefix of contributors whose cumulative coverage reaches the threshold.\n- **Threshold** — if the top 25 contributors can't reach 95%, rvl refuses (`E_DIFFUSE`) instead of printing a misleading partial list.\n\n### 2. NO REAL CHANGE\n\nPrinted when all numeric deltas are within tolerance.\n\n```\nRVL\n\nNO REAL CHANGE\n\nCompared: old.csv -\u003e new.csv\nAlignment: row-order (no key)\nColumns: common=15 old_only=2 new_only=1\nChecked: 4,183 rows, 12 numeric columns (50,196 cells)\nDialect(old): delimiter=, quote=\" escape=none\nDialect(new): delimiter=, quote=\" escape=none\nRanking: abs(delta) (unscaled)\nSettings: threshold=95.0% tolerance=1e-9\nMax abs delta: 7e-10 (\u003c= tolerance 1e-9).\nNo numeric deltas above tolerance in common numeric columns.\n```\n\n**How to read this:**\n- **Max abs delta** — the largest absolute difference observed across all cells (before tolerance zeroing). Proves nothing slipped through.\n- This is a deterministic guarantee: every common numeric cell was checked.\n\n### 3. REFUSAL\n\nPrinted when rvl cannot produce a deterministic verdict. Always includes a concrete next step.\n\n```\nRVL ERROR (E_KEY_DUP)\n\nCompared: old.csv -\u003e new.csv\nAlignment: key=id\nDialect(old): delimiter=, quote=\" escape=none\nDialect(new): delimiter=, quote=\" escape=none\nSettings: threshold=95.0% tolerance=1e-9\n\nCannot align rows: key \"id\" is not unique in old.csv (first duplicate: \"A123\" at data record 184).\nNext: choose a unique key column or dedupe the data, then rerun.\n```\n\n**How to read this:**\n- **Error code** — machine-stable identifier (e.g., `E_KEY_DUP`). See [Refusal Codes](#refusal-codes).\n- **Example** — first concrete instance of the problem (file, record number, value).\n- **Next** — a concrete rerun command or remediation step. Refusals are operator handoffs, never dead ends.\n\n---\n\n## How It Works\n\n### Alignment\n\n**Row-order mode** (no `--key`): rows align by position. Requires identical non-blank row counts. If rvl detects that rows are shuffled (via key discovery), it refuses with `E_NEED_KEY` and suggests a `--key` to use.\n\n**Key mode** (`--key \u003ccolumn\u003e`): rows align by matching key values. Key values are ASCII-trimmed, must be non-empty and unique within each file, and must match exactly between files. Any violation produces a specific refusal (`E_NO_KEY`, `E_KEY_EMPTY`, `E_KEY_DUP`, `E_KEY_MISMATCH`).\n\n### Numeric Columns\n\nOnly columns present in **both** files are compared. Only numeric columns are diffed. A column is numeric if every aligned row is either missing on both sides or parseable finite numbers on both sides.\n\n**Supported numeric formats:**\n- Plain: `123`, `-123.45`, `1e6`, `-1.2E-3`\n- Thousands separators: `1,234`, `-1,234,567.89` (US-style, 3-digit groups)\n- Currency prefix: `$123.45`, `-$1,234.56`, `$-100`\n- Accounting parentheses: `(123.45)` → parsed as `-123.45`\n- Leading `+` is allowed: `+123`, `+$1,234.56`\n\n**Missing tokens** (case-insensitive): empty string, `-`, `NA`, `N/A`, `NULL`, `NAN`, `NONE`.\n\n### Tolerance\n\nAbsolute noise floor applied per-cell. If `abs(new - old) \u003c= tolerance`, the delta is treated as zero (no contribution). Default: `1e-9`. There is no relative/percentage tolerance in v0.\n\n`max_abs_delta` in the output tracks the largest raw delta observed (before zeroing) for transparency.\n\n### Threshold and Coverage\n\n- **Total change** = sum of all `abs(delta)` values above tolerance (L1 distance across all common numeric cells).\n- **Contribution** = `abs(delta)` for a single cell (after tolerance).\n- **Coverage** = sum of top contributor contributions / total change.\n- **Threshold** (default `0.95`) = minimum coverage required for a REAL CHANGE verdict.\n- **MAX_CONTRIBUTORS** = 25 (hard cap, not configurable in v0).\n\nIf the top 25 contributors can't reach the threshold, rvl refuses with `E_DIFFUSE` rather than printing an incomplete explanation. Lower the threshold explicitly if needed: `--threshold 0.80`.\n\n### Contributor Ranking\n\nContributors are ranked by `abs(delta)` descending (unscaled — large-magnitude columns dominate by design). Ties are broken by row ID ascending, then column name ascending (byte order). rvl prints only the smallest prefix of contributors whose cumulative coverage reaches the threshold.\n\n---\n\n## How rvl Compares\n\n| Capability | rvl | Excel / Sheets | `diff` / `csvdiff` | Custom pandas script |\n|------------|-----|----------------|---------------------|----------------------|\n| Ranked numeric explanation | ✅ Top-K with coverage proof | ❌ Manual | ❌ Row-level only | ⚠️ You write it |\n| Deterministic verdict | ✅ Always | ❌ Human judgment | ⚠️ Text diff only | ⚠️ You write it |\n| Tolerance handling | ✅ Built-in | ❌ Manual rounding | ❌ None | ⚠️ You write it |\n| Refusal on ambiguity | ✅ Never wrong, refuses instead | ❌ Silent errors | ❌ Garbage in/out | ❌ Crashes |\n| Auto-detects delimiters | ✅ | N/A | ❌ | ❌ |\n| Setup time | ✅ One curl command | N/A | ⚠️ Minutes | ❌ Hours |\n| Machine-readable output | ✅ `--json` | ❌ | ⚠️ Text only | ✅ |\n\n**When to use rvl:**\n- Monthly/quarterly reconciliation of CSV exports (holdings, positions, balances)\n- CI gate: did the pipeline output actually change?\n- Audit trail: prove what changed and by how much\n\n**When rvl might not be ideal:**\n- Non-numeric diffs (text columns, schema changes) — use [`shape`](https://github.com/cmdrvl/shape) for structural checks first\n- Files that don't fit in memory\n- Diffs where you need relative (percentage) tolerance — not yet supported in v0\n\n---\n\n## Installation\n\n### Homebrew (Recommended)\n\n```bash\nbrew install cmdrvl/tap/rvl\n```\n\n### Shell Script\n\n```bash\ncurl -fsSL https://raw.githubusercontent.com/cmdrvl/rvl/main/scripts/install.sh | bash\n```\n\n### Windows (PowerShell)\n\n```powershell\nSet-ExecutionPolicy -ExecutionPolicy Bypass -Scope Process -Force; iex ((New-Object System.Net.WebClient).DownloadString('https://raw.githubusercontent.com/cmdrvl/rvl/main/scripts/install.ps1'))\n```\n\n### From Source\n\n```bash\ncargo build --release\n./target/release/rvl --help\n```\n\nPrebuilt binaries are available for x86_64 and ARM64 on Linux, macOS, and Windows (x86_64). Each release includes SHA256 checksums, cosign signatures, and an SBOM.\n\n---\n\n## CLI Reference\n\n```\nrvl \u003cold.csv\u003e \u003cnew.csv\u003e [OPTIONS]\n```\n\n### Flags\n\n| Flag | Type | Default | Description |\n|------|------|---------|-------------|\n| `--key \u003ccolumn\u003e` | string | *(none)* | Align rows by key column value. Without this, rows align by position (1st↔1st, 2nd↔2nd, etc.). |\n| `--threshold \u003cfloat\u003e` | float | `0.95` | Coverage target (0 \u003c x ≤ 1.0). The minimum fraction of total numeric change that the top contributors must explain. |\n| `--tolerance \u003cfloat\u003e` | float | `1e-9` | Per-cell noise floor (x ≥ 0). Absolute deltas ≤ this value are treated as zero. |\n| `--delimiter \u003cdelim\u003e` | string | *(auto-detect)* | Force CSV delimiter for both files. See [Delimiter](#delimiter). |\n| `--capsule-out \u003cdir\u003e` | string | *(disabled)* | Write deterministic replay capsule artifacts (`manifest.json`, `old.csv`, `new.csv`, `output.txt`, `replay.sh`) to `\u003cdir\u003e/capsule-\u003cid\u003e/`. |\n| `--json` | flag | `false` | Emit a single JSON object on stdout instead of human-readable output. |\n\nInvalid `--threshold` or `--tolerance` values are CLI argument errors (exit 2).\n\n### Exit Codes\n\n| Code | Meaning |\n|------|---------|\n| `0` | NO REAL CHANGE |\n| `1` | REAL CHANGE |\n| `2` | REFUSAL or CLI error |\n\n### Output Routing\n\n| Mode | REAL CHANGE | NO REAL CHANGE | REFUSAL |\n|------|-------------|----------------|---------|\n| Human (default) | stdout | stdout | stderr |\n| `--json` | stdout | stdout | stdout |\n\nIn `--json` mode, stderr is reserved for process-level failures only (CLI parse errors, panics).\n\n---\n\n## Delimiter\n\n### Auto-Detection (default)\n\nEach file's delimiter is detected independently by sampling the header plus up to 200 data records (or ~64KB). Candidate delimiters are tried in order: `,` → `\\t` → `;` → `|` → `^`. The candidate with the best score (most records parsed, most consistent field count, most fields) wins.\n\nIf multiple candidates tie and produce different parsed output, rvl refuses with `E_DIALECT`. If they produce identical output, the tie breaks by candidate order (comma first).\n\nIf auto-detection yields only 1 column, rvl refuses with `E_DIALECT` (the file may use an unsupported delimiter).\n\n### `sep=` Directive\n\nIf the first non-blank line of a file is `sep=\u003cchar\u003e` (e.g., `sep=;`), rvl uses that delimiter for the file (unless `--delimiter` overrides it). The `sep=` line is skipped during parsing.\n\n### `--delimiter` (forced)\n\nOverrides both auto-detection and `sep=` directives for **both** files. Accepted values:\n\n| Format | Examples |\n|--------|----------|\n| Named | `comma`, `tab`, `semicolon`, `pipe`, `caret` (case-insensitive) |\n| Hex | `0x09` (tab), `0x1f` (unit separator), `0x2c` (comma) |\n| Single ASCII char | `,`, `\\|`, `;` |\n\nValid range: ASCII `0x01`–`0x7F`, excluding `\"` (`0x22`), `\\r` (`0x0D`), `\\n` (`0x0A`). Invalid values are CLI argument errors (exit 2). Use `tab` or `0x09`, not `\\t` (no escape sequences).\n\n---\n\n## Agent / CI Integration\n\nBoth `rvl` and [`shape`](https://github.com/cmdrvl/shape) are designed to be consumed by agents and pipelines, not just humans.\n\n### Agent workflow: shape → rvl\n\n```bash\n# 1. Structural gate (is comparison even valid?)\nshape old.csv new.csv --key id --json \u003e shape.json\nif [ $? -ne 0 ]; then\n  # INCOMPATIBLE or REFUSAL — read .reasons or .refusal for why\n  jq '.reasons // .refusal' shape.json\n  exit 1\nfi\n\n# 2. Numeric explanation (only if structurally compatible)\nrvl old.csv new.csv --key id --json \u003e rvl.json\n\n# 3. Agent extracts the verdict\noutcome=$(jq -r '.outcome' rvl.json)\nif [ \"$outcome\" = \"REAL_CHANGE\" ]; then\n  jq '.contributors[] | \"\\(.row_id).\\(.column): \\(.delta)\"' rvl.json\nfi\n```\n\n### What makes this agent-friendly\n\n- **Exit codes** — `0`/`1`/`2` map directly to pass/fail/error branching\n- **`--json`** — structured output an agent can parse without regex\n- **Refusals have next steps** — an agent can read `.refusal.code` and decide whether to retry with different flags or escalate\n- **`shape --describe`** — prints the tool's `operator.json` contract so an agent can discover invocation, flags, and exit codes without reading docs\n\n### Capsule replay workflow (agent swarms)\n\nUse capsules when you need a deterministic handoff between agents, CI jobs, or debugging sessions:\n\n```bash\n# 1. Produce the normal verdict and write a replay capsule sidecar\nrvl old.csv new.csv --key id --json --capsule-out ./capsules \u003e run.json\n\n# 2. Inspect generated capsule\nls ./capsules/capsule-*/\n# manifest.json old.csv new.csv output.txt replay.sh\n\n# 3. Re-run exactly from the capsule payload\ncd ./capsules/capsule-\u003cid\u003e\n./replay.sh \u003e replay.json\n```\n\n`manifest.json` includes:\n- original invocation args (`key`, `threshold`, `tolerance`, `delimiter`, `json`)\n- outcome and refusal code (if any)\n- contributor summary for REAL_CHANGE\n- replay command plus artifact hashes for integrity checks\n\nFor troubleshooting, compare `run.json` vs `replay.json` outcome/refusal code first; if they differ, the environment or binary changed.\n\n---\n\n## Scripting Examples\n\nCheck if files changed (exit code only):\n\n```bash\nrvl old.csv new.csv \u003e /dev/null 2\u003e\u00261\necho $?  # 0 = no change, 1 = changed, 2 = refused\n```\n\nExtract top contributor from JSON:\n\n```bash\nrvl old.csv new.csv --json | jq '.contributors[0]'\n```\n\nGet total change magnitude:\n\n```bash\nrvl old.csv new.csv --json | jq '.metrics.total_change'\n```\n\nHandle refusals programmatically:\n\n```bash\nrvl old.csv new.csv --json | jq 'select(.outcome == \"REFUSAL\") | .refusal'\n```\n\nForce a tab-delimited comparison with relaxed threshold:\n\n```bash\nrvl old.tsv new.tsv --delimiter tab --key account_id --threshold 0.80\n```\n\nGate a pipeline (shape before rvl):\n\n```bash\nshape old.csv new.csv --key loan_id --json \u003e shape.json \\\n  \u0026\u0026 rvl old.csv new.csv --key loan_id --json \u003e rvl.json\n```\n\n---\n\n## Refusal Codes\n\nEvery refusal includes the error code, first concrete example, and a `Next:` remediation step.\n\n| Code | Meaning | Next Step |\n|------|---------|-----------|\n| `E_IO` | File read error | Check file path and permissions |\n| `E_ENCODING` | Unsupported encoding (UTF-16/32 BOM or NUL bytes) | Convert/re-export as UTF-8 |\n| `E_CSV_PARSE` | CSV parse failure (invalid quoting/escaping) | Re-export as standard RFC4180 CSV |\n| `E_HEADERS` | Missing header, duplicate headers, or rows wider than header | Fix headers or re-export |\n| `E_DIALECT` | Delimiter ambiguous or undetectable | Use `--delimiter \u003cdelim\u003e` or add `sep=\u003cchar\u003e` to file |\n| `E_NO_KEY` | `--key` column not found in one or both files | Use a column name that exists in both files |\n| `E_KEY_EMPTY` | Empty key value in a non-blank row | Choose a key column with no empty values, or fill missing keys |\n| `E_KEY_DUP` | Duplicate key values within a file | Choose a unique key column or dedupe the data |\n| `E_KEY_MISMATCH` | Key sets differ between files (missing/extra keys) | Export comparable scopes or fix the join key |\n| `E_ROWCOUNT` | Row count mismatch (row-order mode) | Use `--key \u003ccolumn\u003e` for a missing/extra-keys report |\n| `E_NEED_KEY` | Detected row reorder without `--key` | Use `--key \u003csuggested\u003e` (rvl prints candidates) |\n| `E_MIXED_TYPES` | Column has both numeric and non-numeric values | Normalize column values to numeric or exclude the column |\n| `E_NO_NUMERIC` | No numeric columns in common | Ensure both files share at least one numeric column |\n| `E_MISSINGNESS` | Numeric value vs. missing token in aligned cell | Fill missing values or exclude the column |\n| `E_DIFFUSE` | Top 25 contributors can't reach threshold | Use `--threshold 0.80` (or lower) to accept less coverage |\n\n---\n\n## Troubleshooting\n\n### \"E_NEED_KEY\" even though rows look the same\n\nYour rows are in a different order between files. rvl detected this and refuses rather than silently comparing wrong row pairs. Use the `--key` it suggests:\n\n```bash\nrvl old.csv new.csv --key loan_id\n```\n\n### \"E_DIFFUSE\" — can't reach threshold\n\nChanges are spread across too many cells for the top 25 to explain 95%. This usually means a broad recalculation (e.g., FX revaluation). Lower the threshold:\n\n```bash\nrvl old.csv new.csv --threshold 0.80\n```\n\n### \"E_MIXED_TYPES\" on a column that looks numeric\n\nA cell in that column has a value rvl can't parse as a number (check for stray text, #N/A variants not in the missing list, or locale-specific formatting). The error message shows the first offending cell.\n\n### \"E_DIALECT\" — delimiter detection failed\n\nYour file uses an uncommon delimiter or has inconsistent field counts. Force the delimiter:\n\n```bash\nrvl old.csv new.csv --delimiter pipe      # for |\nrvl old.csv new.csv --delimiter 0x09      # for tab\nrvl old.csv new.csv --delimiter semicolon # for ;\n```\n\n### Large files are slow\n\nrvl loads both files into memory. For very large files (millions of rows), ensure sufficient RAM. There is no streaming mode in v0.\n\n---\n\n## Limitations\n\n| Limitation | Detail |\n|------------|--------|\n| **Numeric columns only** | rvl compares numbers. Text column changes are ignored — use `diff` or `shape` for structural checks. |\n| **Absolute tolerance only** | No relative/percentage tolerance in v0. A $0.01 delta on a $1M balance and a $0.01 balance are treated identically. |\n| **MAX_CONTRIBUTORS = 25** | Hard cap, not configurable in v0. If change is spread across \u003e25 cells, rvl refuses (`E_DIFFUSE`). |\n| **In-memory** | Both files are loaded fully into memory. No streaming mode yet. |\n| **Two files only** | No multi-file or directory comparison. |\n| **No column filtering** | All common numeric columns are compared. You can't exclude specific columns in v0. |\n\n---\n\n## FAQ\n\n### Why \"rvl\"?\n\nShort for *reveal*. The tool reveals what actually changed, cutting through the noise.\n\n### Is this just `diff` for CSVs?\n\nNo. `diff` shows you every line that's different. `rvl` tells you *which numeric changes matter* — the smallest set that explains the change. It's an explanation, not a diff.\n\n### What if my files have different columns?\n\nrvl compares only columns present in both files. Extra columns on either side are reported in the header but don't affect the verdict.\n\n### Can I use this in CI/CD?\n\nYes. Exit codes (0/1/2) and `--json` output are designed for automation. Gate on exit code, or parse the JSON for richer assertions.\n\n### What about non-US number formats (e.g., `1.234,56`)?\n\nNot supported in v0. rvl assumes US-style formatting (comma as thousands separator, period as decimal).\n\n### How does rvl relate to shape?\n\n[`shape`](https://github.com/cmdrvl/shape) checks structural compatibility (do columns match? is the key valid?). `rvl` checks numeric content (what changed and by how much?). Run `shape` first to validate structure, then `rvl` to explain changes.\n\n---\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cstrong\u003eJSON Output Reference\u003c/strong\u003e\u003c/summary\u003e\n\nA single JSON object on stdout. If the process fails before domain evaluation (e.g., invalid CLI args), JSON may not be emitted.\n\n```jsonc\n{\n  \"version\": \"rvl.v0\",\n  \"outcome\": \"REAL_CHANGE\",            // \"REAL_CHANGE\" | \"NO_REAL_CHANGE\" | \"REFUSAL\"\n  \"files\": {\n    \"old\": \"old.csv\",\n    \"new\": \"new.csv\"\n  },\n  \"alignment\": {\n    \"mode\": \"key\",                      // \"key\" | \"row_order\"\n    \"key_column\": \"u8:id\"              // encoded identifier, or null\n  },\n  \"dialect\": {\n    \"old\": { \"delimiter\": \",\", \"quote\": \"\\\"\", \"escape\": null },\n    \"new\": { \"delimiter\": \",\", \"quote\": \"\\\"\", \"escape\": null }\n  },\n  \"threshold\": 0.95,\n  \"tolerance\": 1e-9,\n  \"counts\": {\n    \"rows_old\": 4183,\n    \"rows_new\": 4183,\n    \"rows_aligned\": 4183,\n    \"columns_old\": 17,\n    \"columns_new\": 16,\n    \"columns_common\": 15,\n    \"columns_old_only\": 2,\n    \"columns_new_only\": 1,\n    \"numeric_columns\": 12,\n    \"numeric_cells_checked\": 50196,\n    \"numeric_cells_changed\": 3\n  },\n  \"metrics\": {\n    \"total_change\": 1842100.3713,       // L1 distance (sum of abs deltas above tolerance)\n    \"max_abs_delta\": 1842100.0,         // largest abs(delta) observed (pre-zeroing)\n    \"top_k_coverage\": 0.952             // coverage of top MAX_CONTRIBUTORS\n  },\n  \"limits\": {\n    \"max_contributors\": 25\n  },\n  \"contributors\": [                     // empty unless REAL_CHANGE\n    {\n      \"row_id\": \"u8:NVDA\",\n      \"column\": \"u8:market_value\",\n      \"old\": 123.0,\n      \"new\": 1842223.0,\n      \"delta\": 1842100.0,\n      \"contribution\": 1842100.0,\n      \"share\": 0.9998,                  // contribution / total_change\n      \"cumulative_share\": 0.9998\n    }\n    // ... more contributors, ranked by contribution desc\n  ],\n  \"refusal\": null                       // null unless REFUSAL\n  // When REFUSAL:\n  // \"refusal\": {\n  //   \"code\": \"E_KEY_DUP\",\n  //   \"message\": \"duplicate key values\",\n  //   \"detail\": { \"file\": \"old.csv\", \"key_samples\": [\"A123\"], ... }\n  // }\n}\n```\n\n### Identifier Encoding (JSON)\n\nRow IDs and column names in JSON use unambiguous encoding:\n- `u8:\u003cstring\u003e` — valid UTF-8 with no ASCII control bytes (e.g., `u8:NVDA`, `u8:market_value`)\n- `hex:\u003chex-bytes\u003e` — anything else (e.g., `hex:ff00ab`)\n\nCopy the encoded identifier directly into `--key` to avoid ambiguity.\n\n### Nullable Fields\n\nOn REFUSAL, `counts` and `metrics` fields may be `null` if they couldn't be computed (e.g., `rows_aligned` is `null` for `E_ROWCOUNT`; all `metrics` are `null` for `E_NEED_KEY`).\n\n\u003c/details\u003e\n\n---\n\n## Spec\n\nThe full specification is `docs/PLAN_RVL.md`. This README covers everything needed to use the tool; the spec adds implementation details, edge-case definitions, and testing requirements.\n\n## Development\n\n```bash\ncargo fmt --check\ncargo clippy --all-targets -- -D warnings\ncargo test\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcmdrvl%2Frvl","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcmdrvl%2Frvl","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcmdrvl%2Frvl/lists"}