https://github.com/semcod/regix
https://github.com/semcod/regix
Last synced: 8 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/semcod/regix
- Owner: semcod
- License: other
- Created: 2026-03-31T11:30:25.000Z (2 months ago)
- Default Branch: main
- Last Pushed: 2026-05-27T16:04:09.000Z (12 days ago)
- Last Synced: 2026-05-27T17:09:56.701Z (12 days ago)
- Language: Python
- Size: 6.57 MB
- Stars: 1
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
Awesome Lists containing this project
README
## AI Cost Tracking
   
  
- 🤖 **LLM usage:** $2.9316 (38 commits)
- 👤 **Human dev:** ~$1567 (15.7h @ $100/h, 30min dedup)
Generated on 2026-05-31 using [openrouter/qwen/qwen3-coder-next](https://openrouter.ai/qwen/qwen3-coder-next)
---
**Regix** is a Python library for detecting, measuring, and reporting code quality regressions between versions of a codebase. It compares static analysis metrics across git commits, branches, or local snapshots and pinpoints exactly which functions, classes, or lines have regressed — and by how much.
> Inspired by operational experience with [pyqual](https://github.com/semcod/pyqual): iterative quality-gate pipelines revealed that without structured regression tracking, metric changes across refactoring cycles are invisible until a gate breaks.
> **Why "Regix"?** From **reg**ression + **ix** (like fix/matrix vibe). Short, technical, memorable.
---
## Problem statement
When refactoring code over multiple iterations:
- A function's cyclomatic complexity improves in one commit, then regresses two commits later.
- Test coverage drops by 3 % after a merge — but no single file is obviously responsible.
- Maintainability index worsens across three files simultaneously; the cause is spread over dozens of small changes.
- A CI pipeline fails a quality gate without explaining *what changed* since the last passing run.
Standard linters and quality tools report the **current state**. They do not answer: *"Is this better or worse than before, and where exactly did it change?"*
**Regix answers that question.**
---
## What Regix does
```
git commit A ──► snapshot_A ─┐
├──► compare ──► regression report
git commit B ──► snapshot_B ─┘
```
For every function, class, and module it tracks:
| Metric | Unit | Regression when |
|---|---|---|
| Cyclomatic complexity (CC) | integer | increases |
| Maintainability index (MI) | 0–100 | decreases |
| Test coverage | % | decreases |
| Function length | lines | increases beyond threshold |
| Docstring coverage | % | decreases |
| LLM validation score | 0–1 | decreases (via vallm) |
| Import count | integer | increases |
| Fan-out (delegation depth) | integer | decreases (shell/stub regression) |
| Call count | integer | decreases (hollowing out) |
| Symbol count | integer | decreases (monolith collapse) |
| Parameter count | integer | increases (interface bloat) |
| Node type diversity | integer | decreases (structural uniformity) |
| Logic density | ratio | decreases (empty body regression) |
In addition to per-metric regressions, Regix detects **architectural smells** — cross-symbol patterns that indicate structural degradation even when individual metrics look acceptable:
| Smell | Pattern |
|---|---|
| `god_function` | One function absorbed CC from several that were deleted |
| `stub_regression` | `fan_out` and `call_count` drop to near-zero (mock/stub replacing real logic) |
| `monolith_collapse` | `symbol_count` drops while total file length grows |
| `shell_cluster` | Multiple functions simultaneously lose delegation depth |
| `logic_drain` | `logic_density` drops across ≥ 3 functions in a file |
Regressions are reported at **three granularity levels**:
1. **Line** — the exact line range that changed and caused the metric shift.
2. **Symbol** — the function or class whose metric crossed a threshold.
3. **Module** — the file-level aggregate delta.
---
## Key features
- **Git-native**: compares any two refs (`HEAD~1`, branch names, tags, commit SHAs).
- **Local-snapshot mode**: compare a dirty working tree against any historical ref without committing.
- **Multi-version history**: compute a regression timeline across N commits to see trends.
- **Configurable thresholds**: per-metric, per-file, or per-symbol severity rules.
- **Multiple backends**: lizard (CC), radon (MI), pytest-cov (coverage), vallm (LLM quality), structure (AST fan-out/call analysis), architecture (cross-symbol smell detection).
- **Machine-readable output**: JSON, YAML, or TOON format for CI integration.
- **Human-readable output**: rich terminal tables and diff-style reports.
- **Zero false-positives mode**: only report regressions that cross a configured threshold delta.
- **LLM patch gate (`review`)**: detect regressions only in changed hunks/symbols to reduce noisy failures.
- **Dependency-aware impact**: selective tests from changed files + transitive import dependents.
---
## Installation
```bash
pip install regix
```
With optional analysis backends:
```bash
pip install "regix[full]" # lizard + radon + vallm + pytest-cov
pip install "regix[lizard]" # CC + length analysis only
pip install "regix[coverage]" # pytest-cov integration only
pip install "regix[vallm]" # LLM-based quality scoring
```
Python requirement: **>= 3.10**
---
### Compare HEAD against previous commit
```bash
regix compare HEAD~1 HEAD
```
```
Regression Report: HEAD~1 → HEAD
════════════════════════════════════════════════════════════
pyqual/cli.py
▲ bulk_run_cmd CC 14 → 19 (+5) ⚠ WARNING
▲ _build_run_summary CC 12 → 18 (+6) ✗ ERROR
▼ run MI 45 → 38 (-7) ⚠ WARNING
pyqual/validation.py
▲ validate_config CC 18 → 26 (+8) ✗ ERROR
▲ validate_config len 92 → 110 (+18) ⚠ WARNING
Summary: 2 errors, 3 warnings, 0 improvements across 2 files
Gates: ✗ FAIL (cc_max=15 violated in 4 symbols)
```
### Compare two branches
```bash
regix compare main feature/refactor --format json > regression.json
```
### Track regression over last 10 commits
```bash
regix history --depth 10 --metric cc --metric coverage
```
```
Commit cc_avg cc_max coverage mi_avg
──────── ────── ────── ──────── ──────
abc1234 6.2 19 59.9 % 38.4 ← current HEAD
def5678 6.0 18 61.2 % 39.1
ghi9012 5.8 16 63.0 % 41.2 ← coverage regression starts here
jkl3456 5.7 15 65.1 % 42.0
...
```
### Use from Python
```python
from regix import Regix, RegressionConfig
cfg = RegressionConfig(
cc_max=15,
mi_min=20,
coverage_min=80,
)
rx = Regix(config=cfg, workdir=".")
report = rx.compare("HEAD~1", "HEAD")
for regression in report.regressions:
print(f"{regression.file}::{regression.symbol} "
f"{regression.metric} {regression.before} → {regression.after} "
f"({regression.severity})")
if report.has_errors:
raise SystemExit(1)
```
---
### Snapshot
A `Snapshot` is an immutable record of all tracked metrics for a codebase at a specific point in time. It is identified by a git ref (commit SHA, branch, tag) or the special value `"local"` for the current working tree.
```python
from regix import Snapshot
snap = Snapshot.from_ref("HEAD", workdir=".")
print(snap.metrics["pyqual/cli.py"]["bulk_run_cmd"]["cc"]) # 19
```
Snapshots are cached by content hash so repeated comparisons are fast.
### Regression
A `Regression` is a detected worsening of a metric between two snapshots, at symbol, module, or project level.
```python
@dataclass
class Regression:
file: str # relative path
symbol: str | None # function or class name, None = module-level
line: int | None # approximate line in the target ref
metric: str # "cc", "mi", "coverage", "length", ...
before: float
after: float
delta: float # after - before (positive = worse for cc/length)
severity: str # "error" | "warning" | "info"
threshold: float # the configured limit that was crossed
```
### Improvement
An `Improvement` is the mirror of a `Regression` — a metric that got *better*. Regix tracks both so you get a complete picture of a refactoring.
### Report
A `RegressionReport` aggregates all `Regression` and `Improvement` objects from a comparison, plus summary statistics, gate pass/fail status, and the two snapshot refs.
---
## CLI reference
```
regix [OPTIONS] COMMAND [ARGS]
Commands:
compare Compare metrics between two git refs or local state.
review Compare diff scope only (LLM/pre-commit regression gate).
history Show metric timeline across N historical commits.
snapshot Capture and store a snapshot without comparing.
diff Show symbol-level metric diff (like git diff but for metrics).
gates Check current state against configured quality gates.
report Re-render a stored comparison as a different format.
impact Analyze git changes and print or execute targeted selective test suites.
```
### `regix compare`
```
regix compare [REF_A] [REF_B] [OPTIONS]
REF_A Base ref (default: HEAD~1)
REF_B Target ref (default: HEAD or local if --local)
Options:
--local Compare REF_A against the current working tree
--config FILE Path to regix.yaml (default: regix.yaml or .regix/config.yaml)
--format Output format: rich | json | yaml | toon [default: rich]
--output FILE Write report to file instead of stdout
--metric TEXT Only report on specific metric(s), repeatable
--file TEXT Only report on specific file(s), repeatable
--symbol TEXT Only report on specific function/class, repeatable
--errors-only Suppress warnings, show errors only
--fail-on warning Exit code 1 on warnings too (default: errors only)
--no-improvements Suppress improvement entries from output
--depth INTEGER For history mode: number of commits to scan
--planfile Generate planfile tickets for detected regressions
```
### `regix history`
```
regix history [OPTIONS]
Options:
--depth INTEGER Number of commits to include [default: 20]
--ref TEXT Starting ref [default: HEAD]
--metric TEXT Metrics to include, repeatable
--file TEXT Filter to specific file(s)
--symbol TEXT Filter to specific symbol
--format rich | json | yaml | csv [default: rich]
--output FILE
```
### `regix review`
```
regix review [BASE] [TARGET] [OPTIONS]
Reviews working-tree/staged changes and reports only regressions overlapping
changed hunks/symbols. Useful for LLM patch validation and pre-commit gates.
Options:
--staged Review staged diff against BASE
--patch FILE Review unified-diff patch file
--config FILE Path to regix.yaml
--format rich | json | yaml | toon [default: rich]
--errors-only Suppress warnings
--fail-on warning Exit code 1 on warnings too
```
### `regix diff`
```
regix diff [REF_A] [REF_B] [OPTIONS]
Shows a symbol-by-symbol metric diff, similar to `git diff` but for
static analysis data instead of source lines.
Options:
--threshold FLOAT Only show symbols with delta >= threshold
--metric TEXT Filter to specific metric(s)
```
### `regix impact`
```
regix impact [OPTIONS]
Analyzes git modified files and dynamically determines which test suites are affected
by the changes (including PyTest, TestQL scenarios, and visual Playwright routes).
Supports automatic selective test suite execution.
Options:
--run Automatically run the selective targeted test suites
--dry-run Dry run the selective targeted test suites
--config FILE Path to regix.yaml
--workdir TEXT Specify the project root directory [default: .]
```
---
## Configuration
Create a `regix.yaml` at the project root:
```yaml
regix:
workdir: .
metrics:
cc_max: 15 # cyclomatic complexity per function
mi_min: 20 # maintainability index (radon)
coverage_min: 80 # test coverage (%)
length_max: 100 # function length in lines
docstring_min: 60 # docstring coverage (%)
thresholds:
delta_warn: 2 # minimum delta to emit a warning
delta_error: 5 # minimum delta to emit an error
backends:
cc: lizard # lizard | radon | both
mi: radon
coverage: pytest-cov
quality: vallm # optional LLM-based score
impact:
include_prefixes: [] # optional, empty = whole repo
ignore_globs:
- "**/.venv/**"
- "**/.tox/**"
- "**/dist/**"
test_patterns:
- "tests/test_{stem}.py"
- "tests/**/test_{stem}.py"
- "{dir}/tests/test_{stem}.py"
enable_import_graph: true
transitive_depth: 2
exclude:
- "tests/**"
- "examples/**"
- "docs/**"
- "**/migrations/**"
include:
- "src/**"
- "mypackage/**"
output:
format: rich # rich | json | yaml | toon
show_improvements: true
max_symbols: 50 # truncate long reports
gates:
on_regression: warn # warn | error | block
fail_exit_code: 1
```
See [configuration.md](configuration.md) for the full reference.
---
### GitHub Actions
```yaml
- name: Check for regressions
run: |
pip install "regix[full]"
regix compare ${{ github.event.pull_request.base.sha }} HEAD \
--format json --output .regix/report.json
regix gates --fail-on error
```
# .pre-commit-config.yaml
repos:
- repo: local
hooks:
- id: regix
name: Regression check
entry: regix compare HEAD --local --errors-only
language: python
pass_filenames: false
```
### pyqual integration
Regix ships a pyqual-compatible stage preset and gate collector:
```yaml
# pyqual.yaml
stages:
- name: regression-check
tool: regix
when: metrics_fail
metrics:
regression_errors_max: 0
regression_warnings_max: 5
```
The `regix` preset runs `regix compare HEAD~1 HEAD --format toon --output .regix/` and the gate collector reads `.regix/report.toon.yaml` for `regression_errors` and `regression_warnings` metrics.
### Planfile integration
Regix features first-class integration with the **Planfile** SDLC ticket standard. When comparing refs, passing the `--planfile` flag will automatically generate highly detailed task tickets for any detected regressions and sync them straight into `TODO.md`:
```bash
regix compare HEAD~1 HEAD --planfile
```
Each generated ticket contains:
- **Title**: Symbol and metric that regressed (e.g. `[Regression] CC worsened in bulk_run_cmd`).
- **Description**: Detailed metrics delta, before/after values, and the violated threshold.
- **Priority**: Dynamic mapping based on severity (`high` for errors, `normal` for warnings).
- **Labels**: Auto-labeled with `bug`, `regression`, `regix`, and the metric key (e.g. `cc`, `mi`, `coverage`).
- **Associated Files**: Linked directly to the regressed source file.
After ticket generation, Regix triggers `planfile sync markdown --direction to` to seamlessly update your local `TODO.md`.
---
## Design principles (lessons from pyqual)
These principles were derived from running iterative quality pipelines and observing failure modes:
1. **Fix the measurement before fixing the code.** If the tool counts unsupported files in the pass-rate denominator, the metric is misleading. Regix measures only what it can actually analyse.
2. **Report at symbol granularity, not file granularity.** A file-level CC average of 6.4 masks individual functions with CC 19+. Regix always links a regression to the smallest attributable unit.
3. **Distinguish regression from absolute violation.** A function with CC=18 that was CC=18 last week is not a *new* regression — it is an existing issue. A function that went CC=10 → CC=18 in this PR *is* a regression. Both should be reportable, but separately.
4. **Stagnation is a signal.** If a fix pass runs twice and produces identical metric values, continuing to iterate wastes time. Regix exposes `report.stagnated` so callers can exit early.
5. **Make the improvement visible too.** Refactoring that improves CC from 22 → 14 in three functions while worsening one from 12 → 16 still net-positive. Hiding improvements creates a discouraging feedback loop.
6. **Historical depth is essential.** A single before/after comparison misses the *trend*. The `history` command lets you see that coverage has been falling 1 % per week for six weeks, not just that it is below the gate today.
7. **Output must be machine-readable.** Every report format (JSON, YAML, TOON) is stable and versioned so downstream tools (dashboards, ticket systems, fix agents) can consume regression data programmatically.
---
### Running tests with tox
Regix uses `tox` for testing across multiple Python versions:
```bash
# Run with coverage, linting, and type checking
tox -e py313-full,lint,type
```
Supported environments: `py39`, `py310`, `py311`, `py312`, `py313`, `lint` (ruff), `type` (mypy).
### Automated quality pipeline (pyqual)
The project includes a `pyqual` pipeline configured in `pyqual.yaml` that automates:
1. **Baseline analysis** — code2llm extraction
2. **Validation** — vallm batch validation
3. **Testing** — pytest with coverage
4. **Regression fixing** — prefact + llx auto-fix
5. **Version bump** — automatic patch version increment
6. **Publishing** — build and upload to PyPI
Run the full pipeline:
```bash
pyqual run -c pyqual.yaml
```
The pipeline loops up to 5 iterations, automatically fixing issues until all quality gates pass:
- CC ≤ 10 per function
- vallm pass rate ≥ 50%
- Test coverage ≥ 80%
---
## Architecture
```
regix/
├── __init__.py # Public API: Regix, Snapshot, RegressionReport, ...
├── cli.py # Typer-based CLI (compare, history, diff, gates, ...)
├── config.py # RegressionConfig dataclass + YAML loader
├── snapshot.py # Snapshot capture, caching, git checkout logic
├── compare.py # Core comparison engine: metric deltas + ArchSmell detection
├── history.py # Multi-commit timeline builder
├── report.py # Rendering: rich tables, JSON/YAML/TOON serialisation
├── gates.py # Gate evaluation (pass/fail) from RegressionReport
├── backends/
│ ├── __init__.py
│ ├── lizard_backend.py # CC + function length via lizard
│ ├── radon_backend.py # MI + raw metrics via radon
│ ├── coverage_backend.py # Coverage via pytest-cov / .coverage file
│ ├── docstring_backend.py # Docstring coverage (AST, built-in)
│ ├── structure_backend.py # fan_out, call_count, symbol_count (AST, built-in)
│ ├── architecture_backend.py # Cross-symbol ArchSmell detection
│ └── vallm_backend.py # LLM quality score via vallm batch
├── git.py # Git helpers: checkout, worktree, diff, log
├── cache.py # Content-addressed snapshot cache (~/.cache/regix/)
└── integrations/
├── pyqual.py # pyqual preset + gate collector
└── github.py # GitHub Actions annotations formatter
```
See [architecture.md](architecture.md) for a detailed description of each module.
---
## Comparison with related tools
| Tool | What it measures | Regression detection | Git-aware | Symbol-level |
|------|-----------------|---------------------|-----------|--------------|
| **Regix** | CC, MI, coverage, length, quality | ✅ first-class | ✅ | ✅ |
| pylint | Style, errors | ❌ | ❌ | ❌ |
| radon | CC, MI raw | ❌ | ❌ | ✅ |
| lizard | CC, length | ❌ | ❌ | ✅ |
| pytest-cov | Coverage | ❌ | ❌ | ❌ |
| pyqual | Quality gates (pipeline) | partial (gate-level) | ❌ | ❌ |
| diff-cover | Coverage diff | partial (line-level) | ✅ | ❌ |
| xenon | CC thresholds | ❌ | ❌ | ❌ |
Regix is designed to sit **above** analysis tools (lizard, radon, vallm) as the comparison and regression-detection layer, and **alongside** gate tools (pyqual) as a complementary data source.
---
## Roadmap
- [x] `v0.1` — Core compare/history CLI, lizard + radon backends, JSON/rich output
- [x] `v0.2` — pytest-cov backend, configuration file, CI presets
- [x] `v0.3` — vallm backend, pyqual integration preset
- [x] `v0.4` — Symbol-level caching, incremental snapshots for large repos
- [ ] `v0.5` — Web dashboard (static HTML report), trend charts
- [ ] `v1.0` — Stable API, full docs, PyPI release
---
## License
Licensed under Apache-2.0.
## Status
_Last updated by [taskill](https://github.com/oqlos/taskill) at 2026-05-27 19:04 UTC_
| Metric | Value |
|---|---|
| HEAD | `d469837` |
| Coverage | — |
| Failing tests | — |
| Commits in last cycle | 34 |
> Refactors and docs updates across the project: CLI/test refactoring and configuration handling were improved, a configuration-management feature was added, and documentation/TODOs were refreshed. CI/test infra was updated (tox, pyqual) and multiple version bumps and lint auto-fix iterations were applied.