https://github.com/heidihelena/recoverlite-py

Python mirror of recoverlite: pre-data recovery tests for planned study designs (protocol-identical to the R package; results agree within Monte Carlo error)
https://github.com/heidihelena/recoverlite-py

estimands metascience monte-carlo power-analysis preregistration python research-methods simulation statistics study-design

Last synced: 23 days ago
JSON representation

Python mirror of recoverlite: pre-data recovery tests for planned study designs (protocol-identical to the R package; results agree within Monte Carlo error)

Host: GitHub
URL: https://github.com/heidihelena/recoverlite-py
Owner: heidihelena
License: apache-2.0
Created: 2026-07-04T13:15:28.000Z (24 days ago)
Default Branch: main
Last Pushed: 2026-07-04T13:56:57.000Z (24 days ago)
Last Synced: 2026-07-04T15:20:34.770Z (24 days ago)
Topics: estimands, metascience, monte-carlo, power-analysis, preregistration, python, research-methods, simulation, statistics, study-design
Language: Python
Homepage: https://github.com/heidihelena/recoverlite
Size: 68.4 KB
Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          # recoverlite (Python)

[![CI](https://github.com/heidihelena/recoverlite-py/actions/workflows/ci.yaml/badge.svg)](https://github.com/heidihelena/recoverlite-py/actions/workflows/ci.yaml)

[![PyPI](https://img.shields.io/pypi/v/recoverlite.svg)](https://pypi.org/project/recoverlite/)

[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.21195922.svg)](https://doi.org/10.5281/zenodo.21195922)

[![License: Apache 2.0](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](LICENSE)

**Pre-data recovery tests for planned study designs** — the Python

mirror of the [R package](https://github.com/heidihelena/recoverlite).

A planned study can be unable to support its intended inferential claim

even when the researcher's substantive assumptions are correct.

`recoverlite` simulates a *declared* design–analysis pair over a crossed

scenario grid — null and target effects, each under declared and

pessimistically perturbed nuisance assumptions — and converts the

diagnosands into a **PASS / RISK / FAIL** verdict under a pre-specified,

versioned threshold profile.

## Mirror contract

This package is **protocol-identical** to the R implementation: same

declaration API, same scenario grid, same diagnosands (including the

exact decomposition *target bias = estimator bias + estimand drift*),

same threshold profiles (shared version string

`recoverlite-thresholds-0.2`), same verdict rule, same report structure.

Simulated numbers agree **within Monte Carlo error, not byte-identically**

— R and Python cannot share RNG streams. The test suite enforces the

contract by reproducing the R package's archived worked-example

diagnosands within 4× combined MCSE.

| | R package | Python mirror |

|---|---|---|

| Two-arm trial (baseline, additive measurement error, MAR/MCAR attrition, noncompliance) | ✅ | ✅ |

| Complete-case linear model | ✅ | ✅ |

| MI baseline-adjusted estimator (Rubin + Barnard-Rubin) | ✅ | ✅ |

| Cluster trial, LMM (Wald-z / Satterthwaite / Kenward–Roger) | ✅ lme4 + lmerTest + pbkrtest | ✅ internal exact REML fitter |

| Cluster-level t-test | ✅ | ✅ |

| Fragility curves (effect + nuisance) | ✅ | ✅ |

The mixed-model machinery is self-contained (numpy/scipy only — no

statsmodels): a closed-form REML fitter for the random-intercept model

with Satterthwaite df (observed REML Hessian, the lmerTest algorithm)

and Kenward–Roger adjusted covariance + df (expected REML information,

the pbkrtest algorithm). Because these methods are deterministic given

data, they are validated against lmerTest/pbkrtest **to numerical

precision** (rel. ≤ 1e-4) on shared fixture datasets — balanced,

unbalanced, near-boundary, and null — in

[`tests/data/`](tests/data/), a stronger contract than the MCSE-level

simulation cross-checks.

## Install

```bash

pip install recoverlite            # numpy + scipy only

```

Not yet on PyPI? Install from source:

`pip install git+https://github.com/heidihelena/recoverlite-py`.

## The workflow in one block

```python

import recoverlite as rl

design = rl.declare_recovery(

    target=rl.target_estimand(

        estimand="ITT mean difference at 12 weeks",

        scale="latent-outcome standardized mean difference",

        sesoi=0.40,

    ),

    data_strategy=rl.two_arm_trial(n_per_arm=115),

    measurement=rl.measured_outcome(reliability=0.70),

    missingness=rl.attrition_model(rate=0.15, mechanism="differential"),

    answer_strategy=rl.planned_analysis(

        estimator="linear_model",

        formula="y_observed ~ treatment",

    ),

)

result = rl.recovery_test(design, sims=2000,

                          scenarios="confirmatory_grid", seed=1)

print(rl.verdict(result))   # PASS / RISK / FAIL (+ strict/lenient recompute)

rl.report(result)           # the standalone report always travels with it

```

Cluster designs use `cluster_trial()` with

`planned_analysis("lmm_random_intercept", "y_observed ~ treatment + (1 | cluster)")`

(Wald-z) or `"cluster_mean_ttest"`. Fragility curves —

`effect_fragility()`, `nuisance_fragility()` — are deliberately outside

the verdict.

## Citation

> Andersen, H. H. (2026). *Recovery before data: pre-data simulation

> diagnosis of planned study designs.* Working paper; preprint

> forthcoming. https://github.com/heidihelena/recoverlite

A PASS is evidence about the instrument, not about the world.

## License

[Apache License 2.0](LICENSE).

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/heidihelena/recoverlite-py

Awesome Lists containing this project

README