https://github.com/kimmingul/samplesize-copilot
Sample-size & power calculator — Python package + Claude Code plugin
https://github.com/kimmingul/samplesize-copilot
biostatistics claude-code claude-code-plugin clinical-trials hypothesis-testing power-analysis power-calculation python sample-size scipy statistics study-design
Last synced: 18 days ago
JSON representation
Sample-size & power calculator — Python package + Claude Code plugin
- Host: GitHub
- URL: https://github.com/kimmingul/samplesize-copilot
- Owner: kimmingul
- License: apache-2.0
- Created: 2026-05-25T04:19:15.000Z (27 days ago)
- Default Branch: main
- Last Pushed: 2026-05-25T13:34:11.000Z (26 days ago)
- Last Synced: 2026-05-25T15:24:01.497Z (26 days ago)
- Topics: biostatistics, claude-code, claude-code-plugin, clinical-trials, hypothesis-testing, power-analysis, power-calculation, python, sample-size, scipy, statistics, study-design
- Language: Python
- Homepage: https://kimmingul.github.io/samplesize-copilot/
- Size: 599 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
- Roadmap: docs/ROADMAP.md
- Notice: NOTICE
Awesome Lists containing this project
README
# samplesize-copilot — Sample-size and power calculations for clinical and applied research
[](https://github.com/kimmingul/samplesize-copilot/actions/workflows/ci.yml)
[](LICENSE)
[](pyproject.toml)
[](docs/METHOD_COVERAGE.md)
[](tests/validation/test_fixture_consistency.py)
A Python package + Claude Code plugin implementing 234 sample-size and power-calculation
methods validated against worked examples from established statistical references.
## Status
**v0.1 — 234 methods implemented and validated, 819 worked-example fixture tests passing.**
Doctor passes 9/9 integrity checks across registry, callables, plugin manifest, and
reporting templates. Roadmap in `docs/ROADMAP.md`; live coverage matrix in
`docs/METHOD_COVERAGE.md`.
## Layout
```
samplesize-copilot/
├── samplesize/ # Python package — pure-Python calculators
│ ├── core/ # distributions, effect sizes, adjustments
│ ├── tests/ # per-method calculator modules
│ ├── reporting/ # plots, tables, protocol text, audit, R/SAS export
│ │ └── templates/ # i18n templates (protocol.en.yaml, protocol.ko.yaml, ...)
│ ├── registry/ # methods.json — categorical metadata only
│ ├── cli.py # `python -m samplesize ...`
│ └── doctor.py # `samplesize doctor` integrity checks
├── plugin/ # Claude Code plugin
│ ├── .claude-plugin/plugin.json
│ ├── skills/ # design / calculate / report / validate
│ ├── commands/ # /ss-design, /ss-calc, /ss-power, /ss-curve, /ss-report
│ └── agents/ # methodologist, calculator, validator
├── reference/ # Local-only knowledge base (gitignored, user-supplied)
│ └── ... # Validation reference material — not bundled in repo
├── tests/ # pytest suites
│ ├── validation/ # worked-example regression tests
│ └── unit/ # registry / doctor / signature parity
└── docs/ # ARCHITECTURE, ROADMAP, METHOD_COVERAGE, COOKBOOK, TROUBLESHOOTING
```
## Installation
```sh
pip install -e ".[dev]"
```
## Quick start
```sh
samplesize list # available methods
samplesize show two_sample_t_equal_var # full metadata + kwargs
samplesize calc two_sample_t_equal_var \
--json-args '{"mean1":10,"mean2":0,"sd":20,"alpha":0.05,"power":0.80,"sides":2}'
# → n1=64, n2=64, achieved_power=0.8015; audit JSON saved
# follow-ups on the audit just printed
AUDIT=$(ls -t .samplesize/audit/*.json | head -1)
samplesize report "$AUDIT" --kind power-curve --out curve.png
samplesize report "$AUDIT" --kind protocol --lang en
samplesize report "$AUDIT" --kind sensitivity --vary "sd=15,20,25,30"
samplesize report "$AUDIT" --kind r-code # pwr::pwr.t.test(...) equivalent
samplesize report "$AUDIT" --kind sas-code # PROC POWER equivalent
# sanity gate
samplesize doctor
```
**More recipes.** `docs/COOKBOOK.md` has 15 worked study scenarios
(RCT, NI, equivalence, survival, Cox, McNemar, χ², ANOVA, correlation).
Hit an error? `docs/TROUBLESHOOTING.md`.
## Using inside Claude Code (plugin)
Two ways to make the slash commands and skills available:
**Ephemeral — load for one session**:
```sh
claude --plugin-dir /path/to/samplesize-copilot/plugin
```
**Persistent — register the marketplace and install**:
```sh
claude plugin marketplace add kimmingul/samplesize-copilot # from GitHub
# …or from a local clone (repo root): claude plugin marketplace add /path/to/samplesize-copilot
claude plugin install samplesize-copilot@samplesize-copilot # requires CC ≥ 2.2
```
Once loaded, these commands work inside Claude Code:
- `/samplesize-copilot:ss-design ` — pick the right test
- `/samplesize-copilot:ss-calc ...` — run a calculation
- `/samplesize-copilot:ss-power ...` — solve for power at fixed N
- `/samplesize-copilot:ss-curve` — emit a power-curve PNG for the latest result
- `/samplesize-copilot:ss-report` — generate ICH E9 protocol / grant text
- `/samplesize-copilot:ss-validate ` — run worked-example validation tests
## Coverage
234 methods across:
- Means (one-sample, two-sample, paired, non-inferiority, equivalence, superiority-by-margin)
- Proportions (one, two, McNemar, NI/equivalence variants)
- Correlation (Pearson exact and Fisher-z)
- ANOVA / GLM (one-way F, chi-square)
- Survival (logrank Freedman, Cox regression Hsieh-Lavori)
- Group-sequential (O'Brien-Fleming, Pocock alpha-spending)
- Cluster-randomized (two means, two proportions, Donner-Klar)
- Cross-over (2×2 design)
- Phase II (Simon two-stage)
- ROC / diagnostic
- And more — see `docs/METHOD_COVERAGE.md`
## Validation
819 fixture tests passing. Methods are validated against worked examples from
established statistical software references. Reference content itself is
user-supplied (see `reference/` — not bundled in this repository).
Fixtures live under `tests/validation/fixtures/.yaml`.
```sh
pytest tests/validation/
```
## License
Apache License 2.0 — see `LICENSE`.
## Acknowledgments
Method implementations draw on the primary statistical literature, including:
- Cohen, J. (1988). *Statistical Power Analysis for the Behavioral Sciences* (2nd ed.)
- Donner, A. & Klar, N. (1996). Statistical considerations in the design and analysis of community intervention trials.
- Hsieh, F. Y. & Lavori, P. W. (2000). Sample-size calculations for the Cox proportional hazards regression model with nonbinary covariates.
- Schoenfeld, D. (1981). The asymptotic properties of nonparametric tests for comparing survival distributions.
- Bonett, D. G. & Wright, T. A. (2000). Sample size requirements for estimating Pearson, Kendall and Spearman correlations.
- Hanley, J. A. & McNeil, B. J. (1982). The meaning and use of the area under a receiver operating characteristic (ROC) curve.
- Simon, R. (1989). Optimal two-stage designs for phase II clinical trials.
- Wang, S. K. & Tsiatis, A. A. (1987). Approximately optimal one-parameter boundaries for group sequential trials.
- Flack, V. F. et al. (1988). Sample size determinations for the two rater kappa statistic.