{"id":50976817,"url":"https://github.com/mbagalman/lattice-doe","last_synced_at":"2026-06-19T09:01:35.869Z","repository":{"id":349218309,"uuid":"1174903757","full_name":"mbagalman/lattice-doe","owner":"mbagalman","description":"Python code to create experimental designs optimized to meet statistical power targets","archived":false,"fork":false,"pushed_at":"2026-05-06T23:21:57.000Z","size":2791,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"master","last_synced_at":"2026-05-07T00:33:30.384Z","etag":null,"topics":["abtesting","data","datascience","designofexperiments","experimentaldesign","statistics"],"latest_commit_sha":null,"homepage":"https://github.com/mbagalman/lattice-doe","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mbagalman.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-03-07T01:09:10.000Z","updated_at":"2026-05-06T23:22:01.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/mbagalman/lattice-doe","commit_stats":null,"previous_names":["mbagalman/lattice-doe"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/mbagalman/lattice-doe","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mbagalman%2Flattice-doe","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mbagalman%2Flattice-doe/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mbagalman%2Flattice-doe/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mbagalman%2Flattice-doe/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mbagalman","download_url":"https://codeload.github.com/mbagalman/lattice-doe/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mbagalman%2Flattice-doe/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34523991,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-19T02:00:06.005Z","response_time":61,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["abtesting","data","datascience","designofexperiments","experimentaldesign","statistics"],"created_at":"2026-06-19T09:01:34.717Z","updated_at":"2026-06-19T09:01:35.837Z","avatar_url":"https://github.com/mbagalman.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Lattice DOE\n\n## Stop Guessing Your Way Through Experimental Design\n\nMost experimental design advice assumes a world that doesn't exist.\n\nClean factors. Unlimited sample sizes. Neatly separable effects.\n\nIn the real world, you get:\n- Too many variables\n- Not enough runs\n- Constraints nobody told you about until it was too late\n\nSo what happens? People either oversimplify the problem until it's wrong, or overcomplicate it until it's unusable. Sometimes they do both in the same meeting.\n\nLattice DOE is for that situation.\n\nIt helps you design experiments that are:\n- statistically powered\n- run-efficient\n- reproducible\n- realistic about constraints\n\nIn plain English: instead of doing a power analysis in one place, picking a design in another, and hoping they vaguely agree, Lattice DOE solves those decisions together.\n\n---\n\n## What This Is\n\n`lattice-doe` is a Python toolkit for designing **powered, efficient, structured experiments under real-world constraints**.\n\nIt searches for the **smallest experiment that can still hit your target power**, then optimizes the run locations under your chosen criterion (`I`, `D`, or `A`).\n\nIt's built for the messy middle:\n- When full factorial designs are impossible\n- When fractional factorial feels like guesswork\n- When \"just randomize it\" isn't good enough\n\nThis is about getting **maximum information from limited experiments** without pretending your situation is cleaner than it is.\n\nIt supports:\n- linear contrast power\n- global R² power\n- GLM contrast power for binomial and Poisson responses\n- multi-response design\n- blocked and split-plot structures\n- Python, CLI, app, API, and spreadsheet-driven workflows\n\n---\n\n## Who This Is For\n\n- Data scientists running experiments with multiple interacting variables\n- Analysts asked to \"design an experiment\" without a textbook setup\n- Teams with **tight budgets on experimental runs**\n- Anyone who has ever thought: *\"There has to be a better way to structure this\"*\n- Anyone tired of hearing \"just use a factorial\" from someone who is not paying for the runs\n\n---\n\n## A Concrete Example\n\nImagine a real study with 8 continuous variables and a budget for 32 experimental runs. A full factorial design would require far more runs, and random sampling leaves coverage gaps and hidden correlations.\n\nThe same logic applies at any scale. Here is a minimal 2-factor example you can run immediately — the API is identical whether you have 2 factors or 20:\n\n```python\nfrom lattice_doe import find_optimal_design, PowerContrastConfig, DesignOptions\nfrom lattice_doe.contrasts import contrast_from_scenarios\n\nformula = \"~ 1 + A + B + A:B\"\nfactors = {\n    \"A\": (-1.0, 1.0),\n    \"B\": (-1.0, 1.0),\n}\n\nL, delta = contrast_from_scenarios(\n    formula=formula,\n    factors=factors,\n    scenario_a={\"A\": -1.0, \"B\": 0.0},\n    scenario_b={\"A\": 1.0, \"B\": 0.0},\n    sesoi=2.0,  # smallest response-scale effect worth detecting (in sigma units)\n)\n\nresult = find_optimal_design(\n    formula=formula,\n    factors=factors,\n    power_cfg=PowerContrastConfig(L=L, delta=delta, power=0.80, sigma=1.0, max_n=50),\n    design_opts=DesignOptions(criterion=\"I\", auto_candidate=True),\n)\n\nprint(result[\"design_df\"])          # the optimal run matrix\nprint(result[\"report\"][\"n\"])        # minimum n that achieves 80% power\n```\n\nNow you have a design you can execute and defend.\n\n---\n\n## Why This Exists\n\nExperimental design isn't just a statistics problem. It's a **decision problem**.\n\nThe goal isn't elegance. The goal is making better decisions with limited information.\n\nMost tools optimise for theoretical purity. This one optimises for practical constraints, interpretability, and real-world usability.\n\n---\n\n## What Comes Next\n\nBelow you'll find full documentation, examples, and implementation details. If you're here to go deep, keep reading. If you're here because your last experiment was a mess, start with the [Quick Start Guide](docs/quickstart.md).\n\n---\n\n**Power-assured optimal experimental designs for linear and GLM models.**\n\nThe package automatically searches for the minimum sample size `n` that achieves your target power, then selects the best design at that `n` under your chosen criterion (`\"I\"` by default, or `\"D\"` / `\"A\"`). If the search hits practical limits first, it returns the best design found and reports that clearly.\n\n**Supported power modes:**\n\n| Mode | Config class | Test | Use case |\n|---|---|---|---|\n| Linear contrast | `PowerContrastConfig` | F-test on Lβ = δ | Detecting a specific effect in a linear model |\n| Global R² | `PowerR2Config` | Omnibus F-test | Testing whether the full model explains meaningful variance |\n| GLM Wald χ² | `PowerGLMContrastConfig` | Wald chi-square | Binomial (logistic) or Poisson response variables |\n| Multi-response | `MultiResponseOptions` | Per-response + combined | Simultaneously powering several responses |\n\n**Supported optimality criteria** (set via `DesignOptions.criterion`):\n\n- **I-optimality** (default) — minimises *average prediction variance* over the design region; preferred when prediction accuracy across the factor space matters most.\n- **D-optimality** — maximises `det(X'X)`; preferred when precise coefficient estimation is the primary goal.\n- **A-optimality** — minimises `trace((X'X)⁻¹)`; equalises coefficient-estimate variances.\n\nNew here? Start with the 10-minute guide: [Quick Start Guide](docs/quickstart.md).  \nLooking for task-oriented examples? See [Recipes](docs/recipes.md).\n\n---\n\n## Table of Contents\n\n- [Installation](#installation)\n- [Quick Start Guide (10 minutes)](docs/quickstart.md)\n- [Recipes](docs/recipes.md)\n- [Quick Start — Python API](#quick-start--python-api)\n- [Quick Start — CLI](#quick-start--cli)\n- [Streamlit Web UI](#streamlit-web-ui)\n- [Power Modes](#power-modes)\n- [Configuration Reference](#configuration-reference)\n- [Output Structure](#output-structure)\n- [Power Curves](#power-curves)\n- [Sensitivity Analysis \u0026 MDE](#sensitivity-analysis)\n- [Comparing Criteria](#comparing-criteria)\n- [Augmenting Designs](#augmenting-an-existing-design)\n- [Split-Plot Designs (Hard-to-Change Factors)](#split-plot-designs-hard-to-change-factors)\n- [Diagnostics](#diagnostics)\n- [Shareable Reports](#shareable-reports)\n- [Candidate Set \u0026 Algorithm Details](#candidate-set--algorithm-details)\n- [Reproducibility](#reproducibility)\n- [Troubleshooting](#troubleshooting)\n- [License](#license)\n\n---\n\n## Installation\n\nRequires Python ≥ 3.9.\n\nIf you just want the core optimizer, install that. If you want YAML, plots, reports, or the app, add the extras you actually need.\n\n```bash\n# Core install (from source)\npip install -e .\n\n# With CLI support (YAML configs)\npip install -e \".[cli]\"\n\n# With visualization (power curve plots — matplotlib + plotly)\npip install -e \".[viz]\"\n\n# With Streamlit web UI (interactive frontend)\npip install -e \".[app]\"\n\n# With shareable HTML report generation (Jinja2 + Pillow)\npip install -e \".[report]\"\n\n# With PDF export support (requires system-level weasyprint dependencies)\npip install -e \".[report-pdf]\"\n\n# With progress bars and Excel export\npip install -e \".[extras]\"\n\n# Everything at once\npip install -e \".[all]\"\n```\n\n**Core dependencies:** `numpy`, `scipy`, `pandas`, `patsy`\n\n---\n\n## Quick Start — Python API\n\nIf you work in notebooks or scripts, this is the fastest path from \"I need a design\" to something you can actually run.\n\n### Contrast-based power\n\nSpecify which linear combination of coefficients you want to detect and by how much.\n\n```python\nfrom lattice_doe import (\n    find_optimal_design,\n    PowerContrastConfig,\n    DesignOptions,\n)\n\nformula = \"~ 1 + A + B + A:B\"\nfactors = {\n    \"A\": [\"low\", \"high\"],   # 2-level categorical → Patsy encodes 1 dummy column\n    \"B\": (0.0, 10.0),       # continuous: (low, high) tuple\n}\n# With these factors, Patsy encodes p = 4 columns:\n#   [Intercept, A[T.high], B, A[T.high]:B]\n# so L must have exactly 4 columns.\n\n# Contrast: test that the B main effect equals 0.5\n# L must be (q x p) where p = number of columns in the Patsy model matrix\npower_cfg = PowerContrastConfig(\n    L=[[0, 0, 1, 0]],   # one-row contrast selecting the B main-effect coefficient\n    delta=[0.5],         # minimum detectable effect (same units as sigma)\n    alpha=0.05,\n    power=0.80,\n    sigma=1.0,\n    max_n=500,\n)\n\nopts = DesignOptions(\n    auto_candidate=True,   # adaptive candidate sizing (recommended)\n    starts=8,\n    random_state=42,\n)\n\nresult = find_optimal_design(\n    formula=formula,\n    factors=factors,\n    power_cfg=power_cfg,\n    design_opts=opts,\n)\n\ndesign_df  = result[\"design_df\"]    # DataFrame: n-run optimal design\nbuckets_df = result[\"buckets_df\"]   # DataFrame: unique run allocations with counts\nreport     = result[\"report\"]       # dict: power, n, lambda, df, timing, etc.\n\nprint(f\"n = {report['n']},  achieved power = {report['achieved_power']:.3f}\")\nprint(buckets_df)\n```\n\n### Building contrasts from scenarios\n\nUse `contrast_from_scenarios` to construct `L` and `delta` automatically by comparing two factor settings:\n\n```python\nfrom lattice_doe.contrasts import contrast_from_scenarios\n\nscenario_a = {\"A\": \"low\",  \"B\": 5.0}\nscenario_b = {\"A\": \"high\", \"B\": 5.0}\n\nL, delta = contrast_from_scenarios(\n    formula=formula,\n    factors=factors,\n    scenario_a=scenario_a,\n    scenario_b=scenario_b,\n    sesoi=1.0,   # smallest effect of interest (in response units)\n)\n\npower_cfg = PowerContrastConfig(L=L, delta=delta, alpha=0.05, power=0.80, sigma=1.0)\n```\n\n### Global R² power\n\nTest whether the full model explains a meaningful proportion of variance.\n\n```python\nfrom lattice_doe import PowerR2Config\n\npower_cfg = PowerR2Config(\n    r2_target=0.15,   # detect R² ≥ 0.15\n    alpha=0.05,\n    power=0.80,\n    max_n=500,\n    lambda_mode=\"n\",  # \"n\" (default) or \"n_minus_p\" (more conservative)\n)\n\nresult = find_optimal_design(formula, factors, power_cfg, opts)\n```\n\n---\n\n## Quick Start — CLI\n\nIf you would rather keep the logic in a config file and the outputs on disk, the CLI is the cleaner option.\n\nInstall with `pip install -e \".[cli]\"` for YAML support, then run:\n\n```bash\n# Generate a starter config (no installation of PyYAML needed for this step)\nlattice --template contrast \u003e config.yml   # contrast mode template\nlattice --template r2      \u003e config.yml   # global R² mode template\n\n# Generate a design\nlattice --config config.yml --out ./output/design\n\n# With Excel output and verbose logging\nlattice --config config.yml --out ./output/design --excel -v\n\n# Validate config without running (dry run)\nlattice --config config.yml --dry-run\n```\n\n**Contrast mode config (`config.yml`):**\n\n```yaml\nformula: \"~ 1 + A + B + A:B\"\n\nfactors:\n  A: [low, high]           # 2-level categorical → Patsy encodes 1 dummy column\n  B: [0.0, 10.0]           # continuous [low, high]\n# With these factors, Patsy encodes p = 4 columns:\n#   [Intercept, A[T.high], B, A[T.high]:B]\n# so L must have exactly 4 columns.\n\n# Option 1: explicit contrast matrix\ncontrast:\n  L: [[0, 0, 1, 0]]        # selects the B main-effect coefficient (column index 2)\n  delta: [0.5]\n\n# Option 2: scenario-based (auto-builds L and delta — safer, formula-agnostic)\n# contrast:\n#   scenario_a: {A: low,  B: 5.0}\n#   scenario_b: {A: high, B: 5.0}\n#   sesoi: 1.0\n\nalpha: 0.05\npower: 0.80\nsigma: 1.0\n\ndesign:\n  auto_candidate: true\n  starts: 8\n  algo: fedorov\n  random_state: 42\n\noutput:\n  basename: my_design\n  excel: false\n```\n\n**Global R² mode config:**\n\n```yaml\nformula: \"~ 1 + A + B + A:B\"\nfactors:\n  A: [low, med, high]\n  B: [0.0, 10.0]\n\nr2_target: 0.15\nalpha: 0.05\npower: 0.80\n\ndesign:\n  auto_candidate: true\n  starts: 8\n  random_state: 42\n```\n\nThe CLI always writes `\u003cbasename\u003e_design.csv`, `\u003cbasename\u003e_buckets.csv`, and `\u003cbasename\u003e_report.json`. Pass `--excel` (or set `output.excel: true`) to also produce an `.xlsx` workbook.\n\n---\n\n## Streamlit Web UI\n\nAn interactive browser-based frontend lets you configure and run designs, explore sensitivity, compare criteria, and download results. It is useful when you want something more guided than a script, or when not everyone on the team wants to touch Python.\n\n### Local run\n\n```bash\n# Install the package with Streamlit and Plotly\npip install -e \".[app]\"\n\n# Launch (opens at http://localhost:8501)\nstreamlit run app/app.py\n```\n\n### Docker\n\n```bash\ndocker build -t lattice-doe .\ndocker run -p 8501:8501 lattice-doe\n# Open http://localhost:8501\n```\n\n### Streamlit Community Cloud (free hosting)\n\n1. Push this repository to GitHub.\n2. Go to [share.streamlit.io](https://share.streamlit.io) and click **New app**.\n3. Select your repository and set **Main file path** to `app/app.py`.\n4. Click **Deploy** — no secrets or environment variables required.\n\nFor a full walkthrough see [Quick Start Guide § 5](docs/quickstart.md#5-streamlit-web-ui).\n\n---\n\n## Power Modes\n\nPick the power model that matches the question you are actually asking. This library does not make you pretend every problem is the same kind of effect test.\n\n### Contrast-based (`PowerContrastConfig`)\n\nTests H₀: Lβ = 0 against H₁: Lβ = δ using an F-test on the linear contrast.\n\n**Noncentrality parameter:**\n\n```\nλ = δᵀ [L (X'X)⁻¹ Lᵀ]⁺ δ / σ²\n```\n\n- `df_num` = rank(L), `df_denom` = n − rank(X)\n- Supports multiple simultaneous contrasts (q \u003e 1 rows in L)\n- L and delta are validated for shape consistency and non-zero content\n\n### Global R² (`PowerR2Config`)\n\nTests H₀: R² = 0 (all slopes are zero) using the omnibus F-test.\n\n**Noncentrality parameter** (via Cohen's f² = R²/(1−R²)):\n\n| `lambda_mode` | Formula | Matches |\n|---|---|---|\n| `\"n\"` (default) | λ = f² · n | G\\*Power, statsmodels |\n| `\"n_minus_p\"` | λ = f² · (n − p) | More conservative |\n\n- `df_num` = number of slope parameters (intercept excluded, per G\\*Power convention)\n- `df_denom` = n − rank(X)\n\n### GLM Wald χ² (`PowerGLMContrastConfig`)\n\nTests H₀: Lβ = 0 using a Wald chi-square statistic for binomial (logistic) or Poisson GLM responses.\n\nThe design search uses a **null-based locally optimal** information matrix: `M = w · X'X` where `w = p₀(1 − p₀)` (binomial) or `w = μ₀` (Poisson). Because `w` is a positive scalar it cancels from I/D/A criteria, so the Fedorov exchange is structurally identical to OLS — only the power calculation changes.\n\n\u003e **Approximation scope.** The Fisher weight `w` is a single scalar evaluated at the null baseline and applied uniformly to every design point. This is accurate when the true operating point is close to the baseline. For designs with wide covariate ranges and substantial slope effects, the true per-point weights `wᵢ = p(xᵢ)(1−p(xᵢ))` will vary across the design, and the constant-weight approximation may over- or understate power. Validate results via simulation when slopes are large relative to the baseline.\n\n```python\nfrom lattice_doe import PowerGLMContrastConfig\nfrom lattice_doe.contrasts import contrast_from_scenarios\n\nL, delta = contrast_from_scenarios(formula, factors, scenario_a, scenario_b, sesoi=0.4)\n\npower_cfg = PowerGLMContrastConfig(\n    L=L,\n    delta=delta,           # effect on the linear-predictor (log-odds / log-rate) scale\n    baseline=0.30,         # p₀ for binomial (0 \u003c p₀ \u003c 1) or μ₀ for Poisson (\u003e 0)\n    family=\"binomial\",     # \"binomial\" or \"poisson\"\n    link=None,             # None → canonical link (logit / log); or explicit \"logit\" / \"log\"\n    alpha=0.05,\n    power=0.80,\n    max_n=500,\n)\n\nresult = find_optimal_design(formula, factors, power_cfg, opts)\n```\n\nUse the CLI template to get started:\n\n```bash\nlattice --template glm-binomial \u003e glm_config.yml\nlattice --template glm-poisson  \u003e glm_config.yml\n```\n\n---\n\n## Configuration Reference\n\n### `PowerContrastConfig`\n\n| Parameter | Type | Default | Description |\n|---|---|---|---|\n| `L` | array (q×p) | — | Contrast matrix; must match model matrix column count |\n| `delta` | array (q,) | — | Minimum detectable effects (one per contrast row) |\n| `alpha` | float | `0.05` | Significance level |\n| `power` | float | `0.80` | Target power |\n| `sigma` | float | `1.0` | Residual standard deviation |\n| `max_n` | int | `2000` | Hard cap on sample size search |\n| `tol_power` | float | `1e-3` | Convergence tolerance |\n| `max_iter` | int | `200` | Max n-search iterations |\n\n### `PowerR2Config`\n\n| Parameter | Type | Default | Description |\n|---|---|---|---|\n| `r2_target` | float | — | Target R² effect size (0, 1) |\n| `alpha` | float | `0.05` | Significance level |\n| `power` | float | `0.80` | Target power |\n| `max_n` | int | `2000` | Hard cap on sample size search |\n| `lambda_mode` | `\"n\"` \\| `\"n_minus_p\"` | `\"n\"` | Noncentrality convention |\n| `tol_power` | float | `1e-3` | Convergence tolerance |\n| `max_iter` | int | `200` | Max n-search iterations |\n\n### `PowerGLMContrastConfig`\n\n| Parameter | Type | Default | Description |\n|---|---|---|---|\n| `L` | array (q×p) | — | Contrast matrix; must match model matrix column count |\n| `delta` | array (q,) | — | Effect sizes on the linear-predictor scale (log-odds for binomial, log-rate for Poisson) |\n| `baseline` | float | — | Baseline mean on the response scale: probability ∈ (0, 1) for binomial; expected count \u003e 0 for Poisson |\n| `family` | `\"binomial\"` \\| `\"poisson\"` | `\"binomial\"` | Response distribution family |\n| `link` | `\"logit\"` \\| `\"log\"` \\| `None` | `None` | Link function; `None` selects the canonical link for the family |\n| `alpha` | float | `0.05` | Significance level |\n| `power` | float | `0.80` | Target power |\n| `max_n` | int | `2000` | Hard cap on sample size search |\n| `tol_power` | float | `1e-3` | Convergence tolerance |\n| `max_iter` | int | `200` | Max n-search iterations |\n\n### `DesignOptions`\n\n| Parameter | Type | Default | Description |\n|---|---|---|---|\n| `random_state` | int | `123` | Global random seed (must be an integer; `None` is not allowed) |\n| `algo` | `\"fedorov\"` \\| `\"coordinate\"` | `\"fedorov\"` | API-compatibility selector; both map to the internal Fedorov exchange engine |\n| `starts` | int | `5` | Number of random starts |\n| `max_iter` | int | `1000` | Max iterations per start |\n| `xtx_jitter` | float | `1e-8` | Ridge added to X'X for numerical stability |\n| `criterion` | str | `\"I\"` | Optimality criterion: `\"I\"` (minimise average prediction variance), `\"D\"` (maximise `det(X'X)`), or `\"A\"` (minimise `trace((X'X)⁻¹)`) |\n| `candidate_points` | int | `2000` | Fixed candidate size (when `auto_candidate=False`) |\n| `auto_candidate` | bool | `False` | Adaptively size the candidate set |\n| `cand_min` | int | `1000` | Minimum candidate points (auto mode) |\n| `cand_max` | int | `10000` | Maximum candidate points (auto mode) |\n| `cat_cells_cap` | int | `10000` | Cap on categorical cell enumeration |\n| `per_cell_alpha` | float | `1.5` | Candidate multiplier per categorical cell |\n| `per_cell_min` | int | `5` | Min points per cell (mixed designs) |\n| `per_cell_max` | int | `20` | Max points per cell (mixed designs) |\n| `allow_candidate_growth` | bool | `False` | Grow candidate once if conditioning is poor |\n| `growth_factor` | float | `2.0` | Multiplier applied when growing candidate |\n| `workers` | int \\| None | `None` | Parallel workers for starts (None = serial) |\n| `parallel_seed_stride` | int | `10000` | Seed offset between parallel workers |\n| `constraint_func` | callable \\| None | `None` | Row-level feasibility filter (Python callable) |\n| `constraint_expr` | str \\| None | `None` | Row-level feasibility filter as a string expression (YAML/JSON-friendly alternative to `constraint_func`; see [Feasibility constraints](#candidate-set--algorithm-details)) |\n| `n_blocks` | int \\| None | `None` | Number of blocks (≥ 2 enables blocking; `None` / `0` = unblocked) |\n| `split_plot` | `SplitPlotOptions` \\| None | `None` | Split-plot configuration for hard-to-change factors (see [Split-Plot Designs](#split-plot-designs-hard-to-change-factors)) |\n\n**Parallel starts note:** On macOS and Windows, set `workers \u003e 1` only inside `if __name__ == \"__main__\":` (standard `multiprocessing` requirement).\n\n---\n\n## Output Structure\n\n`find_optimal_design(...)` returns a dict with three keys: the design, the replication structure, and the audit trail for how the optimizer got there. (For multi-response designs, see `find_multiresponse_design(...)` which returns a different structure with `design`, `buckets`, `responses`, and flat summary fields.)\n\n### `result[\"design_df\"]` — `DataFrame`\n\nThe n-run optimal design (I-, D-, or A-optimal depending on `criterion`). Each row is a selected point from the candidate set. Duplicate rows represent replicated runs.\n\n### `result[\"buckets_df\"]` — `DataFrame`\n\nUnique run allocations with replication counts:\n\n| A | B | count |\n|---|---|---|\n| low | 3.2 | 3 |\n| high | 7.8 | 2 |\n| ... | ... | ... |\n\n### `result[\"report\"]` — `dict`\n\nKey metrics from the design search:\n\n| Key | Description |\n|---|---|\n| `n` | Final sample size |\n| `p` | Number of model parameters |\n| `df_num` | Numerator degrees of freedom |\n| `df_denom` | Denominator degrees of freedom |\n| `alpha` | Significance level used |\n| `target_power` | Requested power |\n| `achieved_power` | Power of the returned design |\n| `noncentrality_lambda` | Noncentrality parameter λ |\n| `criterion` | Optimality criterion |\n| `algo` | Search algorithm used |\n| `starts` | Number of starts configured |\n| `workers` | Number of parallel workers used |\n| `candidate_points` | Candidate set size used |\n| `elapsed_sec` | Wall-clock seconds for the full n-search (excludes file export) |\n| `search_strategy` | Phases executed, e.g. `\"bisection\"`, `\"bisection+growth\"`, `\"bisection+verification\"` |\n| `verify_window` | Number of n values checked in the Phase 2 downward scan (0 if only bisection ran) |\n| `random_state` | The `random_state` seed that was used (from `DesignOptions`) |\n| `warnings` | List of warning messages issued during search (empty list if none) |\n| `diagnostics` | Dict of design quality metrics (condition number, D-efficiency, etc.) |\n\n---\n\n## Power Curves\n\nUse power curves when you are not ready to commit to one design yet and want to see how much the answer moves as `n` or effect size changes.\n\n```python\nfrom lattice_doe import power_curve_by_n, power_curve_by_effect\n\n# Power vs. n (sweeps a range of n values)\ndf_n = power_curve_by_n(\n    formula=formula,\n    factors=factors,\n    power_cfg=power_cfg,\n    design_opts=opts,\n)\nprint(df_n)   # columns: n, power\n\n# Power vs. effect size (sweeps delta or r2_target at a fixed n)\ndf_eff = power_curve_by_effect(\n    formula=formula,\n    factors=factors,\n    n=30,            # fixed sample size for the sweep\n    power_cfg=power_cfg,\n    design_opts=opts,\n)\nprint(df_eff)   # columns: effect_scale,power (contrast) or r2_target,power (R²)\n```\n\nBoth functions respect `auto_candidate` in `DesignOptions`.\n\n### 2D power surface\n\nSweep two parameters simultaneously to produce a contour map — useful for understanding the joint sensitivity of your design to (n, effect), (effect, sigma), etc.\n\n```python\nfrom lattice_doe.power_curves import power_surface_2d\n\nresult = power_surface_2d(\n    formula=formula,\n    factors=factors,\n    power_cfg=power_cfg,\n    param1=\"n\",           # y-axis: 'n', 'effect', 'sigma', or 'alpha'\n    param1_range=(10, 80),\n    param2=\"effect\",      # x-axis: multiplier on delta (1.0 = nominal)\n    param2_range=(0.3, 2.0),\n    grid_points=20,\n    design_opts=opts,\n    plot=True,            # returns a filled contour figure\n)\n\nprint(result[\"data\"])          # DataFrame: param1, param2, power, noncentrality_lambda\nprint(result[\"power_grid\"])    # 2D numpy array of power values\n# result[\"figure\"]             # matplotlib Figure with contour plot\n```\n\n**Notes on `param1` / `param2` semantics:**\n\n| Parameter | `PowerContrastConfig` | `PowerR2Config` |\n|---|---|---|\n| `\"n\"` | Sample size (integer) | Sample size (integer) |\n| `\"effect\"` | Scale multiplier on `delta` (1.0 = nominal) | Actual `r2_target` value |\n| `\"sigma\"` | Absolute σ value | ❌ not applicable |\n| `\"alpha\"` | Significance level | Significance level |\n\nWhen neither axis is `\"n\"`, the function builds one optimal design at a representative n and sweeps analytically (fast). When `\"n\"` is an axis, one optimal design is built per unique n value (expensive but cached).\n\n### Interactive charts (Plotly)\n\nAll four power-analysis functions accept an opt-in `plot_backend=\"plotly\"` parameter that returns a `plotly.graph_objects.Figure` instead of a matplotlib Figure.  The default (`\"matplotlib\"`) is unchanged — no existing code breaks.\n\n```bash\npip install -e \".[viz]\"   # includes plotly\u003e=5.0\n```\n\n```python\nfrom lattice_doe.power_curves import power_curve_by_n\n\nresult = power_curve_by_n(\n    formula=formula,\n    factors=factors,\n    power_cfg=power_cfg,\n    design_opts=opts,\n    plot=True,\n    plot_backend=\"plotly\",\n)\nfig = result[\"figure\"]   # plotly.graph_objects.Figure\nfig.show()               # interactive in Jupyter / browser\n```\n\nPlotly charts support hover tooltips, zoom/pan, and one-click PNG export (camera icon in the toolbar).  They also work directly in Streamlit:\n\n```python\nimport streamlit as st\nst.plotly_chart(result[\"figure\"])\n```\n\nThe same `plot_backend` parameter is available on `power_curve_by_effect`, `power_surface_2d`, and `power_sensitivity`.  To access the figure from those functions call the implementation modules directly (the `lattice_doe` top-level wrappers discard the figure for backward compatibility):\n\n```python\nfrom lattice_doe.power_curves import power_curve_by_effect, power_surface_2d\nfrom lattice_doe import power_sensitivity\n```\n\n### Sensitivity analysis\n\nReveal how much power changes if a key assumption is wrong — without rebuilding any designs.\n\n```python\nfrom lattice_doe import power_sensitivity\n\n# Contrast mode: sweep sigma\nsensitivity = power_sensitivity(\n    formula=formula,\n    factors=factors,\n    power_cfg=power_cfg,          # PowerContrastConfig → sweeps sigma\n    design_df=result[\"design_df\"],\n    sigma_range=(0.5, 2.0),       # sweep σ from 0.5 to 2.0\n    sigma_points=30,\n    plot=True,\n)\nprint(sensitivity[\"data\"])           # DataFrame: sigma, power, noncentrality_lambda\nprint(sensitivity[\"nominal_power\"])  # power at the configured sigma\n# sensitivity[\"figure\"]              # matplotlib Figure\n\n# R² mode: sweep r2_target (sigma does not enter the R² power formula)\nsensitivity_r2 = power_sensitivity(\n    formula=formula,\n    factors=factors,\n    power_cfg=r2_power_cfg,       # PowerR2Config → sweeps r2_target\n    design_df=result[\"design_df\"],\n    r2_range=(0.05, 0.50),        # sweep R² from 5 % to 50 %\n    r2_points=30,\n    plot=True,\n)\nprint(sensitivity_r2[\"data\"])        # DataFrame: r2_target, power, noncentrality_lambda\nprint(sensitivity_r2[\"r2_nominal\"])  # the nominal r2_target from power_cfg\n```\n\n### Minimum detectable effect\n\nFind the smallest effect your design can detect at a given power — no new design needed.\n\n```python\nfrom lattice_doe import min_detectable_effect\n\n# Contrast mode: MDE expressed as a scale factor on delta\nmde = min_detectable_effect(\n    design_df=result[\"design_df\"],\n    formula=formula,\n    factors=factors,\n    power_cfg=power_cfg,       # PowerContrastConfig\n    target_power=0.80,\n)\nprint(mde[\"mde\"])              # scale factor (1.0 = original delta is just detectable)\nprint(mde[\"achieved_power\"])   # power at the MDE\n\n# R² mode: MDE expressed as the minimum detectable r2_target\nmde_r2 = min_detectable_effect(\n    design_df=result[\"design_df\"],\n    formula=formula,\n    factors=factors,\n    power_cfg=r2_power_cfg,    # PowerR2Config\n    target_power=0.80,\n)\nprint(mde_r2[\"mde\"])           # minimum r2_target detectable at 80 % power\n```\n\n### Comparing criteria\n\nNot sure which optimality criterion is right for your study?  `compare_criteria` runs the full powered-design search under each of `\"I\"`, `\"D\"`, and `\"A\"` (or any subset) and returns a side-by-side summary in a single call.\n\n```python\nfrom lattice_doe import compare_criteria, DesignOptions\n\ncomparison = compare_criteria(\n    formula=formula,\n    factors=factors,\n    power_cfg=power_cfg,          # shared across all criteria\n    design_opts=DesignOptions(    # criterion field is overridden per run\n        auto_candidate=True,\n        starts=8,\n        random_state=42,\n    ),\n    criteria=[\"I\", \"D\", \"A\"],    # default; any non-empty subset is valid\n    plot=True,                    # side-by-side bar charts (requires matplotlib)\n)\n\nprint(comparison[\"summary\"])\n# criterion   n   achieved_power  elapsed_sec  condition_number  d_efficiency\n# I          24        0.814         1.23           12.5             0.81\n# D          22        0.823         1.17           10.2             1.00\n# A          23        0.811         1.19           11.8             0.94\n\n# Access the full result for a specific criterion\ni_design = comparison[\"results\"][\"I\"][\"design_df\"]\nd_report  = comparison[\"results\"][\"D\"][\"report\"]\n```\n\nThe function never mutates *design_opts* — it uses `dataclasses.replace` to create a per-criterion copy.  When only two criteria are needed, pass `criteria=[\"I\", \"D\"]` etc.\n\n---\n\n### Augmenting an existing design\n\nAdd runs to a design that already exists, fixing the original rows in place:\n\n```python\nfrom lattice_doe import augment_design, DesignOptions\n\n# Suppose existing_design is a DataFrame with 20 runs\naugmented, new_runs = augment_design(\n    design_df=existing_design,\n    m=5,                            # add 5 new runs\n    formula=formula,\n    factors=factors,\n    design_opts=DesignOptions(criterion=\"I\", random_state=42),\n)\n\nprint(f\"Original: {len(existing_design)} runs\")\nprint(f\"Augmented: {len(augmented)} runs\")\nprint(new_runs)                     # the 5 newly added rows\n```\n\n`augment_design` uses a greedy one-point-at-a-time exchange that optimises the same criterion (`\"I\"`, `\"D\"`, or `\"A\"`) as the original design search. It is fast but does not guarantee global optimality.\n\n---\n\n## Split-Plot Designs (Hard-to-Change Factors)\n\nWhen some factors are **hard to change** (HTC) between runs — oven temperature, batch composition, equipment configuration — a split-plot design groups runs into **whole plots** so that HTC factors are reset only once per group, while **easy-to-change (ETC)** sub-plot factors vary freely within each group.\n\nIgnoring this structure and using standard OLS inflates the apparent precision of WP-factor estimates. This package uses a **GLS information matrix** (`X'V⁻¹X` where `V = η·ZZ' + I`) for both design search and power calculations, giving correct Type-I error and power for both WP and SP effects.\n\n### `SplitPlotOptions` reference\n\n| Parameter | Type | Default | Description |\n|---|---|---|---|\n| `htc_factors` | `List[str]` | — | Factor names that are hard-to-change (whole-plot factors). Must be a non-empty subset of the factor names passed to the API. |\n| `n_whole_plots` | int | — | Number of whole plots (outer randomization units). Must be ≥ 2. |\n| `eta` | float | `1.0` | Variance ratio σ²_wp / σ²_sp. Must be ≥ 0. `eta=0` degenerates to OLS. |\n| `subplots_per_wp` | int \\| None | `None` | Sub-plots per whole plot. `None` → auto-computed as `max(2, ceil(p / n_wp) + 1)`. |\n| `df_method` | `\"auto\"` \\| `\"conservative\"` \\| `\"sp_only\"` | `\"auto\"` | Denominator df assignment. `\"auto\"` classifies each contrast by stratum; `\"conservative\"` always uses WP df; `\"sp_only\"` always uses SP df. |\n\n### Python API\n\n```python\nfrom lattice_doe import (\n    find_optimal_design,\n    SplitPlotOptions,\n    DesignOptions,\n    PowerContrastConfig,\n    power_curve_by_wp,\n)\nfrom lattice_doe.contrasts import contrast_from_scenarios\n\nformula = \"~ 1 + A + B + C\"\nfactors = {\n    \"A\": (-1.0, 1.0),   # HTC: whole-plot factor\n    \"B\": (-1.0, 1.0),   # HTC: whole-plot factor\n    \"C\": (-1.0, 1.0),   # ETC: sub-plot factor\n}\n\nL, delta = contrast_from_scenarios(\n    formula, factors,\n    {\"A\": -1.0, \"B\": -1.0, \"C\": 0.0},\n    {\"A\":  1.0, \"B\":  1.0, \"C\": 0.0},\n    sesoi=1.0,\n)\npower_cfg = PowerContrastConfig(L=L, delta=delta, power=0.80, sigma=1.0, max_n=200)\n\nresult = find_optimal_design(\n    formula=formula,\n    factors=factors,\n    power_cfg=power_cfg,\n    design_opts=DesignOptions(\n        split_plot=SplitPlotOptions(\n            htc_factors=[\"A\", \"B\"],\n            n_whole_plots=6,\n            eta=1.5,           # σ²_wp / σ²_sp\n            subplots_per_wp=4, # optional; auto-computed if omitted\n            df_method=\"auto\",  # \"auto\" | \"conservative\" | \"sp_only\"\n        ),\n        starts=8,\n        random_state=42,\n    ),\n)\n\nprint(f\"n = {result['report']['n']},  power = {result['report']['achieved_power']:.3f}\")\nprint(result[\"report\"][\"split_plot\"])\n# {'n_whole_plots': 6, 'subplots_per_wp': 4, 'n_total': 24,\n#  'eta': 1.5, 'htc_factors': ['A', 'B'], 'etc_factors': ['C'], 'df_method': 'auto'}\n\n# The design DataFrame includes a __wp_id__ column\ndesign_df = result[\"design_df\"]\nprint(design_df[[\"__wp_id__\", \"A\", \"B\", \"C\"]].head(8))\n```\n\n### Power vs. number of whole plots\n\n```python\ndf = power_curve_by_wp(\n    formula=formula,\n    factors=factors,\n    power_cfg=power_cfg,\n    subplots_per_wp=4,\n    htc_factors=[\"A\", \"B\"],\n    eta=1.5,\n    wp_range=(3, 12),   # sweep n_whole_plots from 3 to 12\n    wp_points=10,\n    design_opts=DesignOptions(starts=5, random_state=42),\n)\nprint(df)  # columns: n_wp, n_total, power, noncentrality_lambda\n```\n\n### CLI (YAML config)\n\nAdd a `split_plot:` block inside the `design:` section:\n\n```yaml\nformula: \"~ 1 + A + B + C\"\nfactors:\n  A: [-1.0, 1.0]\n  B: [-1.0, 1.0]\n  C: [-1.0, 1.0]\n\ncontrast:\n  scenario_a: {A: -1.0, B: -1.0, C: 0.0}\n  scenario_b: {A:  1.0, B:  1.0, C: 0.0}\n  sesoi: 1.0\n\nalpha: 0.05\npower: 0.80\nsigma: 1.0\n\ndesign:\n  starts: 8\n  random_state: 42\n  split_plot:\n    htc_factors: [A, B]\n    n_whole_plots: 6\n    eta: 1.5\n    subplots_per_wp: 4    # omit for auto\n    df_method: auto       # auto | conservative | sp_only\n```\n\n### η-sensitivity sweep\n\nAssess how power degrades as the variance ratio η grows:\n\n```python\nfrom lattice_doe import power_sensitivity\n\nresult = find_optimal_design(formula, factors, power_cfg,\n    DesignOptions(split_plot=SplitPlotOptions(htc_factors=[\"A\",\"B\"], n_whole_plots=6, eta=1.5),\n                  starts=5, random_state=42))\n\nsens = power_sensitivity(\n    formula=formula,\n    factors=factors,\n    power_cfg=power_cfg,\n    design_df=result[\"design_df\"],\n    eta_range=(0.0, 5.0),  # sweep η from 0 (OLS) to 5\n    eta_points=25,\n)\nprint(sens[\"data\"])        # columns: eta, power, noncentrality_lambda\n```\n\n### Notes\n\n- `n_blocks` and `split_plot` cannot both be set (blocked split-plots are not yet supported).\n- `eta=0` degenerates to OLS: the GLS power equals the OLS power for that design.\n- The `\"conservative\"` df_method never produces anti-conservative power for WP-factor contrasts.\n- Set `criterion_ignore_vr=True` inside `SplitPlotOptions` to use the standard OLS criterion during design search while keeping GLS power calculations (useful for benchmarking only).\n- **Denominator df approximation.** df assignment uses a WP-vs-SP stratum classification heuristic, not a full Satterthwaite or Kenward-Roger small-sample correction. For balanced designs with a single variance component this gives exact df. For unbalanced designs or near-singular settings the heuristic can be conservative or anti-conservative; use `df_method=\"conservative\"` when in doubt.\n\n---\n\n## Diagnostics\n\nIf you want more than \"here is your design, good luck,\" export diagnostics alongside the main outputs:\n\n```python\nresult = find_optimal_design(\n    formula=formula,\n    factors=factors,\n    power_cfg=power_cfg,\n    design_opts=opts,\n    export_diagnostics_to=\"./output/\",   # folder path\n)\n```\n\nDiagnostics written to the output folder include:\n\n- **Condition number** — detects near-collinearity\n- **D-efficiency** — relative efficiency vs. D-optimal reference\n- **Leverage** — per-row hat values and summary statistics\n- **VIFs** — variance inflation factors per regressor\n- **I-criterion** — average prediction variance over the candidate region\n\nOutput formats: HTML tables, CSV, and optional plots.\n\n---\n\n## Shareable Reports\n\nGenerate a self-contained HTML file (no external dependencies, works offline) that summarises the design configuration, power metrics, design table, diagnostics, and an embedded power-curve figure. This is handy when you need to hand results to someone who does not want a notebook or CLI log.\n\n### Install\n\n```bash\npip install -e \".[report]\"          # HTML reports (Jinja2 + Pillow)\npip install -e \".[report-pdf]\"      # also enables PDF export via weasyprint\n```\n\n### Python API\n\n```python\nfrom lattice_doe import generate_report\n\ngenerate_report(\n    result=result,          # dict returned by find_optimal_design()\n    formula=formula,\n    factors=factors,\n    power_cfg=power_cfg,\n    output_path=\"./reports/my_design.html\",   # .html or .pdf\n)\n```\n\n### Inline with the optimizer\n\nPass `export_report_to=` directly to `find_optimal_design()` to write the report immediately after the design is found:\n\n```python\nresult = find_optimal_design(\n    formula=formula,\n    factors=factors,\n    power_cfg=power_cfg,\n    design_opts=opts,\n    export_report_to=\"./output/\",   # writes the default HTML report into this folder\n)\n\n# Path stored in result for reference\nprint(result[\"report\"][\"report_path\"])\n```\n\nIf report generation fails (e.g. `jinja2` not installed), the error message is stored in `result[\"report\"][\"report_path_error\"]` and the design result is still returned normally.\n\n### CLI\n\n```bash\nlattice --config my_config.yaml --out results --html-report\n# writes: results_report.html alongside results_design.csv, etc.\n```\n\nOr set it permanently in your YAML config:\n\n```yaml\noutput:\n  html_report: true\n```\n\n### PDF export\n\nReplace the `.html` extension with `.pdf`:\n\n```python\ngenerate_report(..., output_path=\"report.pdf\")\n```\n\n\u003e **Note:** PDF export requires `weasyprint`, which depends on system-level libraries (`libpango`, `libcairo`, `libgdk-pixbuf`). These are unavailable on Streamlit Community Cloud and some CI environments. Install with `pip install -e \".[report-pdf]\"` and follow the [weasyprint installation guide](https://doc.courtbouillon.org/weasyprint/stable/first_steps.html) for your OS.\n\n---\n\n## Google Sheets Integration\n\nConnect directly to a Google Spreadsheet to read your design config and write results back — no local YAML file needed. It is a practical option for teams who already live in spreadsheets.\n\n### Install\n\n```bash\npip install \"lattice-doe[sheets]\"\n```\n\n### Create a starter spreadsheet (once)\n\n```python\nfrom lattice_doe import create_sheet_template\n\n# Creates a new spreadsheet with Config/Results/Design/Buckets sheets pre-filled\nurl = create_sheet_template(\n    title=\"My DOE\",\n    credentials=\"service_account.json\",   # or None for OAuth2 browser flow\n    example=\"r2\",                         # \"r2\" or \"contrast\"\n)\nprint(url)  # open this URL, fill in your factors and formula, then run below\n```\n\n### Run from the spreadsheet\n\n```python\nfrom lattice_doe import sheets_run\n\nresult = sheets_run(url, credentials=\"service_account.json\")\nprint(f\"Optimal n = {result['report']['n']}\")\nprint(f\"Achieved power = {result['report']['achieved_power']:.3f}\")\n# Design/Results/Buckets sheets are now populated in the spreadsheet\n```\n\n### CLI\n\n```bash\nlattice --sheets \"https://docs.google.com/spreadsheets/d/…\" \\\n            --sheets-credentials service_account.json\n```\n\nIf `--sheets-credentials` is omitted, the `GOOGLE_APPLICATION_CREDENTIALS` environment variable is checked, then an OAuth2 browser flow is used as a fallback.\n\n### Authentication\n\n| `credentials` value | Auth mode |\n|---------------------|-----------|\n| `\"path/to/sa.json\"` | Service account — for CI/automation; share the spreadsheet with the SA email |\n| `None` | OAuth2 browser flow — opens a tab on first use, caches token in `~/.config/gspread/` |\n\n---\n\n## Candidate Set \u0026 Algorithm Details\n\n### Factor specifications\n\n```python\nfactors = {\n    \"Temperature\": (20.0, 80.0),       # continuous: 2-element numeric tuple/list\n    \"Catalyst\":    [\"A\", \"B\", \"C\"],    # categorical: list of levels\n    \"Time\":        (1.0, 5.0),         # continuous\n}\n```\n\nContinuous factors with exactly two numeric elements are sampled via Latin Hypercube. Categorical factors are enumerated as a Cartesian product (capped at `cat_cells_cap`). Mixed designs combine both.\n\n### Adaptive candidate sizing (`auto_candidate=True`)\n\nRecommended for most use cases. The package sizes the candidate set based on:\n\n- Number and type of factors\n- Categorical cell count (capped at `cat_cells_cap`)\n- `per_cell_alpha`, `per_cell_min`, `per_cell_max` multipliers\n- Bounded by `cand_min` and `cand_max`\n\n### Candidate growth\n\nIf `allow_candidate_growth=True`, the candidate set is grown by `growth_factor` once during the search if the design matrix condition number exceeds 10⁶. This is a safety net for difficult factor spaces.\n\n### Search algorithm\n\nDesign search uses an internal Fedorov point-exchange optimizer that operates\ndirectly on the Patsy model matrix.\n\nThe `algo` option (`\"fedorov\"` or `\"coordinate\"`) is currently retained for\nAPI compatibility; both settings route to this internal exchange implementation.\n\nThe core design search uses an internal vectorised Fedorov exchange that operates\ndirectly on the Patsy model matrix and has no dependency on external design-of-experiments libraries.\n\n### Feasibility constraints\n\nTwo equivalent ways to exclude infeasible candidate points:\n\n**Python callable** — full flexibility, for use in Python scripts:\n\n```python\ndef no_high_temp_low_time(row):\n    return not (row[\"Temperature\"] \u003e 70 and row[\"Time\"] \u003c 2)\n\nopts = DesignOptions(constraint_func=no_high_temp_low_time)\n```\n\n**String expression** — YAML/JSON-friendly, reproduces in config files without Python code:\n\n```python\n# In Python:\nopts = DesignOptions(constraint_expr=\"not (Temperature \u003e 70 and Time \u003c 2)\")\n\n# Compound and math expressions work too:\nopts = DesignOptions(constraint_expr=\"sqrt(Temperature) + Pressure \u003c= 20\")\nopts = DesignOptions(constraint_expr=\"Catalyst != 'C' or Time \u003c= 3\")\n```\n\nIn a YAML config file:\n\n```yaml\ndesign:\n  constraint_expr: \"not (Temperature \u003e 70 and Time \u003c 2)\"\n```\n\nAvailable functions inside `constraint_expr`: `abs`, `min`, `max`, `round`, `sqrt`, `log`, `log2`, `log10`, `exp`, `floor`, `ceil`, `pi`.  The expression is evaluated with each factor's column name as a local variable.  No imports or arbitrary code execution are permitted.\n\nIf both `constraint_func` and `constraint_expr` are set, `constraint_expr` takes precedence.\n\n---\n\n## Reproducibility\n\nFix `random_state` in `DesignOptions` to reproduce candidate generation, design search, and parallel start assignments exactly. If you need to defend why a design changed, start here.\n\n---\n\n## Troubleshooting\n\nMost failures here are informative. The package is usually telling you that the model, the search limits, or the contrast definition needs another look. The three most common errors and how to fix them are below.\n\n### `ValueError: power_cfg.max_n (N) must be greater than the number of model parameters p (M).`\n\n**What it means.** The cap on the search range (`max_n`) is too small relative to the number of model parameters `p`. Patsy expanded the formula into `M` columns (intercept, main effects, interactions, dummy levels for categoricals), and any usable design needs more runs than parameters. The check fires before the search starts, so no design is returned.\n\n**How to inspect `p` for your formula** without running the search:\n\n```python\nfrom lattice_doe.candidate import build_candidate\nfrom lattice_doe.model_matrix import build_model_matrix\n\nformula = \"~ 1 + A + B + A:B + C\"\nfactors = {\"A\": (-1.0, 1.0), \"B\": (-1.0, 1.0), \"C\": [\"low\", \"med\", \"high\"]}\n\ncand = build_candidate(factors, candidate_points=20, seed=0)\nX, names = build_model_matrix(formula, cand)\nprint(f\"p = {X.shape[1]}\")\nprint(\"columns:\", names)\n```\n\n**Fixes.** Raise `max_n` in the power config (e.g. `PowerContrastConfig(..., max_n=200)`), or simplify the formula by dropping interactions or collapsing categorical levels.\n\n### `ValueError: Contrast L has X columns but model has p_treat=Y treatment parameters.`\n\n**What it means.** Your contrast matrix `L` has the wrong number of columns. `L` must have one column per parameter in the **Patsy-encoded** model matrix — including the intercept, dummy columns for categorical factors, and interaction terms. Hand-written `L` arrays are the usual culprit.\n\n**How to inspect the column layout:**\n\n```python\nfrom lattice_doe.candidate import build_candidate\nfrom lattice_doe.model_matrix import build_model_matrix\n\ncand = build_candidate(factors, candidate_points=20, seed=0)\nX, names = build_model_matrix(formula, cand)\nfor i, name in enumerate(names):\n    print(f\"  column {i}: {name}\")\n```\n\n**Fix (recommended).** Don't write `L` by hand. Use `contrast_from_scenarios`, which always produces a correctly-shaped row for the encoded matrix:\n\n```python\nfrom lattice_doe.contrasts import contrast_from_scenarios\n\nL, delta = contrast_from_scenarios(\n    formula=formula,\n    factors=factors,\n    scenario_a={\"A\": -1.0, \"B\": 0.0},\n    scenario_b={\"A\":  1.0, \"B\": 0.0},\n    sesoi=2.0,\n)\n```\n\nIf you do need a hand-written `L`, rebuild it to match the printed `names` list — one entry per column.\n\n### `RuntimeWarning: Design generation finished without converging to target power.`\n\n**What it means.** The search stopped without the design hitting the target power. The full warning message names the limit that was hit (`max_iter` or `max_n`), the achieved power, and the final `n`. The best design found is still returned — `result[\"design_df\"]` and `result[\"report\"]` are both populated — but `report[\"achieved_power\"]` is below `power_cfg.power` and the warning text lands in `report[\"warnings\"]`.\n\n**How to inspect what happened:**\n\n```python\nresult = find_optimal_design(\n    formula=formula,\n    factors=factors,\n    power_cfg=PowerContrastConfig(L=L, delta=delta, power=0.80, sigma=1.0, max_n=50),\n    design_opts=DesignOptions(criterion=\"I\"),\n)\n\nprint(\"achieved :\", result[\"report\"][\"achieved_power\"])\nprint(\"target   :\", result[\"report\"][\"target_power\"])\nprint(\"strategy :\", result[\"report\"][\"search_strategy\"])\nprint(\"warnings :\", result[\"report\"][\"warnings\"])\n```\n\n**Fixes, in order of preference.**\n- Raise `max_n` to give the search more room.\n- Raise `starts` in `DesignOptions` to do more random restarts (helps when the search keeps landing in local optima).\n- Relax `power_cfg.power` to a target you can actually hit.\n- Increase the SESOI (`delta`) — if you're asking for an effect smaller than the noise can resolve at any reasonable `n`, no amount of design optimization will save you.\n\n### Other common failures\n\n**Poor conditioning / near-singular X'X.** Enable `allow_candidate_growth=True`, increase `candidate_points`, or bump `xtx_jitter` slightly (e.g. `1e-6`).\n\n**Parallelism on macOS / Windows.** Guard `workers \u003e 1` calls inside `if __name__ == \"__main__\":`.\n\n---\n\n## License\n\nMIT\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmbagalman%2Flattice-doe","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmbagalman%2Flattice-doe","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmbagalman%2Flattice-doe/lists"}