{"id":35370153,"url":"https://github.com/igerber/diff-diff","last_synced_at":"2026-04-20T02:07:47.041Z","repository":{"id":331382355,"uuid":"1126412969","full_name":"igerber/diff-diff","owner":"igerber","description":"A Python library for Difference-in-Differences (DiD) causal inference analysis with an sklearn-like API and statsmodels-style outputs.","archived":false,"fork":false,"pushed_at":"2026-02-16T00:08:12.000Z","size":2658,"stargazers_count":94,"open_issues_count":1,"forks_count":12,"subscribers_count":2,"default_branch":"main","last_synced_at":"2026-02-16T00:13:23.822Z","etag":null,"topics":["analytics","causal-inference","data-science","difference-in-differences","econometrics","economics"],"latest_commit_sha":null,"homepage":"https://diff-diff.readthedocs.io","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/igerber.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":"ROADMAP.md","authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-01-01T21:27:51.000Z","updated_at":"2026-02-15T23:30:13.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/igerber/diff-diff","commit_stats":null,"previous_names":["igerber/diff-diff"],"tags_count":30,"template":false,"template_full_name":null,"purl":"pkg:github/igerber/diff-diff","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/igerber%2Fdiff-diff","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/igerber%2Fdiff-diff/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/igerber%2Fdiff-diff/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/igerber%2Fdiff-diff/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/igerber","download_url":"https://codeload.github.com/igerber/diff-diff/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/igerber%2Fdiff-diff/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29494605,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-16T00:00:57.352Z","status":"ssl_error","status_checked_at":"2026-02-15T23:56:34.338Z","response_time":118,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["analytics","causal-inference","data-science","difference-in-differences","econometrics","economics"],"created_at":"2026-01-02T02:52:26.150Z","updated_at":"2026-04-20T02:07:47.028Z","avatar_url":"https://github.com/igerber.png","language":"Python","funding_links":[],"categories":["Causal Inference and Econometrics"],"sub_categories":["Frontier Tools"],"readme":"# diff-diff\n\n[![PyPI version](https://img.shields.io/pypi/v/diff-diff.svg)](https://pypi.org/project/diff-diff/)\n[![Python versions](https://img.shields.io/pypi/pyversions/diff-diff.svg)](https://pypi.org/project/diff-diff/)\n[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT)\n[![Downloads](https://img.shields.io/pypi/dm/diff-diff.svg)](https://pypi.org/project/diff-diff/)\n[![Documentation](https://readthedocs.org/projects/diff-diff/badge/?version=stable)](https://diff-diff.readthedocs.io/en/stable/)\n[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.19646175.svg)](https://doi.org/10.5281/zenodo.19646175)\n\nA Python library for Difference-in-Differences (DiD) causal inference analysis with an sklearn-like API and statsmodels-style outputs.\n\n## Installation\n\n```bash\npip install diff-diff\n```\n\nOr install from source:\n\n```bash\ngit clone https://github.com/igerber/diff-diff.git\ncd diff-diff\npip install -e .\n```\n\n## Quick Start\n\n```python\nimport pandas as pd\nfrom diff_diff import DifferenceInDifferences  # or: DiD\n\n# Create sample data\ndata = pd.DataFrame({\n    'outcome': [10, 11, 15, 18, 9, 10, 12, 13],\n    'treated': [1, 1, 1, 1, 0, 0, 0, 0],\n    'post': [0, 0, 1, 1, 0, 0, 1, 1]\n})\n\n# Fit the model\ndid = DifferenceInDifferences()\nresults = did.fit(data, outcome='outcome', treatment='treated', time='post')\n\n# View results\nprint(results)  # DiDResults(ATT=3.0000, SE=1.7321, p=0.1583)\nresults.print_summary()\n```\n\nOutput:\n```\n======================================================================\n             Difference-in-Differences Estimation Results\n======================================================================\n\nObservations:                      8\nTreated units:                     4\nControl units:                     4\nR-squared:                    0.9055\n\n----------------------------------------------------------------------\nParameter           Estimate    Std. Err.     t-stat      P\u003e|t|\n----------------------------------------------------------------------\nATT                   3.0000       1.7321      1.732     0.1583\n----------------------------------------------------------------------\n\n95% Confidence Interval: [-1.8089, 7.8089]\n\nSignif. codes: '***' 0.001, '**' 0.01, '*' 0.05, '.' 0.1\n======================================================================\n```\n\n## For AI Agents\n\nIf you are an AI agent or LLM using this library, call `diff_diff.get_llm_guide()` for a concise API reference with an 8-step practitioner workflow (based on Baker et al. 2025). The workflow ensures rigorous DiD analysis — not just calling `fit()`, but testing assumptions, running sensitivity analysis, and checking robustness.\n\n```python\nfrom diff_diff import get_llm_guide\n\nget_llm_guide()                 # concise API reference\nget_llm_guide(\"practitioner\")   # 8-step workflow (Baker et al. 2025)\nget_llm_guide(\"full\")           # comprehensive documentation\n```\n\nThe guides are bundled in the wheel, so they are accessible from a `pip install` with no network access required.\n\nAfter estimation, call `practitioner_next_steps(results)` for context-aware guidance on remaining diagnostic steps.\n\n## For Data Scientists\n\nMeasuring campaign lift? Evaluating a product launch? diff-diff handles the causal inference so you can focus on the business question.\n\n- **[Which method fits my problem?](docs/practitioner_decision_tree.rst)** - Start from your business scenario (campaign in some markets, staggered rollout, survey data) and find the right estimator\n- **[Getting started for practitioners](docs/practitioner_getting_started.rst)** - End-to-end walkthrough: marketing campaign -\u003e causal estimate -\u003e stakeholder-ready result\n- **[Brand awareness survey tutorial](docs/tutorials/17_brand_awareness_survey.ipynb)** - Full example with complex survey design, brand funnel analysis, and staggered rollouts\n- **Have BRFSS/ACS/CPS individual records?** Use [`aggregate_survey()`](docs/api/prep.rst) to roll respondent-level microdata into a geographic-period panel with inverse-variance precision weights. The returned second-stage design uses analytic weights (`aweight`), so it works directly with `DifferenceInDifferences`, `TwoWayFixedEffects`, `MultiPeriodDiD`, `SunAbraham`, `ContinuousDiD`, and `EfficientDiD` (estimators marked **Full** in the [survey support matrix](docs/choosing_estimator.rst))\n\n### Experimental preview: `BusinessReport` and `DiagnosticReport`\n\ndiff-diff ships two preview classes, `BusinessReport` and `DiagnosticReport`, that produce plain-English output and a structured `to_dict()` schema from any fitted result. **Both are experimental in this release** — wording, verdict thresholds, and schema shape will change as the library learns from real practitioner usage. Do not anchor downstream tooling on the schema yet; the experimental flag is noted in the CHANGELOG.\n\n```python\nfrom diff_diff import CallawaySantAnna, BusinessReport\n\ncs = CallawaySantAnna(base_period=\"universal\").fit(\n    df, outcome=\"revenue\", unit=\"store\", time=\"month\",\n    first_treat=\"first_treat\", aggregate=\"event_study\",\n)\nreport = BusinessReport(\n    cs,\n    outcome_label=\"Revenue per store\",\n    outcome_unit=\"$\",\n    business_question=\"Did the loyalty program lift revenue?\",\n    treatment_label=\"the loyalty program\",\n    # Optional: pass the panel + column names so the auto-constructed\n    # DiagnosticReport can run data-dependent checks (2x2 pre-trends,\n    # Goodman-Bacon decomposition, EfficientDiD Hausman pretest).\n    # Without these the auto path still runs but skips those checks.\n    data=df,\n    outcome=\"revenue\",\n    unit=\"store\",\n    time=\"month\",\n    first_treat=\"first_treat\",\n)\nprint(report.summary())\n```\n\n`BusinessReport` auto-constructs a `DiagnosticReport` so the summary mentions pre-trends, sensitivity, and design-effect findings in one call. Methodology (phrasing rules, verdict thresholds, schema stability) is documented in [docs/methodology/REPORTING.md](docs/methodology/REPORTING.md). Feedback on wording, applicability, and missing diagnostics is welcome — this is the part of the library most likely to evolve in the next few releases.\n\nAlready know DiD? The [academic quickstart](docs/quickstart.rst) and [estimator guide](docs/choosing_estimator.rst) cover the full technical details.\n\n## Features\n\n- **sklearn-like API**: Familiar `fit()` interface with `get_params()` and `set_params()`\n- **Pythonic results**: Easy access to coefficients, standard errors, and confidence intervals\n- **Multiple interfaces**: Column names or R-style formulas\n- **Robust inference**: Heteroskedasticity-robust (HC1) and cluster-robust standard errors\n- **Wild cluster bootstrap**: Valid inference with few clusters (\u003c50) using Rademacher, Webb, or Mammen weights\n- **Panel data support**: Two-way fixed effects estimator for panel designs\n- **Multi-period analysis**: Event-study style DiD with period-specific treatment effects\n- **Staggered adoption**: Callaway-Sant'Anna (2021), Sun-Abraham (2021), Borusyak-Jaravel-Spiess (2024) imputation, Two-Stage DiD (Gardner 2022), Stacked DiD (Wing, Freedman \u0026 Hollingsworth 2024), Efficient DiD (Chen, Sant'Anna \u0026 Xie 2025), and Wooldridge ETWFE (2021/2023) estimators for heterogeneous treatment timing\n- **Reversible (non-absorbing) treatments**: de Chaisemartin-D'Haultfœuille `DID_M` estimator for treatments that switch on AND off over time (marketing campaigns, seasonal promotions, on/off policy cycles) — the only library option for non-absorbing treatments\n- **Triple Difference (DDD)**: Ortiz-Villavicencio \u0026 Sant'Anna (2025) estimators with proper covariate handling\n- **Synthetic DiD**: Combined DiD with synthetic control for improved robustness\n- **Triply Robust Panel (TROP)**: Factor-adjusted DiD with synthetic weights (Athey et al. 2025)\n- **Event study plots**: Publication-ready visualization of treatment effects\n- **Parallel trends testing**: Multiple methods including equivalence tests\n- **Goodman-Bacon decomposition**: Diagnose TWFE bias by decomposing into 2x2 comparisons\n- **Placebo tests**: Comprehensive diagnostics including fake timing, fake group, permutation, and leave-one-out tests\n- **Honest DiD sensitivity analysis**: Rambachan-Roth (2023) bounds and breakdown analysis for parallel trends violations\n- **Pre-trends power analysis**: Roth (2022) minimum detectable violation (MDV) and power curves for pre-trends tests\n- **Power analysis**: MDE, sample size, and power calculations for study design; simulation-based power for any estimator\n- **Data prep utilities**: Helper functions for common data preparation tasks\n- **Survey microdata aggregation**: `aggregate_survey()` rolls individual-level survey data (BRFSS, ACS, CPS, NHANES) into geographic-period panels with design-based precision weights for second-stage DiD\n- **Validated against R**: Benchmarked against `did`, `synthdid`, and `fixest` packages (see [benchmarks](docs/benchmarks.rst))\n\n## Estimator Aliases\n\nAll estimators have short aliases for convenience:\n\n| Alias | Full Name | Method |\n|-------|-----------|--------|\n| `DiD` | `DifferenceInDifferences` | Basic 2x2 DiD |\n| `TWFE` | `TwoWayFixedEffects` | Two-way fixed effects |\n| `EventStudy` | `MultiPeriodDiD` | Event study / multi-period |\n| `CS` | `CallawaySantAnna` | Callaway \u0026 Sant'Anna (2021) |\n| `SA` | `SunAbraham` | Sun \u0026 Abraham (2021) |\n| `BJS` | `ImputationDiD` | Borusyak, Jaravel \u0026 Spiess (2024) |\n| `Gardner` | `TwoStageDiD` | Gardner (2022) two-stage |\n| `SDiD` | `SyntheticDiD` | Synthetic DiD |\n| `DDD` | `TripleDifference` | Triple difference |\n| `CDiD` | `ContinuousDiD` | Continuous treatment DiD |\n| `Stacked` | `StackedDiD` | Stacked DiD |\n| `Bacon` | `BaconDecomposition` | Goodman-Bacon decomposition |\n| `EDiD` | `EfficientDiD` | Efficient DiD |\n| `ETWFE` | `WooldridgeDiD` | Wooldridge ETWFE (2021/2023) |\n| `DCDH` | `ChaisemartinDHaultfoeuille` | de Chaisemartin \u0026 D'Haultfœuille (2020) — reversible treatments |\n\n`TROP` already uses its short canonical name and needs no alias.\n\n## Tutorials\n\nWe provide Jupyter notebook tutorials in `docs/tutorials/`:\n\n| Notebook | Description |\n|----------|-------------|\n| `01_basic_did.ipynb` | Basic 2x2 DiD, formula interface, covariates, fixed effects, cluster-robust SE, wild bootstrap |\n| `02_staggered_did.ipynb` | Staggered adoption with Callaway-Sant'Anna and Sun-Abraham, group-time effects, aggregation methods, Bacon decomposition |\n| `03_synthetic_did.ipynb` | Synthetic DiD, unit/time weights, inference methods, regularization |\n| `04_parallel_trends.ipynb` | Testing parallel trends, equivalence tests, placebo tests, diagnostics |\n| `05_honest_did.ipynb` | Honest DiD sensitivity analysis, bounds, breakdown values, visualization |\n| `06_power_analysis.ipynb` | Power analysis, MDE, sample size calculations, simulation-based power |\n| `07_pretrends_power.ipynb` | Pre-trends power analysis (Roth 2022), MDV, power curves |\n| `08_triple_diff.ipynb` | Triple Difference (DDD) estimation with proper covariate handling |\n| `09_real_world_examples.ipynb` | Real-world data examples (Card-Krueger, Castle Doctrine, Divorce Laws) |\n| `10_trop.ipynb` | Triply Robust Panel (TROP) estimation with factor model adjustment |\n| `11_imputation_did.ipynb` | Imputation DiD (Borusyak et al. 2024), pre-trend test, efficiency comparison |\n| `12_two_stage_did.ipynb` | Two-Stage DiD (Gardner 2022), GMM sandwich variance, per-observation effects |\n| `13_stacked_did.ipynb` | Stacked DiD (Wing et al. 2024), Q-weights, sub-experiment inspection, trimming, clean control definitions |\n| `15_efficient_did.ipynb` | Efficient DiD (Chen et al. 2025), optimal weighting, PT-All vs PT-Post, efficiency gains, bootstrap inference |\n| `16_survey_did.ipynb` | Survey-aware DiD with complex sampling designs (strata, PSU, FPC, weights), replicate weights, subpopulation analysis, DEFF diagnostics |\n| `17_brand_awareness_survey.ipynb` | Measuring campaign impact on brand awareness with survey data — naive vs. survey-corrected comparison, brand funnel analysis, staggered rollouts, stakeholder communication |\n\n## Data Preparation\n\ndiff-diff provides utility functions to help prepare your data for DiD analysis. These functions handle common data transformation tasks like creating treatment indicators, reshaping panel data, and validating data formats.\n\n### Generate Sample Data\n\nCreate synthetic data with a known treatment effect for testing and learning:\n\n```python\nfrom diff_diff import generate_did_data, DifferenceInDifferences\n\n# Generate panel data with 100 units, 4 periods, and a treatment effect of 5\ndata = generate_did_data(\n    n_units=100,\n    n_periods=4,\n    treatment_effect=5.0,\n    treatment_fraction=0.5,  # 50% of units are treated\n    treatment_period=2,       # Treatment starts at period 2\n    seed=42\n)\n\n# Verify the estimator recovers the treatment effect\ndid = DifferenceInDifferences()\nresults = did.fit(data, outcome='outcome', treatment='treated', time='post')\nprint(f\"Estimated ATT: {results.att:.2f} (true: 5.0)\")\n```\n\n### Create Treatment Indicators\n\nConvert categorical variables or numeric thresholds to binary treatment indicators:\n\n```python\nfrom diff_diff import make_treatment_indicator\n\n# From categorical variable\ndf = make_treatment_indicator(\n    data,\n    column='state',\n    treated_values=['CA', 'NY', 'TX']  # These states are treated\n)\n\n# From numeric threshold (e.g., firms above median size)\ndf = make_treatment_indicator(\n    data,\n    column='firm_size',\n    threshold=data['firm_size'].median()\n)\n\n# Treat units below threshold\ndf = make_treatment_indicator(\n    data,\n    column='income',\n    threshold=50000,\n    above_threshold=False  # Units with income \u003c= 50000 are treated\n)\n```\n\n### Create Post-Treatment Indicators\n\nConvert time/date columns to binary post-treatment indicators:\n\n```python\nfrom diff_diff import make_post_indicator\n\n# From specific post-treatment periods\ndf = make_post_indicator(\n    data,\n    time_column='year',\n    post_periods=[2020, 2021, 2022]\n)\n\n# From treatment start date\ndf = make_post_indicator(\n    data,\n    time_column='year',\n    treatment_start=2020  # All years \u003e= 2020 are post-treatment\n)\n\n# Works with datetime columns\ndf = make_post_indicator(\n    data,\n    time_column='date',\n    treatment_start='2020-01-01'\n)\n```\n\n### Reshape Wide to Long Format\n\nConvert wide-format data (one row per unit, multiple time columns) to long format:\n\n```python\nfrom diff_diff import wide_to_long\n\n# Wide format: columns like sales_2019, sales_2020, sales_2021\nwide_df = pd.DataFrame({\n    'firm_id': [1, 2, 3],\n    'industry': ['tech', 'retail', 'tech'],\n    'sales_2019': [100, 150, 200],\n    'sales_2020': [110, 160, 210],\n    'sales_2021': [120, 170, 220]\n})\n\n# Convert to long format for DiD\nlong_df = wide_to_long(\n    wide_df,\n    value_columns=['sales_2019', 'sales_2020', 'sales_2021'],\n    id_column='firm_id',\n    time_name='year',\n    value_name='sales',\n    time_values=[2019, 2020, 2021]\n)\n# Result: 9 rows (3 firms × 3 years), columns: firm_id, year, sales, industry\n```\n\n### Balance Panel Data\n\nEnsure all units have observations for all time periods:\n\n```python\nfrom diff_diff import balance_panel\n\n# Keep only units with complete data (drop incomplete units)\nbalanced = balance_panel(\n    data,\n    unit_column='firm_id',\n    time_column='year',\n    method='inner'\n)\n\n# Include all unit-period combinations (creates NaN for missing)\nbalanced = balance_panel(\n    data,\n    unit_column='firm_id',\n    time_column='year',\n    method='outer'\n)\n\n# Fill missing values\nbalanced = balance_panel(\n    data,\n    unit_column='firm_id',\n    time_column='year',\n    method='fill',\n    fill_value=0  # Or None for forward/backward fill\n)\n```\n\n### Validate Data\n\nCheck that your data meets DiD requirements before fitting:\n\n```python\nfrom diff_diff import validate_did_data\n\n# Validate and get informative error messages\nresult = validate_did_data(\n    data,\n    outcome='sales',\n    treatment='treated',\n    time='post',\n    unit='firm_id',      # Optional: for panel-specific validation\n    raise_on_error=False  # Return dict instead of raising\n)\n\nif result['valid']:\n    print(\"Data is ready for DiD analysis!\")\n    print(f\"Summary: {result['summary']}\")\nelse:\n    print(\"Issues found:\")\n    for error in result['errors']:\n        print(f\"  - {error}\")\n\nfor warning in result['warnings']:\n    print(f\"Warning: {warning}\")\n```\n\n### Summarize Data by Groups\n\nGet summary statistics for each treatment-time cell:\n\n```python\nfrom diff_diff import summarize_did_data\n\nsummary = summarize_did_data(\n    data,\n    outcome='sales',\n    treatment='treated',\n    time='post'\n)\nprint(summary)\n```\n\nOutput:\n```\n                        n      mean       std       min       max\nControl - Pre        250  100.5000   15.2340   65.0000  145.0000\nControl - Post       250  105.2000   16.1230   68.0000  152.0000\nTreated - Pre        250  101.2000   14.8900   67.0000  143.0000\nTreated - Post       250  115.8000   17.5600   72.0000  165.0000\nDiD Estimate           -    9.9000         -         -         -\n```\n\n### Create Event Time for Staggered Designs\n\nFor designs where treatment occurs at different times:\n\n```python\nfrom diff_diff import create_event_time\n\n# Add event-time column relative to treatment timing\ndf = create_event_time(\n    data,\n    time_column='year',\n    treatment_time_column='treatment_year'\n)\n# Result: event_time = -2, -1, 0, 1, 2 relative to treatment\n```\n\n### Aggregate to Cohort Means\n\nAggregate unit-level data for visualization:\n\n```python\nfrom diff_diff import aggregate_to_cohorts\n\ncohort_data = aggregate_to_cohorts(\n    data,\n    unit_column='firm_id',\n    time_column='year',\n    treatment_column='treated',\n    outcome='sales'\n)\n# Result: mean outcome by treatment group and period\n```\n\n### Rank Control Units\n\nSelect the best control units for DiD or Synthetic DiD analysis by ranking them based on pre-treatment outcome similarity:\n\n```python\nfrom diff_diff import rank_control_units, generate_did_data\n\n# Generate sample data\ndata = generate_did_data(n_units=50, n_periods=6, seed=42)\n\n# Rank control units by their similarity to treated units\nranking = rank_control_units(\n    data,\n    unit_column='unit',\n    time_column='period',\n    outcome_column='outcome',\n    treatment_column='treated',\n    n_top=10  # Return top 10 controls\n)\n\nprint(ranking[['unit', 'quality_score', 'pre_trend_rmse']])\n```\n\nOutput:\n```\n   unit  quality_score  pre_trend_rmse\n0    35         1.0000          0.4521\n1    42         0.9234          0.5123\n2    28         0.8876          0.5892\n...\n```\n\nWith covariates for matching:\n\n```python\n# Add covariate-based matching\nranking = rank_control_units(\n    data,\n    unit_column='unit',\n    time_column='period',\n    outcome_column='outcome',\n    treatment_column='treated',\n    covariates=['size', 'age'],  # Match on these too\n    outcome_weight=0.7,          # 70% weight on outcome trends\n    covariate_weight=0.3         # 30% weight on covariate similarity\n)\n```\n\nFilter data for SyntheticDiD using top controls:\n\n```python\nfrom diff_diff import SyntheticDiD\n\n# Get top control units\ntop_controls = ranking['unit'].tolist()\n\n# Filter data to treated + top controls\nfiltered_data = data[\n    (data['treated'] == 1) | (data['unit'].isin(top_controls))\n]\n\n# Fit SyntheticDiD with selected controls\nsdid = SyntheticDiD()\nresults = sdid.fit(\n    filtered_data,\n    outcome='outcome',\n    treatment='treated',\n    unit='unit',\n    time='period',\n    post_periods=[3, 4, 5]\n)\n```\n\n## Usage\n\n### Basic DiD with Column Names\n\n```python\nfrom diff_diff import DifferenceInDifferences\n\ndid = DifferenceInDifferences(robust=True, alpha=0.05)\nresults = did.fit(\n    data,\n    outcome='sales',\n    treatment='treated',\n    time='post_policy'\n)\n\n# Access results\nprint(f\"ATT: {results.att:.4f}\")\nprint(f\"Standard Error: {results.se:.4f}\")\nprint(f\"P-value: {results.p_value:.4f}\")\nprint(f\"95% CI: {results.conf_int}\")\nprint(f\"Significant: {results.is_significant}\")\n```\n\n### Using Formula Interface\n\n```python\n# R-style formula syntax\nresults = did.fit(data, formula='outcome ~ treated * post')\n\n# Explicit interaction syntax\nresults = did.fit(data, formula='outcome ~ treated + post + treated:post')\n\n# With covariates\nresults = did.fit(data, formula='outcome ~ treated * post + age + income')\n```\n\n### Including Covariates\n\n```python\nresults = did.fit(\n    data,\n    outcome='outcome',\n    treatment='treated',\n    time='post',\n    covariates=['age', 'income', 'education']\n)\n```\n\n### Fixed Effects\n\nUse `fixed_effects` for low-dimensional categorical controls (creates dummy variables):\n\n```python\n# State and industry fixed effects\nresults = did.fit(\n    data,\n    outcome='sales',\n    treatment='treated',\n    time='post',\n    fixed_effects=['state', 'industry']\n)\n\n# Access fixed effect coefficients\nstate_coefs = {k: v for k, v in results.coefficients.items() if k.startswith('state_')}\n```\n\nUse `absorb` for high-dimensional fixed effects (more efficient, uses within-transformation):\n\n```python\n# Absorb firm-level fixed effects (efficient for many firms)\nresults = did.fit(\n    data,\n    outcome='sales',\n    treatment='treated',\n    time='post',\n    absorb=['firm_id']\n)\n```\n\nCombine covariates with fixed effects:\n\n```python\nresults = did.fit(\n    data,\n    outcome='sales',\n    treatment='treated',\n    time='post',\n    covariates=['size', 'age'],           # Linear controls\n    fixed_effects=['industry'],            # Low-dimensional FE (dummies)\n    absorb=['firm_id']                     # High-dimensional FE (absorbed)\n)\n```\n\n### Cluster-Robust Standard Errors\n\n```python\ndid = DifferenceInDifferences(cluster='state')\nresults = did.fit(\n    data,\n    outcome='outcome',\n    treatment='treated',\n    time='post'\n)\n```\n\n### Wild Cluster Bootstrap\n\nWhen you have few clusters (\u003c50), standard cluster-robust SEs are biased. Wild cluster bootstrap provides valid inference even with 5-10 clusters.\n\n```python\n# Use wild bootstrap for inference\ndid = DifferenceInDifferences(\n    cluster='state',\n    inference='wild_bootstrap',\n    n_bootstrap=999,\n    bootstrap_weights='rademacher',  # or 'webb' for \u003c10 clusters, 'mammen'\n    seed=42\n)\nresults = did.fit(data, outcome='y', treatment='treated', time='post')\n\n# Results include bootstrap-based SE and p-value\nprint(f\"ATT: {results.att:.3f} (SE: {results.se:.3f})\")\nprint(f\"P-value: {results.p_value:.4f}\")\nprint(f\"95% CI: {results.conf_int}\")\nprint(f\"Inference method: {results.inference_method}\")\nprint(f\"Number of clusters: {results.n_clusters}\")\n```\n\n**Weight types:**\n- `'rademacher'` - Default, ±1 with p=0.5, good for most cases\n- `'webb'` - 6-point distribution, recommended for \u003c10 clusters\n- `'mammen'` - Two-point distribution, alternative to Rademacher\n\nWorks with `DifferenceInDifferences` and `TwoWayFixedEffects` estimators.\n\n### Two-Way Fixed Effects (Panel Data)\n\n```python\nfrom diff_diff import TwoWayFixedEffects\n\ntwfe = TwoWayFixedEffects()\nresults = twfe.fit(\n    panel_data,\n    outcome='outcome',\n    treatment='treated',\n    time='year',\n    unit='firm_id'\n)\n```\n\n### Multi-Period DiD (Event Study)\n\nFor settings with multiple pre- and post-treatment periods. Estimates treatment × period\ninteractions for ALL periods (pre and post), enabling parallel trends assessment:\n\n```python\nfrom diff_diff import MultiPeriodDiD\n\n# Fit full event study with pre and post period effects\ndid = MultiPeriodDiD()\nresults = did.fit(\n    panel_data,\n    outcome='sales',\n    treatment='treated',\n    time='period',\n    post_periods=[3, 4, 5],      # Periods 3-5 are post-treatment\n    reference_period=2,          # Last pre-period (e=-1 convention)\n    unit='unit_id',              # Optional: warns if staggered adoption detected\n)\n\n# Pre-period effects test parallel trends (should be ≈ 0)\nfor period, effect in results.pre_period_effects.items():\n    print(f\"Pre {period}: {effect.effect:.3f} (SE: {effect.se:.3f})\")\n\n# Post-period effects estimate dynamic treatment effects\nfor period, effect in results.post_period_effects.items():\n    print(f\"Post {period}: {effect.effect:.3f} (SE: {effect.se:.3f})\")\n\n# View average treatment effect across post-periods\nprint(f\"Average ATT: {results.avg_att:.3f}\")\nprint(f\"Average SE: {results.avg_se:.3f}\")\n\n# Full summary with pre and post period effects\nresults.print_summary()\n```\n\nOutput:\n```\n================================================================================\n            Multi-Period Difference-in-Differences Estimation Results\n================================================================================\n\nObservations:                      600\nPre-treatment periods:             3\nPost-treatment periods:            3\n\n--------------------------------------------------------------------------------\nAverage Treatment Effect\n--------------------------------------------------------------------------------\nAverage ATT       5.2000       0.8234      6.315      0.0000\n--------------------------------------------------------------------------------\n95% Confidence Interval: [3.5862, 6.8138]\n\nPeriod-Specific Effects:\n--------------------------------------------------------------------------------\nPeriod            Effect     Std. Err.     t-stat      P\u003e|t|\n--------------------------------------------------------------------------------\n3                 4.5000       0.9512      4.731      0.0000***\n4                 5.2000       0.8876      5.858      0.0000***\n5                 5.9000       0.9123      6.468      0.0000***\n--------------------------------------------------------------------------------\n\nSignif. codes: '***' 0.001, '**' 0.01, '*' 0.05, '.' 0.1\n================================================================================\n```\n\n### Staggered Difference-in-Differences (Callaway-Sant'Anna)\n\nWhen treatment is adopted at different times by different units, traditional TWFE estimators can be biased. The Callaway-Sant'Anna estimator provides unbiased estimates with staggered adoption.\n\n```python\nfrom diff_diff import CallawaySantAnna\n\n# Panel data with staggered treatment\n# 'first_treat' = period when unit was first treated (0 if never treated)\ncs = CallawaySantAnna()\nresults = cs.fit(\n    panel_data,\n    outcome='sales',\n    unit='firm_id',\n    time='year',\n    first_treat='first_treat',  # 0 for never-treated, else first treatment year\n    aggregate='event_study'      # Compute event study effects\n)\n\n# View results\nresults.print_summary()\n\n# Access group-time effects ATT(g,t)\nfor (group, time), effect in results.group_time_effects.items():\n    print(f\"Cohort {group}, Period {time}: {effect['effect']:.3f}\")\n\n# Event study effects (averaged by relative time)\nfor rel_time, effect in results.event_study_effects.items():\n    print(f\"e={rel_time}: {effect['effect']:.3f} (SE: {effect['se']:.3f})\")\n\n# Convert to DataFrame\ndf = results.to_dataframe(level='event_study')\n```\n\nOutput:\n```\n=====================================================================================\n          Callaway-Sant'Anna Staggered Difference-in-Differences Results\n=====================================================================================\n\nTotal observations:                     600\nTreated units:                           35\nControl units:                           15\nTreatment cohorts:                        3\nTime periods:                             8\nControl group:                never_treated\n\n-------------------------------------------------------------------------------------\n                  Overall Average Treatment Effect on the Treated\n-------------------------------------------------------------------------------------\nParameter         Estimate     Std. Err.     t-stat      P\u003e|t|   Sig.\n-------------------------------------------------------------------------------------\nATT                 2.5000       0.3521       7.101     0.0000   ***\n-------------------------------------------------------------------------------------\n\n95% Confidence Interval: [1.8099, 3.1901]\n\n-------------------------------------------------------------------------------------\n                          Event Study (Dynamic) Effects\n-------------------------------------------------------------------------------------\nRel. Period       Estimate     Std. Err.     t-stat      P\u003e|t|   Sig.\n-------------------------------------------------------------------------------------\n0                   2.1000       0.4521       4.645     0.0000   ***\n1                   2.5000       0.4123       6.064     0.0000   ***\n2                   2.8000       0.5234       5.349     0.0000   ***\n-------------------------------------------------------------------------------------\n\nSignif. codes: '***' 0.001, '**' 0.01, '*' 0.05, '.' 0.1\n=====================================================================================\n```\n\n**When to use Callaway-Sant'Anna vs TWFE:**\n\n| Scenario | Use TWFE | Use Callaway-Sant'Anna |\n|----------|----------|------------------------|\n| All units treated at same time | ✓ | ✓ |\n| Staggered adoption, homogeneous effects | ✓ | ✓ |\n| Staggered adoption, heterogeneous effects | ✗ | ✓ |\n| Need event study with staggered timing | ✗ | ✓ |\n| Fewer than ~20 treated units | ✓ | Depends on design |\n\n**Parameters:**\n\n```python\nCallawaySantAnna(\n    control_group='never_treated',  # or 'not_yet_treated'\n    anticipation=0,                  # Periods before treatment with effects\n    estimation_method='dr',          # 'dr', 'ipw', or 'reg'\n    alpha=0.05,                      # Significance level\n    cluster=None,                    # Column for cluster SEs\n    n_bootstrap=0,                   # Bootstrap iterations (0 = analytical SEs)\n    bootstrap_weights='rademacher',  # 'rademacher', 'mammen', or 'webb'\n    seed=None                        # Random seed\n)\n```\n\n**Multiplier bootstrap for inference:**\n\nWith few clusters or when analytical standard errors may be unreliable, use the multiplier bootstrap for valid inference. This implements the approach from Callaway \u0026 Sant'Anna (2021).\n\n```python\n# Bootstrap inference with 999 iterations\ncs = CallawaySantAnna(\n    n_bootstrap=999,\n    bootstrap_weights='rademacher',  # or 'mammen', 'webb'\n    seed=42\n)\nresults = cs.fit(\n    data,\n    outcome='sales',\n    unit='firm_id',\n    time='year',\n    first_treat='first_treat',\n    aggregate='event_study'\n)\n\n# Access bootstrap results\nprint(f\"Overall ATT: {results.overall_att:.3f}\")\nprint(f\"Bootstrap SE: {results.bootstrap_results.overall_att_se:.3f}\")\nprint(f\"Bootstrap 95% CI: {results.bootstrap_results.overall_att_ci}\")\nprint(f\"Bootstrap p-value: {results.bootstrap_results.overall_att_p_value:.4f}\")\n\n# Event study bootstrap inference\nfor rel_time, se in results.bootstrap_results.event_study_ses.items():\n    ci = results.bootstrap_results.event_study_cis[rel_time]\n    print(f\"e={rel_time}: SE={se:.3f}, 95% CI=[{ci[0]:.3f}, {ci[1]:.3f}]\")\n```\n\n**Bootstrap weight types:**\n- `'rademacher'` - Default, ±1 with p=0.5, good for most cases\n- `'mammen'` - Two-point distribution matching first 3 moments\n- `'webb'` - Six-point distribution, recommended for very few clusters (\u003c10)\n\n**Covariate adjustment for conditional parallel trends:**\n\nWhen parallel trends only holds conditional on covariates, use the `covariates` parameter:\n\n```python\n# Doubly robust estimation with covariates\ncs = CallawaySantAnna(estimation_method='dr')  # 'dr', 'ipw', or 'reg'\nresults = cs.fit(\n    data,\n    outcome='sales',\n    unit='firm_id',\n    time='year',\n    first_treat='first_treat',\n    covariates=['size', 'age', 'industry'],  # Covariates for conditional PT\n    aggregate='event_study'\n)\n```\n\n### Sun-Abraham Interaction-Weighted Estimator\n\nThe Sun-Abraham (2021) estimator provides an alternative to Callaway-Sant'Anna using an interaction-weighted (IW) regression approach. Running both estimators serves as a useful robustness check—when they agree, results are more credible.\n\n```python\nfrom diff_diff import SunAbraham\n\n# Basic usage\nsa = SunAbraham()\nresults = sa.fit(\n    panel_data,\n    outcome='sales',\n    unit='firm_id',\n    time='year',\n    first_treat='first_treat'  # 0 for never-treated, else first treatment year\n)\n\n# View results\nresults.print_summary()\n\n# Event study effects (by relative time to treatment)\nfor rel_time, effect in results.event_study_effects.items():\n    print(f\"e={rel_time}: {effect['effect']:.3f} (SE: {effect['se']:.3f})\")\n\n# Overall ATT\nprint(f\"Overall ATT: {results.overall_att:.3f} (SE: {results.overall_se:.3f})\")\n\n# Cohort weights (how each cohort contributes to each event-time estimate)\nfor rel_time, weights in results.cohort_weights.items():\n    print(f\"e={rel_time}: {weights}\")\n```\n\n**Parameters:**\n\n```python\nSunAbraham(\n    control_group='never_treated',  # or 'not_yet_treated'\n    anticipation=0,                  # Periods before treatment with effects\n    alpha=0.05,                      # Significance level\n    cluster=None,                    # Column for cluster SEs\n    n_bootstrap=0,                   # Bootstrap iterations (0 = analytical SEs)\n    bootstrap_weights='rademacher',  # 'rademacher', 'mammen', or 'webb'\n    seed=None                        # Random seed\n)\n```\n\n**Bootstrap inference:**\n\n```python\n# Bootstrap inference with 999 iterations\nsa = SunAbraham(\n    n_bootstrap=999,\n    bootstrap_weights='rademacher',\n    seed=42\n)\nresults = sa.fit(\n    data,\n    outcome='sales',\n    unit='firm_id',\n    time='year',\n    first_treat='first_treat'\n)\n\n# Access bootstrap results\nprint(f\"Overall ATT: {results.overall_att:.3f}\")\nprint(f\"Bootstrap SE: {results.bootstrap_results.overall_att_se:.3f}\")\nprint(f\"Bootstrap 95% CI: {results.bootstrap_results.overall_att_ci}\")\nprint(f\"Bootstrap p-value: {results.bootstrap_results.overall_att_p_value:.4f}\")\n```\n\n**When to use Sun-Abraham vs Callaway-Sant'Anna:**\n\n| Aspect | Sun-Abraham | Callaway-Sant'Anna |\n|--------|-------------|-------------------|\n| Approach | Interaction-weighted regression | 2x2 DiD aggregation |\n| Efficiency | More efficient under homogeneous effects | More robust to heterogeneity |\n| Weighting | Weights by cohort share at each relative time | Weights by sample size |\n| Use case | Robustness check, regression-based inference | Primary staggered DiD estimator |\n\n**Both estimators should give similar results when:**\n- Treatment effects are relatively homogeneous across cohorts\n- Parallel trends holds\n\n**Running both as robustness check:**\n\n```python\nfrom diff_diff import CallawaySantAnna, SunAbraham\n\n# Callaway-Sant'Anna\ncs = CallawaySantAnna()\ncs_results = cs.fit(data, outcome='y', unit='unit', time='time', first_treat='first_treat')\n\n# Sun-Abraham\nsa = SunAbraham()\nsa_results = sa.fit(data, outcome='y', unit='unit', time='time', first_treat='first_treat')\n\n# Compare\nprint(f\"Callaway-Sant'Anna ATT: {cs_results.overall_att:.3f}\")\nprint(f\"Sun-Abraham ATT: {sa_results.overall_att:.3f}\")\n\n# If results differ substantially, investigate heterogeneity\n```\n\n### Borusyak-Jaravel-Spiess Imputation Estimator\n\nThe Borusyak et al. (2024) imputation estimator is the **efficient** estimator for staggered DiD under parallel trends, producing ~50% shorter confidence intervals than Callaway-Sant'Anna and 2-3.5x shorter than Sun-Abraham under homogeneous treatment effects.\n\n```python\nfrom diff_diff import ImputationDiD, imputation_did\n\n# Basic usage\nest = ImputationDiD()\nresults = est.fit(data, outcome='outcome', unit='unit',\n                  time='period', first_treat='first_treat')\nresults.print_summary()\n\n# Event study\nresults = est.fit(data, outcome='outcome', unit='unit',\n                  time='period', first_treat='first_treat',\n                  aggregate='event_study')\n\n# Pre-trend test (Equation 9)\npt = results.pretrend_test(n_leads=3)\nprint(f\"F-stat: {pt['f_stat']:.3f}, p-value: {pt['p_value']:.4f}\")\n\n# Convenience function\nresults = imputation_did(data, 'outcome', 'unit', 'period', 'first_treat',\n                         aggregate='all')\n```\n\n```python\nImputationDiD(\n    anticipation=0,         # Number of anticipation periods\n    alpha=0.05,             # Significance level\n    cluster=None,           # Cluster variable (defaults to unit)\n    n_bootstrap=0,          # Bootstrap iterations (0=analytical inference)\n    seed=None,              # Random seed\n    horizon_max=None,       # Max event-study horizon\n    aux_partition=\"cohort_horizon\",  # Variance partition: \"cohort_horizon\", \"cohort\", \"horizon\"\n)\n```\n\n**When to use Imputation DiD vs Callaway-Sant'Anna:**\n\n| Aspect | Imputation DiD | Callaway-Sant'Anna |\n|--------|---------------|-------------------|\n| Efficiency | Most efficient under homogeneous effects | Less efficient but more robust to heterogeneity |\n| Control group | Always uses all untreated obs | Choice of never-treated or not-yet-treated |\n| Inference | Conservative variance (Theorem 3) | Multiplier bootstrap |\n| Pre-trends | Built-in F-test (Equation 9) | Separate testing |\n\n### Two-Stage DiD (Gardner 2022)\n\nTwo-Stage DiD addresses TWFE bias in staggered adoption designs by estimating unit and time fixed effects on untreated observations only, then regressing the residualized outcomes on treatment indicators. Point estimates match the Imputation DiD estimator (Borusyak et al. 2024); the key difference is that Two-Stage DiD uses a GMM sandwich variance estimator that accounts for first-stage estimation error, while Imputation DiD uses a conservative variance (Theorem 3).\n\n```python\nfrom diff_diff import TwoStageDiD\n\n# Basic usage\nest = TwoStageDiD()\nresults = est.fit(data, outcome='outcome', unit='unit', time='period', first_treat='first_treat')\nresults.print_summary()\n```\n\n**Event study:**\n\n```python\n# Event study aggregation with visualization\nresults = est.fit(data, outcome='outcome', unit='unit', time='period',\n                  first_treat='first_treat', aggregate='event_study')\nplot_event_study(results)\n```\n\n**Parameters:**\n\n```python\nTwoStageDiD(\n    anticipation=0,                   # Periods of anticipation effects\n    alpha=0.05,                       # Significance level for CIs\n    cluster=None,                     # Column for cluster-robust SEs (defaults to unit)\n    n_bootstrap=0,                    # Bootstrap iterations (0 = analytical GMM SEs)\n    seed=None,                        # Random seed\n    rank_deficient_action='warn',     # 'warn', 'error', or 'silent'\n    horizon_max=None,                 # Max event-study horizon\n)\n```\n\n**When to use Two-Stage DiD vs Imputation DiD:**\n\n| Aspect | Two-Stage DiD | Imputation DiD |\n|--------|--------------|---------------|\n| Point estimates | Identical | Identical |\n| Variance | GMM sandwich (accounts for first-stage error) | Conservative (Theorem 3, may overcover) |\n| Intuition | Residualize then regress | Impute counterfactuals then aggregate |\n| Reference impl. | R `did2s` package | R `didimputation` package |\n\nBoth estimators are the efficient estimator under homogeneous treatment effects, producing shorter confidence intervals than Callaway-Sant'Anna or Sun-Abraham.\n\n### Stacked DiD (Wing, Freedman \u0026 Hollingsworth 2024)\n\nStacked DiD addresses TWFE bias in staggered adoption settings by constructing a \"clean\" comparison dataset for each treatment cohort and stacking them together. Each cohort's sub-experiment compares units treated at that cohort's timing against units that are not yet treated (or never treated) within a symmetric event-study window. This avoids the \"bad comparisons\" problem in TWFE while retaining a regression-based framework that practitioners familiar with event studies will find intuitive.\n\n```python\nfrom diff_diff import StackedDiD, generate_staggered_data\n\n# Generate sample data\ndata = generate_staggered_data(n_units=200, n_periods=12,\n                                cohort_periods=[4, 6, 8], seed=42)\n\n# Fit stacked DiD with event study\nest = StackedDiD(kappa_pre=2, kappa_post=2)\nresults = est.fit(data, outcome='outcome', unit='unit',\n                  time='period', first_treat='first_treat',\n                  aggregate='event_study')\nresults.print_summary()\n\n# Access stacked data for custom analysis\nstacked = results.stacked_data\n\n# Convenience function\nfrom diff_diff import stacked_did\nresults = stacked_did(data, 'outcome', 'unit', 'period', 'first_treat',\n                      kappa_pre=2, kappa_post=2, aggregate='event_study')\n```\n\n**Parameters:**\n\n```python\nStackedDiD(\n    kappa_pre=1,                          # Pre-treatment event-study periods\n    kappa_post=1,                         # Post-treatment event-study periods\n    weighting='aggregate',                # 'aggregate', 'population', or 'sample_share'\n    clean_control='not_yet_treated',      # 'not_yet_treated', 'strict', or 'never_treated'\n    cluster='unit',                       # 'unit' or 'unit_subexp'\n    alpha=0.05,                           # Significance level\n    anticipation=0,                       # Anticipation periods\n    rank_deficient_action='warn',         # 'warn', 'error', or 'silent'\n)\n```\n\n\u003e **Note:** Group aggregation (`aggregate='group'`) is not supported because the pooled\n\u003e stacked regression cannot produce cohort-specific effects. Use `CallawaySantAnna` or\n\u003e `ImputationDiD` for cohort-level estimates.\n\n**When to use Stacked DiD vs Callaway-Sant'Anna:**\n\n| Aspect | Stacked DiD | Callaway-Sant'Anna |\n|--------|-------------|-------------------|\n| Approach | Stack cohort sub-experiments, run pooled TWFE | 2x2 DiD aggregation |\n| Symmetric windows | Enforced via kappa_pre / kappa_post | Not required |\n| Control group | Not-yet-treated (default) or never-treated | Never-treated or not-yet-treated |\n| Covariates | Passed to pooled regression | Doubly robust / IPW |\n| Intuition | Familiar event-study regression | Nonparametric aggregation |\n\n**Convenience function:**\n\n```python\n# One-liner estimation\nresults = stacked_did(\n    data,\n    outcome='outcome',\n    unit='unit',\n    time='period',\n    first_treat='first_treat',\n    kappa_pre=3,\n    kappa_post=3,\n    aggregate='event_study'\n)\n```\n\n### Efficient DiD (Chen, Sant'Anna \u0026 Xie 2025)\n\nEfficient DiD achieves the semiparametric efficiency bound for ATT estimation in staggered adoption designs along the **no-covariate path**, producing tighter confidence intervals than standard estimators when the stronger PT-All assumption holds. It optimally weights across all valid comparison groups and baselines via the inverse covariance matrix Omega*. A doubly-robust covariate path is also available: it is consistent if either the outcome regression or the sieve propensity ratio is correctly specified, but the linear OLS outcome regression does not generically attain the efficiency bound unless the conditional mean is linear in the covariates.\n\n```python\nfrom diff_diff import EfficientDiD, generate_staggered_data\n\n# Generate sample data\ndata = generate_staggered_data(n_units=300, n_periods=10,\n                                cohort_periods=[4, 6, 8], seed=42)\n\n# Fit with PT-All (overidentified, tighter SEs)\nedid = EfficientDiD(pt_assumption=\"all\")\nresults = edid.fit(data, outcome='outcome', unit='unit',\n                   time='period', first_treat='first_treat',\n                   aggregate='all')\nresults.print_summary()\n\n# PT-Post mode (matches CS for post-treatment effects)\nedid_post = EfficientDiD(pt_assumption=\"post\")\nresults_post = edid_post.fit(data, outcome='outcome', unit='unit',\n                              time='period', first_treat='first_treat')\n```\n\n**Parameters:**\n\n```python\nEfficientDiD(\n    pt_assumption='all',            # 'all' (overidentified) or 'post' (matches CS post-treatment ATT)\n    alpha=0.05,                     # Significance level\n    n_bootstrap=0,                  # Bootstrap iterations (0 = analytical only)\n    bootstrap_weights='rademacher', # 'rademacher', 'mammen', or 'webb'\n    seed=None,                      # Random seed\n    anticipation=0,                 # Anticipation periods\n)\n```\n\n\u003e **Note:** EfficientDiD supports covariate adjustment via a doubly-robust path\n\u003e (sieve-based propensity score ratios and a linear OLS outcome regression).\n\u003e The DR property gives consistency if either the OR or the PS is correctly\n\u003e specified, but the OLS working model for the outcome regression does not\n\u003e generically attain the semiparametric efficiency bound. The unqualified\n\u003e efficiency-bound claim applies to the no-covariate path only. See the\n\u003e `covariates` parameter on `fit()` and `docs/methodology/REGISTRY.md`.\n\n**When to use Efficient DiD vs Callaway-Sant'Anna:**\n\n| Aspect | Efficient DiD | Callaway-Sant'Anna |\n|--------|--------------|-------------------|\n| Approach | Optimal EIF-based weighting | Separate 2x2 DiD aggregation |\n| PT assumption | PT-All (stronger) or PT-Post | Conditional PT |\n| Efficiency | Achieves semiparametric bound on the no-covariate path; DR covariate path is consistent but does not generically attain the bound under a linear OLS outcome regression | Not efficient |\n| Covariates | Supported (doubly robust, sieve-based PS + linear OLS OR) | Supported (OR, IPW, DR) |\n| When to choose | Maximum efficiency, PT-All credible | Covariates needed, weaker PT |\n\n### de Chaisemartin-D'Haultfœuille (dCDH) for Reversible Treatments\n\n`ChaisemartinDHaultfoeuille` (alias `DCDH`) is the only library estimator that handles **non-absorbing (reversible) treatments** — treatment can switch on AND off over time. This is the natural fit for marketing campaigns, seasonal promotions, on/off policy cycles.\n\nShips `DID_M` (= `DID_1` at horizon `l = 1`), the full multi-horizon event study `DID_l` for `l = 1..L_max` via the `L_max` parameter, residualization-style covariate adjustment (`controls`), group-specific linear trends (`trends_linear`), state-set-specific trends (`trends_nonparam`), heterogeneity testing, non-binary treatment, HonestDiD sensitivity integration on placebos, and survey support via Taylor-series linearization.\n\n```python\nfrom diff_diff import ChaisemartinDHaultfoeuille\nfrom diff_diff.prep import generate_reversible_did_data\n\n# Generate a reversible-treatment panel\ndata = generate_reversible_did_data(\n    n_groups=80, n_periods=6, pattern=\"single_switch\", seed=42,\n)\n\n# Fit the estimator\nest = ChaisemartinDHaultfoeuille()\nresults = est.fit(\n    data,\n    outcome=\"outcome\",\n    group=\"group\",\n    time=\"period\",\n    treatment=\"treatment\",\n)\nresults.print_summary()\n\n# Decomposition\nprint(f\"DID_M (overall):  {results.overall_att:.3f}\")\nprint(f\"DID_+ (joiners):  {results.joiners_att:.3f}\")\nprint(f\"DID_- (leavers):  {results.leavers_att:.3f}\")\nprint(f\"Placebo (DID^pl): {results.placebo_effect:.3f}\")\n```\n\n**Parameters:**\n\n```python\nChaisemartinDHaultfoeuille(\n    alpha=0.05,                   # Significance level\n    n_bootstrap=0,                # 0 = analytical SE only; \u003e0 = multiplier bootstrap\n    bootstrap_weights=\"rademacher\",  # 'rademacher', 'mammen', or 'webb'\n    seed=None,                    # Random seed for bootstrap\n    placebo=True,                 # Auto-compute single-lag placebo\n    twfe_diagnostic=True,         # Auto-compute TWFE decomposition diagnostic\n    drop_larger_lower=True,       # Drop multi-switch groups (matches R DIDmultiplegtDYN)\n    rank_deficient_action=\"warn\", # Used by TWFE diagnostic OLS\n)\n```\n\n**What you get back on the results object:**\n\n| Field | Description |\n|-------|-------------|\n| `overall_att`, `overall_se`, `overall_conf_int` | `DID_M` when `L_max=None`; cost-benefit `delta` when `L_max \u003e 1` (delta-method SE from per-horizon SEs) |\n| `joiners_att`, `leavers_att` | Decomposition into the joiners (`DID_+`) and leavers (`DID_-`) views |\n| `placebo_effect` | Single-lag placebo (`DID_M^pl`) point estimate |\n| `per_period_effects` | Per-period decomposition with explicit A11-violation flags |\n| `twfe_weights`, `twfe_fraction_negative`, `twfe_sigma_fe`, `twfe_beta_fe` | Theorem 1 decomposition diagnostic |\n| `n_groups_dropped_crossers`, `n_groups_dropped_singleton_baseline` | Filter counts (multi-switch groups dropped before estimation; singleton-baseline groups excluded from variance) |\n| `n_groups_dropped_never_switching` | Backwards-compatibility metadata. Never-switching groups participate in the variance via stable-control roles; this field is no longer a filter count. |\n\n**Multi-horizon event study** (pass `L_max` to `fit()`):\n\n```python\nresults = est.fit(data, outcome=\"outcome\", group=\"group\",\n                  time=\"period\", treatment=\"treatment\", L_max=5)\n\n# Per-horizon effects with analytical SE\nfor horizon in sorted(results.event_study_effects):\n    e = results.event_study_effects[horizon]\n    print(f\"  l={horizon}: DID_l={e['effect']:.3f} (SE={e['se']:.3f})\")\n\n# Cost-benefit delta (becomes overall_att when L_max \u003e 1)\nprint(f\"Cost-benefit delta: {results.cost_benefit_delta['delta']:.3f}\")\n\n# Normalized effects: DID^n_l = DID_l / l (for binary treatment)\nfor horizon in sorted(results.normalized_effects):\n    print(f\"  DID^n_{horizon} = {results.normalized_effects[horizon]['effect']:.3f}\")\n\n# Event study DataFrame (includes placebos as negative horizons)\ndf = results.to_dataframe(\"event_study\")\n\n# Plot (integrates with plot_event_study)\nfrom diff_diff import plot_event_study\nplot_event_study(results)\n```\n\n**Standalone TWFE decomposition diagnostic** (without fitting the full estimator):\n\n```python\nfrom diff_diff import twowayfeweights\n\ndiagnostic = twowayfeweights(\n    data, outcome=\"outcome\", group=\"group\", time=\"period\", treatment=\"treatment\",\n)\nprint(f\"Plain TWFE coefficient: {diagnostic.beta_fe:.3f}\")\nprint(f\"Fraction of negative weights: {diagnostic.fraction_negative:.3f}\")\nprint(f\"sigma_fe (sign-flipping threshold): {diagnostic.sigma_fe:.3f}\")\n```\n\n\u003e **Note:** Placebo SE is `NaN` for the single-period `DID_M^pl` (`L_max=None`) because the per-period aggregation path has no influence-function derivation; the point estimate is meaningful for visual pre-trends inspection. Multi-horizon dynamic placebos `DID^{pl}_l` (`L_max \u003e= 1`) have valid analytical SE via the same cohort-recentered plug-in variance as the positive horizons, with bootstrap SE available when `n_bootstrap \u003e 0`. See `docs/methodology/REGISTRY.md` for the full contract.\n\n\u003e **Note:** By default (`drop_larger_lower=True`), the estimator drops groups whose treatment switches more than once before estimation. This matches R `DIDmultiplegtDYN`'s default and is required for the analytical variance formula to be consistent with the point estimate. Each drop emits an explicit warning.\n\n\u003e **Note:** The estimator requires panels with a **balanced baseline** (every group observed at the first global period) and **no interior period gaps**. Late-entry groups (missing the baseline) raise `ValueError`; interior-gap groups are dropped with a warning; terminally-missing groups (early exit / right-censoring) are retained and contribute from their observed periods only. This is a documented deviation from R `DIDmultiplegtDYN`, which supports unbalanced panels - see [`docs/methodology/REGISTRY.md`](docs/methodology/REGISTRY.md) for the rationale, the defensive guards that make terminal missingness safe, and workarounds for unbalanced inputs.\n\n\u003e **Note:** Survey design is supported via Taylor-series linearization on `pweight` with strata / PSU / FPC. Replicate-weight variance and PSU-level bootstrap for dCDH are a planned extension. The `aggregate` parameter still raises `NotImplementedError`.\n\n### Triple Difference (DDD)\n\nTriple Difference (DDD) is used when treatment requires satisfying two criteria: belonging to a treated **group** AND being in an eligible **partition**. The `TripleDifference` class implements the methodology from Ortiz-Villavicencio \u0026 Sant'Anna (2025), which correctly handles covariate adjustment (unlike naive implementations).\n\n```python\nfrom diff_diff import TripleDifference, triple_difference\n\n# Basic usage\nddd = TripleDifference(estimation_method='dr')  # doubly robust (recommended)\nresults = ddd.fit(\n    data,\n    outcome='wages',\n    group='policy_state',       # 1=state enacted policy, 0=control state\n    partition='female',         # 1=women (affected by policy), 0=men\n    time='post'                 # 1=post-policy, 0=pre-policy\n)\n\n# View results\nresults.print_summary()\nprint(f\"ATT: {results.att:.3f} (SE: {results.se:.3f})\")\n\n# With covariates (properly incorporated, unlike naive DDD)\nresults = ddd.fit(\n    data,\n    outcome='wages',\n    group='policy_state',\n    partition='female',\n    time='post',\n    covariates=['age', 'education', 'experience']\n)\n```\n\n**Estimation methods:**\n\n| Method | Description | When to use |\n|--------|-------------|-------------|\n| `\"dr\"` | Doubly robust | Recommended. Consistent if either outcome or propensity model is correct |\n| `\"reg\"` | Regression adjustment | Simple outcome regression with full interactions |\n| `\"ipw\"` | Inverse probability weighting | When propensity score model is well-specified |\n\n```python\n# Compare estimation methods\nfor method in ['reg', 'ipw', 'dr']:\n    est = TripleDifference(estimation_method=method)\n    res = est.fit(data, outcome='y', group='g', partition='p', time='t')\n    print(f\"{method}: ATT={res.att:.3f} (SE={res.se:.3f})\")\n```\n\n**Convenience function:**\n\n```python\n# One-liner estimation\nresults = triple_difference(\n    data,\n    outcome='wages',\n    group='policy_state',\n    partition='female',\n    time='post',\n    covariates=['age', 'education'],\n    estimation_method='dr'\n)\n```\n\n**Why use DDD instead of DiD?**\n\nDDD allows for violations of parallel trends that are:\n- Group-specific (e.g., economic shocks in treatment states)\n- Partition-specific (e.g., trends affecting women everywhere)\n\nAs long as these biases are additive, DDD differences them out. The key assumption is that the *differential* trend between eligible and ineligible units would be the same across groups.\n\n### Event Study Visualization\n\nCreate publication-ready event study plots:\n\n```python\nfrom diff_diff import plot_event_study, MultiPeriodDiD, CallawaySantAnna, SunAbraham\n\n# From MultiPeriodDiD (full event study with pre and post period effects)\ndid = MultiPeriodDiD()\nresults = did.fit(data, outcome='y', treatment='treated',\n                  time='period', post_periods=[3, 4, 5], reference_period=2)\nplot_event_study(results, title=\"Treatment Effects Over Time\")\n\n# From CallawaySantAnna (with event study aggregation)\ncs = CallawaySantAnna()\nresults = cs.fit(data, outcome='y', unit='unit', time='period',\n                 first_treat='first_treat', aggregate='event_study')\nplot_event_study(results, title=\"Staggered DiD Event Study (CS)\")\n\n# From SunAbraham\nsa = SunAbraham()\nresults = sa.fit(data, outcome='y', unit='unit', time='period',\n                 first_treat='first_treat')\nplot_event_study(results, title=\"Staggered DiD Event Study (SA)\")\n\n# From a DataFrame\ndf = pd.DataFrame({\n    'period': [-2, -1, 0, 1, 2],\n    'effect': [0.1, 0.05, 0.0, 2.5, 2.8],\n    'se': [0.3, 0.25, 0.0, 0.4, 0.45]\n})\nplot_event_study(df, reference_period=0)\n\n# With customization\nax = plot_event_study(\n    results,\n    title=\"Dynamic Treatment Effects\",\n    xlabel=\"Years Relative to Treatment\",\n    ylabel=\"Effect on Sales ($1000s)\",\n    color=\"#2563eb\",\n    marker=\"o\",\n    shade_pre=True,           # Shade pre-treatment region\n    show_zero_line=True,      # Horizontal line at y=0\n    show_reference_line=True, # Vertical line at reference period\n    figsize=(10, 6),\n    show=False                # Don't call plt.show(), return axes\n)\n```\n\n### Synthetic Difference-in-Differences\n\nSynthetic DiD combines the strengths of Difference-in-Differences and Synthetic Control methods by re-weighting control units to better match treated units' pre-treatment outcomes.\n\n```python\nfrom diff_diff import SyntheticDiD\n\n# Fit Synthetic DiD model\nsdid = SyntheticDiD()\nresults = sdid.fit(\n    panel_data,\n    outcome='gdp_growth',\n    treatment='treated',\n    unit='state',\n    time='year',\n    post_periods=[2015, 2016, 2017, 2018]\n)\n\n# View results\nresults.print_summary()\nprint(f\"ATT: {results.att:.3f} (SE: {results.se:.3f})\")\n\n# Examine unit weights (which control units matter most)\nweights_df = results.get_unit_weights_df()\nprint(weights_df.head(10))\n\n# Examine time weights\ntime_weights_df = results.get_time_weights_df()\nprint(time_weights_df)\n```\n\nOutput:\n```\n===========================================================================\n         Synthetic Difference-in-Differences Estimation Results\n===========================================================================\n\nObservations:                      500\nTreated units:                       1\nControl units:                      49\nPre-treatment periods:               6\nPost-treatment periods:              4\nRegularization (lambda):        0.0000\nPre-treatment fit (RMSE):       0.1234\n\n---------------------------------------------------------------------------\nParameter         Estimate     Std. Err.     t-stat      P\u003e|t|\n---------------------------------------------------------------------------\nATT                 2.5000       0.4521      5.530      0.0000\n---------------------------------------------------------------------------\n\n95% Confidence Interval: [1.6139, 3.3861]\n\n---------------------------------------------------------------------------\n                   Top Unit Weights (Synthetic Control)\n---------------------------------------------------------------------------\n  Unit state_12: 0.3521\n  Unit state_5: 0.2156\n  Unit state_23: 0.1834\n  Unit state_8: 0.1245\n  Unit state_31: 0.0892\n  (8 units with weight \u003e 0.001)\n\nSignif. codes: '***' 0.001, '**' 0.01, '*' 0.05, '.' 0.1\n===========================================================================\n```\n\n#### When to Use Synthetic DiD Over Vanilla DiD\n\nUse Synthetic DiD instead of standard DiD when:\n\n1. **Few treated units**: When you have only one or a small number of treated units (e.g., a single state passed a policy), standard DiD averages across all controls equally. Synthetic DiD finds the optimal weighted combination of controls.\n\n   ```python\n   # Example: California passed a policy, want to estimate its effect\n   # Standard DiD would compare CA to the average of all other states\n   # Synthetic DiD finds states that together best match CA's pre-treatment trend\n   ```\n\n2. **Parallel trends is questionable**: When treated and control groups have different pre-treatment levels or trends, Synthetic DiD can construct a better counterfactual by matching the pre-treatment trajectory.\n\n   ```python\n   # Example: A tech hub city vs rural areas\n   # Rural areas may not be a good comparison on average\n   # Synthetic DiD can weight urban/suburban controls more heavily\n   ```\n\n3. **Heterogeneous control units**: When control units are very different from each other, equal weighting (as in standard DiD) is suboptimal.\n\n   ```python\n   # Example: Comparing a treated developing country to other countries\n   # Some control countries may be much more similar economically\n   # Synthetic DiD upweights the most comparable controls\n   ```\n\n4. **You want transparency**: Synthetic DiD provides explicit unit weights showing which controls contribute most to the comparison.\n\n   ```python\n   # See exactly which units are driving the counterfactual\n   print(results.get_unit_weights_df())\n   ```\n\n**Key differences from standard DiD:**\n\n| Aspect | Standard DiD | Synthetic DiD |\n|--------|--------------|---------------|\n| Control weighting | Equal (1/N) | Optimized to match pre-treatment |\n| Time weighting | Equal across periods | Can emphasize informative periods |\n| N treated required | Can be many | Works with 1 treated unit |\n| Parallel trends | Assumed | Partially relaxed via matching |\n| Interpretability | Simple average | Explicit weights |\n\n**Parameters:**\n\n```python\nSyntheticDiD(\n    zeta_omega=None,        # Unit weight regularization (None = auto-computed from data)\n    zeta_lambda=None,       # Time weight regularization (None = auto-computed from data)\n    alpha=0.05,             # Significance level\n    variance_method=\"placebo\",  # \"placebo\" (default, matches R) or \"bootstrap\"\n    n_bootstrap=200,        # Replications for SE estimation\n    seed=None               # Random seed for reproducibility\n)\n```\n\n### Triply Robust Panel (TROP)\n\nTROP (Athey, Imbens, Qu \u0026 Viviano 2025) extends Synthetic DiD by adding interactive fixed effects (factor model) adjustment. It's particularly useful when there are unobserved time-varying confounders with a factor structure that could bias standard DiD or SDID estimates.\n\nTROP combines three robustness components:\n1. **Nuclear norm regularized factor model**: Estimates interactive fixed effects L_it via soft-thresholding\n2. **Exponential distance-based unit weights**: ω_j = exp(-λ_unit × distance(j,i))\n3. **Exponential time decay weights**: θ_s = exp(-λ_time × |s-t|)\n\nTuning parameters are selected via leave-one-out cross-validation (LOOCV).\n\n```python\nfrom diff_diff import TROP, trop\n\n# Fit TROP model with automatic tuning via LOOCV\ntrop_est = TROP(\n    lambda_time_grid=[0.0, 0.5, 1.0, 2.0],  # Time decay grid\n    lambda_unit_grid=[0.0, 0.5, 1.0, 2.0],  # Unit distance grid\n    lambda_nn_grid=[0.0, 0.1, 1.0],          # Nuclear norm grid\n    n_bootstrap=200\n)\n# Note: TROP infers treatment periods from the treatment indicator column.\n# The 'treated' column must be an absorbing state (D=1 for all periods\n# during and after treatment starts for each unit).\nresults = trop_est.fit(\n    panel_data,\n    outcome='gdp_growth',\n    treatment='treated',\n    unit='state',\n    time='year'\n)\n\n# View results\nresults.print_summary()\nprint(f\"ATT: {results.att:.3f} (SE: {results.se:.3f})\")\nprint(f\"Effective rank: {results.effective_rank:.2f}\")\n\n# Selected tuning parameters\nprint(f\"λ_time: {results.lambda_time:.2f}\")\nprint(f\"λ_unit: {results.lambda_unit:.2f}\")\nprint(f\"λ_nn: {results.lambda_nn:.2f}\")\n\n# Examine unit effects\nunit_effects = results.get_unit_effects_df()\nprint(unit_effects.head(10))\n```\n\nOutput:\n```\n===========================================================================\n         Triply Robust Panel (TROP) Estimation Results\n               Athey, Imbens, Qu \u0026 Viviano (2025)\n===========================================================================\n\nObservations:                      500\nTreated units:                       1\nControl units:                      49\nTreated observations:                4\nPre-treatment periods:               6\nPost-treatment periods:              4\n\n---------------------------------------------------------------------------\n             Tuning Parameters (selected via LOOCV)\n---------------------------------------------------------------------------\nLambda (time decay):               1.0000\nLambda (unit distance):            0.5000\nLambda (nuclear norm):             0.1000\nEffective rank:                      2.35\nLOOCV score:                     0.012345\nVariance method:                bootstrap\nBootstrap replications:              200\n\n---------------------------------------------------------------------------\nParameter         Estimate     Std. Err.     t-stat      P\u003e|t|\n---------------------------------------------------------------------------\nATT                 2.5000       0.3892      6.424      0.0000   ***\n---------------------------------------------------------------------------\n\n95% Confidence Interval: [1.7372, 3.2628]\n\nSignif. codes: '***' 0.001, '**' 0.01, '*' 0.05, '.' 0.1\n===========================================================================\n```\n\n#### When to Use TROP Over Synthetic DiD\n\nUse TROP when you suspect **factor structure** in the data—unobserved confounders that affect outcomes differently across units and time:\n\n| Scenario | Use SDID | Use TROP |\n|----------|----------|----------|\n| Simple parallel trends | ✓ | ✓ |\n| Unobserved factors (e.g., economic cycles) | May be biased | ✓ |\n| Strong unit-time interactions | May be biased | ✓ |\n| Low-dimensional confounding | ✓ | ✓ |\n\n**Example scenarios where TROP excels:**\n- Regional economic shocks that affect states differently based on industry composition\n- Global trends that impact countries differently based on their economic structure\n- Common factors in financial data (market risk, interest rates, etc.)\n\n**How TROP works:**\n\n1. **Factor estimation**: Estimates interactive fixed effects L_it using nuclear norm regularization (encourages low-rank structure)\n2. **Unit weights**: Exponential distance-based weighting ω_j = exp(-λ_unit × d(j,i)) where d(j,i) is the RMSE of outcome differences\n3. **Time weights**: Exponential decay weighting θ_s = exp(-λ_time × |s-t|) based on proximity to treatment\n4. **ATT computation**: τ = Y_it - α_i - β_t - L_it for treated observations\n\n```python\n# Compare TROP vs SDID under factor confounding\nfrom diff_diff import SyntheticDiD\n\n# Synthetic DiD (may be biased with factors)\nsdid = SyntheticDiD()\nsdid_results = sdid.fit(data, outcome='y', treatment='treated',\n                        unit='unit', time='time', post_periods=[5,6,7])\n\n# TROP (accounts for factors)\n# Note: TROP infers treatment periods from the treatment indicator column\n# (D=1 for treated observations, D=0 for control)\ntrop_est = TROP()  # Uses default grids with LOOCV selection\ntrop_results = trop_est.fit(data, outcome='y', treatment='treated',\n                            unit='unit', time='time')\n\nprint(f\"SDID estimate: {sdid_results.att:.3f}\")\nprint(f\"TROP estimate: {trop_results.att:.3f}\")\nprint(f\"Effective rank: {trop_results.effective_rank:.2f}\")\n```\n\n**Tuning parameter grids:**\n\n```python\n# Custom tuning grids (searched via LOOCV)\ntrop = TROP(\n    lambda_time_grid=[0.0, 0.1, 0.5, 1.0, 2.0, 5.0],  # Time decay\n    lambda_unit_grid=[0.0, 0.1, 0.5, 1.0, 2.0, 5.0],  # Unit distance\n    lambda_nn_grid=[0.0, 0.01, 0.1, 1.0, 10.0]        # Nuclear norm\n)\n\n# Fixed tuning parameters (skip LOOCV search)\ntrop = TROP(\n    lambda_time_grid=[1.0],   # Single value = fixed\n    lambda_unit_grid=[1.0],   # Single value = fixed\n    lambda_nn_grid=[0.1]      # Single value = fixed\n)\n```\n\n**Parameters:**\n\n```python\nTROP(\n    method='local',             # Estimation method: 'local' (default) or 'global'\n    lambda_time_grid=None,      # Time decay grid (default: [0, 0.1, 0.5, 1, 2, 5])\n    lambda_unit_grid=None,      # Unit distance grid (default: [0, 0.1, 0.5, 1, 2, 5])\n    lambda_nn_grid=None,        # Nuclear norm grid (default: [0, 0.01, 0.1, 1, 10])\n    max_iter=100,               # Max iterations for factor estimation\n    tol=1e-6,                   # Convergence tolerance\n    alpha=0.05,                 # Significance level\n    n_bootstrap=200,            # Bootstrap replications\n    seed=None                   # Random seed\n)\n```\n\n**Estimation methods:**\n- `'local'` (default): Per-observation model fitting following Algorithm 2 of the paper. Computes observation-specific weights and fits a model for each treated observation, then averages the individual treatment effects. More flexible but computationally intensive.\n- `'global'`: Global weighted least squares optimization. Fits a single model on control observations with global weights, then computes per-observation treatment effects as residuals. Faster but uses global rather than observation-specific weights.\n\n**Convenience function:**\n\n```python\n# One-liner estimation with default tuning grids\n# Note: TROP infers treatment periods from the treatment indicator\nresults = trop(\n    data,\n    outcome='y',\n    treatment='treated',\n    unit='unit',\n    time='time',\n    n_bootstrap=200\n)\n```\n\n## Working with Results\n\n### Export Results\n\n```python\n# As dictionary\nresults.to_dict()\n# {'att': 3.5, 'se': 1.26, 'p_value': 0.037, ...}\n\n# As DataFrame\ndf = results.to_dataframe()\n```\n\n### Check Significance\n\n```python\nif results.is_significant:\n    print(f\"Effect is significant at {did.alpha} level\")\n\n# Get significance stars\nprint(f\"ATT: {results.att}{results.significance_stars}\")\n# ATT: 3.5000*\n```\n\n### Access Full Regression Output\n\n```python\n# All coefficients\nresults.coefficients\n# {'const': 9.5, 'treated': 1.0, 'post': 2.5, 'treated:post': 3.5}\n\n# Variance-covariance matrix\nresults.vcov\n\n# Residuals and fitted values\nresults.residuals\nresults.fitted_values\n\n# R-squared\nresults.r_squared\n```\n\n## Checking Assumptions\n\n### Parallel Trends\n\n**Simple slope-based test:**\n\n```python\nfrom diff_diff.utils import check_parallel_trends\n\ntrends = check_parallel_trends(\n    data,\n    outcome='outcome',\n    time='period',\n    treatment_group='treated'\n)\n\nprint(f\"Treated trend: {trends['treated_trend']:.4f}\")\nprint(f\"Control trend: {trends['control_trend']:.4f}\")\nprint(f\"Difference p-value: {trends['p_value']:.4f}\")\n```\n\n**Robust distributional test (Wasserstein distance):**\n\n```python\nfrom diff_diff.utils import check_parallel_trends_robust\n\nresults = check_parallel_trends_robust(\n    data,\n    outcome='outcome',\n    time='period',\n    treatment_group='treated',\n    unit='firm_id',              # Unit identifier for panel data\n    pre_periods=[2018, 2019],    # Pre-treatment periods\n    n_permutations=1000          # Permutations for p-value\n)\n\nprint(f\"Wasserstein distance: {results['wasserstein_distance']:.4f}\")\nprint(f\"Wasserstein p-value: {results['wasserstein_p_value']:.4f}\")\nprint(f\"KS test p-value: {results['ks_p_value']:.4f}\")\nprint(f\"Parallel trends plausible: {results['parallel_trends_plausible']}\")\n```\n\nThe Wasserstein (Earth Mover's) distance compares the full distribution of outcome changes, not just means. This is more robust to:\n- Non-normal distributions\n- Heterogeneous effects across units\n- Outliers\n\n**Equivalence testing (TOST):**\n\n```python\nfrom diff_diff.utils import equivalence_test_trends\n\nresults = equivalence_test_trends(\n    data,\n    outcome='outcome',\n    time='period',\n    treatment_group='treated',\n    unit='firm_id',\n    equivalence_margin=0.5       # Define \"practically equivalent\"\n)\n\nprint(f\"Mean difference: {results['mean_difference']:.4f}\")\nprint(f\"TOST p-value: {results['tost_p_value']:.4f}\")\nprint(f\"Trends equivalent: {results['equivalent']}\")\n```\n\n### Honest DiD Sensitivity Analysis (Rambachan-Roth)\n\nPre-trends tests have low power and can exacerbate bias. **Honest DiD** (Rambachan \u0026 Roth 2023) provides sensitivity analysis showing how robust your results are to violations of parallel trends.\n\n```python\nfrom diff_diff import HonestDiD, MultiPeriodDiD\n\n# First, fit a full event study (pre + post period effects)\ndid = MultiPeriodDiD()\nevent_results = did.fit(\n    data,\n    outcome='outcome',\n    treatment='treated',\n    time='period',\n    post_periods=[5, 6, 7, 8, 9],\n    reference_period=4,          # Last pre-period (e=-1 convention)\n)\n\n# Compute honest bounds with relative magnitudes restriction\n# M=1 means post-treatment violations can be up to 1x the worst pre-treatment violation\nhonest = HonestDiD(method='relative_magnitude', M=1.0)\nhonest_results = honest.fit(event_results)\n\nprint(honest_results.summary())\nprint(f\"Original estimate: {honest_results.original_estimate:.4f}\")\nprint(f\"Robust 95% CI: [{honest_results.ci_lb:.4f}, {honest_results.ci_ub:.4f}]\")\nprint(f\"Effect robust to violations: {honest_results.is_significant}\")\n```\n\n**Sensitivity analysis over M values:**\n\n```python\n# How do results change as we allow larger violations?\nsensitivity = honest.sensitivity_analysis(\n    event_results,\n    M_grid=[0, 0.5, 1.0, 1.5, 2.0]\n)\n\nprint(sensitivity.summary())\nprint(f\"Breakdown value: M = {sensitivity.breakdown_M}\")\n# Breakdown = smallest M where the robust CI includes zero\n```\n\n**Breakdown value:**\n\nThe breakdown value tells you how robust your conclusion is:\n\n```python\nbreakdown = honest.breakdown_value(event_results)\nif breakdown \u003e= 1.0:\n    print(\"Result holds even if post-treatment violations are as bad as pre-treatment\")\nelse:\n    print(f\"Result requires violations smaller than {breakdown:.1f}x pre-treatment\")\n```\n\n**Smoothness restriction (alternative approach):**\n\n```python\n# Bounds second differences of trend violations\n# M=0 means linear extrapolation of pre-trends\nhonest_smooth = HonestDiD(method='smoothness', M=0.5)\nsmooth_results = honest_smooth.fit(event_results)\n```\n\n**Visualization:**\n\n```python\nfrom diff_diff import plot_sensitivity, plot_honest_event_study\n\n# Plot sensitivity analysis\nplot_sensitivity(sensitivity, title=\"Sensitivity to Parallel Trends Violations\")\n\n# Event study with honest confidence intervals\nplot_honest_event_study(event_results, honest_results)\n```\n\n### Pre-Trends Power Analysis (Roth 2022)\n\nA passing pre-trends test doesn't mean parallel trends holds—it may just mean the test has low power. **Pre-Trends Power Analysis** (Roth 2022) answers: \"What violations could my pre-trends test have detected?\"\n\n```python\nfrom diff_diff import PreTrendsPower, MultiPeriodDiD\n\n# First, fit a full event study\ndid = MultiPeriodDiD()\nevent_results = did.fit(\n    data,\n    outcome='outcome',\n    treatment='treated',\n    time='period',\n    post_periods=[5, 6, 7, 8, 9],\n    reference_period=4,\n)\n\n# Analyze pre-trends test power\npt = PreTrendsPower(alpha=0.05, power=0.80)\npower_results = pt.fit(event_results)\n\nprint(power_results.summary())\nprint(f\"Minimum Detectable Violation (MDV): {power_results.mdv:.4f}\")\nprint(f\"Power to detect violations of size MDV: {power_results.power:.1%}\")\n```\n\n**Key concepts:**\n\n- **Minimum Detectable Violation (MDV)**: Smallest violation magnitude that would be detected with your target power (e.g., 80%). Passing the pre-trends test does NOT rule out violations up to this size.\n- **Power**: Probability of detecting a violation of given size if it exists.\n- **Violation types**: Linear trend, constant violation, last-period only, or custom patterns.\n\n**Power curve visualization:**\n\n```python\nfrom diff_diff import plot_pretrends_power\n\n# Generate power curve across violation magnitudes\ncurve = pt.power_curve(event_results)\n\n# Plot the power curve\nplot_pretrends_power(curve, title=\"Pre-Trends Test Power Curve\")\n\n# Or from the curve object directly\ncurve.plot()\n```\n\n**Different violation patterns:**\n\n```python\n# Linear trend violations (default) - most common assumption\npt_linear = PreTrendsPower(violation_type='linear')\n\n# Constant violation in all pre-periods\npt_constant = PreTrendsPower(violation_type='constant')\n\n# Violation only in the last pre-period (sharp break)\npt_last = PreTrendsPower(violation_type='last_period')\n\n# Custom violation pattern\ncustom_weights = np.array([0.1, 0.3, 0.6])  # Increasing violations\npt_custom = PreTrendsPower(violation_type='custom', violation_weights=custom_weights)\n```\n\n**Combining with HonestDiD:**\n\nPre-trends power analysis and HonestDiD are complementary:\n1. **Pre-trends power** tells you what the test could have detected\n2. **HonestDiD** tells you how robust your results are to violations\n\n```python\nfrom diff_diff import HonestDiD, PreTrendsPower\n\n# If MDV is large relative to your estimated effect, be cautious\npt = PreTrendsPower()\npower_results = pt.fit(event_results)\nsensitivity = pt.sensitivity_to_honest_did(event_results)\nprint(sensitivity['interpretation'])\n\n# Use HonestDiD for robust inference\nhonest = HonestDiD(method='relative_magnitude', M=1.0)\nhonest_results = honest.fit(event_results)\n```\n\n### Placebo Tests\n\nPlacebo tests help validate the parallel trends assumption by checking whether effects appear where they shouldn't (before treatment or in untreated groups).\n\n**Fake timing test:**\n\n```python\nfrom diff_diff import run_placebo_test\n\n# Test: Is there an effect before treatment actually occurred?\n# Actual treatment is at period 3 (post_periods=[3, 4, 5])\n# We test if a \"fake\" treatment at period 1 shows an effect\nresults = run_placebo_test(\n    data,\n    outcome='outcome',\n    treatment='treated',\n    time='period',\n    test_type='fake_timing',\n    fake_treatment_period=1,  # Pretend treatment was in period 1\n    post_periods=[3, 4, 5]    # Actual post-treatment periods\n)\n\nprint(results.summary())\n# If parallel trends hold, placebo_effect should be ~0 and not significant\nprint(f\"Placebo effect: {results.placebo_effect:.3f} (p={results.p_value:.3f})\")\nprint(f\"Is significant (bad): {results.is_significant}\")\n```\n\n**Fake group test:**\n\n```python\n# Test: Is there an effect among never-treated units?\n# Get some control unit IDs to use as \"fake treated\"\ncontrol_units = data[data['treated'] == 0]['firm_id'].unique()[:5]\n\nresults = run_placebo_test(\n    data,\n    outcome='outcome',\n    treatment='treated',\n    time='period',\n    unit='firm_id',\n    test_type='fake_group',\n    fake_treatment_group=list(control_units),  # List of control unit IDs\n    post_periods=[3, 4, 5]\n)\n```\n\n**Permutation test:**\n\n```python\n# Randomly reassign treatment and compute distribution of effects\n# Note: requires binary post indicator (use 'post' column, not 'period')\nresults = run_placebo_test(\n    data,\n    outcome='outcome',\n    treatment='treated',\n    time='post',           # Binary post-treatment indicator\n    unit='firm_id',\n    test_type='permutation',\n    n_permutations=1000,\n    seed=42\n)\n\nprint(f\"Original effect: {results.original_effect:.3f}\")\nprint(f\"Permutation p-value: {results.p_value:.4f}\")\n# Low p-value indicates the effect is unlikely to be due to chance\n```\n\n**Leave-one-out sensitivity:**\n\n```python\n# Test sensitivity to individual treated units\n# Note: requires binary post indicator (use 'post' column, not 'period')\nresults = run_placebo_test(\n    data,\n    outcome='outcome',\n    treatment='treated',\n    time='post',           # Binary post-treatment indicator\n    unit='firm_id',\n    test_type='leave_one_out'\n)\n\n# Check if any single unit drives the result\nprint(results.leave_one_out_effects)  # Effect when each unit is dropped\n```\n\n**Run all placebo tests:**\n\n```python\nfrom diff_diff import run_all_placebo_tests\n\n# Comprehensive diagnostic suite\n# Note: This function runs fake_timing tests on pre-treatment periods.\n# The permutation and leave_one_out tests require a binary post indicator,\n# so they may return errors if the data uses multi-period time column.\nall_results = run_all_placebo_tests(\n    data,\n    outcome='outcome',\n    treatment='treated',\n    time='period',\n    unit='firm_id',\n    pre_periods=[0, 1, 2],\n    post_periods=[3, 4, 5],\n    n_permutations=500,\n    seed=42\n)\n\nfor test_name, result in all_results.items():\n    if hasattr(result, 'p_value'):\n        print(f\"{test_name}: p={result.p_value:.3f}, significant={result.is_significant}\")\n    elif isinstance(result, dict) and 'error' in result:\n        print(f\"{test_name}: Error - {result['error']}\")\n```\n\n## API Reference\n\n### DifferenceInDifferences\n\n```python\nDifferenceInDifferences(\n    robust=True,      # Use HC1 robust standard errors\n    cluster=None,     # Column for cluster-robust SEs\n    alpha=0.05        # Significance level for CIs\n)\n```\n\n**Methods:**\n\n| Method | Description |\n|--------|-------------|\n| `fit(data, outcome, treatment, time, ...)` | Fit the DiD model |\n| `summary()` | Get formatted summary string |\n| `print_summary()` | Print summary to stdout |\n| `get_params()` | Get estimator parameters (sklearn-compatible) |\n| `set_params(**params)` | Set estimator parameters (sklearn-compatible) |\n\n**fit() Parameters:**\n\n| Parameter | Type | Description |\n|-----------|------|-------------|\n| `data` | DataFrame | Input data |\n| `outcome` | str | Outcome variable column name |\n| `treatment` | str | Treatment indicator column (0/1) |\n| `time` | str | Post-treatment indicator column (0/1) |\n| `formula` | str | R-style formula (alternative to column names) |\n| `covariates` | list | Linear control variables |\n| `fixed_effects` | list | Categorical FE columns (creates dummies) |\n| `absorb` | list | High-dimensional FE (within-transformation) |\n\n### DiDResults\n\n**Attributes:**\n\n| Attribute | Description |\n|-----------|-------------|\n| `att` | Average Treatment effect on the Treated |\n| `se` | Standard error of ATT |\n| `t_stat` | T-statistic |\n| `p_value` | P-value for H0: ATT = 0 |\n| `conf_int` | Tuple of (lower, upper) confidence bounds |\n| `n_obs` | Number of observations |\n| `n_treated` | Number of treated units |\n| `n_control` | Number of control units |\n| `r_squared` | R-squared of regression |\n| `coefficients` | Dictionary of all coefficients |\n| `is_significant` | Boolean for significance at alpha |\n| `significance_stars` | String of significance stars |\n\n**Methods:**\n\n| Method | Description |\n|--------|-------------|\n| `summary(alpha)` | Get formatted summary string |\n| `print_summary(alpha)` | Print summary to stdout |\n| `to_dict()` | Convert to dictionary |\n| `to_dataframe()` | Convert to pandas DataFrame |\n\n### MultiPeriodDiD\n\n```python\nMultiPeriodDiD(\n    robust=True,      # Use HC1 robust standard errors\n    cluster=None,     # Column for cluster-robust SEs\n    alpha=0.05        # Significance level for CIs\n)\n```\n\n**fit() Parameters:**\n\n| Parameter | Type | Description |\n|-----------|------|-------------|\n| `data` | DataFrame | Input data |\n| `outcome` | str | Outcome variable column name |\n| `treatment` | str | Treatment indicator column (0/1) |\n| `time` | str | Time period column (multiple values) |\n| `post_periods` | list | List of post-treatment period values |\n| `covariates` | list | Linear control variables |\n| `fixed_effects` | list | Categorical FE columns (creates dummies) |\n| `absorb` | list | High-dimensional FE (within-transformation) |\n| `reference_period` | any | Omitted period (default: last pre-period, e=-1 convention) |\n| `unit` | str | Unit identifier column (for staggered adoption warning) |\n\n### MultiPeriodDiDResults\n\n**Attributes:**\n\n| Attribute | Description |\n|-----------|-------------|\n| `period_effects` | Dict mapping periods to PeriodEffect objects (pre and post, excluding reference) |\n| `avg_att` | Average ATT across post-treatment periods only |\n| `avg_se` | Standard error of average ATT |\n| `avg_t_stat` | T-statistic for average ATT |\n| `avg_p_value` | P-value for average ATT |\n| `avg_conf_int` | Confidence interval for average ATT |\n| `n_obs` | Number of observations |\n| `pre_periods` | List of pre-treatment periods |\n| `post_periods` | List of post-treatment periods |\n| `reference_period` | The omitted reference period (coefficient = 0 by construction) |\n| `interaction_indices` | Dict mapping period → column index in VCV (for sub-VCV extraction) |\n| `pre_period_effects` | Property: pre-period effects only (for parallel trends assessment) |\n| `post_period_effects` | Property: post-period effects only |\n\n**Methods:**\n\n| Method | Description |\n|--------|-------------|\n| `get_effect(period)` | Get PeriodEffect for specific period |\n| `summary(alpha)` | Get formatted summary string |\n| `print_summary(alpha)` | Print summary to stdout |\n| `to_dict()` | Convert to dictionary |\n| `to_dataframe()` | Convert to pandas DataFrame |\n\n### PeriodEffect\n\n**Attributes:**\n\n| Attribute | Description |\n|-----------|-------------|\n| `period` | Time period identifier |\n| `effect` | Treatment effect estimate |\n| `se` | Standard error |\n| `t_stat` | T-statistic |\n| `p_value` | P-value |\n| `conf_int` | Confidence interval |\n| `is_significant` | Boolean for significance at 0.05 |\n| `significance_stars` | String of significance stars |\n\n### SyntheticDiD\n\n```python\nSyntheticDiD(\n    zeta_omega=None,        # Unit weight regularization (None = auto from data)\n    zeta_lambda=None,       # Time weight regularization (None = auto from data)\n    alpha=0.05,             # Significance level for CIs\n    variance_method=\"placebo\",  # \"placebo\" (R default) or \"bootstrap\"\n    n_bootstrap=200,        # Replications for SE estimation\n    seed=None               # Random seed for reproducibility\n)\n```\n\n**fit() Parameters:**\n\n| Parameter | Type | Description |\n|-----------|------|-------------|\n| `data` | DataFrame | Panel data |\n| `outcome` | str | Outcome variable column name |\n| `treatment` | str | Treatment indicator column (0/1) |\n| `unit` | str | Unit identifier column |\n| `time` | str | Time period column |\n| `post_periods` | list | List of post-treatment period values |\n| `covariates` | list | Covariates to residualize out |\n\n### SyntheticDiDResults\n\n**Attributes:**\n\n| Attribute | Description |\n|-----------|-------------|\n| `att` | Average Treatment effect on the Treated |\n| `se` | Standard error (bootstrap or placebo-based) |\n| `t_stat` | T-statistic |\n| `p_value` | P-value |\n| `conf_int` | Confidence interval |\n| `n_obs` | Number of observations |\n| `n_treated` | Number of treated units |\n| `n_control` | Number of control units |\n| `unit_weights` | Dict mapping control unit IDs to weights |\n| `time_weights` | Dict mapping pre-treatment periods to weights |\n| `pre_periods` | List of pre-treatment periods |\n| `post_periods` | List of post-treatment periods |\n| `pre_treatment_fit` | RMSE of synthetic vs treated in pre-period |\n| `placebo_effects` | Array of placebo effect estimates |\n\n**Methods:**\n\n| Method | Description |\n|--------|-------------|\n| `summary(alpha)` | Get formatted summary string |\n| `print_summary(alpha)` | Print summary to stdout |\n| `to_dict()` | Convert to dictionary |\n| `to_dataframe()` | Convert to pandas DataFrame |\n| `get_unit_weights_df()` | Get unit weights as DataFrame |\n| `get_time_weights_df()` | Get time weights as DataFrame |\n\n### TROP\n\n```python\nTROP(\n    lambda_time_grid=None,     # Time decay grid (default: [0, 0.1, 0.5, 1, 2, 5])\n    lambda_unit_grid=None,     # Unit distance grid (default: [0, 0.1, 0.5, 1, 2, 5])\n    lambda_nn_grid=None,       # Nuclear norm grid (default: [0, 0.01, 0.1, 1, 10])\n    max_iter=100,              # Max iterations for factor estimation\n    tol=1e-6,                  # Convergence tolerance\n    alpha=0.05,                # Significance level for CIs\n    n_bootstrap=200,           # Bootstrap replications (minimum 2; TROP requires bootstrap for SEs)\n    seed=None                  # Random seed\n)\n```\n\n**fit() Parameters:**\n\n| Parameter | Type | Description |\n|-----------|------|-------------|\n| `data` | DataFrame | Panel data |\n| `outcome` | str | Outcome variable column name |\n| `treatment` | str | Treatment indicator column (0/1 absorbing state) |\n| `unit` | str | Unit identifier column |\n| `time` | str | Time period column |\n\nNote: TROP infers treatment periods from the treatment indicator column. The treatment column should be an absorbing state indicator where D=1 for all periods during and after treatment starts.\n\n### TROPResults\n\n**Attributes:**\n\n| Attribute | Description |\n|-----------|-------------|\n| `att` | Average Treatment effect on the Treated |\n| `se` | Standard error (bootstrap) |\n| `t_stat` | T-statistic |\n| `p_value` | P-value |\n| `conf_int` | Confidence interval |\n| `n_obs` | Number of observations |\n| `n_treated` | Number of treated units |\n| `n_control` | Number of control units |\n| `n_treated_obs` | Number of treated unit-time observations |\n| `unit_effects` | Dict mapping unit IDs to fixed effects |\n| `time_effects` | Dict mapping periods to fixed effects |\n| `treatment_effects` | Dict mapping (unit, time) to individual effects |\n| `lambda_time` | Selected time decay parameter |\n| `lambda_unit` | Selected unit distance parameter |\n| `lambda_nn` | Selected nuclear norm parameter |\n| `factor_matrix` | Low-rank factor matrix L (n_periods x n_units) |\n| `effective_rank` | Effective rank of factor matrix |\n| `loocv_score` | LOOCV score for selected parameters |\n| `n_pre_periods` | Number of pre-treatment periods |\n| `n_post_periods` | Number of post-treatment periods |\n| `bootstrap_distribution` | Bootstrap distribution (if bootstrap) |\n\n**Methods:**\n\n| Method | Description |\n|--------|-------------|\n| `summary(alpha)` | Get formatted summary string |\n| `print_summary(alpha)` | Print summary to stdout |\n| `to_dict()` | Convert to dictionary |\n| `to_dataframe()` | Convert to pandas DataFrame |\n| `get_unit_effects_df()` | Get unit fixed effects as DataFrame |\n| `get_time_effects_df()` | Get time fixed effects as DataFrame |\n| `get_treatment_effects_df()` | Get individual treatment effects as DataFrame |\n\n### SunAbraham\n\n```python\nSunAbraham(\n    control_group='never_treated',  # or 'not_yet_treated'\n    anticipation=0,           # Periods of anticipation effects\n    alpha=0.05,               # Significance level for CIs\n    cluster=None,             # Column for cluster-robust SEs\n    n_bootstrap=0,            # Bootstrap iterations (0 = analytical SEs)\n    bootstrap_weights='rademacher',  # 'rademacher', 'mammen', or 'webb'\n    seed=None                 # Random seed\n)\n```\n\n**fit() Parameters:**\n\n| Parameter | Type | Description |\n|-----------|------|-------------|\n| `data` | DataFrame | Panel data |\n| `outcome` | str | Outcome variable column name |\n| `unit` | str | Unit identifier column |\n| `time` | str | Time period column |\n| `first_treat` | str | Column with first treatment period (0 for never-treated) |\n| `covariates` | list | Covariate column names |\n\n### SunAbrahamResults\n\n**Attributes:**\n\n| Attribute | Description |\n|-----------|-------------|\n| `event_study_effects` | Dict mapping relative time to effect info |\n| `overall_att` | Overall average treatment effect |\n| `overall_se` | Standard error of overall ATT |\n| `overall_t_stat` | T-statistic for overall ATT |\n| `overall_p_value` | P-value for overall ATT |\n| `overall_conf_int` | Confidence interval for overall ATT |\n| `cohort_weights` | Dict mapping relative time to cohort weights |\n| `groups` | List of treatment cohorts |\n| `time_periods` | List of all time periods |\n| `n_obs` | Total number of observations |\n| `n_treated_units` | Number of ever-treated units |\n| `n_control_units` | Number of never-treated units |\n| `is_significant` | Boolean for significance at alpha |\n| `significance_stars` | String of significance stars |\n| `bootstrap_results` | SABootstrapResults (if bootstrap enabled) |\n\n**Methods:**\n\n| Method | Description |\n|--------|-------------|\n| `summary(alpha)` | Get formatted summary string |\n| `print_summary(alpha)` | Print summary to stdout |\n| `to_dataframe(level)` | Convert to DataFrame ('event_study' or 'cohort') |\n\n### ImputationDiD\n\n```python\nImputationDiD(\n    anticipation=0,                   # Periods of anticipation effects\n    alpha=0.05,                       # Significance level for CIs\n    cluster=None,                     # Column for cluster-robust SEs\n    n_bootstrap=0,                    # Bootstrap iterations (0 = analytical)\n    bootstrap_weights='rademacher',   # 'rademacher', 'mammen', or 'webb'\n    seed=None,                        # Random seed\n    rank_deficient_action='warn',     # 'warn', 'error', or 'silent'\n    horizon_max=None,                 # Max event-study horizon\n    aux_partition='cohort_horizon',   # Variance partition\n)\n```\n\n**fit() Parameters:**\n\n| Parameter | Type | Description |\n|-----------|------|-------------|\n| `data` | DataFrame | Panel data |\n| `outcome` | str | Outcome variable column name |\n| `unit` | str | Unit identifier column |\n| `time` | str | Time period column |\n| `first_treat` | str | First treatment period column (0 for never-treated) |\n| `covariates` | list | Covariate column names |\n| `aggregate` | str | Aggregation: None, \"event_study\", \"group\", \"all\" |\n| `balance_e` | int | Balance event study to this many pre-treatment periods |\n\n### ImputationDiDResults\n\n**Attributes:**\n\n| Attribute | Description |\n|-----------|-------------|\n| `overall_att` | Overall average treatment effect on the treated |\n| `overall_se` | Standard error (conservative, Theorem 3) |\n| `overall_t_stat` | T-statistic |\n| `overall_p_value` | P-value for H0: ATT = 0 |\n| `overall_conf_int` | Confidence interval |\n| `event_study_effects` | Dict of relative time -\u003e effect dict (if `aggregate='event_study'` or `'all'`) |\n| `group_effects` | Dict of cohort -\u003e effect dict (if `aggregate='group'` or `'all'`) |\n| `treatment_effects` | DataFrame of unit-level imputed treatment effects |\n| `n_treated_obs` | Number of treated observations |\n| `n_untreated_obs` | Number of untreated observations |\n\n**Methods:**\n\n| Method | Description |\n|--------|-------------|\n| `summary(alpha)` | Get formatted summary string |\n| `print_summary(alpha)` | Print summary to stdout |\n| `to_dataframe(level)` | Convert to DataFrame ('observation', 'event_study', 'group') |\n| `pretrend_test(n_leads)` | Run pre-trend F-test (Equation 9) |\n\n### TwoStageDiD\n\n```python\nTwoStageDiD(\n    anticipation=0,                   # Periods of anticipation effects\n    alpha=0.05,                       # Significance level for CIs\n    cluster=None,                     # Column for cluster-robust SEs (defaults to unit)\n    n_bootstrap=0,                    # Bootstrap iterations (0 = analytical GMM SEs)\n    bootstrap_weights='rademacher',   # 'rademacher', 'mammen', or 'webb'\n    seed=None,                        # Random seed\n    rank_deficient_action='warn',     # 'warn', 'error', or 'silent'\n    horizon_max=None,                 # Max event-study horizon\n)\n```\n\n**fit() Parameters:**\n\n| Parameter | Type | Description |\n|-----------|------|-------------|\n| `data` | DataFrame | Panel data |\n| `outcome` | str | Outcome variable column name |\n| `unit` | str | Unit identifier column |\n| `time` | str | Time period column |\n| `first_treat` | str | First treatment period column (0 for never-treated) |\n| `covariates` | list | Covariate column names |\n| `aggregate` | str | Aggregation: None, \"event_study\", \"group\", \"all\" |\n| `balance_e` | int | Balance event study to this many pre-treatment periods |\n\n### TwoStageDiDResults\n\n**Attributes:**\n\n| Attribute | Description |\n|-----------|-------------|\n| `overall_att` | Overall average treatment effect on the treated |\n| `overall_se` | Standard error (GMM sandwich variance) |\n| `overall_t_stat` | T-statistic |\n| `overall_p_value` | P-value for H0: ATT = 0 |\n| `overall_conf_int` | Confidence interval |\n| `event_study_effects` | Dict of relative time -\u003e effect dict (if `aggregate='event_study'` or `'all'`) |\n| `group_effects` | Dict of cohort -\u003e effect dict (if `aggregate='group'` or `'all'`) |\n| `treatment_effects` | DataFrame of unit-level treatment effects |\n| `n_treated_obs` | Number of treated observations |\n| `n_untreated_obs` | Number of untreated observations |\n\n**Methods:**\n\n| Method | Description |\n|--------|-------------|\n| `summary(alpha)` | Get formatted summary string |\n| `print_summary(alpha)` | Print summary to stdout |\n| `to_dataframe(level)` | Convert to DataFrame ('observation', 'event_study', 'group') |\n\n### StackedDiD\n\n```python\nStackedDiD(\n    kappa_pre=1,                          # Pre-treatment event-study periods\n    kappa_post=1,                         # Post-treatment event-study periods\n    weighting='aggregate',                # 'aggregate', 'population', or 'sample_share'\n    clean_control='not_yet_treated',      # 'not_yet_treated', 'strict', or 'never_treated'\n    cluster='unit',                       # 'unit' or 'unit_subexp'\n    alpha=0.05,                           # Significance level\n    anticipation=0,                       # Anticipation periods\n    rank_deficient_action='warn',         # 'warn', 'error', or 'silent'\n)\n```\n\n**fit() Parameters:**\n\n| Parameter | Type | Description |\n|-----------|------|-------------|\n| `data` | DataFrame | Panel data |\n| `outcome` | str | Outcome variable column name |\n| `unit` | str | Unit identifier column |\n| `time` | str | Time period column |\n| `first_treat` | str | First treatment period column (0 for never-treated) |\n| `population` | str, optional | Population column (required if weighting='population') |\n| `aggregate` | str | Aggregation: None, `\"simple\"`, or `\"event_study\"` |\n\n### StackedDiDResults\n\n**Attributes:**\n\n| Attribute | Description |\n|-----------|-------------|\n| `overall_att` | Overall average treatment effect on the treated |\n| `overall_se` | Standard error |\n| `overall_t_stat` | T-statistic |\n| `overall_p_value` | P-value for H0: ATT = 0 |\n| `overall_conf_int` | Confidence interval |\n| `event_study_effects` | Dict of relative time -\u003e effect dict (if `aggregate='event_study'`) |\n| `stacked_data` | The stacked dataset used for estimation |\n| `n_treated_obs` | Number of treated observations |\n| `n_untreated_obs` | Number of untreated (clean control) observations |\n| `n_cohorts` | Number of treatment cohorts |\n| `kappa_pre` | Pre-treatment window used |\n| `kappa_post` | Post-treatment window used |\n\n**Methods:**\n\n| Method | Description |\n|--------|-------------|\n| `summary(alpha)` | Get formatted summary string |\n| `print_summary(alpha)` | Print summary to stdout |\n| `to_dataframe(level)` | Convert to DataFrame ('event_study') |\n\n### TripleDifference\n\n```python\nTripleDifference(\n    estimation_method='dr',   # 'dr' (doubly robust), 'reg', or 'ipw'\n    robust=True,              # Use HC1 robust standard errors\n    cluster=None,             # Column for cluster-robust SEs\n    alpha=0.05,               # Significance level for CIs\n    pscore_trim=0.01          # Propensity score trimming threshold\n)\n```\n\n**fit() Parameters:**\n\n| Parameter | Type | Description |\n|-----------|------|-------------|\n| `data` | DataFrame | Input data |\n| `outcome` | str | Outcome variable column name |\n| `group` | str | Group indicator column (0/1): 1=treated group |\n| `partition` | str | Partition/eligibility indicator column (0/1): 1=eligible |\n| `time` | str | Time indicator column (0/1): 1=post-treatment |\n| `covariates` | list | Covariate column names for adjustment |\n\n### TripleDifferenceResults\n\n**Attributes:**\n\n| Attribute | Description |\n|-----------|-------------|\n| `att` | Average Treatment effect on the Treated |\n| `se` | Standard error of ATT |\n| `t_stat` | T-statistic |\n| `p_value` | P-value for H0: ATT = 0 |\n| `conf_int` | Tuple of (lower, upper) confidence bounds |\n| `n_obs` | Total number of observations |\n| `n_treated_eligible` | Obs in treated group \u0026 eligible partition |\n| `n_treated_ineligible` | Obs in treated group \u0026 ineligible partition |\n| `n_control_eligible` | Obs in control group \u0026 eligible partition |\n| `n_control_ineligible` | Obs in control group \u0026 ineligible partition |\n| `estimation_method` | Method used ('dr', 'reg', or 'ipw') |\n| `group_means` | Dict of cell means for diagnostics |\n| `pscore_stats` | Propensity score statistics (IPW/DR only) |\n| `is_significant` | Boolean for significance at alpha |\n| `significance_stars` | String of significance stars |\n\n**Methods:**\n\n| Method | Description |\n|--------|-------------|\n| `summary(alpha)` | Get formatted summary string |\n| `print_summary(alpha)` | Print summary to stdout |\n| `to_dict()` | Convert to dictionary |\n| `to_dataframe()` | Convert to pandas DataFrame |\n\n### HonestDiD\n\n```python\nHonestDiD(\n    method='relative_magnitude',  # 'relative_magnitude' or 'smoothness'\n    M=None,               # Restriction parameter (default: 1.0 for RM, 0.0 for SD)\n    alpha=0.05,           # Significance level for CIs\n    l_vec=None            # Linear combination vector for target parameter\n)\n```\n\n**fit() Parameters:**\n\n| Parameter | Type | Description |\n|-----------|------|-------------|\n| `results` | MultiPeriodDiDResults | Results from MultiPeriodDiD.fit() |\n| `M` | float | Restriction parameter (overrides constructor value) |\n\n**Methods:**\n\n| Method | Description |\n|--------|-------------|\n| `fit(results, M)` | Compute bounds for given event study results |\n| `sensitivity_analysis(results, M_grid)` | Compute bounds over grid of M values |\n| `breakdown_value(results, tol)` | Find smallest M where CI includes zero |\n\n### HonestDiDResults\n\n**Attributes:**\n\n| Attribute | Description |\n|-----------|-------------|\n| `original_estimate` | Point estimate under parallel trends |\n| `lb` | Lower bound of identified set |\n| `ub` | Upper bound of identified set |\n| `ci_lb` | Lower bound of robust confidence interval |\n| `ci_ub` | Upper bound of robust confidence interval |\n| `ci_width` | Width of robust CI |\n| `M` | Restriction parameter used |\n| `method` | Restriction method ('relative_magnitude' or ","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Figerber%2Fdiff-diff","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Figerber%2Fdiff-diff","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Figerber%2Fdiff-diff/lists"}