{"id":15026971,"url":"https://github.com/py-why/econml","last_synced_at":"2025-12-13T18:00:55.042Z","repository":{"id":37524049,"uuid":"131646318","full_name":"py-why/EconML","owner":"py-why","description":"ALICE (Automated Learning and Intelligence for Causation and Economics) is a Microsoft Research project aimed at applying Artificial Intelligence concepts to economic decision making. One of its  goals is to build a toolkit that combines state-of-the-art machine learning techniques with econometrics in order to bring automation to complex causal inference problems. To date, the ALICE Python SDK (econml) implements orthogonal machine learning algorithms such as the double machine learning work of Chernozhukov et al. This toolkit is designed to measure the causal effect of some treatment variable(s) t on an outcome variable y, controlling for a set of features x.","archived":false,"fork":false,"pushed_at":"2025-05-01T20:52:04.000Z","size":47688,"stargazers_count":4079,"open_issues_count":382,"forks_count":745,"subscribers_count":78,"default_branch":"main","last_synced_at":"2025-05-01T21:49:05.536Z","etag":null,"topics":["causal-inference","causality","econometrics","economics","machine-learning","treatment-effects"],"latest_commit_sha":null,"homepage":"https://www.microsoft.com/en-us/research/project/alice/","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/py-why.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2018-04-30T21:02:52.000Z","updated_at":"2025-05-01T09:45:44.000Z","dependencies_parsed_at":"2023-02-14T09:30:55.966Z","dependency_job_id":"bb3d8154-cf17-4299-9393-de39a3aea9df","html_url":"https://github.com/py-why/EconML","commit_stats":{"total_commits":429,"total_committers":39,"mean_commits":11.0,"dds":0.4312354312354313,"last_synced_commit":"ababb7ea30522585cb50da8106916518e22ca23f"},"previous_names":["microsoft/econml"],"tags_count":33,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/py-why%2FEconML","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/py-why%2FEconML/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/py-why%2FEconML/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/py-why%2FEconML/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/py-why","download_url":"https://codeload.github.com/py-why/EconML/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":252751838,"owners_count":21798695,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["causal-inference","causality","econometrics","economics","machine-learning","treatment-effects"],"created_at":"2024-09-24T20:05:31.771Z","updated_at":"2025-12-13T18:00:55.025Z","avatar_url":"https://github.com/py-why.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![Build status](https://github.com/py-why/EconML/actions/workflows/ci.yml/badge.svg)](https://github.com/py-why/EconML/actions/workflows/ci.yml)\n[![PyPI version](https://img.shields.io/pypi/v/econml.svg)](https://pypi.org/project/econml/)\n[![PyPI wheel](https://img.shields.io/pypi/wheel/econml.svg)](https://pypi.org/project/econml/)\n[![Supported Python versions](https://img.shields.io/pypi/pyversions/econml.svg)](https://pypi.org/project/econml/)\n\n\u003ch1\u003e\n\u003ca href=\"https://www.pywhy.org/EconML/\"\u003e\n\u003cimg src=\"doc/econml-logo-icon.png\" width=\"80px\" align=\"left\" style=\"margin-right: 10px;\", alt=\"econml-logo\"\u003e \n\u003c/a\u003e EconML: A Python Package for ML-Based Heterogeneous Treatment Effects Estimation\n\u003c/h1\u003e\n\n**EconML** is a Python package for estimating heterogeneous treatment effects from observational data via machine learning. This package was designed and built as part of the [ALICE project](https://www.microsoft.com/en-us/research/project/alice/) at Microsoft Research with the goal to combine state-of-the-art machine learning \ntechniques with econometrics to bring automation to complex causal inference problems. The promise of EconML:\n\n* Implement recent techniques in the literature at the intersection of econometrics and machine learning\n* Maintain flexibility in modeling the effect heterogeneity (via techniques such as random forests, boosting, lasso and neural nets), while preserving the causal interpretation of the learned model and often offering valid confidence intervals\n* Use a unified API\n* Build on standard Python packages for Machine Learning and Data Analysis\n\nOne of the biggest promises of machine learning is to automate decision making in a multitude of domains. At the core of many data-driven personalized decision scenarios is the estimation of heterogeneous treatment effects: what is the causal effect of an intervention on an outcome of interest for a sample with a particular set of features? In a nutshell, this toolkit is designed to measure the causal effect of some treatment variable(s) `T` on an outcome \nvariable `Y`, controlling for a set of features `X, W` and how does that effect vary as a function of `X`. The methods implemented are applicable even with observational (non-experimental or historical) datasets. For the estimation results to have a causal interpretation, some methods assume no unobserved confounders (i.e. there is no unobserved variable not included in `X, W` that simultaneously has an effect on both `T` and `Y`), while others assume access to an instrument `Z` (i.e. an observed variable `Z` that has an effect on the treatment `T` but no direct effect on the outcome `Y`). Most methods provide confidence intervals and inference results.\n\nFor detailed information about the package, consult the documentation at https://www.pywhy.org/EconML/.\n\nFor information on use cases and background material on causal inference and heterogeneous treatment effects see our webpage at https://www.microsoft.com/en-us/research/project/econml/\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cstrong\u003e\u003cem\u003eTable of Contents\u003c/em\u003e\u003c/strong\u003e\u003c/summary\u003e\n\n- [News](#news)\n- [Getting Started](#getting-started)\n  - [Installation](#installation)\n  - [Usage Examples](#usage-examples)\n    - [Estimation Methods](#estimation-methods)\n    - [Interpretability](#interpretability)\n    - [Causal Model Selection and Cross-Validation](#causal-model-selection-and-cross-validation)\n    - [Inference](#inference)\n    - [Policy Learning](#policy-learning)\n- [For Developers](#for-developers)\n  - [Running the tests](#running-the-tests)\n  - [Generating the documentation](#generating-the-documentation)\n- [Blogs and Publications](#blogs-and-publications)\n- [Citation](#citation)\n- [Contributing and Feedback](#contributing-and-feedback)\n- [Community](#community)\n- [References](#references)\n\n\u003c/details\u003e\n\n# News\n\nIf you'd like to contribute to this project, see the [Help Wanted](#finding-issues-to-help-with) section below.\n\n**July 10, 2025:** Release v0.16.0, see release notes [here](https://github.com/py-why/EconML/releases/tag/v0.16.0)\n\n\u003cdetails\u003e\u003csummary\u003ePrevious releases\u003c/summary\u003e\n\n**July 3, 2024:** Release v0.15.1, see release notes [here](https://github.com/py-why/EconML/releases/tag/v0.15.1)\n\n**February 12, 2024:** Release v0.15.0, see release notes [here](https://github.com/py-why/EconML/releases/tag/v0.15.0)\n\n**November 11, 2023:** Release v0.15.0b1, see release notes [here](https://github.com/py-why/EconML/releases/tag/v0.15.0b1)\n\n**May 19, 2023:** Release v0.14.1, see release notes [here](https://github.com/py-why/EconML/releases/tag/v0.14.1)\n\n**November 16, 2022:** Release v0.14.0, see release notes [here](https://github.com/py-why/EconML/releases/tag/v0.14.0)\n\n**June 17, 2022:** Release v0.13.1, see release notes [here](https://github.com/py-why/EconML/releases/tag/v0.13.1)\n\n**January 31, 2022:** Release v0.13.0, see release notes [here](https://github.com/py-why/EconML/releases/tag/v0.13.0)\n\n**August 13, 2021:** Release v0.12.0, see release notes [here](https://github.com/py-why/EconML/releases/tag/v0.12.0)\n\n**August 5, 2021:** Release v0.12.0b6, see release notes [here](https://github.com/py-why/EconML/releases/tag/v0.12.0b6)\n\n**August 3, 2021:** Release v0.12.0b5, see release notes [here](https://github.com/py-why/EconML/releases/tag/v0.12.0b5)\n\n**July 9, 2021:** Release v0.12.0b4, see release notes [here](https://github.com/py-why/EconML/releases/tag/v0.12.0b4)\n\n**June 25, 2021:** Release v0.12.0b3, see release notes [here](https://github.com/py-why/EconML/releases/tag/v0.12.0b3)\n\n**June 18, 2021:** Release v0.12.0b2, see release notes [here](https://github.com/py-why/EconML/releases/tag/v0.12.0b2)\n\n**June 7, 2021:** Release v0.12.0b1, see release notes [here](https://github.com/py-why/EconML/releases/tag/v0.12.0b1)\n\n**May 18, 2021:** Release v0.11.1, see release notes [here](https://github.com/py-why/EconML/releases/tag/v0.11.1)\n\n**May 8, 2021:** Release v0.11.0, see release notes [here](https://github.com/py-why/EconML/releases/tag/v0.11.0)\n\n**March 22, 2021:** Release v0.10.0, see release notes [here](https://github.com/py-why/EconML/releases/tag/v0.10.0)\n\n**March 11, 2021:** Release v0.9.2, see release notes [here](https://github.com/py-why/EconML/releases/tag/v0.9.2)\n\n**March 3, 2021:** Release v0.9.1, see release notes [here](https://github.com/py-why/EconML/releases/tag/v0.9.1)\n\n**February 20, 2021:** Release v0.9.0, see release notes [here](https://github.com/py-why/EconML/releases/tag/v0.9.0)\n\n**January 20, 2021:** Release v0.9.0b1, see release notes [here](https://github.com/py-why/EconML/releases/tag/v0.9.0b1)\n\n**November 20, 2020:** Release v0.8.1, see release notes [here](https://github.com/py-why/EconML/releases/tag/v0.8.1)\n\n**November 18, 2020:** Release v0.8.0, see release notes [here](https://github.com/py-why/EconML/releases/tag/v0.8.0)\n\n**September 4, 2020:** Release v0.8.0b1, see release notes [here](https://github.com/py-why/EconML/releases/tag/v0.8.0b1)\n\n**March 6, 2020:** Release v0.7.0, see release notes [here](https://github.com/py-why/EconML/releases/tag/v0.7.0)\n\n**February 18, 2020:** Release v0.7.0b1, see release notes [here](https://github.com/py-why/EconML/releases/tag/v0.7.0b1)\n\n**January 10, 2020:** Release v0.6.1, see release notes [here](https://github.com/py-why/EconML/releases/tag/v0.6.1)\n\n**December 6, 2019:** Release v0.6, see release notes [here](https://github.com/py-why/EconML/releases/tag/v0.6)\n\n**November 21, 2019:** Release v0.5, see release notes [here](https://github.com/py-why/EconML/releases/tag/v0.5). \n\n**June 3, 2019:** Release v0.4, see release notes [here](https://github.com/py-why/EconML/releases/tag/v0.4). \n\n**May 3, 2019:** Release v0.3, see release notes [here](https://github.com/py-why/EconML/releases/tag/v0.3).\n\n**April 10, 2019:** Release v0.2, see release notes [here](https://github.com/py-why/EconML/releases/tag/v0.2).\n\n**March 6, 2019:** Release v0.1, welcome to have a try and provide feedback.\n\n\u003c/details\u003e\n\n# Getting Started\n\n## Installation\n\nInstall the latest release from [PyPI](https://pypi.org/project/econml/):\n```\npip install econml\n```\nTo install from source, see [For Developers](#for-developers) section below.\n\n## Usage Examples\n### Estimation Methods\n\n\u003cdetails\u003e\n  \u003csummary\u003eDouble Machine Learning (aka RLearner) (click to expand)\u003c/summary\u003e\n\n  * Linear final stage\n\n  ```Python\n  from econml.dml import LinearDML\n  from sklearn.linear_model import LassoCV\n  from econml.inference import BootstrapInference\n\n  est = LinearDML(model_y=LassoCV(), model_t=LassoCV())\n  ### Estimate with OLS confidence intervals\n  est.fit(Y, T, X=X, W=W) # W -\u003e high-dimensional confounders, X -\u003e features\n  treatment_effects = est.effect(X_test)\n  lb, ub = est.effect_interval(X_test, alpha=0.05) # OLS confidence intervals\n\n  ### Estimate with bootstrap confidence intervals\n  est.fit(Y, T, X=X, W=W, inference='bootstrap')  # with default bootstrap parameters\n  est.fit(Y, T, X=X, W=W, inference=BootstrapInference(n_bootstrap_samples=100))  # or customized\n  lb, ub = est.effect_interval(X_test, alpha=0.05) # Bootstrap confidence intervals\n  ```\n\n  * Sparse linear final stage\n\n  ```Python\n  from econml.dml import SparseLinearDML\n  from sklearn.linear_model import LassoCV\n\n  est = SparseLinearDML(model_y=LassoCV(), model_t=LassoCV())\n  est.fit(Y, T, X=X, W=W) # X -\u003e high dimensional features\n  treatment_effects = est.effect(X_test)\n  lb, ub = est.effect_interval(X_test, alpha=0.05) # Confidence intervals via debiased lasso\n  ```\n\n  * Generic Machine Learning last stage\n  \n  ```Python\n  from econml.dml import NonParamDML\n  from sklearn.ensemble import RandomForestRegressor, RandomForestClassifier\n\n  est = NonParamDML(model_y=RandomForestRegressor(),\n                    model_t=RandomForestClassifier(),\n                    model_final=RandomForestRegressor(),\n                    discrete_treatment=True)\n  est.fit(Y, T, X=X, W=W) \n  treatment_effects = est.effect(X_test)\n  ```\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n  \u003csummary\u003eDynamic Double Machine Learning (click to expand)\u003c/summary\u003e\n\n  ```Python\n  from econml.panel.dml import DynamicDML\n  # Use defaults\n  est = DynamicDML()\n  # Or specify hyperparameters\n  est = DynamicDML(model_y=LassoCV(cv=3), \n                   model_t=LassoCV(cv=3), \n                   cv=3)\n  est.fit(Y, T, X=X, W=None, groups=groups, inference=\"auto\")\n  # Effects\n  treatment_effects = est.effect(X_test)\n  # Confidence intervals\n  lb, ub = est.effect_interval(X_test, alpha=0.05)\n  ```\n\u003c/details\u003e\n\n\u003cdetails\u003e\n  \u003csummary\u003eCausal Forests (click to expand)\u003c/summary\u003e\n\n  ```Python\n  from econml.dml import CausalForestDML\n  from sklearn.linear_model import LassoCV\n  # Use defaults\n  est = CausalForestDML()\n  # Or specify hyperparameters\n  est = CausalForestDML(criterion='het', n_estimators=500,       \n                        min_samples_leaf=10, \n                        max_depth=10, max_samples=0.5,\n                        discrete_treatment=False,\n                        model_t=LassoCV(), model_y=LassoCV())\n  est.fit(Y, T, X=X, W=W)\n  treatment_effects = est.effect(X_test)\n  # Confidence intervals via Bootstrap-of-Little-Bags for forests\n  lb, ub = est.effect_interval(X_test, alpha=0.05)\n  ```\n\u003c/details\u003e\n\n\n\u003cdetails\u003e\n  \u003csummary\u003eOrthogonal Random Forests (click to expand)\u003c/summary\u003e\n\n  ```Python\n  from econml.orf import DMLOrthoForest, DROrthoForest\n  from econml.sklearn_extensions.linear_model import WeightedLasso, WeightedLassoCV\n  # Use defaults\n  est = DMLOrthoForest()\n  est = DROrthoForest()\n  # Or specify hyperparameters\n  est = DMLOrthoForest(n_trees=500, min_leaf_size=10,\n                       max_depth=10, subsample_ratio=0.7,\n                       lambda_reg=0.01,\n                       discrete_treatment=False,\n                       model_T=WeightedLasso(alpha=0.01), model_Y=WeightedLasso(alpha=0.01),\n                       model_T_final=WeightedLassoCV(cv=3), model_Y_final=WeightedLassoCV(cv=3))\n  est.fit(Y, T, X=X, W=W)\n  treatment_effects = est.effect(X_test)\n  # Confidence intervals via Bootstrap-of-Little-Bags for forests\n  lb, ub = est.effect_interval(X_test, alpha=0.05)\n  ```\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\n\u003csummary\u003eMeta-Learners (click to expand)\u003c/summary\u003e\n  \n  * XLearner\n\n  ```Python\n  from econml.metalearners import XLearner\n  from sklearn.ensemble import GradientBoostingClassifier, GradientBoostingRegressor\n\n  est = XLearner(models=GradientBoostingRegressor(),\n                propensity_model=GradientBoostingClassifier(),\n                cate_models=GradientBoostingRegressor())\n  est.fit(Y, T, X=np.hstack([X, W]))\n  treatment_effects = est.effect(np.hstack([X_test, W_test]))\n\n  # Fit with bootstrap confidence interval construction enabled\n  est.fit(Y, T, X=np.hstack([X, W]), inference='bootstrap')\n  treatment_effects = est.effect(np.hstack([X_test, W_test]))\n  lb, ub = est.effect_interval(np.hstack([X_test, W_test]), alpha=0.05) # Bootstrap CIs\n  ```\n  \n  * SLearner\n\n  ```Python\n  from econml.metalearners import SLearner\n  from sklearn.ensemble import GradientBoostingRegressor\n\n  est = SLearner(overall_model=GradientBoostingRegressor())\n  est.fit(Y, T, X=np.hstack([X, W]))\n  treatment_effects = est.effect(np.hstack([X_test, W_test]))\n  ```\n\n  * TLearner\n\n  ```Python\n  from econml.metalearners import TLearner\n  from sklearn.ensemble import GradientBoostingRegressor\n\n  est = TLearner(models=GradientBoostingRegressor())\n  est.fit(Y, T, X=np.hstack([X, W]))\n  treatment_effects = est.effect(np.hstack([X_test, W_test]))\n  ```\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003eDoubly Robust Learners (click to expand)\n\u003c/summary\u003e\n\n* Linear final stage\n\n```Python\nfrom econml.dr import LinearDRLearner\nfrom sklearn.ensemble import GradientBoostingRegressor, GradientBoostingClassifier\n\nest = LinearDRLearner(model_propensity=GradientBoostingClassifier(),\n                      model_regression=GradientBoostingRegressor())\nest.fit(Y, T, X=X, W=W)\ntreatment_effects = est.effect(X_test)\nlb, ub = est.effect_interval(X_test, alpha=0.05)\n```\n\n* Sparse linear final stage\n\n```Python\nfrom econml.dr import SparseLinearDRLearner\nfrom sklearn.ensemble import GradientBoostingRegressor, GradientBoostingClassifier\n\nest = SparseLinearDRLearner(model_propensity=GradientBoostingClassifier(),\n                            model_regression=GradientBoostingRegressor())\nest.fit(Y, T, X=X, W=W)\ntreatment_effects = est.effect(X_test)\nlb, ub = est.effect_interval(X_test, alpha=0.05)\n```\n\n* Nonparametric final stage\n\n```Python\nfrom econml.dr import ForestDRLearner\nfrom sklearn.ensemble import GradientBoostingRegressor, GradientBoostingClassifier\n\nest = ForestDRLearner(model_propensity=GradientBoostingClassifier(),\n                      model_regression=GradientBoostingRegressor())\nest.fit(Y, T, X=X, W=W) \ntreatment_effects = est.effect(X_test)\nlb, ub = est.effect_interval(X_test, alpha=0.05)\n```\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003eDouble Machine Learning with Instrumental Variables (click to expand)\u003c/summary\u003e\n\n* Orthogonal instrumental variable learner\n\n```Python\nfrom econml.iv.dml import OrthoIV\n\nest = OrthoIV(projection=False, \n              discrete_treatment=True, \n              discrete_instrument=True)\nest.fit(Y, T, Z=Z, X=X, W=W)\ntreatment_effects = est.effect(X_test)\nlb, ub = est.effect_interval(X_test, alpha=0.05) # OLS confidence intervals\n```\n* Nonparametric double machine learning with instrumental variable\n\n```Python\nfrom econml.iv.dml import NonParamDMLIV\n\nest = NonParamDMLIV(discrete_treatment=True, \n                    discrete_instrument=True,\n                    model_final=RandomForestRegressor())\nest.fit(Y, T, Z=Z, X=X, W=W) # no analytical confidence interval available\ntreatment_effects = est.effect(X_test)\n```\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003eDoubly Robust Machine Learning with Instrumental Variables (click to expand)\u003c/summary\u003e\n\n* Linear final stage\n```Python\nfrom econml.iv.dr import LinearDRIV\n\nest = LinearDRIV(discrete_instrument=True, discrete_treatment=True)\nest.fit(Y, T, Z=Z, X=X, W=W)\ntreatment_effects = est.effect(X_test)\nlb, ub = est.effect_interval(X_test, alpha=0.05) # OLS confidence intervals\n```\n\n* Sparse linear final stage\n\n```Python\nfrom econml.iv.dr import SparseLinearDRIV\n\nest = SparseLinearDRIV(discrete_instrument=True, discrete_treatment=True)\nest.fit(Y, T, Z=Z, X=X, W=W)\ntreatment_effects = est.effect(X_test)\nlb, ub = est.effect_interval(X_test, alpha=0.05) # Debiased lasso confidence intervals\n```\n\n* Nonparametric final stage\n```Python\nfrom econml.iv.dr import ForestDRIV\n\nest = ForestDRIV(discrete_instrument=True, discrete_treatment=True)\nest.fit(Y, T, Z=Z, X=X, W=W)\ntreatment_effects = est.effect(X_test)\n# Confidence intervals via Bootstrap-of-Little-Bags for forests\nlb, ub = est.effect_interval(X_test, alpha=0.05) \n```\n\n* Linear intent-to-treat (discrete instrument, discrete treatment)\n\n```Python\nfrom econml.iv.dr import LinearIntentToTreatDRIV\nfrom sklearn.ensemble import GradientBoostingRegressor, GradientBoostingClassifier\n\nest = LinearIntentToTreatDRIV(model_y_xw=GradientBoostingRegressor(),\n                              model_t_xwz=GradientBoostingClassifier(),\n                              flexible_model_effect=GradientBoostingRegressor())\nest.fit(Y, T, Z=Z, X=X, W=W)\ntreatment_effects = est.effect(X_test)\nlb, ub = est.effect_interval(X_test, alpha=0.05) # OLS confidence intervals\n```\n\u003c/details\u003e\n\nSee the \u003ca href=\"#references\"\u003eReferences\u003c/a\u003e section for more details.\n\n### Interpretability\n\u003cdetails\u003e\n  \u003csummary\u003eTree Interpreter of the CATE model (click to expand)\u003c/summary\u003e\n  \n  ```Python\n  from econml.cate_interpreter import SingleTreeCateInterpreter\n  intrp = SingleTreeCateInterpreter(include_model_uncertainty=True, max_depth=2, min_samples_leaf=10)\n  # We interpret the CATE model's behavior based on the features used for heterogeneity\n  intrp.interpret(est, X)\n  # Plot the tree\n  plt.figure(figsize=(25, 5))\n  intrp.plot(feature_names=['A', 'B', 'C', 'D'], fontsize=12)\n  plt.show()\n  ```\n  ![image](notebooks/images/dr_cate_tree.png)\n  \n\u003c/details\u003e\n\n\u003cdetails\u003e\n  \u003csummary\u003ePolicy Interpreter of the CATE model (click to expand)\u003c/summary\u003e\n\n  ```Python\n  from econml.cate_interpreter import SingleTreePolicyInterpreter\n  # We find a tree-based treatment policy based on the CATE model\n  intrp = SingleTreePolicyInterpreter(risk_level=0.05, max_depth=2, min_samples_leaf=1,min_impurity_decrease=.001)\n  intrp.interpret(est, X, sample_treatment_costs=0.2)\n  # Plot the tree\n  plt.figure(figsize=(25, 5))\n  intrp.plot(feature_names=['A', 'B', 'C', 'D'], fontsize=12)\n  plt.show()\n  ```\n  ![image](notebooks/images/dr_policy_tree.png)\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n  \u003csummary\u003eSHAP values for the CATE model (click to expand)\u003c/summary\u003e\n\n  ```Python\n  import shap\n  from econml.dml import CausalForestDML\n  est = CausalForestDML()\n  est.fit(Y, T, X=X, W=W)\n  shap_values = est.shap_values(X)\n  shap.summary_plot(shap_values['Y0']['T0'])\n  ```\n\n\u003c/details\u003e\n\n\n### Causal Model Selection and Cross-Validation\n\n\n\u003cdetails\u003e\n  \u003csummary\u003eCausal model selection with the `RScorer` (click to expand)\u003c/summary\u003e\n\n  ```Python\n  from econml.score import RScorer\n\n  # split data in train-validation\n  X_train, X_val, T_train, T_val, Y_train, Y_val = train_test_split(X, T, y, test_size=.4)\n\n  # define list of CATE estimators to select among\n  reg = lambda: RandomForestRegressor(min_samples_leaf=20)\n  clf = lambda: RandomForestClassifier(min_samples_leaf=20)\n  models = [('ldml', LinearDML(model_y=reg(), model_t=clf(), discrete_treatment=True,\n                               cv=3)),\n            ('xlearner', XLearner(models=reg(), cate_models=reg(), propensity_model=clf())),\n            ('dalearner', DomainAdaptationLearner(models=reg(), final_models=reg(), propensity_model=clf())),\n            ('slearner', SLearner(overall_model=reg())),\n            ('drlearner', DRLearner(model_propensity=clf(), model_regression=reg(),\n                                    model_final=reg(), cv=3)),\n            ('rlearner', NonParamDML(model_y=reg(), model_t=clf(), model_final=reg(),\n                                     discrete_treatment=True, cv=3)),\n            ('dml3dlasso', DML(model_y=reg(), model_t=clf(),\n                               model_final=LassoCV(cv=3, fit_intercept=False),\n                               discrete_treatment=True,\n                               featurizer=PolynomialFeatures(degree=3),\n                               cv=3))\n  ]\n\n  # fit cate models on train data\n  models = [(name, mdl.fit(Y_train, T_train, X=X_train)) for name, mdl in models]\n\n  # score cate models on validation data\n  scorer = RScorer(model_y=reg(), model_t=clf(),\n                   discrete_treatment=True, cv=3, mc_iters=2, mc_agg='median')\n  scorer.fit(Y_val, T_val, X=X_val)\n  rscore = [scorer.score(mdl) for _, mdl in models]\n  # select the best model\n  mdl, _ = scorer.best_model([mdl for _, mdl in models])\n  # create weighted ensemble model based on score performance\n  mdl, _ = scorer.ensemble([mdl for _, mdl in models])\n  ```\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n  \u003csummary\u003eFirst Stage Model Selection (click to expand)\u003c/summary\u003e\n\nEconML's cross-fitting estimators provide built-in functionality for first-stage model selection.  This support can work with existing sklearn model selection classes such as `LassoCV` or `GridSearchCV`, or you can pass a list of models to choose the best from among them when cross-fitting.\n\n```Python\nfrom econml.dml import LinearDML\nfrom sklearn import clone\nfrom sklearn.ensemble import RandomForestRegressor\nfrom sklearn.linear_model import LassoCV\nfrom sklearn.model_selection import GridSearchCV\n\ncv_model = GridSearchCV(\n              estimator=RandomForestRegressor(),\n              param_grid={\n                  \"max_depth\": [3, None],\n                  \"n_estimators\": (10, 30, 50, 100, 200),\n                  \"max_features\": (2, 4, 6),\n              },\n              cv=5,\n           )\n\nest = LinearDML(model_y=cv_model, # use sklearn's grid search to select the best Y model \n                model_t=[RandomForestRegressor(), LassoCV()]) # use built-in model selection to choose between forest and linear models for T model\n```\n\n\n\u003c/details\u003e\n\n### Inference\n\nWhenever inference is enabled, then one can get a more structure `InferenceResults` object with more elaborate inference information, such\nas p-values and z-statistics. When the CATE model is linear and parametric, then a `summary()` method is also enabled. For instance:\n\n  ```Python\n  from econml.dml import LinearDML\n  # Use defaults\n  est = LinearDML()\n  est.fit(Y, T, X=X, W=W)\n  # Get the effect inference summary, which includes the standard error, z test score, p value, and confidence interval given each sample X[i]\n  est.effect_inference(X_test).summary_frame(alpha=0.05, value=0, decimals=3)\n  # Get the population summary for the entire sample X\n  est.effect_inference(X_test).population_summary(alpha=0.1, value=0, decimals=3, tol=0.001)\n  #  Get the parameter inference summary for the final model\n  est.summary()\n  ```\n  \n  \u003cdetails\u003e\u003csummary\u003eExample Output (click to expand)\u003c/summary\u003e\n  \n  ```Python\n  # Get the effect inference summary, which includes the standard error, z test score, p value, and confidence interval given each sample X[i]\n  est.effect_inference(X_test).summary_frame(alpha=0.05, value=0, decimals=3)\n  ```\n  ![image](notebooks/images/summary_frame.png)\n  \n  ```Python\n  # Get the population summary for the entire sample X\n  est.effect_inference(X_test).population_summary(alpha=0.1, value=0, decimals=3, tol=0.001)\n  ```\n  ![image](notebooks/images/population_summary.png)\n  \n  ```Python\n  #  Get the parameter inference summary for the final model\n  est.summary()\n  ```\n  ![image](notebooks/images/summary.png)\n  \n  \u003c/details\u003e\n  \n\n### Policy Learning\n\nYou can also perform direct policy learning from observational data, using the doubly robust method for offline\npolicy learning. These methods directly predict a recommended treatment, without internally fitting an explicit\nmodel of the conditional average treatment effect.\n\n\u003cdetails\u003e\n  \u003csummary\u003eDoubly Robust Policy Learning (click to expand)\u003c/summary\u003e\n\n```Python\nfrom econml.policy import DRPolicyTree, DRPolicyForest\nfrom sklearn.ensemble import RandomForestRegressor\n\n# fit a single binary decision tree policy\npolicy = DRPolicyTree(max_depth=1, min_impurity_decrease=0.01, honest=True)\npolicy.fit(y, T, X=X, W=W)\n# predict the recommended treatment\nrecommended_T = policy.predict(X)\n# plot the binary decision tree\nplt.figure(figsize=(10,5))\npolicy.plot()\n# get feature importances\nimportances = policy.feature_importances_\n\n# fit a binary decision forest\npolicy = DRPolicyForest(max_depth=1, min_impurity_decrease=0.01, honest=True)\npolicy.fit(y, T, X=X, W=W)\n# predict the recommended treatment\nrecommended_T = policy.predict(X)\n# plot the first tree in the ensemble\nplt.figure(figsize=(10,5))\npolicy.plot(0)\n# get feature importances\nimportances = policy.feature_importances_\n```\n\n\n  ![image](images/policy_tree.png)\n\u003c/details\u003e\n\nTo see more complex examples, go to the [notebooks](https://github.com/py-why/EconML/tree/main/notebooks) section of the repository. For a more detailed description of the treatment effect estimation algorithms, see the EconML [documentation](https://www.pywhy.org/EconML/).\n\n# For Developers\n\nYou can get started by cloning this repository. We use \n[setuptools](https://setuptools.readthedocs.io/en/latest/index.html) for building and distributing our package.\nWe rely on some recent features of setuptools, so make sure to upgrade to a recent version with\n`pip install setuptools --upgrade`.  Then from your local copy of the repository you can run `pip install -e .` to get started (but depending on what you're doing you might want to install with extras instead, like `pip install -e .[plt]` if you want to use matplotlib integration, or you can use  `pip install -e .[all]` to include all extras).\n\n## Pre-commit hooks\n\nWe use the [pre-commit](https://pre-commit.com/) framework to enforce code style and run checks before every commit. To install the pre-commit hooks, make sure you have pre-commit installed (`pip install pre-commit`) and then run `pre-commit install` in the root of the repository. This will install the hooks and run them automatically before every commit. If you want to run the hooks manually, you can run `pre-commit run --all-files`.\n\n## Finding issues to help with\n\nIf you're looking to contribute to the project, we have a number of issues tagged with the [`up for grabs`](https://github.com/py-why/EconML/issues?q=is%3Aopen+is%3Aissue+label%3A%22up+for+grabs%22) and [`help wanted`](https://github.com/py-why/EconML/issues?q=is%3Aopen+is%3Aissue+label%3A%22help+wanted%22) labels. \"Up for grabs\" issues are ones that we think that people without a lot of experience in our codebase may be able to help with, while \"Help wanted\" issues are valuable improvements to the library that our team currently does not have time to prioritize where we would greatly appreciate community-initiated PRs, but which might be more involved.\n\n## Running the tests\n\nThis project uses [pytest](https://docs.pytest.org/) to run tests for continuous integration.  It is also possible to use `pytest` to run tests locally, but this isn't recommended because it will take an extremely long time and some tests are specific to certain environments or scenarios that have additional dependencies.  However, if you'd like to do this anyway, to run all tests locally after installing the package you can use `pip install pytest pytest-xdist pytest-cov coverage[toml]` (as well as `pip install jupyter jupyter-client nbconvert nbformat seaborn xgboost tqdm` for the dependencies to run all of our notebooks as tests) followed by `python -m pytest`.\n\nBecause running all tests can be very time-consuming, we recommend running only the relevant subset of tests when developing locally.  The easiest way to do this is to rely on `pytest`'s compatibility with `unittest`, so you can just run `python -m unittest econml.tests.test_module` to run all tests in a given module, or `python -m unittest econml.tests.test_module.TestClass` to run all tests in a given class.  You can also run `python -m unittest econml.tests.test_module.TestClass.test_method` to run a single test method.\n\n## Generating the documentation\n\nThis project's documentation is generated via [Sphinx](https://www.sphinx-doc.org/en/main/index.html).  Note that we use [graphviz](https://graphviz.org/)'s \n`dot` application to produce some of the images in our documentation, so you should make sure that `dot` is installed and in your path.\n\nTo generate a local copy of the documentation from a clone of this repository, just run `python setup.py build_sphinx -W -E -a`, which will build the documentation and place it under the `build/sphinx/html` path. \n\nThe reStructuredText files that make up the documentation are stored in the [docs directory](https://github.com/py-why/EconML/tree/main/doc); module documentation is automatically generated by the Sphinx build process.\n\n## Release process\n\nWe use GitHub Actions to build and publish the package and documentation.  To create a new release, an admin should perform the following steps:\n\n1. Update the version number in `econml/_version.py` and add a mention of the new version in the news section of this file and commit the changes.\n2. Manually run the publish_package.yml workflow to build and publish the package to PyPI.\n3. Manually run the publish_docs.yml workflow to build and publish the documentation.\n4. Under https://github.com/py-why/EconML/releases, create a new release with a corresponding tag, and update the release notes.\n\n# Blogs and Publications\n\n* May 2021: [Be Careful When Interpreting Predictive Models in Search of Causal Insights](https://towardsdatascience.com/be-careful-when-interpreting-predictive-models-in-search-of-causal-insights-e68626e664b6)\n\n* June 2019: [Treatment Effects with Instruments paper](https://arxiv.org/pdf/1905.10176.pdf)\n\n* May 2019: [Open Data Science Conference Workshop](https://odsc.com/speakers/machine-learning-estimation-of-heterogeneous-treatment-effect-the-microsoft-econml-library/) \n\n* 2018: [Orthogonal Random Forests paper](http://proceedings.mlr.press/v97/oprescu19a.html)\n\n* 2017: [DeepIV paper](http://proceedings.mlr.press/v70/hartford17a/hartford17a.pdf)\n\n# Citation\n\nIf you use EconML in your research, please cite us as follows:\n\n   Keith Battocchi, Eleanor Dillon, Maggie Hei, Greg Lewis, Paul Oka, Miruna Oprescu, Vasilis Syrgkanis. **EconML: A Python Package for ML-Based Heterogeneous Treatment Effects Estimation.** https://github.com/py-why/EconML, 2019. Version 0.x.\n\nBibTex:\n\n```\n@misc{econml,\n  author={Keith Battocchi, Eleanor Dillon, Maggie Hei, Greg Lewis, Paul Oka, Miruna Oprescu, Vasilis Syrgkanis},\n  title={{EconML}: {A Python Package for ML-Based Heterogeneous Treatment Effects Estimation}},\n  howpublished={https://github.com/py-why/EconML},\n  note={Version 0.x},\n  year={2019}\n}\n```\n\n# Contributing and Feedback\n\nThis project welcomes contributions and suggestions.  We use the [DCO bot](https://github.com/apps/dco) to enforce a [Developer Certificate of Origin](https://developercertificate.org/) which requires users to sign-off on their commits.  This is a simple way to certify that you wrote or otherwise have the right to submit the code you are contributing to the project.  Git provides a `-s` command line option to include this automatically when you commit via `git commit`.\n\nIf you forget to sign one of your commits, the DCO bot will provide specific instructions along with the failed check; alternatively you can use `git commit --amend -s` to add the sign-off to your last commit if you forgot it or `git rebase --signoff` to sign all of the commits in the branch, after which you can force push the changes to your branch with `git push --force-with-lease`.\n\nThis project has adopted the [PyWhy Code of Conduct](https://github.com/py-why/governance/blob/main/CODE-OF-CONDUCT.md).\n\n# Community\n\n\u003ca href=\"https://pywhy.org/\"\u003e\n\u003cimg src=\"doc/spec/img/pywhy-logo.png\" width=\"80px\" align=\"left\" style=\"margin-right: 10px;\", alt=\"pywhy-logo\"\u003e\n\u003c/a\u003e\n\nEconML is a part of [PyWhy](https://www.pywhy.org/), an organization with a mission to build an open-source ecosystem for causal machine learning.\n\nPyWhy also has a [Discord](https://discord.gg/cSBGb3vsZb), which serves as a space for like-minded casual machine learning researchers and practitioners of all experience levels to come together to ask and answer questions, discuss new features, and share ideas.\n\nWe invite you to join us at regular office hours and community calls in the Discord.\n\n# References\n\nAthey, Susan, and Stefan Wager.\n**Policy learning with observational data.**\n[*Econometrica 89.1, 133-161*](https://doi.org/10.3982/ECTA15732), 2021.\n\nX Nie, S Wager.\n**Quasi-Oracle Estimation of Heterogeneous Treatment Effects.**\n[*Biometrika 108.2, 299-319*](https://doi.org/10.1093/biomet/asaa076), 2021.\n\nV. Syrgkanis, V. Lei, M. Oprescu, M. Hei, K. Battocchi, G. Lewis.\n**Machine Learning Estimation of Heterogeneous Treatment Effects with Instruments.**\n[*Proceedings of the 33rd Conference on Neural Information Processing Systems (NeurIPS)*](https://arxiv.org/abs/1905.10176), 2019.\n**(Spotlight Presentation)**\n\nD. Foster, V. Syrgkanis.\n**Orthogonal Statistical Learning.**\n[*Proceedings of the 32nd Annual Conference on Learning Theory (COLT)*](https://arxiv.org/pdf/1901.09036.pdf), 2019.\n**(Best Paper Award)**\n\nM. Oprescu, V. Syrgkanis and Z. S. Wu.\n**Orthogonal Random Forest for Causal Inference.**\n[*Proceedings of the 36th International Conference on Machine Learning (ICML)*](http://proceedings.mlr.press/v97/oprescu19a.html), 2019.\n\nS. Künzel, J. Sekhon, J. Bickel and B. Yu.\n**Metalearners for estimating heterogeneous treatment effects using machine learning.**\n[*Proceedings of the national academy of sciences, 116(10), 4156-4165*](https://www.pnas.org/content/116/10/4156), 2019.\n\nS. Athey, J. Tibshirani, S. Wager.\n**Generalized random forests.**\n[*Annals of Statistics, 47, no. 2, 1148--1178*](https://projecteuclid.org/euclid.aos/1547197251), 2019.\n\nV. Chernozhukov, D. Nekipelov, V. Semenova, V. Syrgkanis.\n**Plug-in Regularized Estimation of High-Dimensional Parameters in Nonlinear Semiparametric Models.**\n[*Arxiv preprint arxiv:1806.04823*](https://arxiv.org/abs/1806.04823), 2018.\n\nS. Wager, S. Athey.\n**Estimation and Inference of Heterogeneous Treatment Effects using Random Forests.**\n[*Journal of the American Statistical Association, 113:523, 1228-1242*](https://www.tandfonline.com/doi/citedby/10.1080/01621459.2017.1319839), 2018.\n\nJason Hartford, Greg Lewis, Kevin Leyton-Brown, and Matt Taddy. **Deep IV: A flexible approach for counterfactual prediction.** [*Proceedings of the 34th International Conference on Machine Learning, ICML'17*](http://proceedings.mlr.press/v70/hartford17a/hartford17a.pdf), 2017.\n\nV. Chernozhukov, D. Chetverikov, M. Demirer, E. Duflo, C. Hansen, and a. W. Newey. **Double Machine Learning for Treatment and Causal Parameters.** [*ArXiv preprint arXiv:1608.00060*](https://arxiv.org/abs/1608.00060), 2016.\n\nDudik, M., Erhan, D., Langford, J., \u0026 Li, L.\n**Doubly robust policy evaluation and optimization.**\n[*Statistical Science, 29(4), 485-511*](https://projecteuclid.org/journals/statistical-science/volume-29/issue-4/Doubly-Robust-Policy-Evaluation-and-Optimization/10.1214/14-STS500.full), 2014.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpy-why%2Feconml","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpy-why%2Feconml","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpy-why%2Feconml/lists"}