https://github.com/soda-inria/caussim

Simulations for predictive model selection in causal inference
https://github.com/soda-inria/caussim

Last synced: about 1 month ago
JSON representation

Simulations for predictive model selection in causal inference

Host: GitHub
URL: https://github.com/soda-inria/caussim
Owner: soda-inria
License: bsd-3-clause
Created: 2023-01-05T07:52:12.000Z (over 2 years ago)
Default Branch: main
Last Pushed: 2025-01-16T12:57:12.000Z (4 months ago)
Last Synced: 2025-03-24T13:21:23.924Z (about 2 months ago)
Language: Python
Size: 28.4 MB
Stars: 13
Watchers: 2
Forks: 2
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE.txt

Awesome Lists containing this project

README

        How to select predictive model for causal inference

===================================================

Overview

--------

This package contains simulations for causal inference, estimators for ATE and

CATE as well as code for experiments described in the paper : How to select

predictive models for causal inference ? 

Package Features

----------------

The package code is contained in: [caussim](caussim/)

- `estimation` contains CATE and ATE estimators usable with any scikit-learn compatible base estimators and meta-learners such as TLearner, SLearner or RLearner.

- `simulations` simulations with basis expansion (available Nystroem, Splines)

- `experiences` used to run extensive evaluation of causal metrics on ACIC 2016

  and handcrafted simulations.

- `reports` contains the scripts used to derive figures and tables presented in

  the paper. The main results are obtained by launching the 

- `utils.py` plot utils

- `pdistances` naive implementation of MMD, Total Varation and Jensen Shannon Divergences used to measure population overlap

- `demos` contains notebooks used to create toy example and risks maps for the 2D simulations.

- `data` contains utils to load semi-simulated datasets (ACIC 2016, ACIC 2018,

  TWINS). A dedicated [README](data/README.md) is

  available in the root data folder.

Experiences

----------

Experiences outputs are mainly csvs (one for each sampled dataset). To launch an experience, run `python scripts/experiences/` and it should output the csv in a dedidacted folder in the corresponding subfolder `data/experiences//`.

**🔎 Replicate the main experience of the paper (section 5.)**, launch the script

[scripts/experiences/causal_scores_evaluation.py](scripts/experiences/causal_scores_evaluation.py). 

Make sure that the configurations for the datasets at the beginning of the file

is : 

```python

from caussim.experiences.base_config import DATASET_GRID_FULL_EXPES

DATASET_GRID = DATASET_GRID_FULL_EXPES

```

📢 Note that the results of the section 5 are already provided in the zenodo link [`experiences.zip`]([data/experiences.zip](https://zenodo.org/records/13765465?token=eyJhbGciOiJIUzUxMiJ9.eyJpZCI6ImJmNTFlOWNjLTUxOTYtNGFjNS04YjVjLTIyZWFjMmNhZjQyMyIsImRhdGEiOnt9LCJyYW5kb20iOiJlOTZjZGE4ZmQzNDFkMWUxNTJhYzI0YWI1ZjUxNGViMyJ9.vPuJgBw0A0w02InS9ovWRShKUGTDk4w6k2uwYBZklRiC-p7hlVvZOOyvpg6wsJ6T5MBW30vUCsL_UdBSCmmFMw)).

Reports

-------

Reports outputs are mainly figures for the papers. To obtain the results, run `pytest scripts/reports/` and it should output the figures in one or several corresponding folders in `figures/`.

The main report type is a pytest function contained in the `reports/causal_scores_evaluation.py` script. For each macro-dataset, it plot the results of running a given set of candidate estimators with a fixed nuisance estimator on several generation process of the macro-dataset (often hundreds of sampled datasets). 

**🔎 Replicate the main figure of the paper (Figure 3.)**, launch the script

[scripts/reports/_1_r_risk_domination.py](scripts/reports/_1_r_risk_domination.py).

It should take some time because of the high number of simulations results. Make

sure that the appropriate experiences results exists. The one used in the paper

are provided in [`experiences.zip`](data/experiences.zip).

```

pytest scripts/reports/causal_scores_evaluation.py

```

Installation

============

- We recommend the use of poetry and python>=3.9 to manage dependencies. 

You can install caussim via 

[poetry](https://python-poetry.org/):

 ```shell script

poetry install

```

or

[pip](https://pip.pypa.io/). In this case you also need to install the dependies listed in the `pyproject.toml`:

 ```shell script

pip install caussim

```

Dependencies: 

------------

python = ">=3.9, <3.11"  

python-dotenv = "^0.15.0"  

click = "^8.0.1"  

yapf = "^0.31.0"  

matplotlib = "^3.4.2"

numpy = "^1.20.3"  

seaborn = "^0.11.1"  

jupytext = "^1.11.5"  

rope = "^0.19.0"  

scikit-learn = "^1.0"  

jedi = "^0.18.0"  

tqdm = "^4.62.3"  

tabulate = "^0.8.9"  

statsmodels = "^0.13.1"  

pyarrow = "^6.0.1"  

submitit = "^1.4.1"  

rpy2 = "^3.4.5"  

moepy = "^1.1.4"

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/soda-inria/caussim

Awesome Lists containing this project

README