https://github.com/eschmidt42/bcg

First steps using Bayesian causal graphical models.
https://github.com/eschmidt42/bcg

Last synced: 5 months ago
JSON representation

First steps using Bayesian causal graphical models.

Host: GitHub
URL: https://github.com/eschmidt42/bcg
Owner: eschmidt42
License: apache-2.0
Created: 2020-04-15T13:50:27.000Z (over 5 years ago)
Default Branch: master
Last Pushed: 2023-04-12T06:04:25.000Z (over 2 years ago)
Last Synced: 2024-11-08T22:47:03.412Z (8 months ago)
Language: Jupyter Notebook
Homepage: https://eschmidt42.github.io/bcg/
Size: 21.4 MB
Stars: 2
Watchers: 2
Forks: 1
Open Issues: 2
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE

Awesome Lists containing this project

README

        # Bayesian Causal Graphs - BCG

> Playing with Bayesian Causal Graphs.

The main references have so far been the following:

**References**

References for the dowhy package:

* Documentation: [here](https://microsoft.github.io/dowhy/)

* GitHub: [here](https://github.com/microsoft/dowhy)

Material which helped my introductory reading so far by :

* degeneratestate.org: [Pt I - potential outcomes](http://www.degeneratestate.org/posts/2018/Mar/24/causal-inference-with-python-part-1-potential-outcomes/) (contains definitions of ATE, ATT and ATC, among other things)

* degeneratestate.org: [Pt II - causal graphs & the backdoor criterion](http://www.degeneratestate.org/posts/2018/Jul/10/causal-inference-with-python-part-2-causal-graphical-models/)

* this youtube video [on backdoor paths](https://www.youtube.com/watch?v=F8vcki-uWJc) 

* Rubin et al. 2005 on [causal inference and potential outcomes](http://www.stat.unipg.it/stanghellini/rubinjasa2005.pdf)

* wiki article on the [average treatment effect (ATE)](https://en.wikipedia.org/wiki/Average_treatment_effect)

* this wiki article [on propensity score matching](https://en.wikipedia.org/wiki/Propensity_score_matching) 

**Brief list of concepts I found useful**:

* at the center of interest is the calculation of *potential outcomes* / *counterfactuals* $\rightarrow$ what Y would have been if X would have been different (so everything after all observations are done) $\rightarrow$ so we may want to know if a certain drug was taken or a placebo and measure the difference in effect

* potential outcomes and counterfactuals can be seen as being the same thing, see [Rubin et al. 2005](http://www.stat.unipg.it/stanghellini/rubinjasa2005.pdf) (really more of a useful thing know when reading the literature - man was I confused before knowing this)

* potential outcomes can be calculated without using a graphical model but graphical models help by guiding which variables should be conditioned on legally

* graphical models themselves require justification

**Technical note**:

In order to run this notebook, you'll need a symbolic link from `/notebooks/bcg` to `./bcg`.

## Small example using $ Y = a \cdot X_0 ^ b + c + X_1 $ to generate observations

```python

n = 1000

a,b,c = 1.5, 1., 0

target = 'Y'

obs = pd.DataFrame(columns=['X0', 'X1', target])

obs['X0'] = stats.uniform.rvs(loc=-1, scale=2, size=n)

# obs['X1'] = stats.norm.rvs(loc=0, scale=.1, size=n)

obs['X1'] = stats.uniform.rvs(loc=-.5, scale=1, size=n)

obs[target] = a * obs.X0 ** b + c + obs.X1

obs.head()

```



    .dataframe tbody tr th:only-of-type {

        vertical-align: middle;

    }

    .dataframe tbody tr th {

        vertical-align: top;

    }

    .dataframe thead th {

        text-align: right;

    }

  

    

      

      X0

      X1

      Y

    

  

  

    

      0

      -0.806061

      0.370908

      -0.838184

    

    

      1

      0.667759

      0.228029

      1.229667

    

    

      2

      0.158370

      0.352362

      0.589917

    

    

      3

      0.103955

      -0.018955

      0.136977

    

    

      4

      -0.755793

      -0.387319

      -1.521008

    

  



### Inspecting the observed data

```python

plot_target_vs_rest(obs)

```

![png](docs/images/output_6_0.png)

![png](docs/images/output_6_1.png)

```python

plot_var_hists(obs)

```

![png](docs/images/output_7_0.png)

![png](docs/images/output_7_1.png)

![png](docs/images/output_7_2.png)

```python

show_correlations(obs)

```

![png](docs/images/output_8_0.png)

## Generating the probably simplest possible causal graphical model

```python

gg = GraphGenerator(obs)

print(f'target var: {gg.target}, not target vars: {", ".join(gg.not_targets)}')

```

    target var: Y, not target vars: X0, X1

```python

g = gg.get_only_Xi_to_Y()

gml = gg.get_gml(g)

```

```python

gg.vis_g(g)

```

![png](docs/images/output_12_0.png)

```python

treatment = ['X0', ] # 'X1'

causal_model = dw.CausalModel(data=obs,  treatment=treatment, 

                              outcome=target, graph=gml)

```

    INFO:dowhy.causal_graph:If this is observed data (not from a randomized experiment), there might always be missing confounders. Adding a node named "Unobserved Confounders" to reflect this.

    INFO:dowhy.causal_model:Model to find the causal effect of treatment ['X0'] on outcome ['Y']

Note how `CausalModel` added an unobserved confounder variable `U`

```python

causal_model.view_model()

```

    WARNING:dowhy.causal_graph:Warning: Pygraphviz cannot be loaded. Check that graphviz and pygraphviz are installed.

    INFO:dowhy.causal_graph:Using Matplotlib for plotting

![png](docs/images/output_15_1.png)

### Identifying the estimant

```python

identified_estimand = causal_model.identify_effect(proceed_when_unidentifiable=True)

print(identified_estimand)

```

    INFO:dowhy.causal_identifier:Common causes of treatment and outcome:['U']

    WARNING:dowhy.causal_identifier:If this is observed data (not from a randomized experiment), there might always be missing confounders. Causal effect cannot be identified perfectly.

    INFO:dowhy.causal_identifier:Continuing by ignoring these unobserved confounders because proceed_when_unidentifiable flag is True.

    INFO:dowhy.causal_identifier:Instrumental variables for treatment and outcome:[]

    Estimand type: nonparametric-ate

    ### Estimand : 1

    Estimand name: backdoor

    Estimand expression:

      d                  

    ─────(Expectation(Y))

    d[X₀]                

    Estimand assumption 1, Unconfoundedness: If U→{X0} and U→Y then P(Y|X0,,U) = P(Y|X0,)

    ### Estimand : 2

    Estimand name: iv

    No such variable found!

    

### Computing the causal treatment estimate

```python

method_name = 'backdoor.linear_regression'

effect_kwargs = dict(

    method_name=method_name,

    control_value = 0,

    treatment_value = 1,

    target_units = 'ate',

    test_significance = True

)

causal_estimate = causal_model.estimate_effect(identified_estimand,

                                        **effect_kwargs)

```

    INFO:dowhy.causal_estimator:INFO: Using Linear Regression Estimator

    INFO:dowhy.causal_estimator:b: Y~X0+X0*X1

```python

print(causal_estimate)

```

    *** Causal Estimate ***

    

    ## Target estimand

    Estimand type: nonparametric-ate

    ### Estimand : 1

    Estimand name: backdoor

    Estimand expression:

      d                  

    ─────(Expectation(Y))

    d[X₀]                

    Estimand assumption 1, Unconfoundedness: If U→{X0} and U→Y then P(Y|X0,,U) = P(Y|X0,)

    ### Estimand : 2

    Estimand name: iv

    No such variable found!

    

    ## Realized estimand

    b: Y~X0+X0*X1

    ## Estimate

    Value: 1.4816271269256485

    

    ## Statistical Significance

    p-value: <0.001

    

### Trying to poke holes into the causal treatment effect

```python

method_name = 'placebo_treatment_refuter'

refute_kwargs = dict(

    method_name=method_name,

    placebo_type = "permute",  # relevant for placebo refutation

)

refute_res = causal_model.refute_estimate(identified_estimand, 

                                   causal_estimate, 

                                   **refute_kwargs)

```

    INFO:dowhy.causal_estimator:INFO: Using Linear Regression Estimator

    INFO:dowhy.causal_estimator:b: Y~placebo+placebo*X1

```python

print(refute_res)

```

    Refute: Use a Placebo Treatment

    Estimated effect:(1.4816271269256485,)

    New effect:(0.10464178347001926,)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/eschmidt42/bcg

Awesome Lists containing this project

README