https://github.com/kougioulis/cs-673-project
Deep End-to-end Causal Inference for Time-series - Project for CS-673 (Intro to Deep Generative Models)
https://github.com/kougioulis/cs-673-project
causal-discovery causality generative-models
Last synced: 3 months ago
JSON representation
Deep End-to-end Causal Inference for Time-series - Project for CS-673 (Intro to Deep Generative Models)
- Host: GitHub
- URL: https://github.com/kougioulis/cs-673-project
- Owner: kougioulis
- License: gpl-3.0
- Created: 2024-05-21T10:51:42.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2024-06-17T11:53:16.000Z (about 1 year ago)
- Last Synced: 2025-03-27T09:04:12.565Z (3 months ago)
- Topics: causal-discovery, causality, generative-models
- Language: Jupyter Notebook
- Homepage:
- Size: 135 MB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# 🎓 CS-673 Project (Intro to Deep Generative Models)
---
## 📜 Overview
The task of **Causal Discovery** is to uncover the true DAG $\mathcal{G}$ given a dataset $D$ such that $p\_D ∼ \mathcal{G}$. Since the work of [ZARX18], which transformed the task of causal discovery to a continuous optimization program with acyclicity constraints, significant attention has shifted to deep learning-based causal discovery algorithms.
In this project, we select **DECI (Deep End-to-end Causal Inference)**, a SOTA causal discovery algorithm for iid observational data by [GAF+22], a variational inference model modeling exogenous noise as a normalizing flow.
Since the literature on causal discovery algorithms for time-series data using deep learning techniques is quite limited and virginal, we opt to implement DECI on time-series data using lagged cross-sectional data. Finally, we evaluate our performance against non-deep-learning-inspired algorithms (PCMCI, PCMCI(+) etc.) on time-series synthetic data with known ground truth causal graph [LKSS20].
---

---
## 📚 References
- [**GAF+22**] Tomas Geffner, Javier Antoran, Adam Foster, Wenbo Gong, Chao Ma, Emre Kiciman, Amit Sharma, Angus Lamb, Martin Kukla, Nick Pawlowski, et al. *Deep end-to-end causal inference*. arXiv preprint arXiv:2202.02195, 2022.
- [**LKSS20**] Andrew R. Lawrence, Marcus Kaiser, Rui Sampaio, and Maksim Sipos. *Data generating process to evaluate causal discovery techniques for time series data*. Causal Discovery Causality-Inspired Machine Learning Workshop at Neural Information Processing Systems, 2020.
- [**VCB22**] Matthew J Vowels, Necati Cihan Camgoz, and Richard Bowden. *D’ya like DAGs? a survey on structure learning and causal discovery*. ACM Computing Surveys, 55(4):1–36, 2022.
- [**ZARX18**] Xun Zheng, Bryon Aragam, Pradeep K Ravikumar, and Eric P Xing. *Dags with NO-TEARS: Continuous optimization for structure learning*. Advances in Neural Information Processing Systems, 31, 2018.---
## 🛠️ Setup Instructions
There are two separate environments that need to be configured to reproduce this project: **CDML** and **DECI (causica)**.
### 🐍 Creating Virtual Environments
You may create the virtual environments with their respective requirements using the provided `.yml` files, using for example your Anaconda installation, on your shell as
1. For CDML:
```sh
conda env create -f environment-cdml.yml
```2. For causica:
```sh
conda env create -f environment-causica.yml
```### 📌 Environment Details
- The first environment runs on Python 3.8.19.
- The second environment runs on Python 3.10.1, 🔥 PyTorch 1.13.0 and PyTorch lightning 2.2.2.### 🥰 Reproducing the experiments
- Run `generate_dataset.ipynb` to generate a CDML configuration, plot the causal graph and generate the corresponding time-lagged dataset.
- Run `experiments.ipynb` to run a DECI model on a CDML configuration, compute the metrics and compare to the ground truth graph, as well as PCMCI.
- Run `RunAll.ipynb` to evaluate all pre-trained DECI models on each datasetavailable at the `datasets` folder and compare with PCMCI.
---
Enjoy! 🚀