{"id":15032841,"url":"https://github.com/fentechsolutions/causaldiscoverytoolbox","last_synced_at":"2025-05-15T00:07:30.277Z","repository":{"id":37370001,"uuid":"92927706","full_name":"FenTechSolutions/CausalDiscoveryToolbox","owner":"FenTechSolutions","description":"Package for causal inference in graphs and in the pairwise settings. Tools for graph structure recovery and dependencies are included.","archived":false,"fork":false,"pushed_at":"2024-04-02T12:12:22.000Z","size":14565,"stargazers_count":1173,"open_issues_count":70,"forks_count":202,"subscribers_count":36,"default_branch":"master","last_synced_at":"2025-05-15T00:07:18.229Z","etag":null,"topics":["algorithm","causal-discovery","causal-inference","causal-models","causality","graph","graph-structure-recovery","inference","machine-learning","python","toolbox"],"latest_commit_sha":null,"homepage":"https://fentechsolutions.github.io/CausalDiscoveryToolbox/html/index.html","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/FenTechSolutions.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-05-31T09:12:45.000Z","updated_at":"2025-05-09T14:45:02.000Z","dependencies_parsed_at":"2023-01-26T06:30:53.047Z","dependency_job_id":"ee185ccc-8262-406f-a1e2-caf8657fe013","html_url":"https://github.com/FenTechSolutions/CausalDiscoveryToolbox","commit_stats":{"total_commits":673,"total_committers":27,"mean_commits":"24.925925925925927","dds":0.4947994056463596,"last_synced_commit":"d0bc352534dcbfac19a84a1bb05f33fe311378d2"},"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/FenTechSolutions%2FCausalDiscoveryToolbox","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/FenTechSolutions%2FCausalDiscoveryToolbox/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/FenTechSolutions%2FCausalDiscoveryToolbox/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/FenTechSolutions%2FCausalDiscoveryToolbox/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/FenTechSolutions","download_url":"https://codeload.github.com/FenTechSolutions/CausalDiscoveryToolbox/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254249197,"owners_count":22039029,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["algorithm","causal-discovery","causal-inference","causal-models","causality","graph","graph-structure-recovery","inference","machine-learning","python","toolbox"],"created_at":"2024-09-24T20:19:33.950Z","updated_at":"2025-05-15T00:07:25.266Z","avatar_url":"https://github.com/FenTechSolutions.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"![](docs/banner.png)\n\nThe Causal Discovery Toolbox is a package for causal inference in graphs and in the pairwise settings for Python\u003e=3.5. Tools for graph structure recovery and dependencies are included. The package is based on Numpy, Scikit-learn, Pytorch and R.\n\n[![Build Status](https://travis-ci.org/FenTechSolutions/CausalDiscoveryToolbox.svg?branch=master)](https://travis-ci.org/FenTechSolutions/CausalDiscoveryToolbox)\n[![Dev Status](https://travis-ci.org/FenTechSolutions/CausalDiscoveryToolbox.svg?branch=dev)](https://travis-ci.org/FenTechSolutions/CausalDiscoveryToolbox)\n[![codecov](https://codecov.io/gh/FenTechSolutions/CausalDiscoveryToolbox/branch/master/graph/badge.svg)](https://codecov.io/gh/FenTechSolutions/CausalDiscoveryToolbox)\n[![Hex.pm](https://img.shields.io/badge/License-MIT-blue.svg?maxAge=259200)](https://raw.githubusercontent.com/FenTechSolutions/CausalDiscoveryToolbox/master/LICENSE.md)\n[![version](https://img.shields.io/badge/version-0.6.0-yellow.svg?maxAge=259200)](#)\n![PyPI - Downloads](https://img.shields.io/pypi/dm/cdt.svg)\n\nIt implements lots of algorithms for graph structure recovery (including algorithms from the __bnlearn__, __pcalg__ packages), mainly based out of observational data.\n\n## [Check out the documentation here](https://fentechsolutions.github.io/CausalDiscoveryToolbox/html/index.html) \n## [Please cite us if you use our software](https://arxiv.org/abs/1903.02278)\n\n[A tutorial is available here](https://fentechsolutions.github.io/CausalDiscoveryToolbox/html/tutorial.html)\n\nInstall it using pip: (See more details on installation below)\n```sh\npip install cdt\n```\n## Docker images\nDocker images are available, including all the dependencies, and enabled functionalities:\n\n|       Branch     |                                                                 master                                                                 |                                                                  dev                                                                 |\n|:----------------:|:--------------------------------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------------------------------------:|\n| Python 3.6 - CPU |       [![d36cpu](https://img.shields.io/badge/docker-0.6.0-0db7ed.svg?maxAge=259200)](https://hub.docker.com/r/fentechai/cdt/)      |       [![d36cpudev](https://img.shields.io/badge/docker-latest-0db7ed.svg?maxAge=259200)](https://hub.docker.com/r/fentechai/cdt-dev)       |\n| Python 3.6 - GPU | [![d36gpu](https://img.shields.io/badge/nvidia--docker-0.6.0-76b900.svg?maxAge=259200)](https://hub.docker.com/r/fentechai/nv-cdt/) |  [![d36gpudev](https://img.shields.io/badge/nvidia--docker-latest-76b900.svg?maxAge=259200)](https://hub.docker.com/r/fentechai/nv-cdt-dev) |\n\n## Installation\n\nThe packages requires a python version \u003e=3.5, as well as some libraries listed in [requirements file](https://github.com/FenTechSolutions/CausalDiscoveryToolbox/blob/master/requirements.txt). For some additional functionalities, more libraries are needed for these extra functions and options to become available. Here is a quick install guide of the package, starting off with the minimal install up to the full installation. \n\n**Note**: A (mini/ana)conda framework would help installing all those packages and therefore could be recommended for non-expert users. \n\n### Install PyTorch\nAs some of the key algorithms in the _cdt_ package use the PyTorch package, it is required to install it. \nCheck out their website to install the PyTorch version suited to your hardware configuration: http://pytorch.org\n\n### Install the CausalDiscoveryToolbox package\nThe package is available on PyPi:\n```sh\npip install cdt\n```\nOr you can also install it from source.\n```sh\n$ git clone https://github.com/FenTechSolutions/CausalDiscoveryToolbox.git  # Download the package \n$ cd CausalDiscoveryToolbox\n$ pip install -r requirements.txt  # Install the requirements\n$ python setup.py install develop --user\n```\n**The package is then up and running! You can run most of the algorithms in the CausalDiscoveryToolbox, you might get warnings: some additional features are not available**\n\nFrom now on, you can import the library using:\n```python\nimport cdt\n```\nCheck out the package structure and more info on the package itself [here](https://github.com/FenTechSolutions/CausalDiscoveryToolbox/blob/master/documentation.md).  \n\n### Additional : R and R libraries\nIn order to have access to additional algorithms from various R packages such as bnlearn, kpcalg, pcalg, ... while using the _cdt_ framework, it is required to install R.\n\nCheck out how to install all R dependencies in the before-install section of the [travis.yml](https://github.com/FenTechSolutions/CausalDiscoveryToolbox/blob/master/.travis.yml) file for debian based distributions. \nThe [r-requirements file](https://github.com/FenTechSolutions/CausalDiscoveryToolbox/blob/master/r_requirements.txt) notes all of the R packages used by the toolbox.\n\n\nHere is an example of installation script of the R packages on Ubuntu 20.04:\n\n``` sh\napt-get -qq update\nDEBIAN_FRONTEND=noninteractive apt-get install -y tzdata\napt-get -qq install dialog apt-utils -y\napt-get install apt-transport-https -y\napt-get install -qq software-properties-common -y\napt-get -qq update\napt-key adv --keyserver keyserver.ubuntu.com --recv-keys E298A3A825C0D65DFD57CBB651716619E084DAB9\nadd-apt-repository 'deb https://cloud.r-project.org/bin/linux/ubuntu bionic-cran35/' -y\napt-get -qq update\n\napt-get -qq install r-base -y\napt-get -qq install libssl-dev -y\napt-get -qq install libgmp3-dev  -y\napt-get -qq install git -y\napt-get -qq install build-essential  -y\napt-get -qq install libv8-dev  -y\napt-get -qq install libcurl4-openssl-dev -y\napt-get -qq install libgsl-dev -y\n\nRscript -e 'install.packages(c(\"V8\"),repos=\"http://cran.us.r-project.org\", quiet=TRUE, verbose=FALSE)'\nRscript -e 'install.packages(c(\"sfsmisc\"),repos=\"http://cran.us.r-project.org\", quiet=TRUE, verbose=FALSE)'\nRscript -e 'install.packages(c(\"clue\"),repos=\"http://cran.us.r-project.org\", quiet=TRUE, verbose=FALSE)'\nRscript -e 'install.packages(\"https://cran.r-project.org/src/contrib/Archive/randomForest/randomForest_4.6-14.tar.gz\", repos=NULL, type=\"source\")'\nRscript -e 'install.packages(c(\"lattice\"),repos=\"http://cran.us.r-project.org\", quiet=TRUE, verbose=FALSE)'\nRscript -e 'install.packages(c(\"devtools\"),repos=\"http://cran.us.r-project.org\", quiet=TRUE, verbose=FALSE)'\nRscript -e 'install.packages(c(\"MASS\"),repos=\"http://cran.us.r-project.org\", quiet=TRUE, verbose=FALSE)'\nRscript -e 'install.packages(\"BiocManager\")'\nRscript -e 'BiocManager::install(c(\"igraph\"))'\nRscript -e 'install.packages(\"https://cran.r-project.org/src/contrib/Archive/fastICA/fastICA_1.2-2.tar.gz\", repos=NULL, type=\"source\")'\nRscript -e 'BiocManager::install(c(\"SID\", \"bnlearn\", \"pcalg\", \"kpcalg\", \"glmnet\", \"mboost\"))'\nRscript -e 'install.packages(\"https://cran.r-project.org/src/contrib/Archive/CAM/CAM_1.0.tar.gz\", repos=NULL, type=\"source\")'\nRscript -e 'install.packages(\"https://cran.r-project.org/src/contrib/sparsebnUtils_0.0.8.tar.gz\", repos=NULL, type=\"source\")'\nRscript -e 'BiocManager::install(c(\"ccdrAlgorithm\", \"discretecdAlgorithm\"))'\n\napt-get -qq install libxml2-dev -y\nRscript -e 'install.packages(\"devtools\")'\nRscript -e 'library(devtools); install_github(\"cran/CAM\"); install_github(\"cran/momentchi2\"); install_github(\"Diviyan-Kalainathan/RCIT\", quiet=TRUE, verbose=FALSE)'\nRscript -e 'install.packages(\"https://cran.r-project.org/src/contrib/Archive/sparsebn/sparsebn_0.1.2.tar.gz\", repos=NULL, type=\"source\")'\n```\n\n\n## Overview\n### General package structure\nThe following figure shows how the package and its algorithms are structured\n\n\n```\n   cdt package\n   |\n   |- independence\n   |  |- graph (Infering the skeleton from data)\n   |  |  |- Lasso variants (Randomized Lasso[1], Glasso[2], HSICLasso[3])\n   |  |  |- FSGNN (CGNN[12] variant for feature selection)\n   |  |  |- Skeleton recovery using feature selection algorithms (RFECV[5], LinearSVR[6], RRelief[7], ARD[8,9], DecisionTree)\n   |  |\n   |  |- stats (pairwise methods for dependency)\n   |     |- Correlation (Pearson, Spearman, KendallTau)\n   |     |- Kernel based (NormalizedHSIC[10])\n   |     |- Mutual information based (MIRegression, Adjusted Mutual Information[11], Normalized mutual information[11])\n   |\n   |- data\n   |  |- CausalPairGenerator (Generate causal pairs)\n   |  |- AcyclicGraphGenerator (Generate FCM-based graphs)\n   |  |- load_dataset (load standard benchmark datasets)\n   |\n   |- causality\n   |  |- graph (methods for graph inference)\n   |  |  |- CGNN[12]\n   |  |  |- PC[13]\n   |  |  |- GES[13]\n   |  |  |- GIES[13]\n   |  |  |- LiNGAM[13]\n   |  |  |- CAM[13]\n   |  |  |- GS[23]\n   |  |  |- IAMB[24]\n   |  |  |- MMPC[25]\n   |  |  |- SAM[26]\n   |  |  |- CCDr[27]\n   |  |\n   |  |- pairwise (methods for pairwise inference)\n   |     |- ANM[14] (Additive Noise Model)\n   |     |- IGCI[15] (Information Geometric Causal Inference)\n   |     |- RCC[16] (Randomized Causation Coefficient)\n   |     |- NCC[17] (Neural Causation Coefficient)\n   |     |- GNN[12] (Generative Neural Network -- Part of CGNN )\n   |     |- Bivariate fit (Baseline method of regression)\n   |     |- Jarfo[20]\n   |     |- CDS[20]\n   |     |- RECI[28]\n   |\n   |- metrics (Implements the metrics for graph scoring)\n   |  |- Precision Recall\n   |  |- SHD\n   |  |- SID [29]\n   |\n   |- utils\n      |- Settings -\u003e SETTINGS class (hardware settings)\n      |- loss -\u003e MMD loss [21, 22] \u0026 various other loss functions\n      |- io -\u003e for importing data formats\n      |- graph -\u003e graph utilities\n\n\n\n```\n\n### Hardware and algorithm settings\nThe toolbox has a SETTINGS class that defines the hardware settings. Those settings are unique and their default parameters are defined in **_cdt/utils/Settings_**.\n\nThese parameters are accessible and overridable via accessing the class:\n\n```python\nimport cdt\ncdt.SETTINGS\n```\n\nMoreover, the hardware parameters are detected and defined automatically (including number of GPUs, CPUs, available optional packages) at the **import** of the package using the **cdt.utils.Settings.autoset_settings** method, run at startup.\n\n### The graph class\nThe whole package revolves around using the **DiGraph** and **Graph** classes from the **networkx** package.\n\n### References\n\n- [1] Wang, S., Nan, B., Rosset, S., \u0026 Zhu, J. (2011). Random lasso. The annals of applied statistics, 5(1), 468.\n- [2] Friedman, J., Hastie, T., \u0026 Tibshirani, R. (2008). Sparse inverse covariance estimation with the graphical lasso. Biostatistics, 9(3), 432-441.\n- [3] Yamada, M., Jitkrittum, W., Sigal, L., Xing, E. P., \u0026 Sugiyama, M. (2014). High-dimensional feature selection by feature-wise kernelized lasso. Neural computation, 26(1), 185-207.\n- [4] Feizi, S., Marbach, D., Médard, M., \u0026 Kellis, M. (2013). Network deconvolution as a general method to distinguish direct dependencies in networks. Nature biotechnology, 31(8), 726-733.\n- [5] Guyon, I., Weston, J., Barnhill, S., \u0026 Vapnik, V. (2002). Gene selection for cancer classification using support vector machines. Machine learning, 46(1), 389-422.\n- [6] Vapnik, V., Golowich, S. E., \u0026 Smola, A. J. (1997). Support vector method for function approximation, regression estimation and signal processing. In Advances in neural information processing systems (pp. 281-287).  \n- [7] Kira, K., \u0026 Rendell, L. A. (1992, July). The feature selection problem: Traditional methods and a new algorithm. In Aaai (Vol. 2, pp. 129-134).\n- [8] MacKay,  D.  J.  (1992). Bayesian interpolation. Neural Computation, 4, 415–447.\n- [9] Neal, R. M. (1996). Bayesian learning for neural networks. No. 118 in Lecture Notes in Statistics. New York: Springer.\n- [10] Gretton, A., Bousquet, O., Smola, A., \u0026 Scholkopf, B. (2005, October). Measuring statistical dependence with Hilbert-Schmidt norms. In ALT (Vol. 16, pp. 63-78).\n- [11] Vinh, N. X., Epps, J., \u0026 Bailey, J. (2010). Information theoretic measures for clusterings comparison: Variants, properties, normalization and correction for chance. Journal of Machine Learning Research, 11(Oct), 2837-2854.\n- [12] Goudet, O., Kalainathan, D., Caillou, P., Lopez-Paz, D., Guyon, I., Sebag, M., ... \u0026 Tubaro, P. (2017). Learning functional causal models with generative neural networks. arXiv preprint arXiv:1709.05321.\n- [13] Spirtes, P., Glymour, C., Scheines, R. (2000). Causation, Prediction, and Search. MIT press.  \n- [14] Hoyer, P. O., Janzing, D., Mooij, J. M., Peters, J., \u0026 Schölkopf, B. (2009). Nonlinear causal discovery with additive noise models. In Advances in neural information processing systems (pp. 689-696).\n- [15] Janzing, D., Mooij, J., Zhang, K., Lemeire, J., Zscheischler, J., Daniušis, P., ... \u0026 Schölkopf, B. (2012). Information-geometric approach to inferring causal directions. Artificial Intelligence, 182, 1-31.\n- [16] Lopez-Paz, D., Muandet, K., Schölkopf, B., \u0026 Tolstikhin, I. (2015, June). Towards a learning theory of cause-effect inference. In International Conference on Machine Learning (pp. 1452-1461).  \n- [17] Lopez-Paz, D., Nishihara, R., Chintala, S., Schölkopf, B., \u0026 Bottou, L. (2017, July). Discovering causal signals in images. In Proceedings of CVPR.  \n- [18] Stegle, O., Janzing, D., Zhang, K., Mooij, J. M., \u0026 Schölkopf, B. (2010). Probabilistic latent variable models for distinguishing between cause and effect. In Advances in Neural Information Processing Systems (pp. 1687-1695).\n- [19] Zhang, K., \u0026 Hyvärinen, A. (2009, June). On the identifiability of the post-nonlinear causal model. In Proceedings of the twenty-fifth conference on uncertainty in artificial intelligence (pp. 647-655). AUAI Press.\n- [20] Fonollosa, J. A. (2016). Conditional distribution variability measures for causality detection. arXiv preprint arXiv:1601.06680.\n- [21] Gretton, A., Borgwardt, K. M., Rasch, M. J., Schölkopf, B., \u0026 Smola, A. (2012). A kernel two-sample test. Journal of Machine Learning Research, 13(Mar), 723-773.\n- [22] Li, Y., Swersky, K., \u0026 Zemel, R. (2015). Generative moment matching networks. In Proceedings of the 32nd International Conference on Machine Learning (ICML-15) (pp. 1718-1727).  \n- [23] Margaritis D (2003). Learning Bayesian Network Model Structure from Data . Ph.D. thesis, School of Computer Science, Carnegie-Mellon University, Pittsburgh, PA. Available as Technical Report CMU-CS-03-153\n- [24] Tsamardinos I, Aliferis CF, Statnikov A (2003). “Algorithms for Large Scale Markov Blanket Discovery”. In “Proceedings of the Sixteenth International Florida Artificial Intelligence Research Society Conference”, pp. 376-381. AAAI Press.\n- [25] Tsamardinos I, Aliferis CF, Statnikov A (2003). “Time and Sample Efficient Discovery of Markov Blankets and Direct Causal Relations”. In “KDD ’03: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining”, pp. 673-678. ACM. Tsamardinos I, Brown LE, Aliferis CF (2006). “The Max-Min Hill-Climbing Bayesian Network Structure Learning Algorithm”. Machine Learning,65(1), 31-78.\n- [26] Kalainathan, Diviyan \u0026 Goudet, Olivier \u0026 Guyon, Isabelle \u0026 Lopez-Paz, David \u0026 Sebag, Michèle. (2018). SAM: Structural Agnostic Model, Causal Discovery and Penalized Adversarial Learning.\n- [27] Aragam, B., \u0026 Zhou, Q. (2015). Concave penalized estimation of sparse Gaussian Bayesian networks. Journal of Machine Learning Research, 16, 2273-2328.\n- [28] Bloebaum, P., Janzing, D., Washio, T., Shimizu, S., \u0026 Schoelkopf, B. (2018, March). Cause-Effect Inference by Comparing Regression Errors. In International Conference on Artificial Intelligence and Statistics (pp. 900-909).\n- [29] Structural Intervention Distance (SID) for Evaluating Causal Graphs, Jonas Peters, Peter Bühlmann: https://arxiv.org/abs/1306.1043\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffentechsolutions%2Fcausaldiscoverytoolbox","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffentechsolutions%2Fcausaldiscoverytoolbox","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffentechsolutions%2Fcausaldiscoverytoolbox/lists"}