{"id":37076511,"url":"https://github.com/kevinsbello/iscan","last_synced_at":"2026-01-14T08:59:49.496Z","repository":{"id":193975605,"uuid":"657716727","full_name":"kevinsbello/iscan","owner":"kevinsbello","description":"A Python 3 package for identifying distribution shifts (a.k.a feature-shifts) between datasets. Official implementation of the paper: \"iSCAN: Identifying Causal Mechanism Shifts among Nonlinear Additive Noise Models\".","archived":false,"fork":false,"pushed_at":"2024-07-05T09:39:25.000Z","size":278,"stargazers_count":4,"open_issues_count":0,"forks_count":2,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-09-21T09:48:49.306Z","etag":null,"topics":["bayesian-networks","causal-mechanisms","causal-models","difference-dag","difference-network","distribution-shift","intervention-target-estimation","neurips-2023","root-cause-analysis","score-matching"],"latest_commit_sha":null,"homepage":"https://iscan-dag.readthedocs.io/en/latest/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/kevinsbello.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2023-06-23T17:17:10.000Z","updated_at":"2025-05-05T01:04:43.000Z","dependencies_parsed_at":"2023-09-24T11:44:40.315Z","dependency_job_id":"7698081b-0cc2-4bf8-8ed6-7276e7db32ba","html_url":"https://github.com/kevinsbello/iscan","commit_stats":null,"previous_names":["kevinsbello/iscan"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/kevinsbello/iscan","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kevinsbello%2Fiscan","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kevinsbello%2Fiscan/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kevinsbello%2Fiscan/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kevinsbello%2Fiscan/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/kevinsbello","download_url":"https://codeload.github.com/kevinsbello/iscan/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kevinsbello%2Fiscan/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28414732,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-14T08:38:59.149Z","status":"ssl_error","status_checked_at":"2026-01-14T08:38:43.588Z","response_time":107,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bayesian-networks","causal-mechanisms","causal-models","difference-dag","difference-network","distribution-shift","intervention-target-estimation","neurips-2023","root-cause-analysis","score-matching"],"created_at":"2026-01-14T08:59:48.933Z","updated_at":"2026-01-14T08:59:49.491Z","avatar_url":"https://github.com/kevinsbello.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# ![iSCAN](https://raw.githubusercontent.com/kevinsbello/iscan/master/logo/iscan.png)\n\n\u003cdiv align=center\u003e\n  \u003ca href=\"https://pypi.org/project/iscan-dag\"\u003e\u003cimg src=\"https://img.shields.io/pypi/v/iscan-dag\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://pypi.org/project/iscan-dag\"\u003e\u003cimg src=\"https://img.shields.io/pypi/pyversions/iscan-dag\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://pypi.org/project/iscan-dag\"\u003e\u003cimg src=\"https://img.shields.io/pypi/wheel/iscan-dag\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://pepy.tech/project/iscan-dag\"\u003e\u003cimg src=\"https://pepy.tech/badge/iscan-dag\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://pypi.org/project/iscan-dag\"\u003e\u003cimg src=\"https://img.shields.io/pypi/l/iscan-dag\"\u003e\u003c/a\u003e\n\u003c/div\u003e\n\n\nThe `iscan-dag` library is a Python 3 package designed for detecting which variables, if any, have undergone a causal mechanism shift *given multiple datasets*. \n\niSCAN operates through a systematic process:\n\n1. For each dataset, iSCAN initially evaluates the Hessian of the data distribution at each sample. This step helps identify the leaf variables (nodes) for all the datasets.\n2. Subsequently, for the identified leaf variable, iSCAN evaluates at each sample the Hessian of the data distribution for the pooled data (resembling a mixture distribution). Then, based on the variance of the Hessian values, iSCAN determines if the given leaf node has undergone a mechanism shift (termed **shifted node**).\n\nThe steps above are applied iteratively, eliminating the identified leaf variable across all datasets at each iteration. See [`iscan.est_node_shifts`](https://iscan-dag.readthedocs.io/en/latest/api/iscan/shifted_nodes/est_node_shifts/) for more details.\n\nAs an optional step, the library also includes a function to detect structural changes (termed **shifted edges**). As a by-product of detecting shifted nodes, iSCAN also estimates a topological ordering of the causal variables. Thus, allowing for the use of recent methods on variable (parents) selection. The current implementation of iSCAN employs  [`FOCI`](https://cran.r-project.org/web/packages/FOCI/index.html)  to identify the parent set of shifted nodes in each dataset. See [`iscan.est_struct_shifts`](https://iscan-dag.readthedocs.io/en/latest/api/iscan/shifted_edges/est_struct_shifts/) for more details.\n\n\n## Citation\n\nThis is an implementation of the following paper:\n\n[1] Chen T., Bello K., Aragam B., Ravikumar P. (2023). [\"iSCAN: Identifying Causal Mechanism Shifts among Nonlinear Additive Noise Models\"][iscan]. [Advances in Neural Information Processing Systems](https://nips.cc/Conferences/2023/). \n\n[iscan]: https://arxiv.org/abs/2306.17361\n\nIf you find this code useful, please consider citing:\n\n### BibTeX\n\n```bibtex\n@article{chen2023iscan,\n  title={iSCAN: identifying causal mechanism shifts among nonlinear additive noise models},\n  author={Chen, Tianyu and Bello, Kevin and Aragam, Bryon and Ravikumar, Pradeep},\n  journal={Advances in Neural Information Processing Systems},\n  volume={36},\n  year={2023}\n}\n```\n\n## Features\n\n- Shifted nodes are detected without the need to estimate the DAG structure for each dataset.\n- iSCAN is agnostic to the type of score's Jacobian estimator. The current implementation is based on a kernelized Stein's estimator. See [`stein_hess`](https://iscan-dag.readthedocs.io/en/latest/api/iscan/score_estimator/stein_hess/) for details.\n- iSCAN's time complexity is not influenced by the underlying graph density and will run faster than methods such as DCI or UT-IGSP for a large number of variables due to its omission of (non)parametric conditional independence tests.\n\n## Getting Started\n\n### Install the package\n\nWe recommend using a virtual environment via `virtualenv` or `conda`, and use `pip` to install the `iscan-dag` package.\n```bash\n$ pip install -U iscan-dag\n```\n\n### Using iSCAN\n\nSee an example of how to use iSCAN in this [iPython notebook][example].\n\n[example]: https://github.com/kevinsbello/iscan/blob/master/example/example.ipynb\n\n## An Overview of iSCAN\n\nWe propose a new method for directly identifying changes (shifts) of causal mechanisms from multiple heterogeneous datasets, which are assumed to be originated by related structural causal models (SCMs) over the same set of variables. \n\niSCAN considers that each **SCM belongs to the general class of nonlinear additive noise models** (ANMs), thus, generalizing prior work that assumed linear models. We assume that each dataset is generated from an interventional (observational if no variables are intervened) distribution of an underlying graph $G^*$. See the figure below.\n\n\u003cimg width=\"1335\" alt=\"\" src=\"https://github.com/kevinsbello/iscan/assets/6846921/ecebed13-8968-4a5e-a404-4b110b5eefd6\"\u003e\n\n\nIn [[1]][iscan], we prove that the Hessian of the log-density function of the **mixture distribution** reveals information about changes (shifts) in general non-parametric functional mechanisms for the leaf variables. Thus, allowing for the detection of shifted nodes. Our method leads to significant improvements in identifying shifted nodes.\n\n**Theorem 1 (see [[1]][iscan]).** \nLet $h$ be the index of the environment (dataset), and $p^h(x)$ denote the pdf of the $h$-th environment. Let $q(x)$ be the pdf of the mixture distribution of all $H$ environments such that $q(x) = \\sum_h w_h p^h(x)$. Also, let $s(x) = \\nabla \\log q(x)$ be the associated score function. Then, under mild assumptions, if $j$ is a leaf variable in all environments, we have:\n\n$$ \nj \\text{ is a shifted node } \\iff  \\text{Var}_{q}\\left[ \\frac{\\partial s_j(X)}{\\partial x_j} \\right] \u003e 0.\n$$\n\n\n## Requirements\n\n- Python 3.6+\n- `numpy`\n- `igraph`\n- `torch`\n- `scikit-learn`\n- `rpy2` (R interface to use the `FOCI` library).\n- `GPy` (Library to sample from Gaussian processes)\n- `kneed` (Used for the elbow heuristic)\n- `pandas`\n\n## Contents\n\n- `score_estimator.py`:  Estimates the diagonal of the Hessian of $\\log p(x)$ at the provided samples points.\n- `utils.py`: Utility functions for generating synthetic data, and evaluating the results\n- `shifted_nodes.py`: Implements iSCAN, providing detected shifted nodes.\n- `shifted_edges.py`: Implements the discovery of structural changes (shifted edges).\n- `my_foci.R`: R implementation that uses `FOCI` for finding parents based on given nodes and topological order.\n\n## Acknowledgements\n\nWe thank the authors of the [SCORE](https://github.com/paulrolland1307/SCORE/tree/main) for making their code available. Part of our code is based on their implementation, especially the `score_estimator.py` file.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkevinsbello%2Fiscan","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkevinsbello%2Fiscan","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkevinsbello%2Fiscan/lists"}