Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

https://github.com/xunzheng/notears

DAGs with NO TEARS: Continuous Optimization for Structure Learning
https://github.com/xunzheng/notears

Last synced: 4 days ago
JSON representation

DAGs with NO TEARS: Continuous Optimization for Structure Learning

Lists

README

        

# DAGs with NO TEARS :no_entry_sign::droplet:

**[Update 12/8/22]** Interested in faster and more accurate structure learning? See our new [DAGMA](https://github.com/kevinsbello/dagma) library from [NeurIPS 2022](https://arxiv.org/abs/2209.08037).

This is an implementation of the following papers:

[1] Zheng, X., Aragam, B., Ravikumar, P., & Xing, E. P. (2018). [DAGs with NO TEARS: Continuous optimization for structure learning](https://arxiv.org/abs/1803.01422) ([NeurIPS 2018](https://nips.cc/Conferences/2018/), Spotlight).

[2] Zheng, X., Dan, C., Aragam, B., Ravikumar, P., & Xing, E. P. (2020). [Learning
sparse nonparametric DAGs](https://arxiv.org/abs/1909.13189) ([AISTATS 2020](https://aistats.org/), to appear).

If you find this code useful, please consider citing:
```
@inproceedings{zheng2018dags,
author = {Zheng, Xun and Aragam, Bryon and Ravikumar, Pradeep and Xing, Eric P.},
booktitle = {Advances in Neural Information Processing Systems},
title = {{DAGs with NO TEARS: Continuous Optimization for Structure Learning}},
year = {2018}
}
```

```
@inproceedings{zheng2020learning,
author = {Zheng, Xun and Dan, Chen and Aragam, Bryon and Ravikumar, Pradeep and Xing, Eric P.},
booktitle = {International Conference on Artificial Intelligence and Statistics},
title = {{Learning sparse nonparametric DAGs}},
year = {2020}
}
```

## tl;dr Structure learning in <60 lines

Check out [`linear.py`](notears/linear.py) for a complete, end-to-end implementation of the NOTEARS algorithm in fewer than **60 lines**.

This includes L2, Logistic, and Poisson loss functions with L1 penalty.

## Introduction

A directed acyclic graphical model (aka Bayesian network) with `d` nodes defines a
distribution of random vector of size `d`.
We are interested in the Bayesian Network Structure Learning (BNSL) problem:
given `n` samples from such distribution, how to estimate the graph `G`?

A major challenge of BNSL is enforcing the directed acyclic graph (DAG)
constraint, which is **combinatorial**.
While existing approaches rely on local heuristics,
we introduce a fundamentally different strategy: we formulate it as a purely
**continuous** optimization problem over real matrices that avoids this
combinatorial constraint entirely.
In other words,

characterization

where `h` is a *smooth* function whose level set exactly characterizes the
space of DAGs.

## Requirements

- Python 3.6+
- `numpy`
- `scipy`
- `python-igraph`: Install [igraph C core](https://igraph.org/c/) and `pkg-config` first.
- `torch`: Optional, only used for nonlinear model.

## Contents (New version)

- `linear.py` - the 60-line implementation of NOTEARS with l1 regularization for various losses
- `nonlinear.py` - nonlinear NOTEARS using MLP or basis expansion
- `locally_connected.py` - special layer structure used for MLP
- `lbfgsb_scipy.py` - wrapper for scipy's LBFGS-B
- `utils.py` - graph simulation, data simulation, and accuracy evaluation

## Running a simple demo

The simplest way to try out NOTEARS is to run a simple example:
```bash
$ git clone https://github.com/xunzheng/notears.git
$ cd notears/
$ python notears/linear.py
```
This runs the l1-regularized NOTEARS on a randomly generated 20-node Erdos-Renyi graph with 100 samples.
Within a few seconds, you should see output like this:
```
{'fdr': 0.0, 'tpr': 1.0, 'fpr': 0.0, 'shd': 0, 'nnz': 20}
```
The data, ground truth graph, and the estimate will be stored in `X.csv`, `W_true.csv`, and `W_est.csv`.

## Running as a command

Alternatively, if you have a CSV data file `X.csv`, you can install the package and run the algorithm as a command:
```bash
$ pip install git+git://github.com/xunzheng/notears
$ notears_linear X.csv
```
The output graph will be stored in `W_est.csv`.

## Examples: Erdos-Renyi graph

- Ground truth: `d = 20` nodes, `2d = 40` expected edges.

ER2_W_true

- Estimate with `n = 1000` samples:
`lambda = 0`, `lambda = 0.1`, and `FGS` (baseline).

ER2_W_est_n1000

Both `lambda = 0` and `lambda = 0.1` are close to the ground truth graph
when `n` is large.

- Estimate with `n = 20` samples:
`lambda = 0`, `lambda = 0.1`, and `FGS` (baseline).

ER2_W_est_n20

When `n` is small, `lambda = 0` perform worse while
`lambda = 0.1` remains accurate, showing the advantage of L1-regularization.

## Examples: Scale-free graph

- Ground truth: `d = 20` nodes, `4d = 80` expected edges.

SF4_W_true

The degree distribution is significantly different from the Erdos-Renyi graph.
One nice property of our method is that it is agnostic about the
graph structure.

- Estimate with `n = 1000` samples:
`lambda = 0`, `lambda = 0.1`, and `FGS` (baseline).

SF4_W_est_n1000

The observation is similar to Erdos-Renyi graph:
both `lambda = 0` and `lambda = 0.1` accurately estimates the ground truth
when `n` is large.

- Estimate with `n = 20` samples:
`lambda = 0`, `lambda = 0.1`, and `FGS` (baseline).

SF4_W_est_n20

Similarly, `lambda = 0` suffers from small `n` while
`lambda = 0.1` remains accurate, showing the advantage of L1-regularization.

## Other implementations

- Python: https://github.com/jmoss20/notears
- Tensorflow with Python: https://github.com/ignavier/notears-tensorflow