https://github.com/drdeford/computational_experiments_on_stirling_numbers_of_uniform_trees

Python code for sampling and evaluating cycle covers of trees
https://github.com/drdeford/computational_experiments_on_stirling_numbers_of_uniform_trees

Last synced: 4 months ago
JSON representation

Python code for sampling and evaluating cycle covers of trees

Host: GitHub
URL: https://github.com/drdeford/computational_experiments_on_stirling_numbers_of_uniform_trees
Owner: drdeford
Created: 2019-06-12T19:47:00.000Z (about 6 years ago)
Default Branch: master
Last Pushed: 2023-03-29T22:43:06.000Z (over 2 years ago)
Last Synced: 2025-02-01T15:13:18.319Z (5 months ago)
Language: Jupyter Notebook
Size: 9.62 MB
Stars: 0
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Stirling Numbers of Uniform Trees and Related Computational Experiments

Python code for sampling and evaluating cycle covers of trees. The sections below provide brief descriptions of the code inside each script or notebook in each subfolder of Scripts.

## ./Scripts/Exact\_Computation/

### **FKT.py**

An implementation of the FKT algorithm for constructing a Pfaffian orientation of a planar graph.

### **uniform\_matching.py**

An implementation of the uniform sampling method from the paper ["Random generation of combinatorial structures from a uniform distribution"](https://www.sciencedirect.com/science/article/pii/030439758690174X), by Jerrum, Valiant, and Vazirani.

### **uniform\_cycle\_cover.py**

An implementation of a uniform sampling method for cycle covers of planar bipartite graphs.

### **uniform\_matching\_symbolic.py**

A method for enumerating the $k$-th Stirling numbers of the first kind for trees using a permanent-determinant approach.

### **spanning\_tree\_metrics.py**

An implementation of an MCMC version of the cycle basis walk on spanning trees with Metropolis-Hastings for interpolating between stars and paths.

## ./Scripts/Probabilistic\_Approach/

### **Probabilistic\_Approach.ipynb**

In this notebook, a random tree $T$ is generated. For this tree, $m$ trials are ran. For each trial, a $\\{0,1\\}$-column vectors with random entries is generated and a trial is considered a "success" if the condition that it contains exactly $k-1$ many $1$'s is met. The code for generating results for $n \in \\{7, 8, \ldots, 15\\}$ and $m \in \\{10000, 20000\\}$ are included. Moreover, the code produces average running times for $n \in \\{7, \ldots, 15, 24, 25, 26\\}$ and $m \in \\{10000, 20000\\}$.

### **probabilistic\_approach--average\_running\_times.py**

This Python script is used to compute the average running times for $n \in \\{5, 6, \ldots, 30\\}$ and $m \in \\{10000, 20000\\}$ for the code in the
**Probabilistic\_Approach.ipynb** notebook. Running time for this Python script is long.

### **probabilistic\_approach--uniform\_sampling.py**

This script computes the difference between the exact value of the $k$-th Stirling number and its approximation using the probabilistic approach for 100 trees of order $n$ uniformly sampled using Wilson algorithm for $n \in \\{7, \ldots, 19\\}$ and $n - 2 \geq k \geq \lceil \frac{n}{2} \rceil$ and returns separate `.csv` files for each $n$.

### **analysis--probabilistic\_approach--uniform\_sampling.py**

This script computes the mean, standard deviation, and skewness of the differences computed in **probabilistic\_approach--uniform\_sampling.py** for each $n$ and each $k$.

## ./Scripts/Autocorrelation/

### **Uniform\_Trees--Autocorrelation.ipynb**

This notebook generates the autocorrelation plots for uniform sampling from the tree space based on degree, maximum degree, betweenness, and closeness with $100000$ iterations, lags up to $1000000$, $p = 0.01$, $q \in \\{0.2, 0.4, 0.6, 0.8\\}$, and $r \in \\{0.2, 0.4, 0.6, 0.8\\}$, where each point is displayed in increments of $50$. The blue ribbons in these plots are made of the $95\%$ confidence intervals where the standard deviation is computed according to Bartlett’s formula ([statsmodels.graphics.tsaplots.plot\_acf](https://www.statsmodels.org/dev/generated/statsmodels.graphics.tsaplots.plot_acf.html)).

### **uniform_trees--autocorrelation--longer_runs.py**

This Python script is used to generate the autocorrelation plots for $p = 0.1$, $q, r \in \\{0.2, 0.4, 0.6, 0.8\\}$, and $10000000$ iterations for the code in the **Uniform\_Trees--Autocorrelation.ipynb** notebook. Running time for this Python script is very long.

#### Global Betweenness Centrality

#### Global Closeness Centrality

## ./Scripts/Classification\_Single\_Predictor/ and ./Scripts/Classification_all_predictors/

### **Classification--{Betweenness, Closeness, Stirling, All}--{Full_Set, Sampling}.ipynb**

These Jupyter notebooks use statistical learning methods to classify trees (into two classes: path-like or star-like) with global betweenness centrality (**Betweenness**), global closeness centrality (**Closeness**), the Stirling numbers of the first kind (**Stirling**), or all three (**All**) as predictors. Two different data sets are used in these notebooks: The first data set (**Full_Set**) consists of all non-isomorphic trees of order 12 and the second data set consists of 500 non-isomorphic trees of order 18 sampled using `networkx.nonisomorphic_trees` function.

### **Classification--All--Uniform\_Sampling.ipynb**

## ./Scripts/Regression_Subset_Predictors/ and ./Scripts/Regression_all_predictors/

### **Regression--{All, Subset}--{Full_Set, Sampling, Uniform\_Sampling}.ipynb** and **Regression--Tree--{All, Subset}--{Full_Set, Sampling, Uniform\_Sampling}.ipynb**

These Jupyter notebooks use statistical learning methods to predict the Stirling numbers of the first kind using $\log10(P (T ; 2, 1))$ (base 10 logarithm of the distinguishing polynomial of $T$ evaluated at $x = 2$ and $y =1$), global closeness centrality, global betweenness centrality, and class as predictors. Since $\log10(P (T ; 2, 1))$ is the main predictor, we also considered the subset of these predictors that excludes $\log10(P (T ; 2, 1))$.

## ./Scripts/

### **Classification--Data_Visualization--R.ipynb** and **Regression--Data_Visualization--R.ipynb**

These Jupyter notebooks in R create plots for comparing training and testing scores for various tree-based classification, regression and tree-based regression methods used in these notebooks in the previous set of Jupyter notebooks.

#### Classification

All trees of order 12

All four predictors

A sample of trees of order 18
using `nx.nonisomorphic_trees`
All four predictors

A sample of trees of order 18
sampled uniformly
All four predictors

#### Regression

All trees of order 12

All four predictors

A sample of trees of order 18
using `nx.nonisomorphic_trees`
All four predictors

A sample of trees of order 18
sampled uniformly
All four predictors

All trees of order 12

Three of the predictors

A sample of trees of order 18
using `nx.nonisomorphic_trees`
Three of the predictors

A sample of trees of order 18
sampled uniformly
Three of the predictors

#### Tree-Based

All trees of order 12

All four predictors

A sample of trees of order 18
using `nx.nonisomorphic_trees`
All four predictors

A sample of trees of order 18
sampled uniformly
All four predictors