Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/drdeford/computational_experiments_on_stirling_numbers_of_uniform_trees
Python code for sampling and evaluating cycle covers of trees
https://github.com/drdeford/computational_experiments_on_stirling_numbers_of_uniform_trees
Last synced: about 1 month ago
JSON representation
Python code for sampling and evaluating cycle covers of trees
- Host: GitHub
- URL: https://github.com/drdeford/computational_experiments_on_stirling_numbers_of_uniform_trees
- Owner: drdeford
- Created: 2019-06-12T19:47:00.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2023-03-29T22:43:06.000Z (almost 2 years ago)
- Last Synced: 2023-05-11T09:23:30.522Z (over 1 year ago)
- Language: Jupyter Notebook
- Size: 9.62 MB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Stirling Numbers of Uniform Trees and Related Computational Experiments
Python code for sampling and evaluating cycle covers of trees. The sections below provide brief descriptions of the code inside each script or notebook in each subfolder of Scripts.
## ./Scripts/Exact\_Computation/
### **FKT.py**
An implementation of the FKT algorithm for constructing a Pfaffian orientation of a planar graph.
### **uniform\_matching.py**
An implementation of the uniform sampling method from the paper ["Random generation of combinatorial structures from a uniform distribution"](https://www.sciencedirect.com/science/article/pii/030439758690174X), by Jerrum, Valiant, and Vazirani.
### **uniform\_cycle\_cover.py**
An implementation of a uniform sampling method for cycle covers of planar bipartite graphs.
### **uniform\_matching\_symbolic.py**
A method for enumerating the $k$-th Stirling numbers of the first kind for trees using a permanent-determinant approach.
### **spanning\_tree\_metrics.py**
An implementation of an MCMC version of the cycle basis walk on spanning trees with Metropolis-Hastings for interpolating between stars and paths.
## ./Scripts/Probabilistic\_Approach/
### **Probabilistic\_Approach.ipynb**
In this notebook, a random tree $T$ is generated. For this tree, $m$ trials are ran. For each trial, a $\\{0,1\\}$-column vectors with random entries is generated and a trial is considered a "success" if the condition that it contains exactly $k-1$ many $1$'s is met. The code for generating results for $n \in \\{7, 8, \ldots, 15\\}$ and $m \in \\{10000, 20000\\}$ are included. Moreover, the code produces average running times for $n \in \\{7, \ldots, 15, 24, 25, 26\\}$ and $m \in \\{10000, 20000\\}$.
### **probabilistic\_approach--average\_running\_times.py**
This Python script is used to compute the average running times for $n \in \\{5, 6, \ldots, 30\\}$ and $m \in \\{10000, 20000\\}$ for the code in the
**Probabilistic\_Approach.ipynb** notebook. Running time for this Python script is long.### **probabilistic\_approach--uniform\_sampling.py**
This script computes the difference between the exact value of the $k$-th Stirling number and its approximation using the probabilistic approach for 100 trees of order $n$ uniformly sampled using Wilson algorithm for $n \in \\{7, \ldots, 19\\}$ and $n - 2 \geq k \geq \lceil \frac{n}{2} \rceil$ and returns separate `.csv` files for each $n$.
### **analysis--probabilistic\_approach--uniform\_sampling.py**
This script computes the mean, standard deviation, and skewness of the differences computed in **probabilistic\_approach--uniform\_sampling.py** for each $n$ and each $k$.
## ./Scripts/Autocorrelation/
### **Uniform\_Trees--Autocorrelation.ipynb**
This notebook generates the autocorrelation plots for uniform sampling from the tree space based on degree, maximum degree, betweenness, and closeness with $100000$ iterations, lags up to $1000000$, $p = 0.01$, $q \in \\{0.2, 0.4, 0.6, 0.8\\}$, and $r \in \\{0.2, 0.4, 0.6, 0.8\\}$, where each point is displayed in increments of $50$. The blue ribbons in these plots are made of the $95\%$ confidence intervals where the standard deviation is computed according to Bartlett’s formula ([statsmodels.graphics.tsaplots.plot\_acf](https://www.statsmodels.org/dev/generated/statsmodels.graphics.tsaplots.plot_acf.html)).
### **uniform_trees--autocorrelation--longer_runs.py**
This Python script is used to generate the autocorrelation plots for $p = 0.1$, $q, r \in \\{0.2, 0.4, 0.6, 0.8\\}$, and $10000000$ iterations for the code in the **Uniform\_Trees--Autocorrelation.ipynb** notebook. Running time for this Python script is very long.
#### Global Betweenness Centrality
#### Global Closeness Centrality
## ./Scripts/Classification\_Single\_Predictor/ and ./Scripts/Classification_all_predictors/
### **Classification--{Betweenness, Closeness, Stirling, All}--{Full_Set, Sampling}.ipynb**
These Jupyter notebooks use statistical learning methods to classify trees (into two classes: path-like or star-like) with global betweenness centrality (**Betweenness**), global closeness centrality (**Closeness**), the Stirling numbers of the first kind (**Stirling**), or all three (**All**) as predictors. Two different data sets are used in these notebooks: The first data set (**Full_Set**) consists of all non-isomorphic trees of order 12 and the second data set consists of 500 non-isomorphic trees of order 18 sampled using `networkx.nonisomorphic_trees` function.
### **Classification--All--Uniform\_Sampling.ipynb**
These Jupyter notebooks use statistical learning methods to classify trees (into two classes: path-like or star-like) with global betweenness centrality (**Betweenness**), global closeness centrality (**Closeness**), the Stirling numbers of the first kind (**Stirling**). The data set used in this notebook consists of 500 trees of order 18 sampled uniformly from the space of all such trees.
## ./Scripts/Regression_Subset_Predictors/ and ./Scripts/Regression_all_predictors/
### **Regression--{All, Subset}--{Full_Set, Sampling, Uniform\_Sampling}.ipynb** and **Regression--Tree--{All, Subset}--{Full_Set, Sampling, Uniform\_Sampling}.ipynb**
These Jupyter notebooks use statistical learning methods to predict the Stirling numbers of the first kind using $\log10(P (T ; 2, 1))$ (base 10 logarithm of the distinguishing polynomial of $T$ evaluated at $x = 2$ and $y =1$), global closeness centrality, global betweenness centrality, and class as predictors. Since $\log10(P (T ; 2, 1))$ is the main predictor, we also considered the subset of these predictors that excludes $\log10(P (T ; 2, 1))$.
## ./Scripts/
### **Classification--Data_Visualization--R.ipynb** and **Regression--Data_Visualization--R.ipynb**
These Jupyter notebooks in R create plots for comparing training and testing scores for various tree-based classification, regression and tree-based regression methods used in these notebooks in the previous set of Jupyter notebooks.
#### Classification
All trees of order 12
All four predictors
A sample of trees of order 18
using `nx.nonisomorphic_trees`
All four predictors
A sample of trees of order 18
sampled uniformly
All four predictors
#### Regression
All trees of order 12
All four predictors
A sample of trees of order 18
using `nx.nonisomorphic_trees`
All four predictors
A sample of trees of order 18
sampled uniformly
All four predictors
All trees of order 12
Three of the predictors
A sample of trees of order 18
using `nx.nonisomorphic_trees`
Three of the predictors
A sample of trees of order 18
sampled uniformly
Three of the predictors
#### Tree-Based
All trees of order 12
All four predictors
A sample of trees of order 18
using `nx.nonisomorphic_trees`
All four predictors
A sample of trees of order 18
sampled uniformly
All four predictors
All trees of order 12
Three of the predictors
A sample of trees of order 18
using `nx.nonisomorphic_trees`
Three of the predictors
A sample of trees of order 18
sampled uniformly
Three of the predictors
### **tree\_functions\_2.py**
All the functions defined by the authors and used in the above notebooks are in this file.