https://github.com/stan-dev/mathematicastan
A Mathematica package to interact with CmdStan
https://github.com/stan-dev/mathematicastan
interface mathematica stan
Last synced: 6 months ago
JSON representation
A Mathematica package to interact with CmdStan
- Host: GitHub
- URL: https://github.com/stan-dev/mathematicastan
- Owner: stan-dev
- License: gpl-3.0
- Created: 2016-08-26T09:59:09.000Z (over 9 years ago)
- Default Branch: master
- Last Pushed: 2024-08-15T11:48:10.000Z (over 1 year ago)
- Last Synced: 2024-10-29T14:21:57.070Z (about 1 year ago)
- Topics: interface, mathematica, stan
- Language: Mathematica
- Size: 1.32 MB
- Stars: 27
- Watchers: 29
- Forks: 8
- Open Issues: 1
-
Metadata Files:
- Readme: README.org
- Funding: .github/FUNDING.yml
- License: LICENSE
Awesome Lists containing this project
README
#+OPTIONS: toc:nil todo:nil pri:nil tags:nil ^:nil tex:t
#+TITLE: MathematicaStan v2.2
#+SUBTITLE: A Mathematica (v11+) package to interact with CmdStan
#+AUTHOR: Picaud Vincent
[[https://zenodo.org/doi/10.5281/zenodo.10810144][file:https://zenodo.org/badge/66637604.svg]]
* Table of contents :TOC_3:noexport:
- [[#introduction][Introduction]]
- [[#news][News]]
- [[#2024-08-13][2024-08-13]]
- [[#2020-12-21][2020-12-21]]
- [[#2019-06-28][2019-06-28]]
- [[#installation][Installation]]
- [[#the-stan-cmdstan-shell-interface][The Stan CmdStan shell interface]]
- [[#the-mathematica-cmdstan-package][The Mathematica CmdStan package]]
- [[#first-run][First run]]
- [[#tutorial-1-linear-regression][Tutorial 1, linear regression]]
- [[#introduction-1][Introduction]]
- [[#stan-code][Stan code]]
- [[#code-compilation][Code compilation]]
- [[#simulated-data][Simulated data]]
- [[#create-the-datar-data-file][Create the =data.R= data file]]
- [[#run-stan-likelihood-maximization][Run Stan, likelihood maximization]]
- [[#load-the-csv-result-file][Load the CSV result file]]
- [[#run-stan-variational-bayes][Run Stan, Variational Bayes]]
- [[#more-about-option-management][More about Option management]]
- [[#overwriting-default-values][Overwriting default values]]
- [[#reading-customized-values][Reading customized values]]
- [[#erasing-customized-option-values][Erasing customized option values]]
- [[#tutorial-2-linear-regression-with-more-than-one-predictor][Tutorial 2, linear regression with more than one predictor]]
- [[#parameter-arrays][Parameter arrays]]
- [[#simulated-data-1][Simulated data]]
- [[#exporting-data][Exporting data]]
- [[#run-stan-hmc-sampling][Run Stan, HMC sampling]]
- [[#load-the-csv-result-file-1][Load the CSV result file]]
- [[#unit-tests][Unit tests]]
* Introduction
*MathematicaStan* is a package to interact with [[http://mc-stan.org/interfaces/cmdstan][CmdStan]] from
Mathematica.
It is developed under *Linux* and is compatible with *Mathematica v11+*
It should work under *MacOS* and also under *Windows*.
*Author & contact:* picaud.vincent at gmail.com
** News
*** 2024-08-13
*New MathematicaStan version 2.2!*
*Package test with last CmdStan v2.35.0, Mathematica 11.2, Linux*
- Add some screenshots to the install procedure section
- CmdStan syntax changes have been included :
|----------------------------+-----------------------------------|
| old | current |
|----------------------------+-----------------------------------|
| <- | = |
| increment_log_prob(...) | target += ... |
| int y[N]; | array[N] int y; |
|----------------------------+-----------------------------------|
- Check that unit tests and examples work.
*** 2020-12-21
*New MathematicaStan version 2.1!*
This version has been fixed and should now run under Windows.
I would like to thank *Ali Ghaderi* who had the patience to help me to
debug the Windows version (I do not have access to this OS). Nothing
would have been possible without him. All possibly remaining bugs are
mine.
As a remainder also note that one should not use path/filename with
spaces (=Make= really does not like that). This consign is also true
under Linux or MacOS. See [[https://stackoverflow.com/questions/9838384/can-gnu-make-handle-filenames-with-spaces][SO:can-gnu-make-handle-filenames-with-spaces]]
by example.
*** 2019-06-28
*New MathematicaStan version 2.0!*
This version uses Mathematica v11 and has been completely refactored
*Caveat:* breaking changes!
*Note*: the "old" MathematicaStan version based on Mathematica v8.0 is now archived in
the [[https://github.com/stan-dev/MathematicaStan/tree/v1][v1 git branch]].
* Installation
** The Stan CmdStan shell interface
First you must install [[http://mc-stan.org/interfaces/cmdstan][CmdStan]]. Once this is done you get a directory containing stuff like:
#+BEGIN_EXAMPLE
bin doc examples Jenkinsfile LICENSE make makefile README.md runCmdStanTests.py src stan test-all.sh
#+END_EXAMPLE
With my configuration *CmdStan* is installed in:
#+BEGIN_EXAMPLE
~/ExternalSoftware/cmdstan-2.35.0
#+END_EXAMPLE
For Windows users it is possibly something like:
#+BEGIN_EXAMPLE
C:\\Users\\USER_NAME\\Documents\\R\\cmdstan-?.??.?
#+END_EXAMPLE
** The Mathematica CmdStan package
To install the Mathematica CmdStan package:
- open the =CmdStan.m= file with Mathematica.
- install it using the Mathematica Notebook *File->Install* menu.
Fill in the pop-up windows as follows:
[[file:figures/install.png]]
** First run
The first time the package is imported
#+BEGIN_SRC mathematica :eval never
< False]
#+END_SRC
** Simulated data
Let's simulate some data:
#+BEGIN_SRC mathematica :eval never
σ = 3; α = 1; β = 2;
n = 20;
X = Range[n];
Y = α + β*X + RandomVariate[NormalDistribution[0, σ], n];
Show[Plot[α + β*x, {x, Min[X], Max[X]}],
ListPlot[Transpose@{X, Y}, PlotStyle -> Red]]
#+END_SRC
[[file:figures/linRegData.png][file:./figures/linRegData.png]]
** Create the =data.R= data file
The data are stored in a =Association= and then exported thanks to the
=ExportStanData= function.
#+BEGIN_SRC mathematica :eval never
stanData = <|"N" -> n, "x" -> X, "y" -> Y|>;
stanDataFile = ExportStanData[stanExeFile, stanData]
#+END_SRC
#+BEGIN_EXAMPLE
/tmp/linear_regression.data.R
#+END_EXAMPLE
*Note:* this function returns the created file
name =/tmp/linear_regression.data.R=. Its first argument, =stanExeFile=
is simply the Stan executable file name with its path. The
=ExportStanData[]= function modifies the file name extension and
replace it with ".data.R", but you can use it with
any file name:
#+BEGIN_SRC mathematica :eval never
ExportStanData["my_custom_path/my_custom_filename.data.R",stanData]
#+END_SRC
** Run Stan, likelihood maximization
We are now able to run the =stanExeFile= executable.
Let's start by maximizing the likelihood
#+BEGIN_SRC mathematica :eval never
stanResultFile = RunStan[stanExeFile, OptimizeDefaultOptions]
#+END_SRC
#+BEGIN_EXAMPLE
Running: /tmp/linear_regression method=optimize data file=/tmp/linear_regression.data.R output file=/tmp/linear_regression.csv
method = optimize
optimize
algorithm = lbfgs (Default)
lbfgs
init_alpha = 0.001 (Default)
tol_obj = 9.9999999999999998e-13 (Default)
tol_rel_obj = 10000 (Default)
tol_grad = 1e-08 (Default)
tol_rel_grad = 10000000 (Default)
tol_param = 1e-08 (Default)
history_size = 5 (Default)
iter = 2000 (Default)
save_iterations = 0 (Default)
id = 0 (Default)
data
file = /tmp/linear_regression.data.R
init = 2 (Default)
random
seed = 2775739062
output
file = /tmp/linear_regression.csv
diagnostic_file = (Default)
refresh = 100 (Default)
Initial log joint probability = -8459.75
Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
19 -32.5116 0.00318011 0.00121546 0.9563 0.9563 52
Optimization terminated normally:
Convergence detected: relative gradient magnitude is below tolerance
#+END_EXAMPLE
The =stanResultFile= variable contains now the csv result file:
#+BEGIN_EXAMPLE
/tmp/linear_regression.csv
#+END_EXAMPLE
*Note:* again, if you do not want to have printed output, use the =StanVerbose->False= option.
#+BEGIN_SRC mathematica :eval never
stanResultFile = RunStan[stanExeFile, OptimizeDefaultOptions,StanVerbose->False]
#+END_SRC
*Note:* the method we use is defined by the second argument
=OptimizeDefaultOptions.= If you want to use Variational Bayes or HMC
sampling you must use
#+BEGIN_SRC mathematica :eval never
RunStan[stanExeFile, VariationalDefaultOptions]
#+END_SRC
or
#+BEGIN_SRC mathematica :eval never
RunStan[stanExeFile, SampleDefaultOptions]
#+END_SRC
*Note*: option management will be detailed later in this tutorial.
** Load the CSV result file
To load CSV result file, do
#+BEGIN_SRC mathematica :eval never
stanResult = ImportStanResult[stanResultFile]
#+END_SRC
which prints
#+BEGIN_EXAMPLE
file: /tmp/linear_regression.csv
meta: lp__
parameter: alpha , beta , sigma
#+END_EXAMPLE
To access estimated variable α, β and σ, simply do:
#+BEGIN_SRC mathematica :eval never
GetStanResultMeta[stanResult, "lp__"]
αe=GetStanResult[stanResult, "alpha"]
βe=GetStanResult[stanResult, "beta"]
σe=GetStanResult[stanResult, "sigma"]
#+END_SRC
which prints:
#+BEGIN_EXAMPLE
{-32.5116}
{2.51749}
{1.83654}
{3.08191}
#+END_EXAMPLE
*Note*: as with likelihood maximization we only have a point estimation,
the returned values are lists of *one* number.
You can plot the estimated line:
#+BEGIN_SRC mathematica :eval never
Show[Plot[{αe + βe*x, α + β*x}, {x, Min[X],Max[X]}, PlotLegends -> "Expressions"],
ListPlot[Transpose@{X, Y}, PlotStyle -> Red]]
#+END_SRC
[[file:./figures/linRegEstimate.png]]
** Run Stan, Variational Bayes
We want to solve the same problem but using variational inference.
As explained before we must use
#+BEGIN_SRC mathematica :eval never
stanResultFile = RunStan[stanExeFile, VariationalDefaultOptions]
#+END_SRC
instead of
#+BEGIN_SRC mathematica :eval never
stanResultFile = RunStan[stanExeFile, OptimizeDefaultOptions]
#+END_SRC
However, please note that running this command will erase
=stanResultFile= which is the file where result are exported. To avoid
this we can modify the output file name by modifying option values.
The default option values are stored in the write-protected
=VariationalDefaultOptions= variable.
To modify them we must first copy this protected symbol:
#+BEGIN_SRC mathematica :eval never
myOpt=VariationalDefaultOptions
#+END_SRC
which prints
#+BEGIN_EXAMPLE
method=variational
#+END_EXAMPLE
The option values are printed when you run the =RunStan= command:
#+BEGIN_EXAMPLE
method = variational
variational
algorithm = meanfield (Default)
meanfield
iter = 10000 (Default)
grad_samples = 1 (Default)
elbo_samples = 100 (Default)
eta = 1 (Default)
adapt
engaged = 1 (Default)
iter = 50 (Default)
tol_rel_obj = 0.01 (Default)
eval_elbo = 100 (Default)
output_samples = 1000 (Default)
id = 0 (Default)
data
file = (Default)
init = 2 (Default)
random
seed = 2784129612
output
file = output.csv (Default)
diagnostic_file = (Default)
refresh = 100 (Default)
#+END_EXAMPLE
We have to modify the =output file= option value. This can be done by:
#+BEGIN_SRC mathematica :eval never
myOpt = SetStanOption[myOpt, "output.file", FileNameJoin[{Directory[], "myOutputFile.csv"}]]
#+END_SRC
which prints:
#+BEGIN_EXAMPLE
method=variational output file=/tmp/myOutputFile.csv
#+END_EXAMPLE
Now we can run Stan:
#+BEGIN_SRC mathematica :eval never
myOutputFile=RunStan[stanExeFile, myOpt, StanVerbose -> False]
#+END_SRC
which must print:
#+BEGIN_EXAMPLE
/tmp/myOutputFile.csv
#+END_EXAMPLE
Now import this CSV file:
#+BEGIN_SRC mathematica :eval never
myResult = ImportStanResult[myOutputFile]
#+END_SRC
which prints:
#+BEGIN_EXAMPLE
file: /tmp/myOutputFile.csv
meta: lp__ , log_p__ , log_g__
parameter: alpha , beta , sigma
#+END_EXAMPLE
As before you can use:
#+BEGIN_SRC mathematica :eval never
GetStanResult[myResult,"alpha"]
#+END_SRC
to get =alpha= parameter value, but now you will get a list of 1000 sample:
#+BEGIN_EXAMPLE
{2.03816, 0.90637, ..., ..., 1.22068, 1.66392}
#+END_EXAMPLE
Instead of the full sample list we are often interested by sample
mean, variance... You can get these quantities as follows:
#+BEGIN_SRC mathematica :eval never
GetStanResult[Mean, myResult, "alpha"]
GetStanResult[Variance, myResult, "alpha"]
#+END_SRC
which prints:
#+BEGIN_EXAMPLE
2.0353
0.317084
#+END_EXAMPLE
You can also get the sample hstogram as simply as:
#+BEGIN_SRC mathematica :eval never
GetStanResult[Histogram, myResult, "alpha"]
#+END_SRC
[[file:figures/linRegHisto.png][file:./figures/linRegHisto.png]]
** More about Option management
*** Overwriting default values
We provide further details concerning option related functions.
To recap the first step is to perform a copy of the write-protected
default option values. By example to modify default MCMC option values
the first step is:
#+BEGIN_SRC mathematica :eval never
myOpt = SampleDefaultOptions
#+END_SRC
The available option are:
#+begin_example
method = sample (Default)
sample
num_samples = 1000 (Default)
num_warmup = 1000 (Default)
save_warmup = 0 (Default)
thin = 1 (Default)
adapt
engaged = 1 (Default)
gamma = 0.050000000000000003 (Default)
delta = 0.80000000000000004 (Default)
kappa = 0.75 (Default)
t0 = 10 (Default)
init_buffer = 75 (Default)
term_buffer = 50 (Default)
window = 25 (Default)
algorithm = hmc (Default)
hmc
engine = nuts (Default)
nuts
max_depth = 10 (Default)
metric = diag_e (Default)
metric_file = (Default)
stepsize = 1 (Default)
stepsize_jitter = 0 (Default)
id = 0 (Default)
data
file = /tmp/linear_regression.data.R
init = 2 (Default)
random
seed = 3714706817 (Default)
output
file = /tmp/linear_regression.csv
diagnostic_file = (Default)
refresh = 100 (Default)
sig_figs = -1 (Default)
#+end_example
If we want to modify:
#+begin_example
method = sample (Default)
sample
num_samples = 1000 (Default)
num_warmup = 1000 (Default)
#+end_example
and
#+begin_example
method = sample (Default)
sample
algorithm = hmc (Default)
hmc
engine = nuts (Default)
nuts
max_depth = 10 (Default)
#+end_example
you must proceed as follows. For each hierarchy level use a "." as
separator and do not forget to rewrite "=" with the associated
value. With our example this gives:
#+BEGIN_SRC mathematica :eval never
myOpt = SetStanOption[myOpt, "adapt.num_samples", 2000]
myOpt = SetStanOption[myOpt, "adapt.num_warmup", 1500]
myOpt = SetStanOption[myOpt, "algorithm=hmc.engine=nuts.max_depth", 5]
#+END_SRC
Now you can run the sampler with these new option values:
#+BEGIN_SRC mathematica :eval never
stanResultFile = RunStan[stanExeFile, myOpt]
#+END_SRC
which should print:
#+begin_example
method = sample (Default)
sample
num_samples = 2000
num_warmup = 1500
save_warmup = 0 (Default)
thin = 1 (Default)
adapt
engaged = 1 (Default)
gamma = 0.050000000000000003 (Default)
delta = 0.80000000000000004 (Default)
kappa = 0.75 (Default)
t0 = 10 (Default)
init_buffer = 75 (Default)
term_buffer = 50 (Default)
window = 25 (Default)
algorithm = hmc (Default)
hmc
engine = nuts (Default)
nuts
max_depth = 5
metric = diag_e (Default)
metric_file = (Default)
stepsize = 1 (Default)
stepsize_jitter = 0 (Default)
id = 0 (Default)
data
file = /tmp/linear_regression.data.R
init = 2 (Default)
random
seed = 3720771451 (Default)
output
file = /tmp/linear_regression.csv
diagnostic_file = (Default)
refresh = 100 (Default)
sig_figs = -1 (Default)
stanc_version = stanc3 b25c0b64
stancflags =
Gradient evaluation took 1.3e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.13 seconds.
Adjust your expectations accordingly!
Iteration: 1 / 3500 [ 0%] (Warmup)
Iteration: 100 / 3500 [ 2%] (Warmup)
Iteration: 200 / 3500 [ 5%] (Warmup)
Iteration: 300 / 3500 [ 8%] (Warmup)
Iteration: 400 / 3500 [ 11%] (Warmup)
Iteration: 500 / 3500 [ 14%] (Warmup)
Iteration: 600 / 3500 [ 17%] (Warmup)
Iteration: 700 / 3500 [ 20%] (Warmup)
Iteration: 800 / 3500 [ 22%] (Warmup)
Iteration: 900 / 3500 [ 25%] (Warmup)
Iteration: 1000 / 3500 [ 28%] (Warmup)
Iteration: 1100 / 3500 [ 31%] (Warmup)
Iteration: 1200 / 3500 [ 34%] (Warmup)
Iteration: 1300 / 3500 [ 37%] (Warmup)
Iteration: 1400 / 3500 [ 40%] (Warmup)
Iteration: 1500 / 3500 [ 42%] (Warmup)
Iteration: 1501 / 3500 [ 42%] (Sampling)
Iteration: 1600 / 3500 [ 45%] (Sampling)
Iteration: 1700 / 3500 [ 48%] (Sampling)
Iteration: 1800 / 3500 [ 51%] (Sampling)
Iteration: 1900 / 3500 [ 54%] (Sampling)
Iteration: 2000 / 3500 [ 57%] (Sampling)
Iteration: 2100 / 3500 [ 60%] (Sampling)
Iteration: 2200 / 3500 [ 62%] (Sampling)
Iteration: 2300 / 3500 [ 65%] (Sampling)
Iteration: 2400 / 3500 [ 68%] (Sampling)
Iteration: 2500 / 3500 [ 71%] (Sampling)
Iteration: 2600 / 3500 [ 74%] (Sampling)
Iteration: 2700 / 3500 [ 77%] (Sampling)
Iteration: 2800 / 3500 [ 80%] (Sampling)
Iteration: 2900 / 3500 [ 82%] (Sampling)
Iteration: 3000 / 3500 [ 85%] (Sampling)
Iteration: 3100 / 3500 [ 88%] (Sampling)
Iteration: 3200 / 3500 [ 91%] (Sampling)
Iteration: 3300 / 3500 [ 94%] (Sampling)
Iteration: 3400 / 3500 [ 97%] (Sampling)
Iteration: 3500 / 3500 [100%] (Sampling)
Elapsed Time: 0.053 seconds (Warm-up)
0.094 seconds (Sampling)
0.147 seconds (Total)
#+end_example
You can check than the new option values have been taken into account:
#+begin_example
num_samples = 2000
num_warmup = 1500
algorithm = hmc (Default)
hmc
engine = nuts (Default)
nuts
max_depth = 5
#+end_example
*** Reading customized values
You can get back the modified values as follows:
#+BEGIN_SRC mathematica :eval never
GetStanOption[myOpt, "adapt.num_warmup"]
GetStanOption[myOpt, "algorithm=hmc.engine=nuts.max_depth"]
#+END_SRC
which prints
#+BEGIN_EXAMPLE
1500
5
#+END_EXAMPLE
*Caveat*: if the option was not defined (by =SetStanOption=) the function
returns =$Failed=.
*** Erasing customized option values
To erase an option value (and use its default value) use:
#+BEGIN_SRC mathematica :eval never
myOpt = RemoveStanOption[myOpt, "algorithm=hmc.engine=nuts.max_depth"]
#+END_SRC
which prints
#+BEGIN_EXAMPLE
method=sample adapt num_samples=2000 num_warmup=1500
#+END_EXAMPLE
* Tutorial 2, linear regression with more than one predictor
** Parameter arrays
By now the parameters alpha, beta, sigma, were *scalars*. We will see
how to handle parameters that are vectors or matrices.
We use second section of the [[https://mc-stan.org/docs/2_19/stan-users-guide/linear-regression.html][linear regression]] example, entitled
"Matrix notation and Vectorization".
The β parameter is now a vector of size K.
#+BEGIN_SRC mathematica :eval never
stanCode = "data {
int N; // number of data items
int K; // number of predictors
matrix[N, K] x; // predictor matrix
vector[N] y; // outcome vector
}
parameters {
real alpha; // intercept
vector[K] beta; // coefficients for predictors
real sigma; // error scale
}
model {
y ~ normal(x * beta + alpha, sigma); // likelihood
}";
stanCodeFile = ExportStanCode["linear_regression_vect.stan", stanCode];
stanExeFile = CompileStanCode[stanCodeFile];
#+END_SRC
** Simulated data
Here we use {x,x²,x³} as predictors, with their coefficients
β = {2,0.1,0.01} so that the model is
y = α + β1 x + β2 x² + β3 x³ + ε
where ε follows a normal distribution.
#+BEGIN_SRC mathematica :eval never
σ = 3; α = 1; β1 = 2; β2 = 0.1; β3 = 0.01;
n = 20;
X = Range[n];
Y = α + β1*X + β2*X^2 + β3*X^3 + RandomVariate[NormalDistribution[0, σ], n];
Show[Plot[α + β1*x + β2*x^2 + β3*x^3, {x, Min[X], Max[X]}],
ListPlot[Transpose@{X, Y}, PlotStyle -> Red]]
#+END_SRC
[[file:figures/linReg2Data.png][file:./figures/linReg2Data.png]]
** Exporting data
The expression
y = α + β1 x + β2 x² + β3 x³ + ε
is convenient for random variable manipulations. However in practical
computations where we have to evaluate:
y[i] = α + β1 x[i] + β2 (x[i])² + β3 (x[i])³ + ε[i], for i = 1..N
it is more convenient to rewrite this in a "vectorized form":
*y* = *α* + *X.β* + *ε*
where *X* is a KxN matrix of columns X[:,j] = j th-predictor = (x[:])^j
and *α* a vector of size N with constant components = α.
Thus data is exported as follows:
#+BEGIN_SRC mathematica :eval never
stanData = <|"N" -> n, "K" -> 3, "x" -> Transpose[{X,X^2,X^3}], "y" -> Y|>;
stanDataFile = ExportStanData[stanExeFile, stanData]
#+END_SRC
*Note:* as Mathematica stores its matrices rows by rows (the C
language convention) we have to transpose ={X,X^2,X^3}= to get the
right matrix X.
** Run Stan, HMC sampling
We can now run Stan using the Hamiltonian Monte Carlo (HMC) method:
#+BEGIN_SRC mathematica :eval never
stanResultFile = RunStan[stanExeFile, SampleDefaultOptions]
#+END_SRC
which prints:
#+BEGIN_EXAMPLE
Running: /tmp/linear_regression_vect method=sample data file=/tmp/linear_regression_vect.data.R output file=/tmp/linear_regression_vect.csv
method = sample (Default)
sample
num_samples = 1000 (Default)
num_warmup = 1000 (Default)
save_warmup = 0 (Default)
thin = 1 (Default)
adapt
engaged = 1 (Default)
gamma = 0.050000000000000003 (Default)
delta = 0.80000000000000004 (Default)
kappa = 0.75 (Default)
t0 = 10 (Default)
init_buffer = 75 (Default)
term_buffer = 50 (Default)
window = 25 (Default)
algorithm = hmc (Default)
hmc
engine = nuts (Default)
nuts
max_depth = 10 (Default)
metric = diag_e (Default)
metric_file = (Default)
stepsize = 1 (Default)
stepsize_jitter = 0 (Default)
id = 0 (Default)
data
file = /tmp/linear_regression_vect.data.R
init = 2 (Default)
random
seed = 3043713420
output
file = /tmp/linear_regression_vect.csv
diagnostic_file = (Default)
refresh = 100 (Default)
Gradient evaluation took 4e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.4 seconds.
Adjust your expectations accordingly!
Iteration: 1 / 2000 [ 0%] (Warmup)
Iteration: 100 / 2000 [ 5%] (Warmup)
Iteration: 200 / 2000 [ 10%] (Warmup)
Iteration: 300 / 2000 [ 15%] (Warmup)
Iteration: 400 / 2000 [ 20%] (Warmup)
Iteration: 500 / 2000 [ 25%] (Warmup)
Iteration: 600 / 2000 [ 30%] (Warmup)
Iteration: 700 / 2000 [ 35%] (Warmup)
Iteration: 800 / 2000 [ 40%] (Warmup)
Iteration: 900 / 2000 [ 45%] (Warmup)
Iteration: 1000 / 2000 [ 50%] (Warmup)
Iteration: 1001 / 2000 [ 50%] (Sampling)
Iteration: 1100 / 2000 [ 55%] (Sampling)
Iteration: 1200 / 2000 [ 60%] (Sampling)
Iteration: 1300 / 2000 [ 65%] (Sampling)
Iteration: 1400 / 2000 [ 70%] (Sampling)
Iteration: 1500 / 2000 [ 75%] (Sampling)
Iteration: 1600 / 2000 [ 80%] (Sampling)
Iteration: 1700 / 2000 [ 85%] (Sampling)
Iteration: 1800 / 2000 [ 90%] (Sampling)
Iteration: 1900 / 2000 [ 95%] (Sampling)
Iteration: 2000 / 2000 [100%] (Sampling)
Elapsed Time: 0.740037 seconds (Warm-up)
0.60785 seconds (Sampling)
1.34789 seconds (Total)
#+END_EXAMPLE
** Load the CSV result file
As before,
#+BEGIN_SRC mathematica :eval never
stanResult = ImportStanResult[stanResultFile]
#+END_SRC
load the generated CSV file and prints:
#+BEGIN_EXAMPLE
file: /tmp/linear_regression_vect.csv
meta: lp__ , accept_stat__ , stepsize__ , treedepth__ , n_leapfrog__ , divergent__ , energy__
parameter: alpha , beta 3, sigma
#+END_EXAMPLE
Compared to the scalar case, the important thing to notice is the =beta 3=. That means that β is not a scalar anymore but a vector of size 3
*Note*: here β is a vector, but if it had been a 3x5 matrix we would
have had =β 3x5= printed instead.
A call to
#+BEGIN_SRC mathematica :eval never
GetStanResult[stanResult, "beta"]
#+END_SRC
returns a vector of size 3 but where each component is a list of 1000
sample (for β1, β2 and β3).
As before it generally useful to summarize this sample with function like mean or histogram:
#+BEGIN_SRC mathematica :eval never
GetStanResult[Mean, stanResult, "beta"]
GetStanResult[Histogram, stanResult, "beta"]
#+END_SRC
prints:
#+BEGIN_EXAMPLE
{3.30321, -0.010088, 0.0126913}
#+END_EXAMPLE
and plots:
[[file:figures/linReg2Histo.png][file:./figures/linReg2Histo.png]]
This is the moment to digress about Keys. If you try:
#+BEGIN_SRC mathematica :eval never
StanResultKeys[stanResult]
StanResultMetaKeys[stanResult]
#+END_SRC
this will print:
#+BEGIN_EXAMPLE
{"alpha", "beta.1", "beta.2", "beta.3", "sigma"}
{"lp__", "accept_stat__", "stepsize__", "treedepth__", "n_leapfrog__", "divergent__", "energy__"}
#+END_EXAMPLE
These functions are useful to get the complete list of keys. Note
that, as β is an 1D-array of size 1 we have =beta.1, beta.2, beta.3=. If
β was a NxM matrix, the list of keys would have been: =beta.1.1,
beta.1.2,... beta.N.M=.
There is also *reduced keys* functions:
#+BEGIN_SRC mathematica :eval never
StanResultReducedKeys[stanResult]
StanResultReducedMetaKeys[stanResult]
#+END_SRC
which print
#+BEGIN_EXAMPLE
{"alpha", "beta", "sigma"}
{"lp__", "accept_stat__", "stepsize__", "treedepth__", "n_leapfrog__", "divergent__", "energy__"}
#+END_EXAMPLE
As you can see the *reduced keys* functions collect and discard indices
to keys associated to arrays.
When accessing a parameter you can work at the component level or globally:
#+BEGIN_SRC mathematica :eval never
GetStanResult[Mean, stanResult, "beta.2"]
GetStanResult[Mean, stanResult, "beta"]
#+END_SRC
which prints
#+BEGIN_EXAMPLE
-0.010088
{3.30321, -0.010088, 0.0126913}
#+END_EXAMPLE
* Unit tests
You can run [[file:tests/CmdStan_test.wl]] to check that everything works
as expected.