https://github.com/stan-dev/mathematicastan

A Mathematica package to interact with CmdStan
https://github.com/stan-dev/mathematicastan
interface mathematica stan
Last synced: 6 months ago
JSON representation
A Mathematica package to interact with CmdStan
Host: GitHub
URL: https://github.com/stan-dev/mathematicastan
Owner: stan-dev
License: gpl-3.0
Created: 2016-08-26T09:59:09.000Z (over 9 years ago)
Default Branch: master
Last Pushed: 2024-08-15T11:48:10.000Z (over 1 year ago)
Last Synced: 2024-10-29T14:21:57.070Z (about 1 year ago)
Topics: interface, mathematica, stan
Language: Mathematica
Size: 1.32 MB
Stars: 27
Watchers: 29
Forks: 8
Open Issues: 1
Metadata Files:
- Readme: README.org
- Funding: .github/FUNDING.yml
- License: LICENSE
Awesome Lists containing this project

README

          #+OPTIONS: toc:nil todo:nil pri:nil tags:nil ^:nil tex:t

#+TITLE: MathematicaStan v2.2

#+SUBTITLE: A Mathematica (v11+) package to interact with CmdStan

#+AUTHOR: Picaud Vincent

[[https://zenodo.org/doi/10.5281/zenodo.10810144][file:https://zenodo.org/badge/66637604.svg]]

 

* Table of contents                                            :TOC_3:noexport:

- [[#introduction][Introduction]]

  - [[#news][News]]

    - [[#2024-08-13][2024-08-13]]

    - [[#2020-12-21][2020-12-21]]

    - [[#2019-06-28][2019-06-28]]

- [[#installation][Installation]]

  - [[#the-stan-cmdstan-shell-interface][The Stan CmdStan shell interface]]

  - [[#the-mathematica-cmdstan-package][The Mathematica CmdStan package]]

  - [[#first-run][First run]]

- [[#tutorial-1-linear-regression][Tutorial 1, linear regression]]

  - [[#introduction-1][Introduction]]

  - [[#stan-code][Stan code]]

  - [[#code-compilation][Code compilation]]

  - [[#simulated-data][Simulated data]]

  - [[#create-the-datar-data-file][Create the =data.R= data file]]

  - [[#run-stan-likelihood-maximization][Run Stan, likelihood maximization]]

  - [[#load-the-csv-result-file][Load the CSV result file]]

  - [[#run-stan-variational-bayes][Run Stan, Variational Bayes]]

  - [[#more-about-option-management][More about Option management]]

    - [[#overwriting-default-values][Overwriting default values]]

    - [[#reading-customized-values][Reading customized values]]

    - [[#erasing-customized-option-values][Erasing customized option values]]

- [[#tutorial-2-linear-regression-with-more-than-one-predictor][Tutorial 2, linear regression with more than one predictor]]

  - [[#parameter-arrays][Parameter arrays]]

  - [[#simulated-data-1][Simulated data]]

  - [[#exporting-data][Exporting data]]

  - [[#run-stan-hmc-sampling][Run Stan, HMC sampling]]

  - [[#load-the-csv-result-file-1][Load the CSV result file]]

- [[#unit-tests][Unit tests]]

* Introduction

*MathematicaStan* is a package to interact with [[http://mc-stan.org/interfaces/cmdstan][CmdStan]] from

Mathematica. 

It is developed under *Linux* and is compatible with *Mathematica v11+*

It should work under *MacOS* and also under *Windows*.

*Author & contact:* picaud.vincent at gmail.com

** News

   

*** 2024-08-13

*New MathematicaStan version 2.2!*

*Package test with last CmdStan v2.35.0, Mathematica 11.2, Linux*

- Add some screenshots to the install procedure section

  

- CmdStan syntax changes have been included :

  |----------------------------+-----------------------------------|

  | old                        | current                           |

  |----------------------------+-----------------------------------|

  | <-                         | =                                 |

  | increment_log_prob(...)    | target += ...                     |

  | int y[N]; | array[N] int y; |

  |----------------------------+-----------------------------------|

- Check that unit tests and examples work.

*** 2020-12-21

    

*New MathematicaStan version 2.1!*

This version has been fixed and should now run under Windows.

I would like to thank *Ali Ghaderi* who had the patience to help me to

debug the Windows version (I do not have access to this OS). Nothing

would have been possible without him. All possibly remaining bugs are

mine.

 

As a remainder also note that one should not use path/filename with

spaces (=Make= really does not like that). This consign is also true

under Linux or MacOS. See [[https://stackoverflow.com/questions/9838384/can-gnu-make-handle-filenames-with-spaces][SO:can-gnu-make-handle-filenames-with-spaces]]

by example.

*** 2019-06-28 

*New MathematicaStan version 2.0!*

This version uses Mathematica v11 and has been completely refactored

*Caveat:* breaking changes!

*Note*: the "old" MathematicaStan version based on Mathematica v8.0 is now archived in

the [[https://github.com/stan-dev/MathematicaStan/tree/v1][v1 git branch]]. 

* Installation

** The Stan CmdStan shell interface

First you must install [[http://mc-stan.org/interfaces/cmdstan][CmdStan]]. Once this is done you get a directory containing stuff like:

#+BEGIN_EXAMPLE

bin  doc  examples  Jenkinsfile  LICENSE  make  makefile  README.md  runCmdStanTests.py  src  stan  test-all.sh

#+END_EXAMPLE

With my configuration *CmdStan* is installed in:

#+BEGIN_EXAMPLE

~/ExternalSoftware/cmdstan-2.35.0

#+END_EXAMPLE

For Windows users it is possibly something like:

#+BEGIN_EXAMPLE

C:\\Users\\USER_NAME\\Documents\\R\\cmdstan-?.??.?

#+END_EXAMPLE

** The Mathematica CmdStan package

To install the Mathematica CmdStan package:

- open the =CmdStan.m= file with Mathematica.

- install it using the Mathematica Notebook *File->Install* menu.

Fill in the pop-up windows as follows:

[[file:figures/install.png]]

** First run

The first time the package is imported

#+BEGIN_SRC mathematica :eval never

< False]

 #+END_SRC

** Simulated data

Let's simulate some data:

 #+BEGIN_SRC mathematica :eval never

σ = 3; α = 1; β = 2;

n = 20;

X = Range[n];

Y = α + β*X + RandomVariate[NormalDistribution[0, σ], n];

Show[Plot[α + β*x, {x, Min[X], Max[X]}], 

     ListPlot[Transpose@{X, Y}, PlotStyle -> Red]]

 #+END_SRC

[[file:figures/linRegData.png][file:./figures/linRegData.png]]

** Create the =data.R= data file 

The data are stored in a =Association= and then exported thanks to the

=ExportStanData= function.

#+BEGIN_SRC mathematica :eval never

stanData = <|"N" -> n, "x" -> X, "y" -> Y|>;

stanDataFile = ExportStanData[stanExeFile, stanData]

#+END_SRC

#+BEGIN_EXAMPLE

/tmp/linear_regression.data.R

#+END_EXAMPLE

*Note:* this function returns the created file

name =/tmp/linear_regression.data.R=. Its first argument, =stanExeFile=

is simply the Stan executable file name with its path. The

=ExportStanData[]= function modifies the file name extension and

replace it with ".data.R", but you can use it with

any file name:

#+BEGIN_SRC mathematica :eval never

ExportStanData["my_custom_path/my_custom_filename.data.R",stanData]

#+END_SRC

** Run Stan, likelihood maximization

We are now able to run the =stanExeFile= executable. 

Let's start by maximizing the likelihood

#+BEGIN_SRC mathematica :eval never

stanResultFile = RunStan[stanExeFile, OptimizeDefaultOptions]

#+END_SRC

#+BEGIN_EXAMPLE

Running: /tmp/linear_regression method=optimize data file=/tmp/linear_regression.data.R output file=/tmp/linear_regression.csv

method = optimize

  optimize

    algorithm = lbfgs (Default)

      lbfgs

        init_alpha = 0.001 (Default)

        tol_obj = 9.9999999999999998e-13 (Default)

        tol_rel_obj = 10000 (Default)

        tol_grad = 1e-08 (Default)

        tol_rel_grad = 10000000 (Default)

        tol_param = 1e-08 (Default)

        history_size = 5 (Default)

    iter = 2000 (Default)

    save_iterations = 0 (Default)

id = 0 (Default)

data

  file = /tmp/linear_regression.data.R

init = 2 (Default)

random

  seed = 2775739062

output

  file = /tmp/linear_regression.csv

  diagnostic_file =  (Default)

  refresh = 100 (Default)

Initial log joint probability = -8459.75

    Iter      log prob        ||dx||      ||grad||       alpha      alpha0  # evals  Notes 

      19      -32.5116    0.00318011    0.00121546      0.9563      0.9563       52   

Optimization terminated normally: 

  Convergence detected: relative gradient magnitude is below tolerance

#+END_EXAMPLE

The =stanResultFile= variable contains now the csv result file:

#+BEGIN_EXAMPLE

/tmp/linear_regression.csv

#+END_EXAMPLE

*Note:* again, if you do not want to have printed output, use the =StanVerbose->False= option.

#+BEGIN_SRC mathematica :eval never

stanResultFile = RunStan[stanExeFile, OptimizeDefaultOptions,StanVerbose->False]

#+END_SRC

*Note:* the method we use is defined by the second argument

=OptimizeDefaultOptions.= If you want to use Variational Bayes or HMC

sampling you must use

#+BEGIN_SRC mathematica :eval never

RunStan[stanExeFile, VariationalDefaultOptions]

#+END_SRC

or

#+BEGIN_SRC mathematica :eval never

RunStan[stanExeFile, SampleDefaultOptions]

#+END_SRC

*Note*: option management will be detailed later in this tutorial.

** Load the CSV result file

To load CSV result file, do

#+BEGIN_SRC mathematica :eval never

stanResult = ImportStanResult[stanResultFile]

#+END_SRC

which prints

#+BEGIN_EXAMPLE

     file: /tmp/linear_regression.csv

     meta: lp__ 

parameter: alpha , beta , sigma 

#+END_EXAMPLE

To access estimated variable α, β and σ, simply do:

#+BEGIN_SRC mathematica :eval never

GetStanResultMeta[stanResult, "lp__"]

αe=GetStanResult[stanResult, "alpha"]

βe=GetStanResult[stanResult, "beta"]

σe=GetStanResult[stanResult, "sigma"]

#+END_SRC

which prints:

#+BEGIN_EXAMPLE

{-32.5116}

{2.51749}

{1.83654}

{3.08191}

#+END_EXAMPLE

*Note*: as with likelihood maximization we only have a point estimation,

the returned values are lists of *one* number.

You can plot the estimated line:

#+BEGIN_SRC mathematica :eval never

Show[Plot[{αe + βe*x, α + β*x}, {x, Min[X],Max[X]}, PlotLegends -> "Expressions"], 

     ListPlot[Transpose@{X, Y}, PlotStyle -> Red]]

#+END_SRC

[[file:./figures/linRegEstimate.png]]

** Run Stan, Variational Bayes

We want to solve the same problem but using variational inference. 

As explained before we must use 

#+BEGIN_SRC mathematica :eval never

stanResultFile = RunStan[stanExeFile, VariationalDefaultOptions]

#+END_SRC

instead of 

#+BEGIN_SRC mathematica :eval never

stanResultFile = RunStan[stanExeFile, OptimizeDefaultOptions]

#+END_SRC

However, please note that running this command will erase

=stanResultFile= which is the file where result are exported. To avoid

this we can modify the output file name by modifying option values.

The default option values are stored in the write-protected

=VariationalDefaultOptions= variable.

To modify them we must first copy this protected symbol:

#+BEGIN_SRC mathematica :eval never

myOpt=VariationalDefaultOptions

#+END_SRC

which prints

#+BEGIN_EXAMPLE

method=variational

#+END_EXAMPLE

The option values are printed when you run the =RunStan= command:

#+BEGIN_EXAMPLE

method = variational

  variational

    algorithm = meanfield (Default)

      meanfield

    iter = 10000 (Default)

    grad_samples = 1 (Default)

    elbo_samples = 100 (Default)

    eta = 1 (Default)

    adapt

      engaged = 1 (Default)

      iter = 50 (Default)

    tol_rel_obj = 0.01 (Default)

    eval_elbo = 100 (Default)

    output_samples = 1000 (Default)

id = 0 (Default)

data

  file =  (Default)

init = 2 (Default)

random

  seed = 2784129612

output

  file = output.csv (Default)

  diagnostic_file =  (Default)

  refresh = 100 (Default)

#+END_EXAMPLE

We have to modify the =output file= option value. This can be done by:

#+BEGIN_SRC mathematica :eval never

myOpt = SetStanOption[myOpt, "output.file", FileNameJoin[{Directory[], "myOutputFile.csv"}]]

#+END_SRC

which prints:

#+BEGIN_EXAMPLE

method=variational output file=/tmp/myOutputFile.csv

#+END_EXAMPLE

Now we can run Stan:

#+BEGIN_SRC mathematica :eval never

myOutputFile=RunStan[stanExeFile, myOpt, StanVerbose -> False]

#+END_SRC

which must print:

#+BEGIN_EXAMPLE

/tmp/myOutputFile.csv

#+END_EXAMPLE

Now import this CSV file:

#+BEGIN_SRC mathematica :eval never

myResult = ImportStanResult[myOutputFile]

#+END_SRC

which prints:

#+BEGIN_EXAMPLE

     file: /tmp/myOutputFile.csv

     meta: lp__ , log_p__ , log_g__ 

parameter: alpha , beta , sigma 

#+END_EXAMPLE

As before you can use:

#+BEGIN_SRC mathematica :eval never

GetStanResult[myResult,"alpha"]

#+END_SRC

to get =alpha= parameter value, but now you will get a list of 1000 sample:

#+BEGIN_EXAMPLE

{2.03816, 0.90637, ..., ..., 1.22068, 1.66392}

#+END_EXAMPLE

Instead of the full sample list we are often interested by sample

mean, variance... You can get these quantities as follows:

#+BEGIN_SRC mathematica :eval never

GetStanResult[Mean, myResult, "alpha"]

GetStanResult[Variance, myResult, "alpha"]

#+END_SRC

which prints:

#+BEGIN_EXAMPLE

2.0353

0.317084

#+END_EXAMPLE

You can also get the sample hstogram as simply as:

#+BEGIN_SRC mathematica :eval never

GetStanResult[Histogram, myResult, "alpha"]

#+END_SRC

[[file:figures/linRegHisto.png][file:./figures/linRegHisto.png]]

** More about Option management

*** Overwriting default values

We provide further details concerning option related functions.

To recap the first step is to perform a copy of the write-protected

default option values. By example to modify default MCMC option values

the first step is:

#+BEGIN_SRC mathematica :eval never

  myOpt = SampleDefaultOptions

#+END_SRC

The available option are:

#+begin_example

method = sample (Default)

  sample

    num_samples = 1000 (Default)

    num_warmup = 1000 (Default)

    save_warmup = 0 (Default)

    thin = 1 (Default)

    adapt

      engaged = 1 (Default)

      gamma = 0.050000000000000003 (Default)

      delta = 0.80000000000000004 (Default)

      kappa = 0.75 (Default)

      t0 = 10 (Default)

      init_buffer = 75 (Default)

      term_buffer = 50 (Default)

      window = 25 (Default)

    algorithm = hmc (Default)

      hmc

        engine = nuts (Default)

          nuts

            max_depth = 10 (Default)

        metric = diag_e (Default)

        metric_file =  (Default)

        stepsize = 1 (Default)

        stepsize_jitter = 0 (Default)

id = 0 (Default)

data

  file = /tmp/linear_regression.data.R

init = 2 (Default)

random

  seed = 3714706817 (Default)

output

  file = /tmp/linear_regression.csv

  diagnostic_file =  (Default)

  refresh = 100 (Default)

  sig_figs = -1 (Default)

#+end_example

If we want to modify:

#+begin_example

method = sample (Default)

  sample

    num_samples = 1000 (Default)

    num_warmup = 1000 (Default)

#+end_example

and

#+begin_example

method = sample (Default)

  sample

    algorithm = hmc (Default)

      hmc

        engine = nuts (Default)

          nuts

            max_depth = 10 (Default)

#+end_example

you must proceed as follows. For each hierarchy level use a "." as

separator and do not forget to rewrite "=" with the associated

value. With our example this gives:

#+BEGIN_SRC mathematica :eval never

myOpt = SetStanOption[myOpt, "adapt.num_samples", 2000]

myOpt = SetStanOption[myOpt, "adapt.num_warmup", 1500]

myOpt = SetStanOption[myOpt, "algorithm=hmc.engine=nuts.max_depth", 5]

#+END_SRC

Now you can run the sampler with these new option values:

#+BEGIN_SRC mathematica :eval never

stanResultFile = RunStan[stanExeFile, myOpt]

#+END_SRC

which should print:

#+begin_example

method = sample (Default)

  sample

    num_samples = 2000

    num_warmup = 1500

    save_warmup = 0 (Default)

    thin = 1 (Default)

    adapt

      engaged = 1 (Default)

      gamma = 0.050000000000000003 (Default)

      delta = 0.80000000000000004 (Default)

      kappa = 0.75 (Default)

      t0 = 10 (Default)

      init_buffer = 75 (Default)

      term_buffer = 50 (Default)

      window = 25 (Default)

    algorithm = hmc (Default)

      hmc

        engine = nuts (Default)

          nuts

            max_depth = 5

        metric = diag_e (Default)

        metric_file =  (Default)

        stepsize = 1 (Default)

        stepsize_jitter = 0 (Default)

id = 0 (Default)

data

  file = /tmp/linear_regression.data.R

init = 2 (Default)

random

  seed = 3720771451 (Default)

output

  file = /tmp/linear_regression.csv

  diagnostic_file =  (Default)

  refresh = 100 (Default)

  sig_figs = -1 (Default)

stanc_version = stanc3 b25c0b64

stancflags = 

Gradient evaluation took 1.3e-05 seconds

1000 transitions using 10 leapfrog steps per transition would take 0.13 seconds.

Adjust your expectations accordingly!

Iteration:    1 / 3500 [  0%]  (Warmup)

Iteration:  100 / 3500 [  2%]  (Warmup)

Iteration:  200 / 3500 [  5%]  (Warmup)

Iteration:  300 / 3500 [  8%]  (Warmup)

Iteration:  400 / 3500 [ 11%]  (Warmup)

Iteration:  500 / 3500 [ 14%]  (Warmup)

Iteration:  600 / 3500 [ 17%]  (Warmup)

Iteration:  700 / 3500 [ 20%]  (Warmup)

Iteration:  800 / 3500 [ 22%]  (Warmup)

Iteration:  900 / 3500 [ 25%]  (Warmup)

Iteration: 1000 / 3500 [ 28%]  (Warmup)

Iteration: 1100 / 3500 [ 31%]  (Warmup)

Iteration: 1200 / 3500 [ 34%]  (Warmup)

Iteration: 1300 / 3500 [ 37%]  (Warmup)

Iteration: 1400 / 3500 [ 40%]  (Warmup)

Iteration: 1500 / 3500 [ 42%]  (Warmup)

Iteration: 1501 / 3500 [ 42%]  (Sampling)

Iteration: 1600 / 3500 [ 45%]  (Sampling)

Iteration: 1700 / 3500 [ 48%]  (Sampling)

Iteration: 1800 / 3500 [ 51%]  (Sampling)

Iteration: 1900 / 3500 [ 54%]  (Sampling)

Iteration: 2000 / 3500 [ 57%]  (Sampling)

Iteration: 2100 / 3500 [ 60%]  (Sampling)

Iteration: 2200 / 3500 [ 62%]  (Sampling)

Iteration: 2300 / 3500 [ 65%]  (Sampling)

Iteration: 2400 / 3500 [ 68%]  (Sampling)

Iteration: 2500 / 3500 [ 71%]  (Sampling)

Iteration: 2600 / 3500 [ 74%]  (Sampling)

Iteration: 2700 / 3500 [ 77%]  (Sampling)

Iteration: 2800 / 3500 [ 80%]  (Sampling)

Iteration: 2900 / 3500 [ 82%]  (Sampling)

Iteration: 3000 / 3500 [ 85%]  (Sampling)

Iteration: 3100 / 3500 [ 88%]  (Sampling)

Iteration: 3200 / 3500 [ 91%]  (Sampling)

Iteration: 3300 / 3500 [ 94%]  (Sampling)

Iteration: 3400 / 3500 [ 97%]  (Sampling)

Iteration: 3500 / 3500 [100%]  (Sampling)

 Elapsed Time: 0.053 seconds (Warm-up)

               0.094 seconds (Sampling)

               0.147 seconds (Total)

#+end_example

You can check than the new option values have been taken into account:

#+begin_example

    num_samples = 2000

    num_warmup = 1500

    algorithm = hmc (Default)

      hmc

        engine = nuts (Default)

          nuts

            max_depth = 5

#+end_example

*** Reading customized values

You can get back the modified values as follows:

  #+BEGIN_SRC mathematica :eval never

GetStanOption[myOpt, "adapt.num_warmup"]

GetStanOption[myOpt, "algorithm=hmc.engine=nuts.max_depth"]

  #+END_SRC

  which prints

  #+BEGIN_EXAMPLE

  1500

  5

  #+END_EXAMPLE

  *Caveat*: if the option was not defined (by =SetStanOption=) the function

  returns =$Failed=.

*** Erasing customized option values

To erase an option value (and use its default value) use:

  #+BEGIN_SRC mathematica :eval never

  myOpt = RemoveStanOption[myOpt, "algorithm=hmc.engine=nuts.max_depth"]

  #+END_SRC

  which prints

  #+BEGIN_EXAMPLE

  method=sample adapt num_samples=2000 num_warmup=1500 

  #+END_EXAMPLE

* Tutorial 2, linear regression with more than one predictor

** Parameter arrays

By now the parameters alpha, beta, sigma, were *scalars*. We will see

how to handle parameters that are vectors or matrices. 

We use second section of the [[https://mc-stan.org/docs/2_19/stan-users-guide/linear-regression.html][linear regression]] example, entitled

"Matrix notation and Vectorization".

The β parameter is now a vector of size K. 

#+BEGIN_SRC mathematica :eval never 

stanCode = "data {

    int N;   // number of data items

    int K;   // number of predictors

    matrix[N, K] x;   // predictor matrix

    vector[N] y;      // outcome vector

  }

  parameters {

    real alpha;           // intercept

    vector[K] beta;       // coefficients for predictors

    real sigma;  // error scale

  }

  model {

    y ~ normal(x * beta + alpha, sigma);  // likelihood

  }";

stanCodeFile = ExportStanCode["linear_regression_vect.stan", stanCode];

stanExeFile = CompileStanCode[stanCodeFile];

#+END_SRC

** Simulated data

Here we use {x,x²,x³} as predictors, with their coefficients

β = {2,0.1,0.01} so that the model is 

y = α + β1 x + β2 x² + β3 x³ + ε

where ε follows a normal distribution.

#+BEGIN_SRC mathematica :eval never 

σ = 3; α = 1; β1 = 2; β2 = 0.1; β3 = 0.01;

n = 20;

X = Range[n];

Y = α + β1*X + β2*X^2 + β3*X^3 + RandomVariate[NormalDistribution[0, σ], n];

Show[Plot[α + β1*x + β2*x^2 + β3*x^3, {x, Min[X], Max[X]}],

     ListPlot[Transpose@{X, Y}, PlotStyle -> Red]]

#+END_SRC

[[file:figures/linReg2Data.png][file:./figures/linReg2Data.png]]

** Exporting data

The expression 

y = α + β1 x + β2 x² + β3 x³ + ε

is convenient for random variable manipulations. However in practical

computations where we have to evaluate:

y[i] = α + β1 x[i] + β2 (x[i])² + β3 (x[i])³ + ε[i], for i = 1..N

it is more convenient to rewrite this in a "vectorized form":

*y* = *α* + *X.β* + *ε*

where *X* is a KxN matrix of columns X[:,j] = j th-predictor = (x[:])^j

and *α* a vector of size N with constant components = α.

Thus data is exported as follows:

#+BEGIN_SRC mathematica :eval never 

stanData = <|"N" -> n, "K" -> 3, "x" -> Transpose[{X,X^2,X^3}], "y" -> Y|>;

stanDataFile = ExportStanData[stanExeFile, stanData]

#+END_SRC

*Note:* as Mathematica stores its matrices rows by rows (the C

 language convention) we have to transpose ={X,X^2,X^3}= to get the

 right matrix X.

** Run Stan, HMC sampling

We can now run Stan using the Hamiltonian Monte Carlo (HMC) method:

#+BEGIN_SRC mathematica :eval never 

stanResultFile = RunStan[stanExeFile, SampleDefaultOptions]

#+END_SRC

which prints:

#+BEGIN_EXAMPLE

Running: /tmp/linear_regression_vect method=sample data file=/tmp/linear_regression_vect.data.R output file=/tmp/linear_regression_vect.csv

method = sample (Default)

  sample

    num_samples = 1000 (Default)

    num_warmup = 1000 (Default)

    save_warmup = 0 (Default)

    thin = 1 (Default)

    adapt

      engaged = 1 (Default)

      gamma = 0.050000000000000003 (Default)

      delta = 0.80000000000000004 (Default)

      kappa = 0.75 (Default)

      t0 = 10 (Default)

      init_buffer = 75 (Default)

      term_buffer = 50 (Default)

      window = 25 (Default)

    algorithm = hmc (Default)

      hmc

        engine = nuts (Default)

          nuts

            max_depth = 10 (Default)

        metric = diag_e (Default)

        metric_file =  (Default)

        stepsize = 1 (Default)

        stepsize_jitter = 0 (Default)

id = 0 (Default)

data

  file = /tmp/linear_regression_vect.data.R

init = 2 (Default)

random

  seed = 3043713420

output

  file = /tmp/linear_regression_vect.csv

  diagnostic_file =  (Default)

  refresh = 100 (Default)

Gradient evaluation took 4e-05 seconds

1000 transitions using 10 leapfrog steps per transition would take 0.4 seconds.

Adjust your expectations accordingly!

Iteration:    1 / 2000 [  0%]  (Warmup)

Iteration:  100 / 2000 [  5%]  (Warmup)

Iteration:  200 / 2000 [ 10%]  (Warmup)

Iteration:  300 / 2000 [ 15%]  (Warmup)

Iteration:  400 / 2000 [ 20%]  (Warmup)

Iteration:  500 / 2000 [ 25%]  (Warmup)

Iteration:  600 / 2000 [ 30%]  (Warmup)

Iteration:  700 / 2000 [ 35%]  (Warmup)

Iteration:  800 / 2000 [ 40%]  (Warmup)

Iteration:  900 / 2000 [ 45%]  (Warmup)

Iteration: 1000 / 2000 [ 50%]  (Warmup)

Iteration: 1001 / 2000 [ 50%]  (Sampling)

Iteration: 1100 / 2000 [ 55%]  (Sampling)

Iteration: 1200 / 2000 [ 60%]  (Sampling)

Iteration: 1300 / 2000 [ 65%]  (Sampling)

Iteration: 1400 / 2000 [ 70%]  (Sampling)

Iteration: 1500 / 2000 [ 75%]  (Sampling)

Iteration: 1600 / 2000 [ 80%]  (Sampling)

Iteration: 1700 / 2000 [ 85%]  (Sampling)

Iteration: 1800 / 2000 [ 90%]  (Sampling)

Iteration: 1900 / 2000 [ 95%]  (Sampling)

Iteration: 2000 / 2000 [100%]  (Sampling)

 Elapsed Time: 0.740037 seconds (Warm-up)

               0.60785 seconds (Sampling)

               1.34789 seconds (Total)

#+END_EXAMPLE

** Load the CSV result file

As before, 

#+BEGIN_SRC mathematica :eval never

stanResult = ImportStanResult[stanResultFile]

#+END_SRC

load the generated CSV file and prints:

#+BEGIN_EXAMPLE

     file: /tmp/linear_regression_vect.csv

     meta: lp__ , accept_stat__ , stepsize__ , treedepth__ , n_leapfrog__ , divergent__ , energy__ 

parameter: alpha , beta 3, sigma 

#+END_EXAMPLE

Compared to the scalar case, the important thing to notice is the =beta 3=. That means that β is not a scalar anymore but a vector of size 3

*Note*: here β is a vector, but if it had been a 3x5 matrix we would

 have had =β 3x5= printed instead.

A call to 

#+BEGIN_SRC mathematica :eval never

GetStanResult[stanResult, "beta"]

#+END_SRC

returns a vector of size 3 but where each component is a list of 1000

sample (for β1, β2 and β3).

As before it generally useful to summarize this sample with function like mean or histogram:

#+BEGIN_SRC mathematica :eval never

GetStanResult[Mean, stanResult, "beta"]

GetStanResult[Histogram, stanResult, "beta"]

#+END_SRC

prints:

#+BEGIN_EXAMPLE

{3.30321, -0.010088, 0.0126913}

#+END_EXAMPLE

and plots:

[[file:figures/linReg2Histo.png][file:./figures/linReg2Histo.png]]

This is the moment to digress about Keys. If you try:

#+BEGIN_SRC mathematica :eval never

StanResultKeys[stanResult]

StanResultMetaKeys[stanResult]

#+END_SRC

this will print:

#+BEGIN_EXAMPLE

{"alpha", "beta.1", "beta.2", "beta.3", "sigma"}

{"lp__", "accept_stat__", "stepsize__", "treedepth__", "n_leapfrog__", "divergent__", "energy__"}

#+END_EXAMPLE

These functions are useful to get the complete list of keys. Note

that, as β is an 1D-array of size 1 we have =beta.1, beta.2, beta.3=. If

β was a NxM matrix, the list of keys would have been: =beta.1.1,

beta.1.2,... beta.N.M=.

There is also *reduced keys* functions:

#+BEGIN_SRC mathematica :eval never

StanResultReducedKeys[stanResult]

StanResultReducedMetaKeys[stanResult]

#+END_SRC

which print

#+BEGIN_EXAMPLE

{"alpha", "beta", "sigma"}

{"lp__", "accept_stat__", "stepsize__", "treedepth__", "n_leapfrog__", "divergent__", "energy__"}

#+END_EXAMPLE

As you can see the *reduced keys* functions collect and discard indices

to keys associated to arrays.

When accessing a parameter you can work at the component level or globally:

#+BEGIN_SRC mathematica :eval never

GetStanResult[Mean, stanResult, "beta.2"]

GetStanResult[Mean, stanResult, "beta"]

#+END_SRC

which prints

#+BEGIN_EXAMPLE

-0.010088

{3.30321, -0.010088, 0.0126913}

#+END_EXAMPLE

* Unit tests

You can run [[file:tests/CmdStan_test.wl]] to check that everything works

as expected.
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/stan-dev/mathematicastan

Awesome Lists containing this project

README