An open API service indexing awesome lists of open source software.

https://github.com/ankargren/bayesnorm

Efficient sampling of normal posterior distributions
https://github.com/ankargren/bayesnorm

cpp package r rcpp rcpparmadillo

Last synced: 23 days ago
JSON representation

Efficient sampling of normal posterior distributions

Awesome Lists containing this project

README

          

---
output:
github_document
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```

# bayesnorm

*Efficient sampling of normal posterior distributions*

[![Build Status](https://travis-ci.org/ankargren/bayesnorm.svg?branch=master)](https://travis-ci.org/ankargren/bayesnorm)[![Coverage status](https://codecov.io/gh/ankargren/bayesnorm/branch/master/graph/badge.svg)](https://codecov.io/github/ankargren/bayesnorm?branch=master)

## About

The `bayesnorm` package provides two functions, `rmvn_bcm` and `rmvn_rue`, which
allow for efficient sampling from normal posterior distributions. The posterior distribution
should have the form `mu = Sigma * Phi' * alpha`, where the posterior
covariance matrix is `Sigma = (Phi' * Phi + D^{-1})^{-1}` and `D` is
a diagonal matrix. This is the ubiquitous form of normal posteriors. It shows up
in standard Bayesian linear regression as well as when using scale mixtures such
as the horseshoe prior or other priors from the global-local shrinkage family. The idea
of this package is to provide C++ headers so that sampling routines specifically
tailored for the form of the posterior can be employed, which can speed up computations
notably. The main use of the package is therefore for building and implementing
Gibbs samplers. In addition to the C++ headers, simple wrappers are available for
use from R directly.

## Installation

The package is currently GitHub only and can be installed using `devtools`:
```{r, eval = FALSE}
devtools::install_github("ankargren/bayesnorm")
```

## Sampling routines

The `rmvn_bcm` is appropriate when `n

p`.

The sampling routines are based on the proposals by Bhattacharya, Chakraborty
and Mallick (2016) and Rue (2001). The former is based on
an idea of avoiding operations in the `p` dimension in favor of working
in the `n` dimension, which is why it is preferrable when `n

%
group_by(expr) %>%
summarize(median = median(time)/1e6) %>%
arrange(as.character(expr)) %>%
pull(median)

}
```

Before plotting, put the data into long format:
```{r}
theme_set(theme_minimal())
cbPalette <- c("#999999", "#E69F00", "#56B4E9", "#009E73",
"#F0E442", "#0072B2", "#D55E00", "#CC79A7")
plot_df <- as_tibble(combs) %>%
gather(bcm:rue, key = "Method", value = "Milliseconds (log scale)")
```

First, we plot the results letting the `x` axis be the number of covariates `p`. For each facet, `n` is fixed.
```{r}
plot_df %>%
ggplot(aes(x = p, y = `Milliseconds (log scale)`)) +
geom_line(aes(color = Method)) +
facet_grid(n~.,labeller = labeller(.rows = label_both, .cols = label_both)) +
scale_y_continuous(trans = "log2") +
scale_color_manual(values = cbPalette)
```

What the figure is showing is that:

1. using the `rmvn_rue()` function, specifically tailored for this specific posterior distribution, is always better than computing `mu` and `Sigma` and then drawing from the posterior
2. `rmvn_rue()` and `mgcv::rmvn()` are unaffected across facets, i.e. the sample size does not affect the computational time
3. if `n` is small relative to `p`, the `rmvn_bcm()` function offers a substantial speed improvement

In the second figure, we instead fix `p` and let the `x` axis be the sample size `n`:
```{r}
plot_df %>%
ggplot(aes(x = n, y = `Milliseconds (log scale)`)) +
geom_line(aes(color = Method)) +
facet_grid(p~.,labeller = labeller(.rows = label_both, .cols = label_both)) +
scale_y_continuous(trans = "log2") +
scale_color_manual(values = cbPalette)

```

The second figure reiterates the point that you can always do better than naive sampling where `mu` and `Sigma` are computed explicitly. The take-away message is that notable speed improvements can be obtained by using one of the two sampling routines offered in the package when sampling from normal posterior distributions. If `p>n`, the `rmvn_bcm` function is the more faster alternative whereas `rmvn_rue` is faster otherwise.

## Example 3: Bayesian linear regression

To see what the gains are in estimating a model, the package RcppDist provides an example of standard Bayesian linear regression. In standard Bayesian linear regression, the posterior mean and variance is constant and can be pre-computed outside of the MCMC loop. However, using scale mixtures and other hierarchical priors (e.g. normal-gamma, horseshoe, Dirichlet-Laplace, etc) the diagonal `D` matrix changes at every iteration, and as such the moments of the conditional posterior can no longer be pre-computed. To mimic this situation but without introducing unnecessary details, we will use the RcppDist linear regression example but fix the error variance to 1 and with the alteration that we compute `mu` and `Sigma` at every iteration (as you would need to do with a hierarchical prior). The RcppDist example with this modification is:

```{Rcpp bayeslm, cache = TRUE}
#include
#include
// [[Rcpp::depends(RcppArmadillo, RcppDist)]]
// [[Rcpp::export]]
arma::mat bayeslm(const arma::vec& y, const arma::mat& x,
const int iters = 1000) {
int p = x.n_cols;
arma::vec d = arma::vec(p, arma::fill::ones); // prior variance is 1
arma::mat xtx, Sigma, mu;

// Storage
arma::mat beta_draws(iters, p); // Object to store beta draws in
for ( int iter = 0; iter < iters; ++iter ) {
xtx = x.t() * x; // X'X
xtx.diag() += arma::pow(d, -1.0); // add D^{-1}

Sigma = xtx.i(); // the inverse is Sigma
mu = Sigma * x.t() * y; // compute mu

beta_draws.row(iter) = rmvnorm(1, mu, Sigma);
}
return beta_draws;
}
```
In the code, the diagonal of `D` is taken to be 1, implying that we have a standard normal prior on all of the regression parameters.

To see what use of the `bayesnorm` means in terms of efficiency in this situation, we can create a similar function where we use `mvn_rue()` to sample from the posterior:
```{Rcpp bayeslm_rue, cache = TRUE}
#include
#include
// [[Rcpp::depends(RcppArmadillo, bayesnorm)]]
// [[Rcpp::export]]
arma::mat bayeslm_rue(const arma::vec& y, const arma::mat& x,
const int iters = 1000) {
int p = x.n_cols;
arma::vec d = arma::vec(p, arma::fill::ones);
arma::mat beta_draws(p, iters);
for ( int iter = 0; iter < iters; ++iter ) {
beta_draws.col(iter) = mvn_rue(x, d, y);
}
return beta_draws;
}
```

We will try a sample size of 500 and 100 covariates:
```{r, cache = TRUE}
n <- 500
p <- 100

X <- matrix(rnorm(n * p), n, p)
y <- matrix(rnorm(n), n, 1)

microbenchmark::microbenchmark(bayeslm(y, X), bayeslm_rue(y, X), times = 10)
```

Using the `mvn_rue()` function yields about a modest 10% speed improvement.

If we instead study the `p>n` case, improvements are more sizable. Create the same MCMC function but now using `mvn_bcm()` for sampling:
```{Rcpp bayeslm_bcm, cache = TRUE}
#include
#include
// [[Rcpp::depends(RcppArmadillo, bayesnorm)]]
// [[Rcpp::export]]
arma::mat bayeslm_bcm(const arma::vec& y, const arma::mat& x,
const int iters = 1000) {
int p = x.n_cols;
arma::vec d = arma::vec(p, arma::fill::ones);
arma::mat beta_draws(p, iters);
for ( int iter = 0; iter < iters; ++iter ) {
beta_draws.col(iter) = mvn_bcm(x, d, y);
}
return beta_draws;
}
```

Setting the sample size to 50 and keeping 100 as the number of covariates shows a more impressive improvement in computational efficiency:
```{r, cache = TRUE}
n <- 50
p <- 100

X <- matrix(rnorm(n * p), n, p)
y <- matrix(rnorm(n), n, 1)

microbenchmark::microbenchmark(bayeslm(y, X), bayeslm_bcm(y, X), times = 10)
```

## Incorporation into other packages

To use the sampling routines in other RcppArmadillo-based packages, all that is needed is to:

- add `bayesnorm` in the `LinkingTo` field in the `DESCRIPTION`
- add `#include ` at the top of the file calling the functions (or in the package's header file)
- the C++ functions are named `mvn_bcm` and `mvn_rue`

### References

Bhattacharya, A., Chakraborty, A. and Mallick, B. (2016) Fast
sampling with Gaussian scale mixture priors in high-dimensional regression,
*Biometrika*, 103(4):985-991, [doi:10.1093/biomet/asw042](https://doi.org/10.1093/biomet/asw042)

Rue, H. (2001) Fast sampling of Gaussian Markov random fiels, *Journal of the Royal Statistical Society: Series B*, 63, 325-339, [doi:10.1111/1467-9868.00288](https://doi.org/10.1111/1467-9868.00288)