An open API service indexing awesome lists of open source software.

https://github.com/stan-dev/posteriordb-r


https://github.com/stan-dev/posteriordb-r

Last synced: 2 months ago
JSON representation

Awesome Lists containing this project

README

          

---
output:
md_document:
variant: markdown_github
---

[![R-CMD-check](https://github.com/stan-dev/posteriordb-r/actions/workflows/check-release.yaml/badge.svg)](https://github.com/stan-dev/posteriordb-r/actions/workflows/check-release.yaml) [![Codecov test coverage](https://codecov.io/gh/stan-dev/posteriordb-r/branch/main/graph/badge.svg)](https://codecov.io/gh/stan-dev/posteriordb-r?branch=main)

# `posteriordb`: an R package to work with `posteriordb`

This repository contains the R package to efficiently work with the [posteriordb](https://github.com/stan-dev/posteriordb) repository. The R package includes convenience functions to access data, model code and information for individual posteriors, models, data and draws.

## Installation

To install only the R package and then access the posteriors remotely, install the package from GitHub using the `remotes` package.

```{r, eval = FALSE}
remotes::install_github("stan-dev/posteriordb-r")
```

To load the package, just run.

```{r}
library(posteriordb)
```

## Connect to the posterior database

First, we create the posterior database connection to use. Here we want to use the database locally. We assume the `posteriordb` repo has been cloned and is accessible locally.

```{r, eval=FALSE}
my_pdb <- pdb_local()
```

```{r, eval=TRUE, echo=FALSE}
# Store the PDB_PATH in .env
dotenv::load_dot_env()
my_pdb <- pdb_local(Sys.getenv("PDB_PATH"))
```

The above code requires that your working directory be the cloned repository's main folder. Otherwise, we can use the `path` argument in `pdb_local()` to point to the local posterior database. We can also set the environment variable `PBD_PATH` to handle the connection. For more details, see `?pdb`.

The most straightforward approach is to use the GitHub repository directly to access the database.

```{r, eval=FALSE}
my_pdb <- pdb_github()
```

When you have a connection to the posterior database of choice, you can access the data, models etc., using the same functionality.

## Contributing content using R

If you want to contribute to a posteriordb, see the vignette [vignettes/contributing](https://htmlpreview.github.io/?https://github.com/stan-dev/posteriordb-r/blob/main/vignettes/contributing.html).

## Access content

To list the posteriors in the database, use `posterior_names()`.

```{r}
pos <- posterior_names(my_pdb)
head(pos)
```

In the same fashion, we can list data and models included in the database as

```{r}
mn <- model_names(my_pdb)
head(mn)

dn <- data_names(my_pdb)
head(dn)
```

We can also get all information on each posterior as a table with

```{r}
pos <- posteriors_tbl_df(my_pdb)
head(pos)
```

The posterior's name is made up of the data and model fitted
to the data. Together, these two uniquely define a posterior distribution.
To access a posterior object, we can use the posterior name.

```{r}
po <- posterior("eight_schools-eight_schools_centered", my_pdb)
```

From the posterior object, we can access data, model code (i.e., Stan code in this case) and other useful information.

```{r}
dat <- pdb_data(po)
dat

code <- stan_code(po)
code
```

We can also access the paths to data after they have been unzipped and copied to the cache directory set in `pdb` (the R temp directory by default).

```{r}
dfp <- data_file_path(po)
dfp

scfp <- stan_code_file_path(po)
scfp
```

We can also access information regarding the model and the data used to compute the posterior.

```{r}
data_info(po)
model_info(po)
```

Note that the references reference BibTeX items found in `content/references/references.bib`.

We can access most of the posterior information as a `tbl_df` using

```{r}
tbl <- posteriors_tbl_df(my_pdb)
head(tbl)
```

In addition, we can also access a list of posteriors with `filter_posteriors()`. The filtering function follows dplyr filter semantics.

```{r}
pos <- filter_posteriors(pdb = my_pdb, data_name == "eight_schools")
pos
```

To access reference posterior draws, we use `reference_posterior_draws()`.

```{r}
rpd <- reference_posterior_draws(po)
```

The function `reference_posterior_draws()` returns a posterior `draws_list` object that can be summarized and transformed using the `posterior` package.

```{r}
posterior::summarize_draws(rpd)
```

To access information on the reference posterior we can use `reference_posterior_draws_info()` or use `info()` on the reference posterior. The posterior reference draws return information on how the reference posterior was computed.

```{r}
rpi <- reference_posterior_draws_info(po)
rpi

info(rpd)
```