{"id":13401270,"url":"https://github.com/mjskay/tidybayes","last_synced_at":"2025-05-15T15:03:36.789Z","repository":{"id":29851821,"uuid":"33396684","full_name":"mjskay/tidybayes","owner":"mjskay","description":"Bayesian analysis + tidy data + geoms (R package)","archived":false,"fork":false,"pushed_at":"2024-04-24T03:25:13.000Z","size":240400,"stargazers_count":704,"open_issues_count":45,"forks_count":58,"subscribers_count":24,"default_branch":"master","last_synced_at":"2024-04-25T00:51:03.345Z","etag":null,"topics":["bayesian-data-analysis","brms","ggplot2","jags","r","r-package","stan","tidy-data","visualization"],"latest_commit_sha":null,"homepage":"http://mjskay.github.io/tidybayes","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mjskay.png","metadata":{"files":{"readme":"README.Rmd","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2015-04-04T06:35:31.000Z","updated_at":"2024-06-18T17:07:34.852Z","dependencies_parsed_at":"2023-02-18T20:00:18.604Z","dependency_job_id":"f7ca1461-ccde-452e-8fc5-9011e9ababf3","html_url":"https://github.com/mjskay/tidybayes","commit_stats":{"total_commits":1035,"total_committers":5,"mean_commits":207.0,"dds":0.02898550724637683,"last_synced_commit":"b9f8517b72887baf4545d5c19e198e6991c2baf0"},"previous_names":[],"tags_count":16,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mjskay%2Ftidybayes","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mjskay%2Ftidybayes/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mjskay%2Ftidybayes/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mjskay%2Ftidybayes/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mjskay","download_url":"https://codeload.github.com/mjskay/tidybayes/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247704569,"owners_count":20982298,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bayesian-data-analysis","brms","ggplot2","jags","r","r-package","stan","tidy-data","visualization"],"created_at":"2024-07-30T19:01:00.707Z","updated_at":"2025-04-07T18:10:42.852Z","avatar_url":"https://github.com/mjskay.png","language":"R","funding_links":[],"categories":["R"],"sub_categories":[],"readme":"---\noutput: github_document\n---\n\n```{r chunk_options, include=FALSE}\nknitr::opts_chunk$set(\n  fig.path = \"man/figures/README/\"\n)\nknitr::opts_chunk$set(\n  fig.retina = 2\n)\nif (requireNamespace(\"ragg\", quietly = TRUE)) {\n  knitr::opts_chunk$set(\n    dev = \"ragg_png\"\n  )\n} else if (capabilities(\"cairo\")) {\n  knitr::opts_chunk$set(\n    dev = \"png\",\n    dev.args = list(png = list(type = \"cairo\"))\n  )\n}\ndir.create(\"README_models\", showWarnings = FALSE)\n```\n\n# tidybayes: Bayesian analysis + tidy data + geoms\n\n[![R build status](https://github.com/mjskay/tidybayes/workflows/R-CMD-check/badge.svg)](https://github.com/mjskay/tidybayes/actions)\n[![Coverage status](https://codecov.io/gh/mjskay/tidybayes/branch/master/graph/badge.svg)](https://codecov.io/github/mjskay/tidybayes?branch=master)\n[![CRAN status](https://www.r-pkg.org/badges/version/tidybayes)](https://cran.r-project.org/package=tidybayes)\n![Download count](https://cranlogs.r-pkg.org/badges/last-month/tidybayes)\n[![DOI](https://zenodo.org/badge/33396684.svg)](https://zenodo.org/badge/latestdoi/33396684)\n\n![Preview of tidybayes plots](man/figures/preview.gif)\n\n[tidybayes](https://mjskay.github.io/tidybayes/) is an R package that aims to make it easy to integrate popular Bayesian \nmodeling methods into a tidy data + ggplot workflow. It builds on top of (and re-exports)\nseveral functions for visualizing uncertainty from its sister package, \n[ggdist](https://mjskay.github.io/ggdist/)\n\n[Tidy](https://dx.doi.org/10.18637/jss.v059.i10)\ndata frames (one observation per row) are particularly convenient for use\nin a variety of R data manipulation and visualization packages. However,\nwhen using Bayesian modeling functions like JAGS or Stan in R, we often have\nto translate this data into a form the model understands, and then after\nrunning the model, translate the resulting sample (or predictions) into a more tidy\nformat for use with other R functions.  `tidybayes` aims to simplify these \ntwo common (often tedious) operations:\n\n* __Composing data__ for use with the model. This often means translating\n  data from a `data.frame` into a `list` , making sure `factors` are encoded as\n  numerical data, adding variables to store the length of indices, etc. This\n  package helps automate these operations using the `compose_data()` function, which\n  automatically handles data types like `numeric`, `logical`, `factor`, and `ordinal`, \n  and allows easy extensions for converting other data types into a format the\n  model understands by providing your own implementation of the generic `as_data_list()`.\n\n* __Extracting tidy draws__ from the model. This often means extracting indices\n  from parameters with names like `\"b[1,1]\"`, `\"b[1,2]\"` into separate columns\n  of a data frame, like `i = c(1,1,..)` and `j = c(1,2,...)`. More tediously,\n  sometimes these indices actually correspond to levels of a factor in the original\n  data; e.g. `\"x[1]\"` might correspond to a value of `x` for the first level of\n  some factor. We provide several straightforward ways to convert draws from a\n  variable with indices into useful long-format \n  (\"[tidy](https://dx.doi.org/10.18637/jss.v059.i10)\") \n  data frames, with automatic back-conversion of common data types (factors, logicals)\n  using the `spread_draws()` and `gather_draws()` functions, including automatic \n  recovery of factor levels corresponding to variable indices. In most cases this\n  kind of long-format data is much easier to use with other data-manipulation and \n  plotting packages (e.g., `dplyr`, `tidyr`, `ggplot2`) than the format provided \n  by default from the model. See `vignette(\"tidybayes\")` for examples.\n\n`tidybayes` also provides some additional functionality for data manipulation\nand visualization tasks common to many models:\n\n* __Extracting tidy fits and predictions__ from models. For models like those\n  provided by `rstanarm` and `brms`, `tidybayes` provides a tidy analog of the\n  `posterior_epred()`, `posterior_predict()`, and `posterior_linpred()` functions,\n  called `add_epred_draws()`, `add_predicted_draws()`, and `add_linpred_draws()`.\n  These functions are modeled after the `modelr::add_predictions()`\n  function, and turn a grid of predictions into a long-format data frame of\n  draws from either the fits or predictions from a model. These functions make\n  it straightforward to generate arbitrary fit lines from a model. See\n  `vignette(\"tidy-brms\")` or `vignette(\"tidy-rstanarm\")` for examples.\n  \n* __Summarizing posterior distributions__ from models. `tidybayes` re-exports the \n  `ggdist::point_interval()` family of functions (`median_qi()`, `mean_qi()`, `mode_hdi()`, etc),\n  which are methods for generating point summaries and intervals that are designed with tidy workflows \n  in mind. They can generate point summaries plus an arbitrary number of probability \n  intervals *from* tidy data frames of draws, they *return* tidy data frames,\n  and they **respect data frame groups**. `tidybayes` also provides and implementation\n  of `posterior::summarise_draws()` for use with grouped data frames, such as those\n  returned by the `tidybayes::XXX_draws` functions.\n\n* __Visualizing priors and posteriors__. The focus on tidy data makes the output from tidybayes\n  easy to visualize using `ggplot`. While existing `geom`s (like `ggdist::geom_pointrange()` and \n  `ggdist::geom_linerange()`) can give useful output, the output from `tidybayes` is designed to work\n  well with several geoms and stats in its sister package, `ggdist`. These geoms have sensible\n  defaults suitable for visualizing posterior point summaries and intervals (`ggdist::geom_pointinterval()`,\n  `ggdist::stat_pointinterval()`), visualizing\n  distributions with point summaries and intervals (the `ggdist::stat_sample_slabinterval()` family of\n  stats, including eye plots, half-eye plots, CCDF bar plots, gradient plots, dotplots, and histograms), \n  and visualizing fit lines with an arbitrary number of uncertainty bands (`ggdist::geom_lineribbon()` \n  and `ggdist::stat_lineribbon()`). Priors can also be visualized in the same way using the \n  `ggdist::stat_slabinterval()` family of stats. The `ggdist::geom_dotsinterval()` family also\n  automatically finds good binning parameters for dotplots, and can be used to easily construct\n  quantile dotplots of posteriors (see example in this document). For convenience, `tidybayes`\n  re-exports the `ggdist` stats and geoms.\n  \n  ![The slabinterval family of geoms and stats](man/figures/slabinterval_family.png)\n\n  See `vignette(\"slabinterval\", package = \"ggdist\")` for more information.\n\n* __Extracting and visualizing data frames of random variables__ from models.\n  `tidybayes` also provides `XXX_rvars` functions as alternatives to the \n  `XXX_draws` functions, such as `spread_rvars()`, `add_predicted_rvars()`, etc. \n  These functions instead return tidy data frames of `posterior::rvar()`s, a\n  vectorized random variable data type (see `vignette(\"rvar\", package = \"posterior\")`\n  for more about `rvar`s). Combined with the `ggdist::stat_slabinterval()`\n  and `ggdist::stat_lineribbon()` geometries, these functions make it\n  easy to extract samples from distributions, manipulate them, and visualize them;\n  this format may have significant advantages in terms of memory required for \n  large models. See `vignette(\"tidy-posterior\")` for examples.\n\n* __Comparing a variable across levels of a factor__, which often means first\n  generating pairs of levels of a factor (according to some desired set of \n  comparisons) and then computing a function over the value of the comparison\n  variable for those pairs of levels. Assuming your data is in the format\n  returned by `spread_draws`, the `compare_levels` function allows comparison\n  across levels to be made easily.\n\nFinally, `tidybayes` aims to fit into common workflows through __compatibility with\nother packages__:\n\n* Its core functions for returning tidy data frames of draws are built on top\n  of `posterior::as_draws_df()`.\n\n* Drop-in functions to translate tidy column names used by `tidybayes` to/from names used\n  by other common packages and functions, including column names used by\n  `ggmcmc::ggs` (via `to_ggmcmc_names` and `from_ggmcmc_names`) and column names used by\n  `broom::tidy` (via `to_broom_names` and `from_broom_names`), which makes comparison\n  with results of other models straightforward.\n  \n* The `unspread_draws` and `ungather_draws` functions invert\n  `spread_draws` and `gather_draws`, aiding compatibility with other Bayesian\n  plotting packages (notably `bayesplot`).\n  \n* The `gather_emmeans_draws` function turns the output from `emmeans::emmeans`\n  (formerly `lsmeans`) into long-format data frames (when applied to supported\n  model types, like `MCMCglmm` and `rstanarm` models).\n\n  \n\n## Supported model types\n\n`tidybayes` aims to support a variety of models with a uniform interface. Currently supported models include \n[rstan](https://cran.r-project.org/package=rstan),\n[cmdstanr](https://mc-stan.org/cmdstanr/),\n[brms](https://cran.r-project.org/package=brms), \n[rstanarm](https://cran.r-project.org/package=rstanarm), \n[runjags](https://cran.r-project.org/package=runjags),\n[rjags](https://cran.r-project.org/package=rjags), \n[jagsUI](https://cran.r-project.org/package=jagsUI), \n[coda::mcmc and coda::mcmc.list](https://cran.r-project.org/package=coda),\n[posterior::draws](https://mc-stan.org/posterior/),\n[MCMCglmm](https://cran.r-project.org/package=MCMCglmm), \nand anything with its own `as.mcmc.list` implementation. If you install the [tidybayes.rethinking](https://mjskay.github.io/tidybayes.rethinking/) package, models from the [rethinking](https://github.com/rmcelreath/rethinking) package are also supported.\n\n\n## Installation\n\nYou can install the currently-released version from CRAN with this R\ncommand:\n\n```{r install, eval=FALSE}\ninstall.packages(\"tidybayes\")\n```\n\nAlternatively, you can install the latest development version from GitHub with these R\ncommands:\n\n```{r install_github, eval=FALSE}\ninstall.packages(\"devtools\")\ndevtools::install_github(\"mjskay/tidybayes\")\n```\n\n## Examples\n\nThis example shows the use of tidybayes with the Stan modeling language; however, tidybayes supports many other model types, such as JAGS, brm, rstanarm, and (theoretically) any model type supported by `coda::as.mcmc.list`.\n\n```{r setup, message = FALSE, warning = FALSE}\nlibrary(magrittr)\nlibrary(dplyr)\nlibrary(ggplot2)\nlibrary(rstan)\nlibrary(tidybayes)\nlibrary(emmeans)\nlibrary(broom)\nlibrary(brms)\nlibrary(modelr)\nlibrary(forcats)\nlibrary(cowplot)\nlibrary(RColorBrewer)\nlibrary(gganimate)\n\ntheme_set(theme_tidybayes() + panel_border())\n```\n\n```{r hidden_options, include=FALSE}\nrstan_options(auto_write = TRUE)\noptions(mc.cores = parallel::detectCores())\n\n#misc options\noptions(width = 90)\n```\n\nImagine this dataset:\n\n```{r make_data}\nset.seed(5)\nn = 10\nn_condition = 5\nABC =\n  tibble(\n    condition = factor(rep(c(\"A\",\"B\",\"C\",\"D\",\"E\"), n)),\n    response = rnorm(n * 5, c(0,1,2,1,-1), 0.5)\n  )\n\nABC %\u003e%\n  ggplot(aes(x = response, y = condition)) +\n  geom_point(alpha = 0.5) +\n  ylab(\"condition\")\n```\n\nA hierarchical model of this data might fit an overall mean across the conditions (`overall_mean`), the standard deviation of the condition means (`condition_mean_sd`), the mean within each condition (`condition_mean[condition]`) and the standard deviation of the responses given a condition mean (`response_sd`):\n\n```{stan abc_model, output.var = \"ABC_stan\", results = \"hide\", cache = TRUE}\ndata {\n  int\u003clower=1\u003e n;\n  int\u003clower=1\u003e n_condition;\n  int\u003clower=1, upper=n_condition\u003e condition[n];\n  real response[n];\n}\nparameters {\n  real overall_mean;\n  vector[n_condition] condition_zoffset;\n  real\u003clower=0\u003e response_sd;\n  real\u003clower=0\u003e condition_mean_sd;\n}\ntransformed parameters {\n  vector[n_condition] condition_mean;\n  condition_mean = overall_mean + condition_zoffset * condition_mean_sd;\n}\nmodel {\n  response_sd ~ cauchy(0, 1);         // =\u003e half-cauchy(0, 1)\n  condition_mean_sd ~ cauchy(0, 1);   // =\u003e half-cauchy(0, 1)\n  overall_mean ~ normal(0, 5);\n  condition_zoffset ~ normal(0, 1);   // =\u003e condition_mean ~ normal(overall_mean, condition_mean_sd)\n  for (i in 1:n) {\n    response[i] ~ normal(condition_mean[condition[i]], response_sd);\n  }\n}\n```\n\n### Composing data for input to model: `compose_data`\n\nWe have compiled and loaded this model into the variable `ABC_stan`. Rather than munge the data into a format Stan likes ourselves, we will use the `tidybayes::compose_data()` function, which takes our `ABC` data frame and automatically generates a list of the following elements:\n\n* `n`: number of observations in the data frame\n* `n_condition`: number of levels of the condition factor\n* `condition`: a vector of integers indicating the condition of each observation\n* `response`: a vector of observations\n\nSo we can skip right to modeling:\n\n```{r abc_sampling}\nm = sampling(ABC_stan, data = compose_data(ABC), control = list(adapt_delta = 0.99))\n```\n\n### Getting tidy draws from the model: `spread_draws`\n\nWe decorate the fitted model using `tidybayes::recover_types()`, which will ensure that numeric indices (like `condition`) are back-translated back into factors when we extract data:\n\n```{r recover_types}\nm %\u003c\u003e% recover_types(ABC)\n```\n\nNow we can extract variables of interest using `spread_draws`, which automatically parses indices, converts them back into their original format, and turns them into data frame columns. This function accepts a symbolic specification of Stan variables using the same syntax you would to index columns in Stan. For example, we can extract the condition means and the residual standard deviation:\n\n```{r spread_draws}\nm %\u003e%\n  spread_draws(condition_mean[condition], response_sd) %\u003e%\n  head(15)  # just show the first few rows\n```\n\nThe condition numbers are automatically turned back into text (\"A\", \"B\", \"C\", ...) and split into their own column. A long-format data frame is returned with a row for every draw $\\times$ every combination of indices across all variables given to `spread_draws`; for example, because `response_sd` here is not indexed by `condition`, within the same draw it has the same value for each row corresponding to a different `condition` (some other formats supported by `tidybayes` are discussed in `vignette(\"tidybayes\")`; in particular, the format returned by `gather_draws`).\n\n\n### Plotting posteriors as eye plots: `stat_eye()`\n\nAutomatic splitting of indices into columns makes it easy to plot the condition means here. We will employ the `ggdist::stat_eye()` geom, which combines a violin plot of the posterior density, median, 66% and 95% quantile interval to give an \"eye plot\" of the posterior. The point and interval types are customizable using the `point_interval()` family of functions. A \"half-eye\" plot (non-mirrored density) is also available as `ggdist::stat_halfeye()`. All tidybayes geometries automatically detect their appropriate orientation, though this can be overridden with the `orientation` parameter if the detection fails.\n\n```{r stat_eye}\nm %\u003e%\n  spread_draws(condition_mean[condition]) %\u003e%\n  ggplot(aes(x = condition_mean, y = condition)) +\n  stat_eye()\n```\n\nOr one can employ the similar \"half-eye\" plot:\n\n```{r stat_halfeye}\nm %\u003e%\n  spread_draws(condition_mean[condition]) %\u003e%\n  ggplot(aes(x = condition_mean, y = condition)) +\n  stat_halfeye()\n```\n\nA variety of other stats and geoms for visualizing priors and posteriors are available; see `vignette(\"slabinterval\", package = \"ggdist\")` for an overview of them.\n\n### Plotting posteriors as quantile dotplots\n\nIntervals are nice if the alpha level happens to line up with whatever decision you are trying to make, but getting a shape of the posterior is better (hence eye plots, above). On the other hand, making inferences from density plots is imprecise (estimating the area of one shape as a proportion of another is a hard perceptual task). Reasoning about probability in frequency formats is easier, motivating [quantile dotplots](https://github.com/mjskay/when-ish-is-my-bus/blob/master/quantile-dotplots.md) ([Kay et al. 2016](https://doi.org/10.1145/2858036.2858558), [Fernandes et al. 2018](https://doi.org/10.1145/3173574.3173718)), which also allow precise estimation of arbitrary intervals (down to the dot resolution of the plot, 100 in the example below). \n\nWithin the slabinterval family of geoms in tidybayes is the `dots` and `dotsinterval` family, which automatically determine appropriate bin sizes for dotplots and can calculate quantiles from samples to construct quantile dotplots. `ggdist::stat_dots()` is the variant designed for use on samples:\n\n```{r quantile_dotplots}\nm %\u003e%\n  spread_draws(condition_mean[condition]) %\u003e%\n  ggplot(aes(x = condition_mean, y = condition)) +\n  stat_dots(quantiles = 100) \n```\n\nThe idea is to get away from thinking about the posterior as indicating one canonical point or interval, but instead to represent it as (say) 100 approximately equally likely points.\n\n### Point and interval summaries\n\nThe functions `ggdist::median_qi()`, `ggdist::mean_qi()`, `ggdist::mode_hdi()`, etc (the `point_interval` functions) give tidy output of point summaries and intervals:\n\n```{r median_qi}\nm %\u003e%\n  spread_draws(condition_mean[condition]) %\u003e%\n  median_qi(condition_mean)\n```\n\n\n### Comparison to other models via compatibility with `broom`\n\nTranslation functions like `ggdist::to_broom_names()`, `ggdist::from_broom_names()`, `ggdist::to_ggmcmc_names()`, etc. can be used to translate between common tidy format data frames with different naming schemes. This makes it easy, for example, to compare points summaries and intervals between `tidybayes` output and models that are supported by `broom::tidy`.\n\nFor example, let's compare against ordinary least squares (OLS) regression:\n\n```{r broom_ols}\nlinear_results = \n  lm(response ~ condition, data = ABC) %\u003e% \n  emmeans(~ condition) %\u003e% \n  tidy(conf.int = TRUE) %\u003e%\n  mutate(model = \"OLS\")\nlinear_results\n```\n\nUsing `ggdist::to_broom_names()`, we'll convert the output from `median_qi` (which uses names `.lower` and `.upper`) to use names from `broom` (`conf.low` and `conf.high`) so that comparison with output from `broom::tidy` is easy:\n\n```{r broom_tidybayes}\nbayes_results = m %\u003e%\n  spread_draws(condition_mean[condition]) %\u003e%\n  median_qi(estimate = condition_mean) %\u003e%\n  to_broom_names() %\u003e%\n  mutate(model = \"Bayes\")\nbayes_results\n```\n\nThis makes it easy to bind the two results together and plot them:\n\n```{r broom_bind}\nbind_rows(linear_results, bayes_results) %\u003e%\n  ggplot(aes(y = condition, x = estimate, xmin = conf.low, xmax = conf.high, color = model)) +\n  geom_pointinterval(position = position_dodge(width = .3))\n```\n\nShrinkage towards the overall mean is visible in the Bayesian results.\n\n### Posterior prediction and complex custom plots\n\nThe tidy data format returned by `spread_draws` also facilitates additional computation on variables followed by the construction of more complex custom plots. For example, we can generate posterior predictions easily, and use the `.width` argument (passed internally to `median_qi`) to generate any number of intervals from the posterior predictions, then plot them alongside point summaries and the data:\n\n```{r pp_intervals}\nm %\u003e%\n  spread_draws(condition_mean[condition], response_sd) %\u003e%\n  mutate(prediction = rnorm(n(), condition_mean, response_sd)) %\u003e%\n  ggplot(aes(y = condition)) +\n  \n  # posterior predictive intervals\n  stat_interval(aes(x = prediction), .width = c(.5, .8, .95)) +\n  scale_color_brewer() +\n  \n  # median and quantile intervals of condition mean\n  stat_pointinterval(aes(x = condition_mean), .width = c(.66, .95), position = position_nudge(y = -0.2)) +\n  \n  # data\n  geom_point(aes(x = response), data = ABC)\n```\n\nThis plot shows 66% and 95% quantile credible intervals of posterior median for each condition (point + black line); 95%, 80%, and 50% posterior predictive intervals (blue); and the data.\n\n\n### Fit curves\n\nFor models that support it (like `rstanarm` and `brms` models), We can also use the `add_epred_draws()` or `add_predicted_draws()` functions to generate distributions of posterior means or predictions. Combined with the functions from the `modelr` package, this makes it easy to generate fit curves.\n\nLet's fit a slightly naive model to miles per gallon versus horsepower in the `mtcars` dataset:\n\n```{r m_mpg, results = \"hide\", message = FALSE, warning = FALSE, cache = TRUE}\nm_mpg = brm(\n  mpg ~ log(hp), \n  data = mtcars, \n  family = lognormal,\n\n  file = \"README_models/m_mpg.rds\" # cache model (can be removed)  \n)\n```\n\nNow we will use `modelr::data_grid`, `tidybayes::add_predicted_draws()`, and `ggdist::stat_lineribbon()` to generate a fit curve with multiple probability bands:\n\n```{r pp_bands}\nmtcars %\u003e%\n  data_grid(hp = seq_range(hp, n = 101)) %\u003e%\n  add_predicted_draws(m_mpg) %\u003e%\n  ggplot(aes(x = hp, y = mpg)) +\n  stat_lineribbon(aes(y = .prediction), .width = c(.99, .95, .8, .5), color = \"#08519C\") +\n  geom_point(data = mtcars, size = 2) +\n  scale_fill_brewer()\n```\n\n`ggdist::stat_lineribbon(aes(y = .prediction), .width = c(.99, .95, .8, .5))` is one of several shortcut geoms that simplify common combinations of `tidybayes` functions and `ggplot` geoms. It is roughly equivalent to the following:\n\n```r\n  stat_summary(\n    aes(y = .prediction, fill = forcats::fct_rev(ordered(after_stat(.width))), group = -after_stat(.width)), \n    geom = \"ribbon\", point_interval = median_qi, fun.args = list(.width = c(.99, .95, .8, .5))\n  ) +\n  stat_summary(aes(y = .prediction), fun.y = median, geom = \"line\", color = \"red\", linewidth = 1.25)\n```\n\nBecause this is all tidy data, if you wanted to build a model with interactions among different categorical variables (say a different curve for automatic and manual transmissions), you can easily generate predictions faceted over that variable (say, different curves for different transmission types). Then you could use the existing faceting features built in to ggplot to plot them.\n\nSuch a model might be:\n\n```{r m_mpg_am, results = \"hide\", message = FALSE, warning = FALSE, cache = TRUE}\nm_mpg_am = brm(\n  mpg ~ log(hp) * am, \n  data = mtcars, \n  family = lognormal,\n\n  file = \"README_models/m_mpg_am.rds\" # cache model (can be removed)  \n)\n```\n\nThen we can generate and plot predictions as before (differences from above are highlighted as comments):\n\n```{r pp_bands_facet}\nmtcars %\u003e%\n  data_grid(hp = seq_range(hp, n = 101), am) %\u003e%    # add am to the prediction grid\n  add_predicted_draws(m_mpg_am) %\u003e%\n  ggplot(aes(x = hp, y = mpg)) +\n  stat_lineribbon(aes(y = .prediction), .width = c(.99, .95, .8, .5), color = \"#08519C\") +\n  geom_point(data = mtcars) +\n  scale_fill_brewer() +\n  facet_wrap(~ am)                                  # facet by am\n```\n\nOr, if you would like overplotted posterior fit lines, you can instead use `tidybayes::add_epred_draws()` to get draws from conditional means (expectations of the posterior predictive, thus `epred`), select some reasonable number of them (say `ndraws = 100`), and then plot them:\n\n```{r spaghetti}\nmtcars %\u003e%\n  data_grid(hp = seq_range(hp, n = 200), am) %\u003e%\n  # NOTE: this shows the use of ndraws to subsample within add_epred_draws()\n  # ONLY do this IF you are planning to make spaghetti plots, etc.\n  # NEVER subsample to a small sample to plot intervals, densities, etc.\n  add_epred_draws(m_mpg_am, ndraws = 100) %\u003e%   # sample 100 means from the posterior\n  ggplot(aes(x = hp, y = mpg)) +\n  geom_line(aes(y = .epred, group = .draw), alpha = 1/20, color = \"#08519C\") +\n  geom_point(data = mtcars) +\n  facet_wrap(~ am)\n```\n\nAnimated hypothetical outcome plots (HOPs) can also be easily constructed by using `gganimate`:\n\n```{r hops}\nset.seed(12345)\nndraws = 50\n\np = mtcars %\u003e%\n  data_grid(hp = seq_range(hp, n = 50), am) %\u003e%\n  # NOTE: this shows the use of ndraws to subsample within add_epred_draws()\n  # ONLY do this IF you are planning to make spaghetti plots, etc.\n  # NEVER subsample to a small sample to plot intervals, densities, etc.\n  add_epred_draws(m_mpg_am, ndraws = ndraws) %\u003e%\n  ggplot(aes(x = hp, y = mpg)) +\n  geom_line(aes(y = .epred, group = .draw), color = \"#08519C\") +\n  geom_point(data = mtcars) +\n  facet_wrap(~ am, labeller = label_both) +\n  transition_states(.draw, 0, 1) +\n  shadow_mark(past = TRUE, future = TRUE, alpha = 1/20, color = \"gray50\")\n\nanimate(p, nframes = ndraws, fps = 2.5, width = 672, height = 480, units = \"px\", res = 100, dev = \"ragg_png\")\n```\n\nSee `vignette(\"tidybayes\")` for a variety of additional examples and more explanation of how it works.\n\n\n## Feedback, issues, and contributions\n\nI welcome feedback, suggestions, issues, and contributions! Contact me at \u003cmjskay@northwestern.edu\u003e. If you have found a bug, please file it [here](https://github.com/mjskay/tidybayes/issues/new) with minimal code to reproduce the issue. Pull requests should be filed against the [`dev`](https://github.com/mjskay/tidybayes/tree/dev) branch.\n\n`tidybayes` grew out of helper functions I wrote to make my own analysis pipelines tidier. Over time it has expanded to cover more use cases I have encountered, but I would love to make it cover more!\n\n## Citing `tidybayes`\n\nMatthew Kay (`r format(Sys.Date(), \"%Y\")`). _tidybayes: Tidy Data and Geoms for Bayesian Models_. R package version `r getNamespaceVersion(\"tidybayes\")`, \u003chttps://mjskay.github.io/tidybayes/\u003e.\nDOI: [10.5281/zenodo.1308151](https://doi.org/10.5281/zenodo.1308151).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmjskay%2Ftidybayes","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmjskay%2Ftidybayes","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmjskay%2Ftidybayes/lists"}