{"id":17949315,"url":"https://github.com/helske/bssm","last_synced_at":"2025-04-30T14:10:33.325Z","repository":{"id":43459562,"uuid":"53692028","full_name":"helske/bssm","owner":"helske","description":"Bayesian Inference of State Space Models","archived":false,"fork":false,"pushed_at":"2024-09-08T10:46:39.000Z","size":66452,"stargazers_count":40,"open_issues_count":2,"forks_count":14,"subscribers_count":6,"default_branch":"main","last_synced_at":"2024-10-29T09:18:40.910Z","etag":null,"topics":["bayesian-inference","cpp","markov-chain-monte-carlo","particle-filter","r","state-space","time-series"],"latest_commit_sha":null,"homepage":"","language":"HTML","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/helske.png","metadata":{"files":{"readme":"README.Rmd","changelog":"NEWS.md","contributing":".github/CONTRIBUTING.md","funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":"codemeta.json"}},"created_at":"2016-03-11T19:36:44.000Z","updated_at":"2024-10-03T09:50:21.000Z","dependencies_parsed_at":"2023-10-16T20:25:21.729Z","dependency_job_id":"267a3128-cd1e-4540-a7b4-77072946974a","html_url":"https://github.com/helske/bssm","commit_stats":{"total_commits":858,"total_committers":9,"mean_commits":95.33333333333333,"dds":0.4883449883449883,"last_synced_commit":"232b2fe37f94c29c05ddc11580dda795c02b210d"},"previous_names":[],"tags_count":7,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/helske%2Fbssm","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/helske%2Fbssm/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/helske%2Fbssm/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/helske%2Fbssm/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/helske","download_url":"https://codeload.github.com/helske/bssm/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243510125,"owners_count":20302295,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bayesian-inference","cpp","markov-chain-monte-carlo","particle-filter","r","state-space","time-series"],"created_at":"2024-10-29T09:16:10.823Z","updated_at":"2025-03-14T02:05:56.388Z","avatar_url":"https://github.com/helske.png","language":"HTML","funding_links":[],"categories":[],"sub_categories":[],"readme":"---\noutput: github_document\n---\n\n\u003c!-- README.md is generated from README.Rmd. Please edit that file --\u003e\n\n```{r, include = FALSE}\nknitr::opts_chunk$set(\n  collapse = TRUE,\n  comment = \"#\u003e\",\n  fig.path = \"man/figures/README-\",\n  out.width = \"100%\",\n  cache = TRUE\n)\n```\n\n```{r srr-tags, eval = FALSE, echo = FALSE}\n#' @srrstats {G1.2} Contains project status badge.\n#' @srrstats {G1.4,G1.4a} Package uses roxygen2 for documentation.\n#' @srrstats {G2.0, G2.0a, G2.1, G2.1a, G2.2, G2.4, G2.4a, G2.4b, G2.4c, G2.6} \n#' Input types and shapes are tested and checked with autotest and converted \n#' explicitly when necessary.\n#' \n#' @srrstats {G2.3, G2.3a, G2.3b} match.arg and tolower are used where \n#' applicable.\n#' @srrstats {G1.0, G1.3, G1.4, G1.4a, G1.5, G1.6} General \n#' documentation, addressed by the vignettes and the corresponding R \n#' Journal paper.\n#' @srrstats {G1.1} This is the first software to implement the IS-MCMC by \n#' Vihola, Helske, and Franks (2020) and first R package to implement delayed \n#' acceptance pseudo-marginal MCMC for state space models. The IS-MCMC method \n#' is also available in [walker](github.com/helske/walker) package for a \n#' limited class of time-varying GLMss (a small subset of the models \n#' supported by this package). Some of the functionality for exponential family \n#' state space models is also available in [KFAS](github.com/helske/KFAS), and \n#' those models can be converted easily to bssm format for Bayesian analysis.\n#' @srrstats {G2.4, G2.4a, G2.4b, G2.4c, G2.6} Explicit conversions are used \n#' where necessary.\n#' \n#' @srrstats {G2.14, G2.14a, G2.14b, G2.14c, G2.15, G2.16} Missing observations \n#' (y) are handled automatically as per SSM theory, whereas missing values are \n#' not allowed elsewhere. Inputing or ignoring them does not make sense in time \n#' series context.\n#'\n#' @srrstats {G3.0} No floating point equality comparisons are made.\n#'\n#' @srrstats {G5.4, G5.4a, G5.4b, G5.4c, G5.5, G5.6, G5.6a, G5.6b, G5.7} and\n#' @srrstats {BS4.0, BS4.1} The algorithms work as defined per Vihola, Helske, \n#' Franks (2020) (all simulations were implemented with the bssm package) and \n#' Helske and Vihola (2021). Full replication of the results would take \n#' days/weeks (but see also bsm_ng, negbin_series and several testthat tests).\n#'\n#' @srrstats {G5.8, G5.8a, G5.8b, G5.8c, G5.8d} Tested with autotest and the \n#' testthat tests.\n#' @srrstats {G5.9, G5.9a, G5.9b} Tested with autotest and the testthat tests.\n#'\n#' @srrstats {BS1.0, BS1.1, BS1.2, BS1.2a, BS1.2b, BS1.3b} Addressed in the \n#' models.R, run_mcmc.R, in vignettes and in the R Journal paper.\n#'\n#' @srrstats {BS2.1, BS2.1a, BS2.6} Tested and demonstrated by autotest and \n#' package examples/tests.\n#' @srrstats {BS7.4, BS7.4a} The scales do not matter (in terms of runtime) \n#' in random walk Metropolis nor in particle filters, as long as numerical \n#' issues are not encountered\n```\n\n# bssm\n\n\u003c!-- badges: start --\u003e\n[![Project Status: Active - The project has reached a stable, usable state and is being actively developed](https://www.repostatus.org/badges/latest/active.svg)](https://www.repostatus.org/#active)\n[![Status at rOpenSci Software Peer Review](https://badges.ropensci.org/489_status.svg)](https://github.com/ropensci/software-review/issues/489)\n[![R-CMD-check](https://github.com/helske/bssm/workflows/R-CMD-check/badge.svg)](https://github.com/helske/bssm/actions)\n[![Codecov test coverage](https://codecov.io/gh/helske/bssm/graph/badge.svg)](https://app.codecov.io/gh/helske/bssm)\n[![CRAN version](http://www.r-pkg.org/badges/version/bssm)]( https://CRAN.R-project.org/package=bssm)\n[![downloads](https://cranlogs.r-pkg.org/badges/bssm)](https://cranlogs.r-pkg.org/badges/bssm)\n\n\u003c!-- badges: end --\u003e\n\nThe `bssm` R package provides efficient methods for Bayesian inference of state \nspace models via particle Markov chain Monte Carlo and importance sampling type \nweighted MCMC. \nCurrently Gaussian, Poisson, binomial, negative binomial, and Gamma observation \ndensities with linear-Gaussian state dynamics, as well as general non-linear \nGaussian models and discretely observed latent diffusion processes are \nsupported.\n\nFor details, see \n\n* [The bssm paper on The R Journal](https://journal.r-project.org/archive/2021/RJ-2021-103/index.html), \n* [Package vignettes at CRAN](https://CRAN.R-project.org/package=bssm) \n* Paper on [Importance sampling type estimators based on approximate marginal Markov chain Monte Carlo](https://onlinelibrary.wiley.com/doi/abs/10.1111/sjos.12492)\n\nThere are also couple posters and a talk related to IS-correction methodology and bssm package: \n\n* [UseR!2021 talk slides](https://jounihelske.netlify.app/talk/user2021/)    \n* [SMC 2017 workshop: Accelerating MCMC with an approximation ](http://users.jyu.fi/~jovetale/posters/SMC2017)\n* [UseR!2017: Bayesian non-Gaussian state space models in R](http://users.jyu.fi/~jovetale/posters/user2017.pdf)\n\nThe `bssm` package was originally developed with the support of Academy of Finland grants 284513, 312605, 311877, and 331817. Current development is focused on increased usability. For recent changes, see NEWS file.\n\n### Citing the package \n\nIf you use the `bssm` package in publications, please cite the corresponding R Journal paper:\n\nJouni Helske and Matti Vihola (2021). \"bssm: Bayesian Inference of Non-linear and Non-Gaussian State Space Models in R.\" The R Journal (2021) 13:2, pages 578-589. https://journal.r-project.org/archive/2021/RJ-2021-103/index.html\n\n## Installation\n\nYou can install the released version of bssm from [CRAN](https://CRAN.R-project.org) with:\n\n```{r, eval=FALSE}\ninstall.packages(\"bssm\")\n```\n\nAnd the development version from [GitHub](https://github.com/) with:\n\n```{r, eval=FALSE}\n# install.packages(\"devtools\")\ndevtools::install_github(\"helske/bssm\")\n```\nOr from R-universe with \n\n```{r, eval = FALSE}\ninstall.packages(\"bssm\", repos = \"https://helske.r-universe.dev\")\n```\n\n## Example\n\nConsider the daily air quality measurements in New Your from May to September 1973, available in the `datasets` package. Let's try to predict the missing ozone levels by simple linear-Gaussian local linear trend model with temperature and wind as explanatory variables (missing response variables are handled naturally in the state space modelling framework, however no missing values in covariates are normally allowed);\n\n```{r example}\nlibrary(\"bssm\")\nlibrary(\"dplyr\")\nlibrary(\"ggplot2\")\nset.seed(1)\n\ndata(\"airquality\", package = \"datasets\")\n\n# Covariates as matrix. For complex cases, check out as_bssm function\nxreg \u003c- airquality |\u003e select(Wind, Temp) |\u003e as.matrix()\n\nmodel \u003c- bsm_lg(airquality$Ozone,\n  xreg = xreg,  \n  # Define priors for hyperparameters (i.e. not the states), see ?bssm_prior\n  # Initial value followed by parameters of the prior distribution\n  beta = normal_prior(rep(0, ncol(xreg)), 0, 1),\n  sd_y = gamma_prior(1, 2, 0.01),\n  sd_level = gamma_prior(1, 2, 0.01), \n  sd_slope = gamma_prior(1, 2, 0.01))\n\nfit \u003c- run_mcmc(model, iter = 20000, burnin = 5000)\nfit\n\nobs \u003c- data.frame(Time = 1:nrow(airquality),\n  Ozone = airquality$Ozone) |\u003e filter(!is.na(Ozone))\n\npred \u003c- fitted(fit, model)\npred |\u003e\n  ggplot(aes(x = Time, y = Mean)) + \n  geom_ribbon(aes(ymin = `2.5%`, ymax = `97.5%`), \n    alpha = 0.5, fill = \"steelblue\") + \n  geom_line() + \n  geom_point(data = obs, \n    aes(x = Time, y = Ozone), colour = \"Tomato\") +\n  theme_bw()\n\n```\n\nSame model but now assuming observations are from Gamma distribution:\n```{r gamma-example}\n\nmodel2 \u003c- bsm_ng(airquality$Ozone,\n  xreg = xreg,  \n  beta = normal(rep(0, ncol(xreg)), 0, 1),\n  distribution = \"gamma\",\n  phi = gamma_prior(1, 2, 0.01),\n  sd_level = gamma_prior(1, 2, 0.1), \n  sd_slope = gamma_prior(1, 2, 0.1))\n\nfit2 \u003c- run_mcmc(model2, iter = 20000, burnin = 5000, particles = 10)\nfit2\n```\n\nComparison:\n```{r compare}\npred2 \u003c- fitted(fit2, model2)\n\nbind_rows(list(Gaussian = pred, Gamma = pred2), .id = \"Model\") |\u003e\n  ggplot(aes(x = Time, y = Mean)) + \n  geom_ribbon(aes(ymin = `2.5%`, ymax = `97.5%`, fill = Model), \n    alpha = 0.25) + \n  geom_line(aes(colour = Model)) + \n  geom_point(data = obs, \n    aes(x = Time, y = Ozone)) +\n  theme_bw()\n```\n\n\nNow let's assume that we also want to use the solar radiation variable as predictor for ozone. As it contains few missing values, we cannot use it directly. As the number of missing time points is very small, simple imputation would likely be acceptable, but let's consider more another approach. For simplicity, the slope terms of the previous models are now omitted, and we focus on the Gaussian case. Let $\\mu_t$ be the true solar radiation at time $t$. Now for ozone $O_t$ we assume following model:\n\n$O_t = D_t + \\alpha_t + \\beta_S \\mu_t + \\sigma_\\epsilon \\epsilon_t$\\\n$\\alpha_{t+1} = \\alpha_t + \\sigma_\\eta\\eta_t$\\\n$\\alpha_1 \\sim N(0, 100^2\\textrm{I})$,\\\nwheere $D_t = \\beta X_t$ contains regression terms related to wind and temperature, $\\alpha_t$ is the time varying intercept term, and $\\beta_S$ is the effect of solar radiation $\\mu_t$.\n\nNow for the observed solar radiation $S_t$ we assume \n\n$S_t = \\mu_t$\\\n$\\mu_{t+1} = \\mu_t + \\sigma_\\xi\\xi_t,$\\\n$\\mu_1 \\sim N(0, 100^2)$,\\\ni.e. we assume as simple random walk for the $\\mu$ which we observe without error or not at all (there is no error term in the observation equation $S_t=\\mu_t$).\n\nWe combine these two models as a bivariate Gaussian model with `ssm_mlg`:\n\n```{r missing-values}\n# predictors (not including solar radiation) for ozone\nxreg \u003c- airquality |\u003e select(Wind, Temp) |\u003e as.matrix()\n\n# Function which outputs new model components given the parameter vector theta\nupdate_fn \u003c- function(theta) {\n  D \u003c- rbind(t(xreg %*% theta[1:2]), 1)\n  Z \u003c- matrix(c(1, 0, theta[3], 1), 2, 2)\n  R \u003c- diag(exp(theta[4:5]))\n  H \u003c- diag(c(exp(theta[6]), 0))\n  # add third dimension so we have p x n x 1, p x m x 1, p x p x 1 arrays\n  dim(Z)[3] \u003c- dim(R)[3] \u003c- dim(H)[3] \u003c- 1\n  list(D = D, Z = Z, R = R, H = H)\n}\n\n# Function for log-prior density\nprior_fn \u003c- function(theta) {\n  sum(dnorm(theta[1:3], 0, 10, log = TRUE)) + \n    sum(dgamma(exp(theta[4:6]), 2, 0.01, log = TRUE)) + \n    sum(theta[4:6]) # log-jacobian\n}\n\ninit_theta \u003c- c(0, 0, 0, log(50), log(5), log(20))\ncomps \u003c- update_fn(init_theta)\n\nmodel \u003c- ssm_mlg(y = cbind(Ozone = airquality$Ozone, Solar = airquality$Solar.R),\n  Z = comps$Z, D = comps$D, H = comps$H, T = diag(2), R = comps$R, \n  a1 = rep(0, 2), P1 = diag(100, 2), init_theta = init_theta, \n  state_names = c(\"alpha\", \"mu\"), update_fn = update_fn,\n  prior_fn = prior_fn)\n\nfit \u003c- run_mcmc(model, iter = 60000, burnin = 10000)\nfit\n```\n\nDraw predictions:\n```{r bivariate-fig}\npred \u003c- fitted(fit, model)\n\nobs \u003c- data.frame(Time = 1:nrow(airquality),\n  Solar = airquality$Solar.R) |\u003e filter(!is.na(Solar))\n\npred |\u003e filter(Variable == \"Solar\") |\u003e\n  ggplot(aes(x = Time, y = Mean)) + \n  geom_ribbon(aes(ymin = `2.5%`, ymax = `97.5%`), \n    alpha = 0.5, fill = \"steelblue\") + \n  geom_line() + \n  geom_point(data = obs, \n    aes(x = Time, y = Solar), colour = \"Tomato\") +\n  theme_bw()\n\n\nobs \u003c- data.frame(Time = 1:nrow(airquality),\n  Ozone = airquality$Ozone) |\u003e filter(!is.na(Ozone))\n\npred |\u003e filter(Variable == \"Ozone\") |\u003e\n  ggplot(aes(x = Time, y = Mean)) + \n  geom_ribbon(aes(ymin = `2.5%`, ymax = `97.5%`), \n    alpha = 0.5, fill = \"steelblue\") + \n  geom_line() +  \n  geom_point(data = obs, \n    aes(x = Time, y = Ozone), colour = \"Tomato\") +\n  theme_bw()\n```\n\nSee more examples in the paper, vignettes, and in the docs.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhelske%2Fbssm","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhelske%2Fbssm","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhelske%2Fbssm/lists"}