https://github.com/fchamroukhi/samurais
StAtistical Models for the UnsupeRvised segmentAion of tIme-Series
https://github.com/fchamroukhi/samurais
artificial-intelligence change-point-detection data-science dynamic-programming em-algorithm hidden-markov-models hidden-process-regression human-activity-recognition latent-variable-models model-selection multivariate-timeseries newton-raphson piecewise-regression statistical-inference statistical-learning time-series-analysis time-series-clustering
Last synced: 3 months ago
JSON representation
StAtistical Models for the UnsupeRvised segmentAion of tIme-Series
- Host: GitHub
- URL: https://github.com/fchamroukhi/samurais
- Owner: fchamroukhi
- Created: 2019-07-07T23:20:22.000Z (over 6 years ago)
- Default Branch: master
- Last Pushed: 2020-01-22T17:31:47.000Z (about 6 years ago)
- Last Synced: 2025-10-22T03:59:07.070Z (3 months ago)
- Topics: artificial-intelligence, change-point-detection, data-science, dynamic-programming, em-algorithm, hidden-markov-models, hidden-process-regression, human-activity-recognition, latent-variable-models, model-selection, multivariate-timeseries, newton-raphson, piecewise-regression, statistical-inference, statistical-learning, time-series-analysis, time-series-clustering
- Language: R
- Homepage:
- Size: 10.8 MB
- Stars: 11
- Watchers: 2
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.Rmd
Awesome Lists containing this project
README
---
output: github_document
bibliography: bibliography.bib
csl: chicago-author-date.csl
nocite: '@*'
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.align = "center",
fig.path = "man/figures/README-"
)
```
# **SaMUraiS**: **S**t**A**tistical **M**odels for the **U**nsupe**R**vised segment**A**t**I**on of time-**S**eries
[](https://travis-ci.org/fchamroukhi/SaMUraiS)
[](https://CRAN.R-project.org/package=samurais)
[](https://CRAN.R-project.org/package=samurais)
samurais is an open source toolbox (available in R and in Matlab) including
many original and flexible user-friendly statistical latent variable models
and unsupervised algorithms to segment and represent, time-series data
(univariate or multivariate), and more generally, longitudinal data which
include regime changes.
Our samurais use mainly the following efficient "sword" packages to segment
data: Regression with Hidden Logistic Process (**RHLP**), Hidden Markov Model
Regression (**HMMR**), Piece-Wise regression (**PWR**), Multivariate 'RHLP'
(**MRHLP**), and Multivariate 'HMMR' (**MHMMR**).
The models and algorithms are developed and written in Matlab by Faicel
Chamroukhi, and translated and designed into R packages by Florian Lecocq,
Marius Bartcus and Faicel Chamroukhi.
# Installation
You can install the **samurais** package from
[GitHub](https://github.com/fchamroukhi/SaMUraiS) with:
```{r, eval = FALSE}
# install.packages("devtools")
devtools::install_github("fchamroukhi/SaMUraiS")
```
To build *vignettes* for examples of usage, type the command below instead:
```{r, eval = FALSE}
# install.packages("devtools")
devtools::install_github("fchamroukhi/SaMUraiS",
build_opts = c("--no-resave-data", "--no-manual"),
build_vignettes = TRUE)
```
Use the following command to display vignettes:
```{r, eval = FALSE}
browseVignettes("samurais")
```
# Usage
```{r, message = FALSE}
library(samurais)
```
RHLP
```{r, echo = TRUE}
# Application to a toy data set
data("univtoydataset")
x <- univtoydataset$x
y <- univtoydataset$y
K <- 5 # Number of regimes (mixture components)
p <- 3 # Dimension of beta (order of the polynomial regressors)
q <- 1 # Dimension of w (order of the logistic regression: to be set to 1 for segmentation)
variance_type <- "heteroskedastic" # "heteroskedastic" or "homoskedastic" model
n_tries <- 1
max_iter = 1500
threshold <- 1e-6
verbose <- TRUE
verbose_IRLS <- FALSE
rhlp <- emRHLP(X = x, Y = y, K, p, q, variance_type, n_tries,
max_iter, threshold, verbose, verbose_IRLS)
rhlp$summary()
rhlp$plot()
```
```{r, echo = TRUE}
# Application to a real data set
data("univrealdataset")
x <- univrealdataset$x
y <- univrealdataset$y2
K <- 5 # Number of regimes (mixture components)
p <- 3 # Dimension of beta (order of the polynomial regressors)
q <- 1 # Dimension of w (order of the logistic regression: to be set to 1 for segmentation)
variance_type <- "heteroskedastic" # "heteroskedastic" or "homoskedastic" model
n_tries <- 1
max_iter = 1500
threshold <- 1e-6
verbose <- TRUE
verbose_IRLS <- FALSE
rhlp <- emRHLP(X = x, Y = y, K, p, q, variance_type, n_tries,
max_iter, threshold, verbose, verbose_IRLS)
rhlp$summary()
rhlp$plot()
```
HMMR
```{r, echo = TRUE}
# Application to a toy data set
data("univtoydataset")
x <- univtoydataset$x
y <- univtoydataset$y
K <- 5 # Number of regimes (states)
p <- 3 # Dimension of beta (order of the polynomial regressors)
variance_type <- "heteroskedastic" # "heteroskedastic" or "homoskedastic" model
n_tries <- 1
max_iter <- 1500
threshold <- 1e-6
verbose <- TRUE
hmmr <- emHMMR(X = x, Y = y, K, p, variance_type,
n_tries, max_iter, threshold, verbose)
hmmr$summary()
hmmr$plot(what = c("smoothed", "regressors", "loglikelihood"))
```
```{r, echo = TRUE}
# Application to a real data set
data("univrealdataset")
x <- univrealdataset$x
y <- univrealdataset$y2
K <- 5 # Number of regimes (states)
p <- 3 # Dimension of beta (order of the polynomial regressors)
variance_type <- "heteroskedastic" # "heteroskedastic" or "homoskedastic" model
n_tries <- 1
max_iter <- 1500
threshold <- 1e-6
verbose <- TRUE
hmmr <- emHMMR(X = x, Y = y, K, p, variance_type,
n_tries, max_iter, threshold, verbose)
hmmr$summary()
hmmr$plot(what = c("smoothed", "regressors", "loglikelihood"))
```
PWR
```{r, echo = TRUE}
# Application to a toy data set
data("univtoydataset")
x <- univtoydataset$x
y <- univtoydataset$y
K <- 5 # Number of segments
p <- 3 # Polynomial degree
pwr <- fitPWRFisher(X = x, Y = y, K, p)
pwr$summary()
pwr$plot()
```
```{r, echo = TRUE}
# Application to a real data set
data("univrealdataset")
x <- univrealdataset$x
y <- univrealdataset$y2
K <- 5 # Number of segments
p <- 3 # Polynomial degree
pwr <- fitPWRFisher(X = x, Y = y, K, p)
pwr$summary()
pwr$plot()
```
MRHLP
```{r, echo = TRUE}
# Application to a toy data set
data("multivtoydataset")
x <- multivtoydataset$x
y <- multivtoydataset[,c("y1", "y2", "y3")]
K <- 5 # Number of regimes (mixture components)
p <- 1 # Dimension of beta (order of the polynomial regressors)
q <- 1 # Dimension of w (order of the logistic regression: to be set to 1 for segmentation)
variance_type <- "heteroskedastic" # "heteroskedastic" or "homoskedastic" model
n_tries <- 1
max_iter <- 1500
threshold <- 1e-6
verbose <- TRUE
verbose_IRLS <- FALSE
mrhlp <- emMRHLP(X = x, Y = y, K, p, q, variance_type, n_tries,
max_iter, threshold, verbose, verbose_IRLS)
mrhlp$summary()
mrhlp$plot()
```
```{r, echo = TRUE}
# Application to a real data set (human activity recogntion data)
data("multivrealdataset")
x <- multivrealdataset$x
y <- multivrealdataset[,c("y1", "y2", "y3")]
K <- 5 # Number of regimes (mixture components)
p <- 3 # Dimension of beta (order of the polynomial regressors)
q <- 1 # Dimension of w (order of the logistic regression: to be set to 1 for segmentation)
variance_type <- "heteroskedastic" # "heteroskedastic" or "homoskedastic" model
n_tries <- 1
max_iter <- 1500
threshold <- 1e-6
verbose <- TRUE
verbose_IRLS <- FALSE
mrhlp <- emMRHLP(X = x, Y = y, K, p, q, variance_type, n_tries,
max_iter, threshold, verbose, verbose_IRLS)
mrhlp$summary()
mrhlp$plot()
```
MHMMR
```{r, echo = TRUE}
# Application to a simulated data set
data("multivtoydataset")
x <- multivtoydataset$x
y <- multivtoydataset[,c("y1", "y2", "y3")]
K <- 5 # Number of regimes (states)
p <- 1 # Dimension of beta (order of the polynomial regressors)
variance_type <- "heteroskedastic" # "heteroskedastic" or "homoskedastic" model
n_tries <- 1
max_iter <- 1500
threshold <- 1e-6
verbose <- TRUE
mhmmr <- emMHMMR(X = x, Y = y, K, p, variance_type, n_tries,
max_iter, threshold, verbose)
mhmmr$summary()
mhmmr$plot(what = c("smoothed", "regressors", "loglikelihood"))
```
```{r, echo = TRUE}
# Application to a real data set (human activity recognition data)
data("multivrealdataset")
x <- multivrealdataset$x
y <- multivrealdataset[,c("y1", "y2", "y3")]
K <- 5 # Number of regimes (states)
p <- 3 # Dimension of beta (order of the polynomial regressors)
variance_type <- "heteroskedastic" # "heteroskedastic" or "homoskedastic" model
n_tries <- 1
max_iter <- 1500
threshold <- 1e-6
verbose <- TRUE
mhmmr <- emMHMMR(X = x, Y = y, K, p, variance_type, n_tries,
max_iter, threshold, verbose)
mhmmr$summary()
mhmmr$plot(what = c("smoothed", "regressors", "loglikelihood"))
```
# Model selection
samurais also implements model selection procedures to select an optimal model
based on information criteria including **BIC**, **AIC** and **ICL**.
The selection can be done for the two following parameters:
* **K**: The number of regimes (segments);
* **p**: The order of the polynomial regression.
Instructions below can be used to illustrate the model on provided simulated
and real data sets.
RHLP
Let's select a RHLP model for the following time series:
```{r, message = FALSE}
data("univtoydataset")
x = univtoydataset$x
y = univtoydataset$y
plot(x, y, type = "l", xlab = "x", ylab = "Y")
```
```{r, message = FALSE}
selectedrhlp <- selectRHLP(X = x, Y = y, Kmin = 2, Kmax = 6, pmin = 0, pmax = 3)
selectedrhlp$plot(what = "estimatedsignal")
```
HMMR
Let's select a HMMR model for the following time series:
```{r, message = FALSE}
data("univtoydataset")
x = univtoydataset$x
y = univtoydataset$y
plot(x, y, type = "l", xlab = "x", ylab = "Y")
```
```{r, message = FALSE}
selectedhmmr <- selectHMMR(X = x, Y = y, Kmin = 2, Kmax = 6, pmin = 0, pmax = 3)
selectedhmmr$plot(what = "smoothed")
```
MRHLP
Let's select a MRHLP model for the following multivariate time series:
```{r}
data("multivtoydataset")
x <- multivtoydataset$x
y <- multivtoydataset[, c("y1", "y2", "y3")]
matplot(x, y, type = "l", xlab = "x", ylab = "Y", lty = 1)
```
```{r, message = FALSE}
selectedmrhlp <- selectMRHLP(X = x, Y = y, Kmin = 2, Kmax = 6, pmin = 0, pmax = 3)
selectedmrhlp$plot(what = "estimatedsignal")
```
MHMMR
Let's select a MHMMR model for the following multivariate time series:
```{r}
data("multivtoydataset")
x <- multivtoydataset$x
y <- multivtoydataset[, c("y1", "y2", "y3")]
matplot(x, y, type = "l", xlab = "x", ylab = "Y", lty = 1)
```
```{r, message = FALSE}
selectedmhmmr <- selectMHMMR(X = x, Y = y, Kmin = 2, Kmax = 6, pmin = 0, pmax = 3)
selectedmhmmr$plot(what = "smoothed")
```
# References