{"id":32201049,"url":"https://github.com/fchamroukhi/samurais","last_synced_at":"2025-10-22T03:59:21.958Z","repository":{"id":56936781,"uuid":"195705969","full_name":"fchamroukhi/SaMUraiS","owner":"fchamroukhi","description":"StAtistical Models for the UnsupeRvised segmentAion of tIme-Series","archived":false,"fork":false,"pushed_at":"2020-01-22T17:31:47.000Z","size":11331,"stargazers_count":11,"open_issues_count":0,"forks_count":1,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-10-22T03:59:07.070Z","etag":null,"topics":["artificial-intelligence","change-point-detection","data-science","dynamic-programming","em-algorithm","hidden-markov-models","hidden-process-regression","human-activity-recognition","latent-variable-models","model-selection","multivariate-timeseries","newton-raphson","piecewise-regression","statistical-inference","statistical-learning","time-series-analysis","time-series-clustering"],"latest_commit_sha":null,"homepage":"","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/fchamroukhi.png","metadata":{"files":{"readme":"README.Rmd","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-07-07T23:20:22.000Z","updated_at":"2025-03-14T02:01:36.000Z","dependencies_parsed_at":"2022-08-21T01:10:28.925Z","dependency_job_id":null,"html_url":"https://github.com/fchamroukhi/SaMUraiS","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/fchamroukhi/SaMUraiS","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fchamroukhi%2FSaMUraiS","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fchamroukhi%2FSaMUraiS/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fchamroukhi%2FSaMUraiS/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fchamroukhi%2FSaMUraiS/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/fchamroukhi","download_url":"https://codeload.github.com/fchamroukhi/SaMUraiS/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fchamroukhi%2FSaMUraiS/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":280376547,"owners_count":26320275,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-22T02:00:06.515Z","response_time":63,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["artificial-intelligence","change-point-detection","data-science","dynamic-programming","em-algorithm","hidden-markov-models","hidden-process-regression","human-activity-recognition","latent-variable-models","model-selection","multivariate-timeseries","newton-raphson","piecewise-regression","statistical-inference","statistical-learning","time-series-analysis","time-series-clustering"],"created_at":"2025-10-22T03:59:20.931Z","updated_at":"2025-10-22T03:59:21.953Z","avatar_url":"https://github.com/fchamroukhi.png","language":"R","funding_links":[],"categories":[],"sub_categories":[],"readme":"---\noutput: github_document\nbibliography: bibliography.bib\ncsl: chicago-author-date.csl\nnocite: '@*'\n---\n\n\u003c!-- README.md is generated from README.Rmd. Please edit that file --\u003e\n\n```{r, include = FALSE}\nknitr::opts_chunk$set(\n  collapse = TRUE,\n  comment = \"#\u003e\",\n  fig.align = \"center\",\n  fig.path = \"man/figures/README-\"\n)\n```\n\n# **SaMUraiS**: **S**t**A**tistical **M**odels for the **U**nsupe**R**vised segment**A**t**I**on of time-**S**eries\n\n\u003c!-- badges: start --\u003e\n[![Travis build status](https://travis-ci.org/fchamroukhi/SaMUraiS.svg?branch=master)](https://travis-ci.org/fchamroukhi/SaMUraiS)\n[![CRAN versions](https://www.r-pkg.org/badges/version/samurais)](https://CRAN.R-project.org/package=samurais)\n[![CRAN logs](https://cranlogs.r-pkg.org/badges/samurais)](https://CRAN.R-project.org/package=samurais)\n\u003c!-- badges: end --\u003e\n\nsamurais is an open source toolbox (available in R and in Matlab) including \nmany original and flexible user-friendly statistical latent variable models \nand unsupervised algorithms to segment and represent, time-series data \n(univariate or multivariate), and more generally, longitudinal data which \ninclude regime changes.\n\nOur samurais use mainly the following efficient \"sword\" packages to segment \ndata: Regression with Hidden Logistic Process (**RHLP**), Hidden Markov Model\nRegression (**HMMR**), Piece-Wise regression (**PWR**), Multivariate 'RHLP'\n(**MRHLP**), and Multivariate 'HMMR' (**MHMMR**).\n\nThe models and algorithms are developed and written in Matlab by Faicel \nChamroukhi, and translated and designed into R packages by Florian Lecocq, \nMarius Bartcus and Faicel Chamroukhi.\n\n# Installation\n\nYou can install the **samurais** package from\n[GitHub](https://github.com/fchamroukhi/SaMUraiS) with:\n\n```{r, eval = FALSE}\n# install.packages(\"devtools\")\ndevtools::install_github(\"fchamroukhi/SaMUraiS\")\n```\n\nTo build *vignettes* for examples of usage, type the command below instead:\n\n```{r, eval = FALSE}\n# install.packages(\"devtools\")\ndevtools::install_github(\"fchamroukhi/SaMUraiS\", \n                         build_opts = c(\"--no-resave-data\", \"--no-manual\"), \n                         build_vignettes = TRUE)\n```\n\nUse the following command to display vignettes:\n\n```{r, eval = FALSE}\nbrowseVignettes(\"samurais\")\n```\n\n# Usage\n\n```{r, message = FALSE}\nlibrary(samurais)\n```\n\n\u003cdetails\u003e\n  \u003csummary\u003eRHLP\u003c/summary\u003e\n\n```{r, echo = TRUE}\n# Application to a toy data set\ndata(\"univtoydataset\")\nx \u003c- univtoydataset$x\ny \u003c- univtoydataset$y\n\nK \u003c- 5 # Number of regimes (mixture components)\np \u003c- 3 # Dimension of beta (order of the polynomial regressors)\nq \u003c- 1 # Dimension of w (order of the logistic regression: to be set to 1 for segmentation)\nvariance_type \u003c- \"heteroskedastic\" # \"heteroskedastic\" or \"homoskedastic\" model\n\nn_tries \u003c- 1\nmax_iter = 1500\nthreshold \u003c- 1e-6\nverbose \u003c- TRUE\nverbose_IRLS \u003c- FALSE\n\nrhlp \u003c- emRHLP(X = x, Y = y, K, p, q, variance_type, n_tries, \n               max_iter, threshold, verbose, verbose_IRLS)\n\nrhlp$summary()\n\nrhlp$plot()\n```\n\n```{r, echo = TRUE}\n# Application to a real data set\ndata(\"univrealdataset\")\nx \u003c- univrealdataset$x\ny \u003c- univrealdataset$y2\n\nK \u003c- 5 # Number of regimes (mixture components)\np \u003c- 3 # Dimension of beta (order of the polynomial regressors)\nq \u003c- 1 # Dimension of w (order of the logistic regression: to be set to 1 for segmentation)\nvariance_type \u003c- \"heteroskedastic\" # \"heteroskedastic\" or \"homoskedastic\" model\n\nn_tries \u003c- 1\nmax_iter = 1500\nthreshold \u003c- 1e-6\nverbose \u003c- TRUE\nverbose_IRLS \u003c- FALSE\n\nrhlp \u003c- emRHLP(X = x, Y = y, K, p, q, variance_type, n_tries, \n               max_iter, threshold, verbose, verbose_IRLS)\n\nrhlp$summary()\n\nrhlp$plot()\n```\n\u003c/details\u003e\n\n\u003cdetails\u003e\n  \u003csummary\u003eHMMR\u003c/summary\u003e\n\n```{r, echo = TRUE}\n# Application to a toy data set\ndata(\"univtoydataset\")\nx \u003c- univtoydataset$x\ny \u003c- univtoydataset$y\n\nK \u003c- 5 # Number of regimes (states)\np \u003c- 3 # Dimension of beta (order of the polynomial regressors)\nvariance_type \u003c- \"heteroskedastic\" # \"heteroskedastic\" or \"homoskedastic\" model\n\nn_tries \u003c- 1\nmax_iter \u003c- 1500\nthreshold \u003c- 1e-6\nverbose \u003c- TRUE\n\nhmmr \u003c- emHMMR(X = x, Y = y, K, p, variance_type, \n               n_tries, max_iter, threshold, verbose)\n\nhmmr$summary()\n\nhmmr$plot(what = c(\"smoothed\", \"regressors\", \"loglikelihood\"))\n```\n\n\n```{r, echo = TRUE}\n# Application to a real data set\ndata(\"univrealdataset\")\nx \u003c- univrealdataset$x\ny \u003c- univrealdataset$y2\n\nK \u003c- 5 # Number of regimes (states)\np \u003c- 3 # Dimension of beta (order of the polynomial regressors)\nvariance_type \u003c- \"heteroskedastic\" # \"heteroskedastic\" or \"homoskedastic\" model\n\nn_tries \u003c- 1\nmax_iter \u003c- 1500\nthreshold \u003c- 1e-6\nverbose \u003c- TRUE\n\nhmmr \u003c- emHMMR(X = x, Y = y, K, p, variance_type, \n               n_tries, max_iter, threshold, verbose)\n\nhmmr$summary()\n\nhmmr$plot(what = c(\"smoothed\", \"regressors\", \"loglikelihood\"))\n```\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n  \u003csummary\u003ePWR\u003c/summary\u003e\n\n```{r, echo = TRUE}\n# Application to a toy data set\ndata(\"univtoydataset\")\nx \u003c- univtoydataset$x\ny \u003c- univtoydataset$y\n\nK \u003c- 5 # Number of segments\np \u003c- 3 # Polynomial degree\n\npwr \u003c- fitPWRFisher(X = x, Y = y, K, p)\n\npwr$summary()\n\npwr$plot()\n```\n\n\n```{r, echo = TRUE}\n# Application to a real data set\ndata(\"univrealdataset\")\nx \u003c- univrealdataset$x\ny \u003c- univrealdataset$y2\n\nK \u003c- 5 # Number of segments\np \u003c- 3 # Polynomial degree\n\npwr \u003c- fitPWRFisher(X = x, Y = y, K, p)\n\npwr$summary()\n\npwr$plot()\n```\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003eMRHLP\u003c/summary\u003e\n\n```{r, echo = TRUE}\n# Application to a toy data set\ndata(\"multivtoydataset\")\nx \u003c- multivtoydataset$x\ny \u003c- multivtoydataset[,c(\"y1\", \"y2\", \"y3\")]\n\nK \u003c- 5 # Number of regimes (mixture components)\np \u003c- 1 # Dimension of beta (order of the polynomial regressors)\nq \u003c- 1 # Dimension of w (order of the logistic regression: to be set to 1 for segmentation)\nvariance_type \u003c- \"heteroskedastic\" # \"heteroskedastic\" or \"homoskedastic\" model\n\nn_tries \u003c- 1\nmax_iter \u003c- 1500\nthreshold \u003c- 1e-6\nverbose \u003c- TRUE\nverbose_IRLS \u003c- FALSE\n\nmrhlp \u003c- emMRHLP(X = x, Y = y, K, p, q, variance_type, n_tries, \n                 max_iter, threshold, verbose, verbose_IRLS)\n\nmrhlp$summary()\n\nmrhlp$plot()\n```\n\n```{r, echo = TRUE}\n# Application to a real data set (human activity recogntion data)\ndata(\"multivrealdataset\")\nx \u003c- multivrealdataset$x\ny \u003c- multivrealdataset[,c(\"y1\", \"y2\", \"y3\")]\n\nK \u003c- 5 # Number of regimes (mixture components)\np \u003c- 3 # Dimension of beta (order of the polynomial regressors)\nq \u003c- 1 # Dimension of w (order of the logistic regression: to be set to 1 for segmentation)\nvariance_type \u003c- \"heteroskedastic\" # \"heteroskedastic\" or \"homoskedastic\" model\n\nn_tries \u003c- 1\nmax_iter \u003c- 1500\nthreshold \u003c- 1e-6\nverbose \u003c- TRUE\nverbose_IRLS \u003c- FALSE\n\nmrhlp \u003c- emMRHLP(X = x, Y = y, K, p, q, variance_type, n_tries, \n                 max_iter, threshold, verbose, verbose_IRLS)\n\nmrhlp$summary()\n\nmrhlp$plot()\n```\n\u003c/details\u003e\n\n\u003cdetails\u003e\n  \u003csummary\u003eMHMMR\u003c/summary\u003e\n\n```{r, echo = TRUE}\n# Application to a simulated data set\ndata(\"multivtoydataset\")\nx \u003c- multivtoydataset$x\ny \u003c- multivtoydataset[,c(\"y1\", \"y2\", \"y3\")]\n\nK \u003c- 5 # Number of regimes (states)\np \u003c- 1 # Dimension of beta (order of the polynomial regressors)\nvariance_type \u003c- \"heteroskedastic\" # \"heteroskedastic\" or \"homoskedastic\" model\n\nn_tries \u003c- 1\nmax_iter \u003c- 1500\nthreshold \u003c- 1e-6\nverbose \u003c- TRUE\n\nmhmmr \u003c- emMHMMR(X = x, Y = y, K, p, variance_type, n_tries, \n                 max_iter, threshold, verbose)\n\nmhmmr$summary()\n\nmhmmr$plot(what = c(\"smoothed\", \"regressors\", \"loglikelihood\"))\n```\n\n```{r, echo = TRUE}\n# Application to a real data set (human activity recognition data)\ndata(\"multivrealdataset\")\nx \u003c- multivrealdataset$x\ny \u003c- multivrealdataset[,c(\"y1\", \"y2\", \"y3\")]\n\nK \u003c- 5 # Number of regimes (states)\np \u003c- 3 # Dimension of beta (order of the polynomial regressors)\nvariance_type \u003c- \"heteroskedastic\" # \"heteroskedastic\" or \"homoskedastic\" model\n\nn_tries \u003c- 1\nmax_iter \u003c- 1500\nthreshold \u003c- 1e-6\nverbose \u003c- TRUE\n\nmhmmr \u003c- emMHMMR(X = x, Y = y, K, p, variance_type, n_tries, \n                 max_iter, threshold, verbose)\n\nmhmmr$summary()\n\nmhmmr$plot(what = c(\"smoothed\", \"regressors\", \"loglikelihood\"))\n```\n\n\u003c/details\u003e\n\n# Model selection\n\nsamurais also implements model selection procedures to select an optimal model \nbased on information criteria including **BIC**, **AIC** and **ICL**.\n\nThe selection can be done for the two following parameters:\n\n * **K**: The number of regimes (segments);\n * **p**: The order of the polynomial regression.\n\nInstructions below can be used to illustrate the model on provided simulated \nand real data sets.\n\n\u003cdetails\u003e\n  \u003csummary\u003eRHLP\u003c/summary\u003e\n\nLet's select a RHLP model for the following time series:\n\n```{r, message = FALSE}\ndata(\"univtoydataset\")\nx = univtoydataset$x\ny = univtoydataset$y\n\nplot(x, y, type = \"l\", xlab = \"x\", ylab = \"Y\")\n```\n\n```{r, message = FALSE}\nselectedrhlp \u003c- selectRHLP(X = x, Y = y, Kmin = 2, Kmax = 6, pmin = 0, pmax = 3)\n\nselectedrhlp$plot(what = \"estimatedsignal\")\n```\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n  \u003csummary\u003eHMMR\u003c/summary\u003e\n\nLet's select a HMMR model for the following time series:\n\n```{r, message = FALSE}\ndata(\"univtoydataset\")\nx = univtoydataset$x\ny = univtoydataset$y\n\nplot(x, y, type = \"l\", xlab = \"x\", ylab = \"Y\")\n```\n\n```{r, message = FALSE}\nselectedhmmr \u003c- selectHMMR(X = x, Y = y, Kmin = 2, Kmax = 6, pmin = 0, pmax = 3)\n\nselectedhmmr$plot(what = \"smoothed\")\n```\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003eMRHLP\u003c/summary\u003e\n\nLet's select a MRHLP model for the following multivariate time series:\n\n\u003cbr /\u003e\n\n```{r}\ndata(\"multivtoydataset\")\nx \u003c- multivtoydataset$x\ny \u003c- multivtoydataset[, c(\"y1\", \"y2\", \"y3\")]\nmatplot(x, y, type = \"l\", xlab = \"x\", ylab = \"Y\", lty = 1)\n```\n\n```{r, message = FALSE}\nselectedmrhlp \u003c- selectMRHLP(X = x, Y = y, Kmin = 2, Kmax = 6, pmin = 0, pmax = 3)\n\nselectedmrhlp$plot(what = \"estimatedsignal\")\n```\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n  \u003csummary\u003eMHMMR\u003c/summary\u003e\n\nLet's select a MHMMR model for the following multivariate time series:\n\n```{r}\ndata(\"multivtoydataset\")\nx \u003c- multivtoydataset$x\ny \u003c- multivtoydataset[, c(\"y1\", \"y2\", \"y3\")]\nmatplot(x, y, type = \"l\", xlab = \"x\", ylab = \"Y\", lty = 1)\n```\n\n```{r, message = FALSE}\nselectedmhmmr \u003c- selectMHMMR(X = x, Y = y, Kmin = 2, Kmax = 6, pmin = 0, pmax = 3)\n\nselectedmhmmr$plot(what = \"smoothed\")\n```\n\n\u003c/details\u003e\n\n# References\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffchamroukhi%2Fsamurais","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffchamroukhi%2Fsamurais","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffchamroukhi%2Fsamurais/lists"}