{"id":18520162,"url":"https://github.com/mlr-org/mlr3forecast","last_synced_at":"2025-04-09T09:32:41.223Z","repository":{"id":248905213,"uuid":"829542113","full_name":"mlr-org/mlr3forecast","owner":"mlr-org","description":"Time series forecasting for mlr3","archived":false,"fork":false,"pushed_at":"2025-03-30T18:00:21.000Z","size":2390,"stargazers_count":5,"open_issues_count":5,"forks_count":0,"subscribers_count":5,"default_branch":"main","last_synced_at":"2025-04-02T12:12:50.535Z","etag":null,"topics":["forecasting","machine-learning","mlr3","r","r-package","time-series"],"latest_commit_sha":null,"homepage":"http://mlr3forecast.mlr-org.com/","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"lgpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mlr-org.png","metadata":{"files":{"readme":"README.Rmd","changelog":"NEWS.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null},"funding":{"github":"mlr-org"}},"created_at":"2024-07-16T16:33:00.000Z","updated_at":"2025-03-30T17:58:22.000Z","dependencies_parsed_at":"2025-01-17T12:28:48.340Z","dependency_job_id":"f8a18168-720d-4227-b824-b4885b63c0ae","html_url":"https://github.com/mlr-org/mlr3forecast","commit_stats":null,"previous_names":["mlr-org/mlr3forecast"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mlr-org%2Fmlr3forecast","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mlr-org%2Fmlr3forecast/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mlr-org%2Fmlr3forecast/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mlr-org%2Fmlr3forecast/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mlr-org","download_url":"https://codeload.github.com/mlr-org/mlr3forecast/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248012599,"owners_count":21033226,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["forecasting","machine-learning","mlr3","r","r-package","time-series"],"created_at":"2024-11-06T17:18:49.406Z","updated_at":"2025-04-09T09:32:41.216Z","avatar_url":"https://github.com/mlr-org.png","language":"R","funding_links":["https://github.com/sponsors/mlr-org"],"categories":[],"sub_categories":[],"readme":"---\noutput: github_document\n---\n\n```{r, include = FALSE}\nknitr::opts_chunk$set(\n  collapse = TRUE,\n  comment = \"#\u003e\",\n  fig.path = \"man/figures/README-\",\n  out.width = \"100%\"\n)\n\nlgr::get_logger(\"mlr3\")$set_threshold(\"warn\")\noptions(datatable.print.class = FALSE, datatable.print.keys = FALSE)\nlibrary(data.table)\nlibrary(mlr3misc)\n```\n\n\n# mlr3forecast\n\nExtending mlr3 to time series forecasting.\n\n\u003c!-- badges: start --\u003e\n[![Lifecycle: experimental](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](https://lifecycle.r-lib.org/articles/stages.html#experimental)\n[![RCMD Check](https://github.com/mlr-org/mlr3forecast/actions/workflows/rcmdcheck.yaml/badge.svg)](https://github.com/mlr-org/mlr3forecast/actions/workflows/rcmdcheck.yaml)\n[![CRAN status](https://www.r-pkg.org/badges/version/mlr3forecast)](https://CRAN.R-project.org/package=mlr3forecast)\n[![StackOverflow](https://img.shields.io/badge/stackoverflow-mlr3-orange.svg)](https://stackoverflow.com/questions/tagged/mlr3)\n[![Mattermost](https://img.shields.io/badge/chat-mattermost-orange.svg)](https://lmmisld-lmu-stats-slds.srv.mwn.de/mlr_invite/)\n\u003c!-- badges: end --\u003e\n\n\u003e This package is in an early stage of development and should be considered experimental.\n\u003e If you are interested in experimenting with it, we welcome your feedback!\n\n## Installation\n\nInstall the development version from [GitHub](https://github.com/):\n\n```{r, eval = FALSE}\n# install.packages(\"pak\")\npak::pak(\"mlr-org/mlr3forecast\")\n```\n\n## Usage\n\nThe goal of mlr3forecast is to extend mlr3 to time series forecasting.\nThis is achieved by introducing new classes and methods for forecasting tasks,\nlearners, and resamplers. For now the forecasting task and learner is restricted\nto time series regression tasks, but might be extended to classification tasks\nin the future.\n\nWe have two goals, one to support traditional forecasting learners and the\nother to support to support machine learning forecasting, i.e. using regression\nlearners and applying them to forecasting tasks. The design of the latter is\nstill in flux and may change.\n\n### Example: forecasting with forecast learner\n\nCurrently, we support native forecasting learners from the forecast package.\nIn the future, we plan to support more forecasting learners.\n\n```{r, message = FALSE}\nlibrary(mlr3forecast)\n\ntask = tsk(\"airpassengers\")\nlearner = lrn(\"fcst.auto_arima\")$train(task)\nprediction = learner$predict(task, 140:144)\nprediction$score(msr(\"regr.rmse\"))\nnewdata = generate_newdata(task, 12L)\nlearner$predict_newdata(newdata, task)\n\n# works with quantile response\nlearner = lrn(\"fcst.auto_arima\",\n  predict_type = \"quantiles\",\n  quantiles = c(0.1, 0.15, 0.5, 0.85, 0.9),\n  quantile_response = 0.5\n)$train(task)\nlearner$predict_newdata(newdata, task)\n```\n\n### Example: forecasting with regression learner\n\n```{r, message = FALSE}\nlibrary(mlr3learners)\n\ntask = tsk(\"airpassengers\")\n# we have to remove the date feature for regression learners\ntask$select(setdiff(task$feature_names, \"date\"))\nflrn = ForecastLearner$new(lrn(\"regr.ranger\"), 1:12)$train(task)\nnewdata = data.frame(passengers = rep(NA_real_, 3L))\nprediction = flrn$predict_newdata(newdata, task)\nprediction\nprediction = flrn$predict(task, 142:144)\nprediction\nprediction$score(msr(\"regr.rmse\"))\n\nflrn = ForecastLearner$new(lrn(\"regr.ranger\"), 1:12)\nresampling = rsmp(\"forecast_holdout\", ratio = 0.9)\nrr = resample(task, flrn, resampling)\nrr$aggregate(msr(\"regr.rmse\"))\n\nresampling = rsmp(\"forecast_cv\")\nrr = resample(task, flrn, resampling)\nrr$aggregate(msr(\"regr.rmse\"))\n```\n\nOr with some feature engineering using mlr3pipelines:\n\n```{r}\nlibrary(mlr3pipelines)\n\ngraph = ppl(\"convert_types\", \"Date\", \"POSIXct\") %\u003e\u003e%\n  po(\"datefeatures\",\n    param_vals = list(\n      week_of_year = FALSE,\n      day_of_year = FALSE,\n      day_of_month = FALSE,\n      day_of_week = FALSE,\n      is_day = FALSE,\n      hour = FALSE,\n      minute = FALSE,\n      second = FALSE\n    )\n  )\ntask = tsk(\"airpassengers\")\nflrn = ForecastLearner$new(lrn(\"regr.ranger\"), 1:12)\nglrn = as_learner(graph %\u003e\u003e% flrn)$train(task)\nprediction = glrn$predict(task, 142:144)\nprediction$score(msr(\"regr.rmse\"))\n```\n\n### Example: forecasting electricity demand\n\n```{r, message = FALSE}\nlibrary(mlr3learners)\nlibrary(mlr3pipelines)\n\ntask = tsk(\"electricity\")\ngraph = ppl(\"convert_types\", \"Date\", \"POSIXct\") %\u003e\u003e%\n  po(\"datefeatures\",\n    param_vals = list(\n      year = FALSE,\n      is_day = FALSE,\n      hour = FALSE,\n      minute = FALSE,\n      second = FALSE\n    )\n  )\nflrn = ForecastLearner$new(lrn(\"regr.ranger\"), 1:3)\nglrn = as_learner(graph %\u003e\u003e% flrn)$train(task)\n\nmax_date = task$data()[.N, date]\nnewdata = data.frame(\n  date = max_date + 1:14,\n  demand = rep(NA_real_, 14L),\n  temperature = 26,\n  holiday = c(TRUE, rep(FALSE, 13L))\n)\nprediction = glrn$predict_newdata(newdata, task)\nprediction\n```\n\n### Example: global forecasting (longitudinal data)\n\n```{r, message = FALSE}\nlibrary(mlr3learners)\nlibrary(mlr3pipelines)\nlibrary(tsibble)\n\ntask = tsibbledata::aus_livestock |\u003e\n  as.data.table() |\u003e\n  setnames(tolower) |\u003e\n  _[, month := as.Date(month)] |\u003e\n  _[, .(count = sum(count)), by = .(state, month)] |\u003e\n  setorder(state, month) |\u003e\n  as_task_fcst(\n    id = \"aus_livestock\",\n    target = \"count\",\n    order = \"month\",\n    key = \"state\",\n    freq = \"monthly\"\n  )\n\ngraph = ppl(\"convert_types\", \"Date\", \"POSIXct\") %\u003e\u003e%\n  po(\"datefeatures\",\n    param_vals = list(\n      week_of_year = FALSE,\n      day_of_week = FALSE,\n      day_of_month = FALSE,\n      day_of_year = FALSE,\n      is_day = FALSE,\n      hour = FALSE,\n      minute = FALSE,\n      second = FALSE\n    )\n  )\ntask = graph$train(task)[[1L]]\n\nflrn = ForecastLearner$new(lrn(\"regr.ranger\"), 1:3)$train(task)\nprediction = flrn$predict(task, 4460:4464)\nprediction$score(msr(\"regr.rmse\"))\n\nflrn = ForecastLearner$new(lrn(\"regr.ranger\"), 1:3)\nresampling = rsmp(\"forecast_holdout\", ratio = 0.9)\nrr = resample(task, flrn, resampling)\nrr$aggregate(msr(\"regr.rmse\"))\n```\n\n### Example: global vs local forecasting\n\nIn machine learning forecasting the difference between forecasting a time series\nand longitudinal data is often refered to local and global forecasting.\n\n```{r, eval = FALSE}\n# TODO: find better task example, since the effect is minor here\n\ngraph = ppl(\"convert_types\", \"Date\", \"POSIXct\") %\u003e\u003e%\n  po(\"datefeatures\",\n    param_vals = list(\n      week_of_year = FALSE,\n      day_of_week = FALSE,\n      day_of_month = FALSE,\n      day_of_year = FALSE,\n      is_day = FALSE,\n      hour = FALSE,\n      minute = FALSE,\n      second = FALSE\n    )\n  )\n\n# local forecasting\ntask = tsibbledata::aus_livestock |\u003e\n  as.data.table() |\u003e\n  setnames(tolower) |\u003e\n  _[, month := as.Date(month)] |\u003e\n  _[state == \"Western Australia\", .(count = sum(count)), by = .(month)] |\u003e\n  setorder(month) |\u003e\n  as_task_fcst(id = \"aus_livestock\", target = \"count\", order = \"month\")\ntask = graph$train(task)[[1L]]\nflrn = ForecastLearner$new(lrn(\"regr.ranger\"), 1L)$train(task)\ntab = task$backend$data(\n  rows = task$row_ids,\n  cols = c(task$backend$primary_key, \"month.year\")\n)\nsetnames(tab, c(\"row_id\", \"year\"))\nrow_ids = tab[year \u003e= 2015, row_id]\nprediction = flrn$predict(task, row_ids)\nprediction$score(msr(\"regr.rmse\"))\n\n# global forecasting\ntask = tsibbledata::aus_livestock |\u003e\n  as.data.table() |\u003e\n  setnames(tolower) |\u003e\n  _[, month := as.Date(month)] |\u003e\n  _[, .(count = sum(count)), by = .(state, month)] |\u003e\n  setorder(state, month) |\u003e\n  as_task_fcst(id = \"aus_livestock\", target = \"count\", order = \"month\", key = \"state\")\ntask = graph$train(task)[[1L]]\ntask$col_roles$key = \"state\"\nflrn = ForecastLearner$new(lrn(\"regr.ranger\"), 1L)$train(task)\ntab = task$backend$data(\n  rows = task$row_ids,\n  cols = c(task$backend$primary_key, \"month.year\", \"state\")\n)\nsetnames(tab, c(\"row_id\", \"year\", \"state\"))\nrow_ids = tab[year \u003e= 2015 \u0026 state == \"Western Australia\", row_id]\nprediction = flrn$predict(task, row_ids)\nprediction$score(msr(\"regr.rmse\"))\n```\n\n### Example: Custom PipeOps\n\n```{r, eval = FALSE}\nlibrary(mlr3learners)\nlibrary(mlr3pipelines)\n\ntask = tsk(\"airpassengers\")\npop = po(\"fcst.lag\", lags = 1:12)\nnew_task = pop$train(list(task))[[1L]]\nnew_task$data()\n\ntask = tsk(\"airpassengers\")\ngraph = po(\"fcst.lag\", lags = 1:12) %\u003e\u003e%\n  ppl(\"convert_types\", \"Date\", \"POSIXct\") %\u003e\u003e%\n  po(\"datefeatures\",\n    param_vals = list(\n      week_of_year = FALSE,\n      day_of_week = FALSE,\n      day_of_month = FALSE,\n      day_of_year = FALSE,\n      is_day = FALSE,\n      hour = FALSE,\n      minute = FALSE,\n      second = FALSE\n    )\n  )\nflrn = ForecastRecursiveLearner$new(lrn(\"regr.ranger\"))\nglrn = as_learner(graph %\u003e\u003e% flrn)$train(task)\nprediction = glrn$predict(task, 142:144)\nprediction$score(msr(\"regr.rmse\"))\n\nnewdata = generate_newdata(task, 12L)\nglrn$predict_newdata(newdata, task)\n```\n\n### Example: common target transformations\n\nSome common target transformations in forecasting are:\n\n- differencing (WIP)\n- log transformation, see example below\n- power transformations such as [Box-Cox](https://mlr3pipelines.mlr-org.com/reference/mlr_pipeops_boxcox.html) and [Yeo-Johnson](https://mlr3pipelines.mlr-org.com/reference/mlr_pipeops_yeojohnson.html)\n  currently only supported as feature transformation and not target\n- scaling/normalization, available see [here](https://mlr3pipelines.mlr-org.com/reference/mlr_pipeops_targettrafoscalerange.html)\n\n```{r, eval = FALSE}\ntrafo = po(\"targetmutate\",\n  param_vals = list(\n    trafo = function(x) log(x),\n    inverter = function(x) list(response = exp(x$response))\n  )\n)\n\ngraph = po(\"fcst.lag\", lags = 1:12) %\u003e\u003e%\n  ppl(\"convert_types\", \"Date\", \"POSIXct\") %\u003e\u003e%\n  po(\"datefeatures\",\n    param_vals = list(\n      week_of_year = FALSE,\n      day_of_week = FALSE,\n      day_of_month = FALSE,\n      day_of_year = FALSE,\n      is_day = FALSE,\n      hour = FALSE,\n      minute = FALSE,\n      second = FALSE\n    )\n  )\n\ntask = tsk(\"airpassengers\")\nflrn = ForecastRecursiveLearner$new(lrn(\"regr.ranger\"))\nglrn = as_learner(graph %\u003e\u003e% flrn)\npipeline = ppl(\"targettrafo\", graph = glrn, trafo_pipeop = trafo)\nglrn = as_learner(pipeline)$train(task)\nprediction = glrn$predict(task, 142:144)\nprediction$score(msr(\"regr.rmse\"))\n```\n\n```{r, eval = FALSE}\ngraph = po(\"fcst.lag\", lags = 1:12) %\u003e\u003e%\n  ppl(\"convert_types\", \"Date\", \"POSIXct\") %\u003e\u003e%\n  po(\"datefeatures\",\n    param_vals = list(\n      week_of_year = FALSE,\n      day_of_week = FALSE,\n      day_of_month = FALSE,\n      day_of_year = FALSE,\n      is_day = FALSE,\n      hour = FALSE,\n      minute = FALSE,\n      second = FALSE\n    )\n  )\n\ntask = tsk(\"airpassengers\")\nflrn = ForecastRecursiveLearner$new(lrn(\"regr.ranger\"))\nglrn = as_learner(graph %\u003e\u003e% flrn)\ntrafo = po(\"fcst.targetdiff\", lags = 12L)\npipeline = ppl(\"targettrafo\", graph = glrn, trafo_pipeop = trafo)\nglrn = as_learner(pipeline)$train(task)\nprediction = glrn$predict(task, 142:144)\nprediction$score(msr(\"regr.rmse\"))\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmlr-org%2Fmlr3forecast","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmlr-org%2Fmlr3forecast","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmlr-org%2Fmlr3forecast/lists"}