{"id":15284118,"url":"https://github.com/mlr-org/mlr3db","last_synced_at":"2026-02-28T16:02:04.678Z","repository":{"id":46049535,"uuid":"151068496","full_name":"mlr-org/mlr3db","owner":"mlr-org","description":"Data Backends to let mlr3 work transparently with (remote) data bases","archived":false,"fork":false,"pushed_at":"2026-02-27T19:14:05.000Z","size":3871,"stargazers_count":23,"open_issues_count":4,"forks_count":1,"subscribers_count":14,"default_branch":"main","last_synced_at":"2026-02-27T23:43:34.155Z","etag":null,"topics":["bigquery","data-backend","database","duckdb","machine-learning","mariadb","mlr3","mysql","odbc","postgresql","r","r-package","spark","sqlite"],"latest_commit_sha":null,"homepage":"https://mlr3db.mlr-org.com","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"lgpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mlr-org.png","metadata":{"funding":{"github":"mlr-org"},"files":{"readme":"README.Rmd","changelog":"NEWS.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2018-10-01T10:03:12.000Z","updated_at":"2025-10-10T08:33:02.000Z","dependencies_parsed_at":"2025-04-12T23:24:53.546Z","dependency_job_id":"f27d4bb2-5a4c-4f76-a2a8-a4c5715bbbdd","html_url":"https://github.com/mlr-org/mlr3db","commit_stats":{"total_commits":217,"total_committers":5,"mean_commits":43.4,"dds":"0.18433179723502302","last_synced_commit":"f0e0355d288182522cf54cb7121a5b35c3fafc5d"},"previous_names":[],"tags_count":12,"template":false,"template_full_name":null,"purl":"pkg:github/mlr-org/mlr3db","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mlr-org%2Fmlr3db","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mlr-org%2Fmlr3db/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mlr-org%2Fmlr3db/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mlr-org%2Fmlr3db/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mlr-org","download_url":"https://codeload.github.com/mlr-org/mlr3db/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mlr-org%2Fmlr3db/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29941797,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-28T13:49:17.081Z","status":"ssl_error","status_checked_at":"2026-02-28T13:48:50.396Z","response_time":90,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bigquery","data-backend","database","duckdb","machine-learning","mariadb","mlr3","mysql","odbc","postgresql","r","r-package","spark","sqlite"],"created_at":"2024-09-30T14:49:50.411Z","updated_at":"2026-02-28T16:02:04.647Z","avatar_url":"https://github.com/mlr-org.png","language":"R","funding_links":["https://github.com/sponsors/mlr-org"],"categories":[],"sub_categories":[],"readme":"---\noutput: github_document\n---\n\n```{r, include = FALSE}\nknitr::opts_chunk$set(\n  collapse = TRUE,\n  comment = \"#\u003e\",\n  fig.path = \"man/figures/README-\",\n  out.width = \"100%\"\n)\nlgr::get_logger(\"mlr3\")$set_threshold(\"warn\")\n```\n\n# mlr3db\n\n\u003c!-- badges: start --\u003e\n[![r-cmd-check](https://github.com/mlr-org/mlr3db/actions/workflows/r-cmd-check.yml/badge.svg)](https://github.com/mlr-org/mlr3db/actions/workflows/r-cmd-check.yml)\n[![CRAN Status](https://www.r-pkg.org/badges/version-ago/mlr3db)](https://cran.r-project.org/package=mlr3db)\n[![Mattermost](https://img.shields.io/badge/chat-mattermost-orange.svg)](https://lmmisld-lmu-stats-slds.srv.mwn.de/mlr_invite/)\n\u003c!-- badges: end --\u003e\n\nPackage website: [release](https://mlr3db.mlr-org.com/) | [dev](https://mlr3db.mlr-org.com/dev/)\n\nExtends the [mlr3](https://mlr3.mlr-org.com/) package with a DataBackend to transparently work with databases.\nThree additional backends are currently implemented:\n\n* `DataBackendDplyr`: Relies internally on the abstraction of [dplyr](https://dplyr.tidyverse.org/) and [dbplyr](https://dbplyr.tidyverse.org/).\n    This allows working on a broad range of DBMS, such as SQLite, MySQL, MariaDB, or PostgreSQL.\n* `DataBackendDuckDB`: Connector to [duckdb](https://cran.r-project.org/package=duckdb).\n  This includes support for Parquet files (see example below).\n* `DataBackendPolars`: Connector to [polars](https://pola-rs.github.io/r-polars/).\n\nTo construct the backends, you have to establish a connection to the DBMS yourself with the [DBI](https://cran.r-project.org/package=DBI) package.\nFor the serverless SQLite and DuckDB, we provide the converters `as_sqlite_backend()` and `as_duckdb_backend()`.\n\n\n## Installation\n\nYou can install the released version of mlr3db from [CRAN](https://CRAN.R-project.org) with:\n\n```{r, eval = FALSE}\ninstall.packages(\"mlr3db\")\n```\n\nAnd the development version from [GitHub](https://github.com/) with:\n\n```{r, eval = FALSE}\n# install.packages(\"devtools\")\ndevtools::install_github(\"mlr-org/mlr3db\")\n```\n\n## Example\n\n### DataBackendDplyr\n\n```{r}\nlibrary(\"mlr3db\")\n\n# Create a classification task:\ntask = tsk(\"spam\")\n\n# Convert the task backend from a in-memory backend (DataBackendDataTable)\n# to an out-of-memory SQLite backend via DataBackendDplyr.\n# A temporary directory is used here to store the database files.\ntask$backend = as_sqlite_backend(task$backend, path = tempfile())\n\n# Resample a classification tree using a 3-fold CV.\n# The requested data will be queried and fetched from the database in the background.\nresample(task, lrn(\"classif.rpart\"), rsmp(\"cv\", folds = 3))\n```\n\n### DataBackendDuckDB\n\n```{r}\nlibrary(\"mlr3db\")\n\n# Get an example parquet file from the package install directory:\n# spam dataset (tsk(\"spam\")) stored as parquet file\nfile = system.file(file.path(\"extdata\", \"spam.parquet\"), package = \"mlr3db\")\n\n# Create a backend on the file\nbackend = as_duckdb_backend(file)\n\n# Construct classification task on the constructed backend\ntask = as_task_classif(backend, target = \"type\")\n\n# Resample a classification tree using a 3-fold CV.\n# The requested data will be queried and fetched from the database in the background.\nresample(task, lrn(\"classif.rpart\"), rsmp(\"cv\", folds = 3))\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmlr-org%2Fmlr3db","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmlr-org%2Fmlr3db","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmlr-org%2Fmlr3db/lists"}