{"id":18612380,"url":"https://github.com/winvector/rqdatatable","last_synced_at":"2025-07-26T05:10:33.183Z","repository":{"id":56934206,"uuid":"135368264","full_name":"WinVector/rqdatatable","owner":"WinVector","description":"Implement the rquery piped query algebra in R using data.table. Distributed under choice of GPL-2 or GPL-3 license.","archived":false,"fork":false,"pushed_at":"2023-08-20T05:23:41.000Z","size":42384,"stargazers_count":38,"open_issues_count":0,"forks_count":3,"subscribers_count":9,"default_branch":"main","last_synced_at":"2025-03-25T06:12:26.788Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://winvector.github.io/rqdatatable/","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/WinVector.png","metadata":{"files":{"readme":"README.Rmd","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-05-30T00:49:26.000Z","updated_at":"2025-01-02T20:52:20.000Z","dependencies_parsed_at":"2024-06-21T13:09:54.193Z","dependency_job_id":null,"html_url":"https://github.com/WinVector/rqdatatable","commit_stats":{"total_commits":406,"total_committers":1,"mean_commits":406.0,"dds":0.0,"last_synced_commit":"66838f888aad1373012428bc68eb306de8d8917d"},"previous_names":[],"tags_count":29,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/WinVector%2Frqdatatable","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/WinVector%2Frqdatatable/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/WinVector%2Frqdatatable/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/WinVector%2Frqdatatable/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/WinVector","download_url":"https://codeload.github.com/WinVector/rqdatatable/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248316041,"owners_count":21083369,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-07T03:16:51.407Z","updated_at":"2025-04-10T23:31:16.302Z","avatar_url":"https://github.com/WinVector.png","language":"R","readme":"---\noutput: github_document\n---\n\n\u003c!-- README.md is generated from README.Rmd. Please edit that file --\u003e\n\n[![CRAN_Status_Badge](https://www.r-pkg.org/badges/version/rqdatatable)](https://cran.r-project.org/package=rqdatatable)\n[![status](https://tinyverse.netlify.com/badge/rqdatatable)](https://CRAN.R-project.org/package=rqdatatable)\n\n\n![](https://github.com/WinVector/rqdatatable/raw/master/tools/rqdatatable.png)\n\n[`rqdatatable`](https://github.com/WinVector/rqdatatable) is an implementation of\nthe [`rquery`](https://github.com/WinVector/rquery) piped Codd-style relational algebra \nhosted on [`data.table`](https://rdatatable.gitlab.io/data.table/).  `rquery` allow the expression\nof complex transformations as a series of relational operators and\n`rqdatatable` implements the operators using `data.table`.\n\nA `Python` version of `rquery`/`rqdatatable` is under initial development as [`data_algebra`](https://github.com/WinVector/data_algebra).\n\nFor example \nscoring a logistic regression model (which requires grouping, ordering, and ranking)\nis organized as follows.  For more on this example please see \n[\"Let’s Have Some Sympathy For The Part-time R User\"](https://win-vector.com/2017/08/04/lets-have-some-sympathy-for-the-part-time-r-user/).\n\n```{r}\nlibrary(\"rqdatatable\")\n```\n\n\n```{r}\n# data example\ndL \u003c- build_frame(\n   \"subjectID\", \"surveyCategory\"     , \"assessmentTotal\" |\n   1          , \"withdrawal behavior\", 5                 |\n   1          , \"positive re-framing\", 2                 |\n   2          , \"withdrawal behavior\", 3                 |\n   2          , \"positive re-framing\", 4                 )\n```\n\n\n```{r}\nscale \u003c- 0.237\n\n# example rquery pipeline\nrquery_pipeline \u003c- local_td(dL) %.\u003e%\n  extend_nse(.,\n             probability :=\n               exp(assessmentTotal * scale))  %.\u003e% \n  normalize_cols(.,\n                 \"probability\",\n                 partitionby = 'subjectID') %.\u003e%\n  pick_top_k(.,\n             k = 1,\n             partitionby = 'subjectID',\n             orderby = c('probability', 'surveyCategory'),\n             reverse = c('probability', 'surveyCategory')) %.\u003e% \n  rename_columns(., c('diagnosis' = 'surveyCategory')) %.\u003e%\n  select_columns(., c('subjectID', \n                      'diagnosis', \n                      'probability')) %.\u003e%\n  orderby(., cols = 'subjectID')\n```\n\nWe can show the expanded form of query tree.\n\n```{r, comment=\"\"}\ncat(format(rquery_pipeline))\n```\n\nAnd execute it using `data.table`.\n\n```{r}\nex_data_table(rquery_pipeline)\n```\n\nOne can also apply the pipeline to new tables.\n\n```{r}\nbuild_frame(\n   \"subjectID\", \"surveyCategory\"     , \"assessmentTotal\" |\n   7          , \"withdrawal behavior\", 5                 |\n   7          , \"positive re-framing\", 20                ) %.\u003e%\n  rquery_pipeline\n```\n\n\nInitial bench-marking of `rqdatatable` is very favorable (notes [here](https://win-vector.com/2018/06/03/rqdatatable-rquery-powered-by-data-table/)).\n\nTo install `rqdatatable` please use `install.packages(\"rqdatatable\")`.\n\nSome related work includes:\n\n * [`data.table`](https://rdatatable.gitlab.io/data.table/)\n * [`Polars`](https://www.pola.rs)\n * [`data algebra`](https://github.com/WinVector/data_algebra)\n * [`disk.frame`](https://github.com/DiskFrame/disk.frame)\n * [`dbplyr`](https://dbplyr.tidyverse.org)\n * [`dplyr`](https://dplyr.tidyverse.org)\n * [`dtplyr`](https://github.com/tidyverse/dtplyr)\n * [`maditr`](https://github.com/gdemin/maditr)\n * [`nc`](https://github.com/tdhock/nc)\n * [`poorman`](https://github.com/nathaneastwood/poorman)\n * [`rquery`](https://github.com/WinVector/rquery)\n * [`SparkR`]( https://CRAN.R-project.org/package=SparkR)\n * [`sparklyr`](https://spark.rstudio.com)\n * [`sqldf`](https://github.com/ggrothendieck/sqldf)\n * [`table.express`](https://github.com/asardaes/table.express)\n * [`tidyfast`](https://github.com/TysonStanley/tidyfast)\n * [`tidyfst`](https://github.com/hope-data-science/tidyfst)\n * [`tidyquery`](https://github.com/ianmcook/tidyquery)\n * [`tidyr`](https://tidyr.tidyverse.org)\n * [`tidytable`](https://github.com/markfairbanks/tidytable) (formerly `gdt`/`tidydt`)\n\n--\n\nNote `rqdatatable` has an \"immediate mode\" which allows direct application of pipelines stages without\npre-assembling the pipeline. \"Immediate mode\" is a convenience for ad-hoc analyses, and has some negative\nperformance impact, so we encourage users to build pipelines for most work.  Some notes on the issue can be found\n[here](https://github.com/WinVector/rqdatatable/blob/master/extras/ImmediateIssue.md).\n\n`rqdatatable` implements the `rquery` grammar in the style of a \"Turing or Cook reduction\" (implementing the result in terms of multiple oracle calls to the related system).\n\n`rqdatatable` is intended for \"simple column names\", in particular as `rqdatatable` often uses `eval()` to work over `data.table` escape characters such as \"`\\`\" and \"`\\\\`\" are not reliable in column names.  Also `rqdatatable` does not support tables with no columns.\n\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwinvector%2Frqdatatable","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fwinvector%2Frqdatatable","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwinvector%2Frqdatatable/lists"}