{"id":13858291,"url":"https://github.com/tidyverse/dtplyr","last_synced_at":"2025-12-12T01:03:08.351Z","repository":{"id":37514198,"uuid":"53366659","full_name":"tidyverse/dtplyr","owner":"tidyverse","description":"Data table backend for dplyr","archived":false,"fork":false,"pushed_at":"2025-01-24T17:43:43.000Z","size":11043,"stargazers_count":672,"open_issues_count":33,"forks_count":58,"subscribers_count":30,"default_branch":"main","last_synced_at":"2025-04-28T11:52:52.094Z","etag":null,"topics":["datatable","dplyr","r"],"latest_commit_sha":null,"homepage":"https://dtplyr.tidyverse.org","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/tidyverse.png","metadata":{"files":{"readme":"README.Rmd","changelog":"NEWS.md","contributing":".github/CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":".github/CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":".github/SUPPORT.md","governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2016-03-07T23:28:16.000Z","updated_at":"2025-04-04T04:44:29.000Z","dependencies_parsed_at":"2023-07-19T08:31:26.341Z","dependency_job_id":"730ea9bf-a834-4648-b570-539b320eea5b","html_url":"https://github.com/tidyverse/dtplyr","commit_stats":{"total_commits":1643,"total_committers":43,"mean_commits":38.2093023255814,"dds":0.4303104077906269,"last_synced_commit":"0fa0d0459dd284fba48007fde512d0d489f569b5"},"previous_names":["hadley/dtplyr"],"tags_count":12,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tidyverse%2Fdtplyr","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tidyverse%2Fdtplyr/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tidyverse%2Fdtplyr/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tidyverse%2Fdtplyr/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/tidyverse","download_url":"https://codeload.github.com/tidyverse/dtplyr/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254020658,"owners_count":22000757,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["datatable","dplyr","r"],"created_at":"2024-08-05T03:02:03.035Z","updated_at":"2025-12-12T01:03:08.304Z","avatar_url":"https://github.com/tidyverse.png","language":"R","funding_links":[],"categories":["R"],"sub_categories":[],"readme":"---\noutput: github_document\n---\n\n\u003c!-- README.md is generated from README.Rmd. Please edit that file --\u003e\n\n```{r, include = FALSE}\nknitr::opts_chunk$set(\n  collapse = TRUE,\n  comment = \"#\u003e\",\n  fig.path = \"man/figures/README-\",\n  out.width = \"100%\"\n)\n```\n\n# dtplyr \u003ca href='https://dtplyr.tidyverse.org'\u003e\u003cimg src='man/figures/logo.png' align=\"right\" height=\"138\" /\u003e\u003c/a\u003e\n\n\u003c!-- badges: start --\u003e\n[![CRAN status](https://www.r-pkg.org/badges/version/dtplyr)](https://cran.r-project.org/package=dtplyr)\n[![R-CMD-check](https://github.com/tidyverse/dtplyr/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/tidyverse/dtplyr/actions/workflows/R-CMD-check.yaml)\n[![Codecov test coverage](https://codecov.io/gh/tidyverse/dtplyr/graph/badge.svg)](https://app.codecov.io/gh/tidyverse/dtplyr)\n\u003c!-- badges: end --\u003e\n\n## Overview\n\n\u003ca href=\"https://rdatatable-community.github.io/The-Raft/posts/2024-08-01-seal_of_approval-dtplyr/\"\u003e\u003cimg src='man/figures/dt-seal.png' align=\"right\" width=\"200\" height=\"157\" alt=\"data.table seal of approval\"/\u003e\u003c/a\u003edtplyr provides a [data.table](http://r-datatable.com/) backend for dplyr. The goal of dtplyr is to allow you to write dplyr code that is automatically translated to the equivalent, but usually much faster, data.table code.\n\nSee `vignette(\"translation\")` for details of the current translations, and  [table.express](https://github.com/asardaes/table.express) and [rqdatatable](https://github.com/WinVector/rqdatatable/) for related work.\n\n## Installation\n\nYou can install from CRAN with:\n\n```R\ninstall.packages(\"dtplyr\")\n```\n\nOr try the development version from GitHub with:\n\n```R\n# install.packages(\"pak\")\npak::pak(\"tidyverse/dtplyr\")\n```\n\n## Usage\n\nTo use dtplyr, you must at least load dtplyr and dplyr. You may also want to load [data.table](http://r-datatable.com/) so you can access the other goodies that it provides:\n\n```{r setup}\nlibrary(data.table)\nlibrary(dtplyr)\nlibrary(dplyr, warn.conflicts = FALSE)\n```\n\nThen use `lazy_dt()` to create a \"lazy\" data table that tracks the operations performed on it.\n\n```{r}\nmtcars2 \u003c- lazy_dt(mtcars)\n```\n\nYou can preview the transformation (including the generated data.table code) by printing the result:\n\n```{r}\nmtcars2 %\u003e%\n  filter(wt \u003c 5) %\u003e%\n  mutate(l100k = 235.21 / mpg) %\u003e% # liters / 100 km\n  group_by(cyl) %\u003e%\n  summarise(l100k = mean(l100k))\n```\n\nBut generally you should reserve this only for debugging, and use `as.data.table()`, `as.data.frame()`, or `as_tibble()` to indicate that you're done with the transformation and want to access the results:\n\n```{r}\nmtcars2 %\u003e%\n  filter(wt \u003c 5) %\u003e%\n  mutate(l100k = 235.21 / mpg) %\u003e% # liters / 100 km\n  group_by(cyl) %\u003e%\n  summarise(l100k = mean(l100k)) %\u003e%\n  as_tibble()\n```\n\n## Why is dtplyr slower than data.table?\n\nThere are two primary reasons that dtplyr will always be somewhat slower than data.table:\n\n* Each dplyr verb must do some work to convert dplyr syntax to data.table\n  syntax. This takes time proportional to the complexity of the input code,\n  not the input _data_, so should be a negligible overhead for large datasets.\n  [Initial benchmarks][benchmark] suggest that the overhead should be under\n  1ms per dplyr call.\n\n* To match dplyr semantics, `mutate()` does not modify in place by default.\n  This means that most expressions involving `mutate()` must make a copy\n  that would not be necessary if you were using data.table directly.\n  (You can opt out of this behaviour in `lazy_dt()` with `immutable = FALSE`).\n\n[benchmark]: https://dtplyr.tidyverse.org/articles/translation.html#performance\n\n## Code of Conduct\n\nPlease note that the dtplyr project is released with a [Contributor Code of Conduct](https://dtplyr.tidyverse.org/CODE_OF_CONDUCT.html). By contributing to this project, you agree to abide by its terms.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftidyverse%2Fdtplyr","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftidyverse%2Fdtplyr","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftidyverse%2Fdtplyr/lists"}