{"id":2189144,"url":"https://github.com/emilyriederer/dbtplyr","last_synced_at":"2026-02-25T14:41:04.612Z","repository":{"id":40340549,"uuid":"336423410","full_name":"emilyriederer/dbtplyr","owner":"emilyriederer","description":"dbt package mimicking dplyr select-helpers semantics","archived":false,"fork":false,"pushed_at":"2024-08-22T23:54:37.000Z","size":661,"stargazers_count":141,"open_issues_count":3,"forks_count":10,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-10-11T22:09:08.097Z","etag":null,"topics":["dbt","dplyr","macros","sql"],"latest_commit_sha":null,"homepage":"https://emilyriederer.github.io/dbtplyr","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/emilyriederer.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-02-06T00:40:16.000Z","updated_at":"2025-10-06T09:34:47.000Z","dependencies_parsed_at":"2024-08-23T00:58:46.577Z","dependency_job_id":"d6d5166d-ffd2-4e80-96e7-9e6d4868293d","html_url":"https://github.com/emilyriederer/dbtplyr","commit_stats":null,"previous_names":[],"tags_count":7,"template":false,"template_full_name":null,"purl":"pkg:github/emilyriederer/dbtplyr","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/emilyriederer%2Fdbtplyr","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/emilyriederer%2Fdbtplyr/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/emilyriederer%2Fdbtplyr/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/emilyriederer%2Fdbtplyr/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/emilyriederer","download_url":"https://codeload.github.com/emilyriederer/dbtplyr/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/emilyriederer%2Fdbtplyr/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":281043400,"owners_count":26434444,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-25T02:00:06.499Z","response_time":81,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["dbt","dplyr","macros","sql"],"created_at":"2024-01-22T11:01:06.305Z","updated_at":"2025-10-26T00:56:45.006Z","avatar_url":"https://github.com/emilyriederer.png","language":null,"funding_links":[],"categories":["Others"],"sub_categories":[],"readme":"## dbtplyr\n\nThis add-on package enhances `dbt` by providing macros which programmatically select columns\nbased on their column names. It is inspired by the [`across()` function](https://www.tidyverse.org/blog/2020/04/dplyr-1-0-0-colwise/) \nand the [`select helpers`](https://tidyselect.r-lib.org/reference/select_helpers.html) in the R package `dplyr`.\n\n`dplyr` (\u003e= 1.0.0) has helpful semantics for selecting and applying transformations to variables based on their names.\nFor example, if one wishes to take the *sum* of all variables with name prefixes of `N` and the mean of all variables with\nname prefixes of `IND` in the dataset `mydata`, they may write:\n\n```\nsummarize(\n  mydata, \n  across( starts_with('N'), sum),\n  across( starts_with('IND', mean)\n)\n```\n\nThis package enables us to similarly write `dbt` data models with commands like:\n\n```\n{% set cols = dbtplyr.get_column_names( ref('mydata') ) %}\n{% set cols_n = dbtplyr.starts_with('N', cols) %}\n{% set cols_ind = dbtplyr.starts_with('IND', cols) %}\n\nselect\n\n  {{ dbtplyr.across(cols_n, \"sum({{var}}) as {{var}}_tot\") }},\n  {{ dbtplyr.across(cols_ind, \"mean({{var}}) as {{var}}_avg\") }}\n\nfrom {{ ref('mydata') }}\n```\n\nwhich `dbt` then compiles to standard SQL. \n\nAlternatively, to protect against cases where no column names matched the pattern provided \n(e.g. no variables start with `n` so `cols_n` is an empty list), one may instead internalize the final comma\nso that it is only compiled to SQL when relevant by using the `final_comma` parameter of `across`.\n\n```\n  {{ dbtplyr.across(cols_n, \"sum({{var}}) as {{var}}_tot\", final_comma = true) }}\n```\n\n\nNote that, slightly more `dplyr`-like, you may also write:\n\n```\nselect\n\n  {{ dbtplyr.across(dbtplyr.starts_with('N', ref('mydata')), \"sum({{var}}) as {{var}}_tot\") }},\n  {{ dbtplyr.across(dbtplyr.starts_with('IND', ref('mydata')), \"mean({{var}}) as {{var}}_avg\") }}\n\nfrom {{ ref('mydata') }}\n```\n\nBut, as each function call is a bit longer than the equivalent `dplyr` code, I personally find the first form more readable.\n\n## Macros\n\nThe complete list of macros included are:\n\n**Functions to apply operation across columns**\n\n- `across(var_list, script_string, final_comma)`\n- `c_across(var_list, script_string)`\n\n**Functions to evaluation condition across columns**\n\n- `if_any(var_list, script_string)`\n- `if_all(var_list, script_string)`\n\n**Functions to subset columns by naming conventions**\n\n- `starts_with(string, relation or list)` \n- `ends_with(string, relation or list)`\n- `contains(string, relation or list)`\n- `not_contains(string, relation or list)`\n- `one_of(string_list, relation or list)`\n- `not_one_of(string_list, relation or list)`\n- `matches(string, relation)`\n- `everything(relation)`\n- `where(fn, relation)` where `fn` is the string name of a [Column type-checker](https://docs.getdbt.com/reference/dbt-classes/#column) (e.g. \"is_number\")\n\nNote that all of the select-helper functions that take a relation as an argument can optionally be passed a list of names instead.\n\nDocumentation for these functions is available on the [package website](https://emilyriederer.github.io/dbtplyr/) and in the [`macros/macro.yml`](https://github.com/emilyriederer/dbtplyr/blob/main/macros/macro.yml) file.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Femilyriederer%2Fdbtplyr","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Femilyriederer%2Fdbtplyr","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Femilyriederer%2Fdbtplyr/lists"}