{"id":13857400,"url":"https://github.com/ropensci/tarchetypes","last_synced_at":"2025-05-16T07:06:29.025Z","repository":{"id":39643887,"uuid":"282774543","full_name":"ropensci/tarchetypes","owner":"ropensci","description":"Archetypes for targets and pipelines","archived":false,"fork":false,"pushed_at":"2025-05-05T18:53:35.000Z","size":2142,"stargazers_count":144,"open_issues_count":0,"forks_count":21,"subscribers_count":7,"default_branch":"main","last_synced_at":"2025-05-05T19:54:36.293Z","etag":null,"topics":["data-science","high-performance-computing","peer-reviewed","pipeline","r","r-package","r-targetopia","reproducibility","rstats","targets","workflow"],"latest_commit_sha":null,"homepage":"https://docs.ropensci.org/tarchetypes","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ropensci.png","metadata":{"files":{"readme":"README.Rmd","changelog":"NEWS.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":"codemeta.json","zenodo":null}},"created_at":"2020-07-27T02:25:17.000Z","updated_at":"2025-05-05T18:53:39.000Z","dependencies_parsed_at":"2023-09-21T19:56:48.382Z","dependency_job_id":"d6d666d0-6476-441e-8a2f-5b15a065cf71","html_url":"https://github.com/ropensci/tarchetypes","commit_stats":{"total_commits":629,"total_committers":10,"mean_commits":62.9,"dds":0.2209856915739269,"last_synced_commit":"c1149146265953710e29cb0cf38d01815eebf1b7"},"previous_names":[],"tags_count":35,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ropensci%2Ftarchetypes","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ropensci%2Ftarchetypes/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ropensci%2Ftarchetypes/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ropensci%2Ftarchetypes/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ropensci","download_url":"https://codeload.github.com/ropensci/tarchetypes/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254485063,"owners_count":22078767,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-science","high-performance-computing","peer-reviewed","pipeline","r","r-package","r-targetopia","reproducibility","rstats","targets","workflow"],"created_at":"2024-08-05T03:01:35.616Z","updated_at":"2025-05-16T07:06:28.975Z","avatar_url":"https://github.com/ropensci.png","language":"R","funding_links":[],"categories":["R"],"sub_categories":[],"readme":"---\noutput: github_document\n---\n\n```{r, include = FALSE}\nknitr::opts_chunk$set(\n  collapse = TRUE,\n  comment = \"#\u003e\",\n  fig.path = \"man/figures/README-\",\n  out.width = \"100%\"\n)\n```\n\n# tarchetypes \u003cimg src='man/figures/logo.png' align=\"right\" height=\"139\"/\u003e\n\n[![ropensci](https://badges.ropensci.org/401_status.svg)](https://github.com/ropensci/software-review/issues/401)\n[![zenodo](https://zenodo.org/badge/282774543.svg)](https://zenodo.org/badge/latestdoi/282774543)\n[![R Targetopia](https://img.shields.io/badge/R_Targetopia-member-blue?style=flat\u0026labelColor=gray)](https://wlandau.github.io/targetopia/)\n[![CRAN](https://www.r-pkg.org/badges/version/tarchetypes)](https://CRAN.R-project.org/package=tarchetypes)\n[![status](https://www.repostatus.org/badges/latest/active.svg)](https://www.repostatus.org/#active)\n[![check](https://github.com/ropensci/tarchetypes/actions/workflows/check.yaml/badge.svg)](https://github.com/ropensci/tarchetypes/actions?query=workflow%3Acheck)\n[![codecov](https://codecov.io/gh/ropensci/tarchetypes/branch/main/graph/badge.svg?token=3T5DlLwUVl)](https://app.codecov.io/gh/ropensci/tarchetypes)\n[![lint](https://github.com/ropensci/tarchetypes/actions/workflows/lint.yaml/badge.svg)](https://github.com/ropensci/tarchetypes/actions?query=workflow%3Alint)\n\nThe `tarchetypes` R package is a collection of target and pipeline archetypes for the [`targets`](https://github.com/ropensci/targets) package. These archetypes express complicated pipelines with concise syntax, which enhances readability and thus reproducibility. Archetypes are possible because of the flexible metaprogramming capabilities of [`targets`](https://github.com/ropensci/targets). In [`targets`](https://github.com/ropensci/targets), one can define a target as an object outside the central pipeline, and the [`tar_target_raw()`](https://docs.ropensci.org/targets/reference/tar_target_raw.html) function completely avoids non-standard evaluation. That means anyone can write their own niche interfaces for specialized projects. `tarchetypes` aims to include the most common and versatile archetypes and usage patterns.\n\n## Grouped data frames\n\n`tarchetypes` has functions for easy dynamic branching over subsets of data frames:\n\n* `tar_group_by()`: define row groups using `dplyr::group_by()` semantics.\n* `tar_group_select()`: define row groups using `tidyselect` semantics.\n* `tar_group_count()`: define a given number row groups.\n* `tar_group_size()`: define row groups of a given size.\n\nIf you define a target with one of these functions, all downstream dynamic targets will automatically branch over the row groups.\n\n```{r, echo = FALSE}\ntargets::tar_script({\n  produce_data \u003c- function() {\n    expand.grid(var1 = c(\"a\", \"b\"), var2 = c(\"c\", \"d\"), rep = c(1, 2, 3))\n  }\n  list(\n    tarchetypes::tar_group_by(data, produce_data(), var1, var2),\n    tar_target(group, data, pattern = map(data))\n  )\n})\n```\n\n```{r, eval = FALSE}\n# _targets.R file:\nlibrary(targets)\nlibrary(tarchetypes)\nproduce_data \u003c- function() {\n  expand.grid(var1 = c(\"a\", \"b\"), var2 = c(\"c\", \"d\"), rep = c(1, 2, 3))\n}\nlist(\n  tar_group_by(data, produce_data(), var1, var2),\n  tar_target(group, data, pattern = map(data))\n)\n```\n\n```{r}\n# R console:\nlibrary(targets)\ntar_make()\n\n# First row group:\ntar_read(group, branches = 1)\n\n# Second row group:\ntar_read(group, branches = 2)\n```\n\n## Literate programming\n\nConsider the following R Markdown report.\n\n```{r, echo = FALSE, comment = \"\"}\nlines \u003c- c(\n  \"---\",\n  \"title: report\",\n  \"output: html_document\",\n  \"---\",\n  \"\",\n  \"```{r}\",\n  \"library(targets)\",\n  \"tar_read(dataset)\",\n  \"```\"\n)\ncat(lines, sep = \"\\n\")\n```\n\nWe want to define a target to render the report. And because the report calls `tar_read(dataset)`, this target needs to depend on `dataset`. Without `tarchetypes`, it is cumbersome to set up the pipeline correctly.\n\n```{r, eval = FALSE}\n# _targets.R\nlibrary(targets)\nlist(\n  tar_target(dataset, data.frame(x = letters)),\n  tar_target(\n    report, {\n      # Explicitly mention the symbol `dataset`.\n      list(dataset)\n      # Return relative paths to keep the project portable.\n      fs::path_rel(\n        # Need to return/track all input/output files.\n        c( \n          rmarkdown::render(\n            input = \"report.Rmd\",\n            # Always run from the project root\n            # so the report can find _targets/.\n            knit_root_dir = getwd(),\n            quiet = TRUE\n          ),\n          \"report.Rmd\"\n        )\n      )\n    },\n    # Track the input and output files.\n    format = \"file\",\n    # Avoid building small reports on HPC.\n    deployment = \"main\"\n  )\n)\n```\n\nWith `tarchetypes`, we can simplify the pipeline with the `tar_render()` archetype.\n\n```{r, eval = FALSE}\n# _targets.R\nlibrary(targets)\nlibrary(tarchetypes)\nlist(\n  tar_target(dataset, data.frame(x = letters)),\n  tar_render(report, \"report.Rmd\")\n)\n```\n\nAbove, `tar_render()` scans code chunks for mentions of targets in `tar_load()` and `tar_read()`, and it enforces the dependency relationships it finds. In our case, it reads `report.Rmd` and then forces `report` to depend on `dataset`. That way, `tar_make()` always processes `dataset` before `report`, and it automatically reruns `report.Rmd` whenever `dataset` changes.\n\n## Alternative pipeline syntax\n\n[`tar_plan()`](https://docs.ropensci.org/tarchetypes/reference/tar_plan.html) is a drop-in replacement for [`drake_plan()`](https://docs.ropensci.org/drake/reference/drake_plan.html) in the [`targets`](https://github.com/ropensci/targets) ecosystem. \nIt lets users write targets as name/command pairs without having to call [`tar_target()`](https://docs.ropensci.org/targets/reference/tar_target.html).\n\n```{r, eval = FALSE}\ntar_plan(\n  tar_file(raw_data_file, \"data/raw_data.csv\", format = \"file\"),\n  # Simple drake-like syntax:\n  raw_data = read_csv(raw_data_file, col_types = cols()),\n  data =raw_data %\u003e%\n    mutate(Ozone = replace_na(Ozone, mean(Ozone, na.rm = TRUE))),\n  hist = create_plot(data),\n  fit = biglm(Ozone ~ Wind + Temp, data),\n  # Needs tar_render() because it is a target archetype:\n  tar_render(report, \"report.Rmd\")\n)\n```\n\n## Installation\n\nType | Source | Command\n---|---|---\nRelease | CRAN | `install.packages(\"tarchetypes\")`\nDevelopment | GitHub | `remotes::install_github(\"ropensci/tarchetypes\")`\nDevelopment | rOpenSci | `install.packages(\"tarchetypes\", repos = \"https://dev.ropensci.org\")`\n\n## Documentation\n\nFor specific documentation on `tarchetypes`, including the help files of all user-side functions, please visit the [reference website](https://docs.ropensci.org/tarchetypes/). For documentation on [`targets`](https://github.com/ropensci/targets) in general, please visit the [`targets` reference website](https://docs.ropensci.org/targets/). Many of the linked resources use `tarchetypes` functions such as [`tar_render()`](https://docs.ropensci.org/tarchetypes/reference/tar_render.html).\n\n## Help\n\nPlease read the [help guide](https://books.ropensci.org/targets/help.html) to learn how best to ask for help using `targets` and `tarchetypes`.\n\n## Code of conduct\n\nPlease note that this package is released with a [Contributor Code of Conduct](https://ropensci.org/code-of-conduct/).\n\n## Citation\n\n```{r}\ncitation(\"tarchetypes\")\n```\n\n```{r, echo = FALSE}\nunlink(\"_targets.R\")\ntar_destroy()\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fropensci%2Ftarchetypes","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fropensci%2Ftarchetypes","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fropensci%2Ftarchetypes/lists"}