https://github.com/cynkra/populate
Safe Wrappers around 'dplyr::mutate()'
https://github.com/cynkra/populate
Last synced: 6 months ago
JSON representation
Safe Wrappers around 'dplyr::mutate()'
- Host: GitHub
- URL: https://github.com/cynkra/populate
- Owner: cynkra
- Created: 2022-11-21T13:51:42.000Z (almost 3 years ago)
- Default Branch: main
- Last Pushed: 2022-11-24T19:36:59.000Z (almost 3 years ago)
- Last Synced: 2025-04-12T20:12:22.311Z (6 months ago)
- Language: R
- Size: 7.81 KB
- Stars: 3
- Watchers: 3
- Forks: 1
- Open Issues: 1
-
Metadata Files:
- Readme: README.Rmd
Awesome Lists containing this project
README
---
output: github_document
---```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```# populate
Draft for discussion.
This issue was closed : https://github.com/tidyverse/dplyr/issues/6547
Is this extension package useful ?
{populate} provides stricter wrappers around `dplyr::mutate()`:
* `populate()` makes sure the input's ptype is preserved
* no column type change
* no column addition
* no column deletion (`mutate()` can delete with `NULL`)
* cast by default, assert optionally
* `collate()` only creates new columnsDo we need more or should we keep it simple?
* `abrogate()` to remove columns ?
* `sublimate()` like `populate()` but allow ptype change (doesn't allow new col creation) ?
* `summarize()` variants ?Is it better to use a subclass of data frame (as in linked github issue) so we can make safe versions of
everything including functions like `count()`, `unnest()`, joins ...We'd lose autocomplete of new args, maybe more confusing than having new verbs ?
## Installation
You can install the development version of populate like so:
``` r
devtools::install_github("cynkra/populate")
```## Examples
```{r example, error = TRUE}
library(populate)data <- tibble::tibble(
a = letters[1:2],
b = c(1,2),
c = factor(letters[1:2]),
d = as.Date(c("2022-01-01", "2022-01-02")),
e = vctrs::list_of(cars)
)# can't create a column if it exists
collate(data, a = 1)# but we can create new columns
collate(data, ee = 1)# we can't create a new column with populate()
populate(data, ee = 1)# can't cast double to character
populate(data, a = 1)# casting integer to double
populate(data, b = 3:4)# doesn't work if `.strict` is `TRUE`
populate(data, b = 3:4, .strict = TRUE)# casting character to factor with allowed levels
populate(data, c = c("b", "b"))# can't cast because wrong levels
populate(data, c = c("b", "d"))# datetimes are casted to date
populate(data, d = lubridate::as_datetime(c("2022-01-01", "2022-01-02")))# characters can't be casted to date
populate(data, d = c("2022-01-01", "2022-01-02"))# using list_of allowed us to prevent corrupting our data silently
populate(data, e = list(iris))# and we don't have to bother with list_of anymore if we feed the right format
populate(data, e = list(head(cars)))
```