Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/rossellhayes/fauxnaif

Convert Values to NA
https://github.com/rossellhayes/fauxnaif

Last synced: about 2 months ago
JSON representation

Convert Values to NA

Awesome Lists containing this project

README

        

---
output: github_document
---

```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)

options(tibble.print_min = 5, tibble.print_max = 5)

library(ipa)
# remotes::install_github("GuangchuangYu/badger")
library(badger)
```

# fauxnaif

`r badge_cran_release(color = "brightgreen")`
`r badge_lifecycle("stable")`
`r badge_license(color = "blueviolet")`
`r badge_github_actions(action = "R-CMD-check")`
`r badge_codecov()`
`r badge_dependencies()`

***faux-naïf*** ([`r ipa::sampa("/%foU.naI\"if/")`](https://en.wikipedia.org/wiki/Help:IPA/English)): a person who pretends to be simple or innocent

**fauxnaif**: an R package for simplifying data by pretending values are `NA`

## Overview

**fauxnaif** provides an extension to `dplyr::na_if()`.
Unlike [**dplyr**](https://github.com/tidyverse/dplyr)'s `na_if()`,
`na_if_in()` allows you to specify multiple values to be replaced with
`NA` using a single function.
**fauxnaif** also includes a complementary function `na_if_not()` to specify values to keep.

## Installation

You can install `fauxnaif` from [CRAN](https://cran.r-project.org/package=fauxnaif):

```{r eval = FALSE}
install.packages("fauxanif")
```

Or the development version from [GitHub](https://github.com/rossellhayes/fauxnaif):

```{r eval = FALSE}
# install.packages("remotes")
remotes::install_github("rossellhayes/fauxnaif")
```

## Usage

```{r, message = FALSE}
library(dplyr)
library(fauxnaif)
```

### The basics

Let's say we want to remove an unwanted negative value from a vector of numbers

```{r}
-1:10
```

We can replace -1...

... explicitly:
```{r}
na_if_in(-1:10, -1)
```

... by specifying values to keep:
```{r}
na_if_not(-1:10, 0:10)
```

... using a formula:
```{r}
na_if_in(-1:10, ~ . < 0)
```

### A little more complex

```{r}
messy_string <- c("abc", "", "def", "NA", "ghi", 42, "jkl", "NULL", "mno")
```

We can replace unwanted values...

... one at a time:
```{r}
na_if_in(messy_string, "")
```

... or all at once:
```{r}
na_if_in(messy_string, "", "NA", "NULL", 1:100)
na_if_in(messy_string, c("", "NA", "NULL", 1:100))
na_if_in(messy_string, list("", "NA", "NULL", 1:100))
```

... or using a clever formula:
```{r}
grepl("[a-z]{3,}", messy_string)
na_if_not(messy_string, ~ grepl("[a-z]{3,}", .))
```

### With data frames

```{r include = FALSE}
faux_census <- fauxnaif::faux_census %>%
select(state, age, income, gender) %>%
filter(
state == "Canada" |
age < 18 |
age > 120 |
income == 9999999
)
```

```{r}
faux_census
```

na_if_in() is particularly useful inside `dplyr::mutate()`:
```{r}
faux_census %>%
mutate(
income = na_if_in(income, 9999999),
age = na_if_in(age, ~ . < 18, ~ . > 120),
state = na_if_not(state, ~ grepl("^[A-Z]{2,}$", .)),
gender = na_if_in(gender, ~ nchar(.) > 20)
)
```

Or you can use `dplyr::across()` on data frames:
```{r}
faux_census %>%
mutate(
across(age, na_if_in, ~ . < 18, ~ . > 120),
across(state, na_if_not, ~ grepl("^[A-Z]{2,}$", .)),
across(where(is.character), na_if_in, ~ nchar(.) > 20),
across(everything(), na_if_in, 9999999)
)
```

---

Hex sticker fonts are [Bodoni* by indestructible type*](https://github.com/indestructible-type/Bodoni) and
[Source Code Pro by Adobe](https://github.com/adobe-fonts/source-code-pro).

Image adapted from icon made by [Freepik](https://www.freepik.com) from
[flaticon.com](https://www.flaticon.com/free-icon/paper-shredder_1701401).

Please note that **fauxnaif** is released with a
[Contributor Code of Conduct](https://www.contributor-covenant.org/version/2/0/code_of_conduct/).