Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/joeroe/controller
Tools for working with controlled vocabularies in R
https://github.com/joeroe/controller
authority-control controlled-vocabularies r r-package
Last synced: 23 days ago
JSON representation
Tools for working with controlled vocabularies in R
- Host: GitHub
- URL: https://github.com/joeroe/controller
- Owner: joeroe
- License: other
- Created: 2021-04-26T12:54:48.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2023-05-12T12:50:54.000Z (over 1 year ago)
- Last Synced: 2024-11-15T23:44:07.634Z (about 2 months ago)
- Topics: authority-control, controlled-vocabularies, r, r-package
- Language: R
- Homepage: https://controller.joeroe.io/
- Size: 1.27 MB
- Stars: 0
- Watchers: 3
- Forks: 0
- Open Issues: 7
-
Metadata Files:
- Readme: README.Rmd
- Changelog: NEWS.md
- License: LICENSE
Awesome Lists containing this project
README
---
output: github_document
---```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```# controller
[![Project Status: WIP – Initial development is in progress, but there has not yet been a stable, usable release suitable for the public.](https://www.repostatus.org/badges/latest/wip.svg)](https://www.repostatus.org/#wip)
[![R-CMD-check](https://github.com/joeroe/controller/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/joeroe/controller/actions/workflows/R-CMD-check.yaml)
[![Test covcontrollerge](https://codecov.io/gh/joeroe/controller/graph/badge.svg)](https://app.codecov.io/gh/joeroe/controller)**controller** is a collection of functions for working with controlled vocabularies in R.
It introduces the `control()` verb, which recodes values in a vector using a lookup table of preferred and variant terms (a *thesaurus*).## Installation
You can install the development version of controlled from GitHub using the [remotes](https://remotes.r-lib.org/) package:
``` r
remotes::install_github("joeroe/controller")
```## Example
A common data-tidying problem is standardising variant terms for the same concept.
Imagine we have a dataset that uses a number of different names for shades of the same colour.
As data analysts, we naturally want to recode the data to eliminate this messy creativity, for example using [dplyr::recode()](https://dplyr.tidyverse.org/reference/recode.html):```{r eg-dplyr}
library(dplyr, warn.conflicts = FALSE)
shades <- c("daffodil", "purple", "magenta", "azure", "navy", "violet")recode(shades,
daffodil = "yellow",
purple = "purple",
magenta = "pink",
azure = "blue",
navy = "blue",
violet = "purple")
```But recoding this way can be tedious, especially if there are a large number of terms.
With `control()`, we can instead use a data frame containing a thesaurus to replace the values:```{r eg-controller}
library(controller)
data("colour_thesaurus")control(shades, colour_thesaurus)
````control()` also supports fuzzy matching, removing the need to exhaustively list variants for common causes of differing terminology.
For example, to perform a case insensitive match to the thesaurus:```{r eg-ci}
shades <- toupper(shades)
control_ci(shades, colour_thesaurus)
```