https://github.com/btskinner/crosswalkr
Rename and encode variables using external crosswalk files
https://github.com/btskinner/crosswalkr
crosswalk encode labels r rename
Last synced: 3 months ago
JSON representation
Rename and encode variables using external crosswalk files
- Host: GitHub
- URL: https://github.com/btskinner/crosswalkr
- Owner: btskinner
- License: other
- Created: 2017-10-14T22:46:47.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2024-03-11T14:43:25.000Z (about 1 year ago)
- Last Synced: 2024-04-04T18:10:20.949Z (about 1 year ago)
- Topics: crosswalk, encode, labels, r, rename
- Language: R
- Homepage: https://www.btskinner.io/crosswalkr
- Size: 388 KB
- Stars: 8
- Watchers: 4
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.Rmd
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
- jimsghstars - btskinner/crosswalkr - Rename and encode variables using external crosswalk files (R)
README
---
title: crosswalkr
output: md_document
---# crosswalkr
```{r, include = FALSE}
options(width = 100)
```[](https://github.com/btskinner/crosswalkr/actions)
[](https://github.com/btskinner/crosswalkr)
[](http://cran.r-project.org/package=crosswalkr)## Overview
This package offers a pair of functions, `renamefrom()` and
`encodefrom()`, for renaming and encoding data frames using external
crosswalk files. It is especially useful when constructing master
data sets from multiple smaller data sets that do not name or encode
variables consistently across files. Based on `renamefrom` and
`encodefrom` [Stata commands written by Sally Hudson and
team](https://github.com/slhudson/rename-and-encode).## Installation
Install the latest release version from CRAN with
```{r, eval = FALSE}
install.packages('crosswalkr')
```Install the latest development version from Github with
```{r, eval = FALSE}
devtools::install_github('btskinner/crosswalkr')
```## Usage
```{r, message = FALSE}
library(crosswalkr)
library(dplyr)
library(haven)
```
```{r}
## starting data frame
df <- data.frame(state = c('Kentucky','Tennessee','Virginia'),
fips = c(21,47,51),
region = c('South','South','South'))
df## crosswalk with which to convert old names to new names with labels
cw <- data.frame(old_name = c('state','fips'),
new_name = c('stname','stfips'),
label = c('Full state name', 'FIPS code'))
cw
```### Renaming
Convert old variable names to new names and add labels from crosswalk.
```{r}
df1 <- renamefrom(df, cw_file = cw, raw = old_name, clean = new_name, label = label)
df1
```Convert old variable names to new names using old names as labels
(ignoring labels in crosswalk).
```{r}
df2 <- renamefrom(df, cw_file = cw, raw = old_name, clean = new_name, name_label = TRUE)
df2
```Convert old variable names to new names, but keep unmatched old names
in the data frame.
```{r}
df3 <- renamefrom(df, cw_file = cw, raw = old_name, clean = new_name, drop_extra = FALSE)
df3
```### Encoding
```{r}
## starting data frame
df <- data.frame(state = c('Kentucky','Tennessee','Virginia'),
stfips = c(21,47,51),
cenregnm = c('South','South','South'))
df## use state crosswalk data file from package
cw <- get(data(stcrosswalk))
cw
```Create a new column with factor-encoded values
```{r}
df$state2 <- encodefrom(df, var = state, cw_file = cw, raw = stname, clean = stfips, label = stabbr)
df
```Create a new column with labelled values.
```{r}
## convert to tbl_df
df <- tibble::as_tibble(df)
df$state3 <- encodefrom(df, var = state, cw_file = cw, raw = stname, clean = stfips, label = stabbr)
```Create new column with factor-encoded values (ignores the fact that `df` is a tibble)
```{r}
df$state4 <- encodefrom(df, var = state, cw_file = cw, raw = stname, clean = stfips, label = stabbr, ignore_tibble = TRUE)
```Show factors with labels:
```{r}
as_factor(df)
```
Show factors without labels:
```{r}
zap_labels(df)
```