https://github.com/epicentre-msf/datadict
https://github.com/epicentre-msf/datadict
Last synced: 5 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/epicentre-msf/datadict
- Owner: epicentre-msf
- License: other
- Created: 2021-08-16T17:22:35.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2024-05-03T12:18:13.000Z (12 months ago)
- Last Synced: 2024-08-13T07:11:32.444Z (9 months ago)
- Language: R
- Homepage: https://epicentre-msf.github.io/datadict/
- Size: 6.35 MB
- Stars: 4
- Watchers: 5
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.Rmd
- License: LICENSE
Awesome Lists containing this project
- jimsghstars - epicentre-msf/datadict - (R)
README
---
output: github_document
---```{r, echo = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/"
)
options(digits = 4, width = 120)
```# datadict: Data dictionary tools for the OCA data sharing platform
[](https://www.tidyverse.org/lifecycle/#experimental)
[](https://github.com/epicentre-msf/datadict/actions)
[](https://codecov.io/gh/epicentre-msf/datadict?branch=main)### Installation
Install from GitHub with:
```{r, eval=FALSE}
# install.packages("remotes")
remotes::install_github("epicentre-msf/datadict")
```### Example usage
#### Generate data dictionary from ODK template
The `dict_from_odk()` function can be used to generate an OCA-style data
dictionary from an ODK template (both the 'survey' and 'options' sheets of the
ODK template are required as inputs).```{r}
library(datadict)
library(readxl)# path to example ODK template (a WHO mortality survey)
path_data <- system.file("extdata", package = "datadict")
path_odk_template <- file.path(path_data, "WHOVA2016_v1_5_3_ODK.xlsx")# read 'survey' sheet and 'choices' sheet
odk_survey <- readxl::read_xlsx(path_odk_template, sheet = "survey")
odk_choices <- readxl::read_xlsx(path_odk_template, sheet = "choices")# derive OCA-style data dictionary
dict <- dict_from_odk(odk_survey, odk_choices)# examine first few rows/cols
dict[1:5,1:4]
```#### Generate data dictionary from REDCap template
The `dict_from_redcap()` function can be used to generate an OCA-style data
dictionary from a REDCap data dictionary. The input dictionary can be exported
directly from a REDCap project website or fetched via the API using e.g. the R
package [redcap](https://github.com/epicentre-msf/redcap).```{r}
# path to example REDCap template
path_data <- system.file("extdata", package = "datadict")
path_redcap_dict <- file.path(path_data, "REDCapDataDictionaryDemo.csv")# read dictionary
redcap_dict <- read.csv(path_redcap_dict)# derive OCA-style data dictionary
dict <- dict_from_redcap(redcap_dict)# examine first few rows/cols
dict[1:5,1:5]
```#### Generate data dictionary template from a dataset
The `dict_from_data()` function can be used to generate a template OCA-style
data dictionary (which may require further processing) from a dataset. Data types
are based on the class of each column within in the input dataset, e.g.:| Column class in R | Dictionary data type |
| ------------------|----------------------|
| Date | Date |
| POSIX | Datetime |
| logical | Logical |
| integer | Numeric |
| numeric | Numeric |
| factor | Coded list |
| character | Coded list or Free text (see argument `factor_threshold`) |```{r}
# path to example dataset
path_data <- system.file("extdata", package = "datadict")
path_linelist <- file.path(path_data, "linelist_cleaned.xlsx")# read data
dat <- readxl::read_xlsx(path_linelist)# derive OCA-style data dictionary template
dict <- dict_from_data(dat)# examine first few rows/cols
dict[1:7,1:5]
```