An open API service indexing awesome lists of open source software.

https://github.com/annakrystalli/metadatar

Tools for creating metadata files for reproducible research.
https://github.com/annakrystalli/metadatar

Last synced: 4 days ago
JSON representation

Tools for creating metadata files for reproducible research.

Awesome Lists containing this project

README

        

---
output: github_document
---

```{r setup, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# metadatar

[![Travis build status](https://travis-ci.org/annakrystalli/metadatar.svg?branch=master)](https://travis-ci.org/annakrystalli/metadatar) [![lifecycle](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](https://www.tidyverse.org/lifecycle/#experimental)

The goal of metadatar is to help produce minimum metadata files to document datasets in simple formats that can form building blocks of more complex metadata formats (eg. EML, rdf).

## Installation

You can install the developent version of metadatar from GitHub with:

``` r
#install.packages("devtools")
devtools::install_github("annakrystalli/metadatar")
```

## Example

This is a basic example which shows you how to create a metadata table for the [**gapminder**](https://github.com/jennybc/gapminder) dataset

```{r example}
library(gapminder)
library(metadatar)
str(gapminder)
```

```{r}
meta_shell <- mt_create_meta_shell(gapminder)
knitr::kable(meta_shell)
```

I've focused on recognized column headers to make it easier to create an EML object down the line and on the core columns required but additional ones can be added.

### Attributes associated with all variables:

- attributeName (required, free text field)
- attributeDefinition (required, free text field)
- columnClasses (required, `"numeric"`, `"character"`, `"factor"`, `"ordered"`,
or `"Date"`, case sensitive)


### `columnClasses` dependant attributes

- For `numeric` (ratio or interval) data:
- unit (required, see [eml-unitTypeDefinitions](https://knb.ecoinformatics.org/#external//emlparser/docs/eml-2.1.1/./eml-unitTypeDefinitions.html) and [working with units](https://github.com/ropensci/EML/blob/master/vignettes/working-with-units.Rmd))
- For `character` (textDomain) data:
- definition (required)
- For `dateTime` data:
- formatString (required)
e.g for date `11-03-2001` formatString would be `"DD-MM-YYYY"`

- I use the columns `code` and `levels` to store information on factors. I use `";"` to separate code and level descriptions. These can be extracted by `metadatar` function `mt_extract_attr_factors()` later on.

## Complete metadata table

```{r, message=FALSE, warning=FALSE}
meta_df <- readr::read_csv(system.file("extdata", "gapminder_meta.csv", package="metadatar"))
```

```{r}
knitr::kable(meta_df)
```

## Extracting factors

```{r}
mt_extract_factors_tbl(meta_df)
```

## Data visualisation utilities

Create more descriptive variable labels for plot axes/titles or tables

```{r}
mt_label(meta_df, var = "gdpPercap")
```

***

Please note that this project is released with a [Contributor Code of Conduct](CODE_OF_CONDUCT.md).
By participating in this project you agree to abide by its terms.