An open API service indexing awesome lists of open source software.

https://github.com/stufield/wranglr

The wranglr package contains general functions necessary to manipulate and wrangle internal R representations of proteomic data into convenient forms for analysis.
https://github.com/stufield/wranglr

Last synced: about 1 year ago
JSON representation

The wranglr package contains general functions necessary to manipulate and wrangle internal R representations of proteomic data into convenient forms for analysis.

Awesome Lists containing this project

README

          

---
output: github_document
---

```{r setup, include = FALSE}
options(width = 100)
Sys.setlocale("LC_COLLATE", "en_US.UTF-8") # ensure common sorting envir
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
ver <- desc::desc_get_version(".")
ver <- paste0("https://img.shields.io/badge/Version-", ver,
"-success.svg?style=flat&logo=github")
sample_df <- wranglr:::sample_df
```

# The `wranglr` package

![GitHub version](`r ver`)
[![CRAN status](http://www.r-pkg.org/badges/version/wranglr)](https://cran.r-project.org/package=wranglr)
[![R-CMD-check](https://github.com/stufield/wranglr/workflows/R-CMD-check/badge.svg)](https://github.com/stufield/wranglr/actions)
[![](https://cranlogs.r-pkg.org/badges/grand-total/wranglr)](https://cran.r-project.org/package=wranglr)
[![Codecov test coverage](https://codecov.io/gh/stufield/wranglr/branch/main/graph/badge.svg)](https://app.codecov.io/gh/stufield/wranglr?branch=main)
[![Lifecycle: stable](https://img.shields.io/badge/lifecycle-stable-brightgreen.svg)](https://lifecycle.r-lib.org/articles/stages.html#stable)
[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](https://choosealicense.com/licenses/mit/)

## Overview

The `wranglr` package contains general functions
necessary to manipulate and wrangle internal `R` representations
of proteomic data into convenient forms for analysis.

----------------

## Installation

```{r install-git, eval = FALSE}
# current dev version
remotes::install_github("stufield/wranglr")

# or a specific version
remotes::install_github("stufield/wranglr@v0.0.1")
```

----------------

## Usage

To load `wranglr` simply make a call to `library()` as usual:

```{r load, message = FALSE}
library(wranglr)
```

## Help summary of the package

```{r help, eval = FALSE}
library(help = wranglr)
```

-------------

## Useful functions in `wranglr`

### Transforming Data

* `center_scale()`
* `create_recipe()`

```{r center-scale}
scaled <- center_scale(mtcars)
apply(feature_matrix(scaled), 2, mean) |> sum() # mean = 0
apply(feature_matrix(scaled), 2, sd) # sd = 1

# `create_recipe()`
rcp <- create_recipe(mtcars)
rcp
```

-----------

```{r helpr, echo = FALSE}
library(helpr) # needs get_outliers()
```

### Imputing Data

* `impute_outliers()`, `get_outliers()` (`helpr`)
* `imputeNAs()`
* `impute_predictors()`

```{r imputing}
# Outliers
x <- withr::with_seed(1, rnorm(10)) # normal
x <- c(x, 100) # add outlier
get_outliers(x) # index of the outlier

# parameters stored in attributes (parametric only)
attributes(get_outliers(x, type = "para"))

# Impute 11th value
impute_outliers(x)

# Impute NAs
x <- withr::with_seed(1, rnorm(6))
x[ c(3, 5) ] <- NA
median(x, na.rm = TRUE)

imputeNAs(x)

table(imputeNAs(x))

# Predictors
x <- data.frame(a = 1:3, b = 4:6, c = 7:9, d = c(1.23, 4.56, 7.89))
tbl <- tibble::tribble(
~ Feature, ~ xtrm_max, ~ impute_max, ~ xtrm_min, ~ impute_min,
"a", NA, NA, NA, NA,
"b", 5, 5, 0, 1,
"c", 9, 7, 7.1, 7.1
)
impute_predictors(x, tbl)
```

-----------

### Binding Data

* `bind_intersect()`
* `bind_union()`

```{r bind}
df1 <- data.frame(a = 1, b = 2, c = 3, row.names = "A")
df2 <- data.frame(a = 4, b = 5, d = 6, row.names = "B")
df3 <- data.frame(a = 7, b = 8, e = 9, row.names = "C")
list_df <- list(a = df1, b = df2, c = df3)
list_df

bind_intersect(list_df)

bind_union(list_df)
```

-----------

### Selecting Feature Data

* `refactor_data()`
* `feature_matrix()`

```{r refactor}
df <- data.frame(a = factor(c("a", "b")), b = 1:2L)
foo <- df[df$a == "a", ]
foo

levels(foo$a) # 2 levels! "b" is a ghost level

bar <- refactor_data(foo)
levels(bar$a) # 1 level now
```

-----------

### Sequence IDs and Annotations

* `lookup_anno()`
* `seq_lookup()`
* `seqify()`

```{r seq-lookup}
seqs <- withr::with_seed(101, sample(names(sample_df), 10L))
seqs

# NAs for those analytes dropped from menu
seq_lookup(seqs)

seqify(seqs)

# Pass `tbl` containing annotations
# to reconstitute those missing ones
anno <- attr(sample_df, "anno")
seq_lookup(seqs, tbl = anno)
```