Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/const-ae/lemur.utils

R package with helper functions to process the output of lemur
https://github.com/const-ae/lemur.utils

Last synced: 9 days ago
JSON representation

R package with helper functions to process the output of lemur

Awesome Lists containing this project

README

        

---
output: github_document
---

```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)

options(max.print=10)
```

# lemur.utils

Helper functions to manage the output of [`lemur`](https://www.bioconductor.org/packages/lemur/).

## Installation

You can install the development version of lemur.utils from [GitHub](https://github.com/) with:

``` r
# install.packages("devtools")
devtools::install_github("const-ae/lemur.utils")
```

# Disclaimer

This package is in an very early stage of development and the API is not considered stable.

# Documentation

I will demonstrate the functions using the data by Kang et al. (2018).

```{r, message=FALSE, paged.print=FALSE}
library(lemur)
library(lemur.utils)
library(SingleCellExperiment)
library(tidyverse)
set.seed(1)

# Prepare the data
sce <- muscData::Kang18_8vs8()
logcounts(sce) <- transformGamPoi::shifted_log_transform(sce)
hvg <- order(-rowVars(logcounts(sce)))
sce <- sce[hvg[1:500],]

fit <- lemur(sce, design = ~ stim, n_embedding = 10, verbose = FALSE)
fit <- align_harmony(fit)
fit <- test_de(fit, contrast = cond(stim = "stim") - cond(stim = "ctrl"))
nei <- find_de_neighborhoods(fit, group_by = vars(ind))

as_tibble(nei)
```

### Neighborhood helpers

#### `neighborhoods_to_long_data()`

Convert the neighborhood column from the output of `lemur::find_de_neighborhoods` to a tidy tibble.

```{r, paged.print=FALSE}
# By default the long data contains all gene / cell combinations
neighborhoods_to_long_data(nei, fit = fit)
# `only_keep_inside` filters out `inside == FALSE` and produces a smaller tibble
neighborhoods_to_long_data(nei, fit = fit, only_keep_inside = TRUE)
```

#### `neighborhoods_to_matrix()`

Convert the neighborhood column from the output of `lemur::find_de_neighborhoods` to a 0/1 matrix.

```{r, paged.print=FALSE}
neighborhoods_to_matrix(nei, fit = fit)
```

#### `count_labels_per_neighborhood()`

Count the occurrences of a cell label per neighborhood.

```{r, paged.print=FALSE}
count_labels_per_neighborhood(nei, labels = vars(cell), fit = fit)
```

### Make tidy

#### `fit_pivot_longer`

The `fit_pivot_longer` works on the `lemur_fit` objects. For `SingleCellExperiment` objects use the `sce_pivot_longer` function, which works analogously but has slightly different defaults.

Be careful when using this function: the output tibble will have `n_genes * n_cells` rows, which for `n_genes = 1e4` and `n_cells = 1e4` produces a 100 million rows.

```{r, paged.print=FALSE}
# Select genes by name
fit_pivot_longer(fit, genes = "FTH1")

# Select genes by index
fit_pivot_longer(fit, genes = 1:3)

# Select genes with filter statement wrapped in `vars`
fit_pivot_longer(fit, genes = vars(str_starts(SYMBOL, "HSP")))

# Select cells by cell type and condition
fit_pivot_longer(fit, genes = 1:10, cells = vars(cell == "CD 4 T cells" | stim == "ctrl"))
```

### Plotting helpers

#### `scale_color_de` / `scale_fill_de`

```{r, paged.print=FALSE}
fit_pivot_longer(fit, genes = 1, assays = "DE", reduced_dims = "embedding") %>%
ggplot(aes(x = embedding[,1], y = embedding[,2])) +
geom_point(aes(color = DE)) +
scale_color_de(qlimits = 0.1)
```

### Projection onto a reference dataset

#### `transfer_col_data`

After integrating a query and a reference dataset, transfer the annotation from the reference data to the query data.
The `ref` and `query` data must contain a shared embedding, produced by manually integrating them with Seurat, harmony, or LEMUR.

```{r, paged.print=FALSE}
# This is a completely simulated example to demonstrate how to call the `transfer_col_data` function
ref_sce <- SingleCellExperiment(list(logcounts = matrix(1, nrow = 400, ncol = 300)),
colData = DataFrame(celltype = sample(c("A", "B", "C"), size = 300, replace = TRUE),
origin = sample(c("foo", "bar"), size = 300, replace = TRUE)),
reducedDims = list(embedding = t(matrix(rnorm(10 * 300), nrow = 10, ncol = 300))))

transfer_col_data(ref_sce, fit, columns = vars(celltype, origin))
```

# Session Info

```{r}
sessionInfo()
```