https://github.com/const-ae/lemur.utils

R package with helper functions to process the output of lemur
https://github.com/const-ae/lemur.utils

Last synced: about 2 months ago
JSON representation

R package with helper functions to process the output of lemur

Host: GitHub
URL: https://github.com/const-ae/lemur.utils
Owner: const-ae
Created: 2024-06-20T13:49:45.000Z (about 1 year ago)
Default Branch: main
Last Pushed: 2024-07-30T08:43:24.000Z (11 months ago)
Last Synced: 2025-02-17T01:42:15.493Z (4 months ago)
Language: R
Size: 275 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.Rmd
- Changelog: NEWS.md

Awesome Lists containing this project

README

        ---

output: github_document

---

```{r, include = FALSE}

knitr::opts_chunk$set(

  collapse = TRUE,

  comment = "#>",

  fig.path = "man/figures/README-",

  out.width = "100%"

)

options(max.print=10)

```

# lemur.utils

Helper functions to manage the output of [`lemur`](https://www.bioconductor.org/packages/lemur/).

## Installation

You can install the development version of lemur.utils from [GitHub](https://github.com/) with:

``` r

# install.packages("devtools")

devtools::install_github("const-ae/lemur.utils")

```

# Disclaimer

This package is in an very early stage of development and the API is not considered stable.

# Documentation

I will demonstrate the functions using the data by Kang et al. (2018).

```{r, message=FALSE, paged.print=FALSE}

library(lemur)

library(lemur.utils)

library(SingleCellExperiment)

library(tidyverse)

set.seed(1)

# Prepare the data

sce <- muscData::Kang18_8vs8()

logcounts(sce) <- transformGamPoi::shifted_log_transform(sce)

hvg <- order(-rowVars(logcounts(sce)))

sce <- sce[hvg[1:500],]

fit <- lemur(sce, design = ~ stim, n_embedding = 10, verbose = FALSE)

fit <- align_harmony(fit)

fit <- test_de(fit, contrast = cond(stim = "stim") - cond(stim = "ctrl"))

nei <- find_de_neighborhoods(fit, group_by = vars(ind))

as_tibble(nei)

```

### Neighborhood helpers

#### `neighborhoods_to_long_data()`

Convert the neighborhood column from the output of `lemur::find_de_neighborhoods` to a tidy tibble.

```{r, paged.print=FALSE}

# By default the long data contains all gene / cell combinations

neighborhoods_to_long_data(nei, fit = fit)

# `only_keep_inside` filters out `inside == FALSE` and produces a smaller tibble

neighborhoods_to_long_data(nei, fit = fit, only_keep_inside = TRUE)

```

#### `neighborhoods_to_matrix()`

Convert the neighborhood column from the output of `lemur::find_de_neighborhoods` to a 0/1 matrix. 

```{r, paged.print=FALSE}

neighborhoods_to_matrix(nei, fit = fit)

```

#### `count_labels_per_neighborhood()`

Count the occurrences of a cell label per neighborhood.

```{r, paged.print=FALSE}

count_labels_per_neighborhood(nei, labels = vars(cell), fit = fit)

```

### Make tidy

#### `fit_pivot_longer`

The `fit_pivot_longer` works on the `lemur_fit` objects. For `SingleCellExperiment` objects use the `sce_pivot_longer` function, which works analogously but has slightly different defaults.

Be careful when using this function: the output tibble will have `n_genes * n_cells` rows, which for `n_genes = 1e4` and `n_cells = 1e4` produces a 100 million rows. 

```{r, paged.print=FALSE}

# Select genes by name

fit_pivot_longer(fit, genes = "FTH1")

# Select genes by index

fit_pivot_longer(fit, genes = 1:3)

# Select genes with filter statement wrapped in `vars`

fit_pivot_longer(fit, genes = vars(str_starts(SYMBOL, "HSP")))

# Select cells by cell type and condition

fit_pivot_longer(fit, genes = 1:10, cells = vars(cell == "CD 4 T cells" | stim == "ctrl"))

```

### Plotting helpers

#### `scale_color_de` / `scale_fill_de`

```{r, paged.print=FALSE}

fit_pivot_longer(fit, genes = 1, assays = "DE", reduced_dims = "embedding") %>%

  ggplot(aes(x = embedding[,1], y = embedding[,2])) +

    geom_point(aes(color = DE)) +

    scale_color_de(qlimits = 0.1)

```

### Projection onto a reference dataset

#### `transfer_col_data`

After integrating a query and a reference dataset, transfer the annotation from the reference data to the query data. 

The `ref` and `query` data must contain a shared embedding, produced by manually integrating them with Seurat, harmony, or LEMUR.

```{r, paged.print=FALSE}

# This is a completely simulated example to demonstrate how to call the `transfer_col_data` function

ref_sce <- SingleCellExperiment(list(logcounts = matrix(1, nrow = 400, ncol = 300)),

                                colData = DataFrame(celltype = sample(c("A", "B", "C"), size = 300, replace = TRUE),

                                                    origin = sample(c("foo", "bar"), size = 300, replace = TRUE)),

                                reducedDims = list(embedding = t(matrix(rnorm(10 * 300), nrow = 10, ncol = 300))))

transfer_col_data(ref_sce, fit, columns = vars(celltype, origin))

```

# Session Info

```{r}

sessionInfo()

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/const-ae/lemur.utils

Awesome Lists containing this project

README