Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/const-ae/lemur.utils
R package with helper functions to process the output of lemur
https://github.com/const-ae/lemur.utils
Last synced: 9 days ago
JSON representation
R package with helper functions to process the output of lemur
- Host: GitHub
- URL: https://github.com/const-ae/lemur.utils
- Owner: const-ae
- Created: 2024-06-20T13:49:45.000Z (7 months ago)
- Default Branch: main
- Last Pushed: 2024-07-30T08:43:24.000Z (5 months ago)
- Last Synced: 2024-11-06T14:00:50.158Z (about 2 months ago)
- Language: R
- Size: 275 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.Rmd
- Changelog: NEWS.md
Awesome Lists containing this project
README
---
output: github_document
---```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)options(max.print=10)
```# lemur.utils
Helper functions to manage the output of [`lemur`](https://www.bioconductor.org/packages/lemur/).
## Installation
You can install the development version of lemur.utils from [GitHub](https://github.com/) with:
``` r
# install.packages("devtools")
devtools::install_github("const-ae/lemur.utils")
```# Disclaimer
This package is in an very early stage of development and the API is not considered stable.
# Documentation
I will demonstrate the functions using the data by Kang et al. (2018).
```{r, message=FALSE, paged.print=FALSE}
library(lemur)
library(lemur.utils)
library(SingleCellExperiment)
library(tidyverse)
set.seed(1)# Prepare the data
sce <- muscData::Kang18_8vs8()
logcounts(sce) <- transformGamPoi::shifted_log_transform(sce)
hvg <- order(-rowVars(logcounts(sce)))
sce <- sce[hvg[1:500],]fit <- lemur(sce, design = ~ stim, n_embedding = 10, verbose = FALSE)
fit <- align_harmony(fit)
fit <- test_de(fit, contrast = cond(stim = "stim") - cond(stim = "ctrl"))
nei <- find_de_neighborhoods(fit, group_by = vars(ind))as_tibble(nei)
```### Neighborhood helpers
#### `neighborhoods_to_long_data()`
Convert the neighborhood column from the output of `lemur::find_de_neighborhoods` to a tidy tibble.
```{r, paged.print=FALSE}
# By default the long data contains all gene / cell combinations
neighborhoods_to_long_data(nei, fit = fit)
# `only_keep_inside` filters out `inside == FALSE` and produces a smaller tibble
neighborhoods_to_long_data(nei, fit = fit, only_keep_inside = TRUE)
```#### `neighborhoods_to_matrix()`
Convert the neighborhood column from the output of `lemur::find_de_neighborhoods` to a 0/1 matrix.
```{r, paged.print=FALSE}
neighborhoods_to_matrix(nei, fit = fit)
```#### `count_labels_per_neighborhood()`
Count the occurrences of a cell label per neighborhood.
```{r, paged.print=FALSE}
count_labels_per_neighborhood(nei, labels = vars(cell), fit = fit)
```### Make tidy
#### `fit_pivot_longer`
The `fit_pivot_longer` works on the `lemur_fit` objects. For `SingleCellExperiment` objects use the `sce_pivot_longer` function, which works analogously but has slightly different defaults.
Be careful when using this function: the output tibble will have `n_genes * n_cells` rows, which for `n_genes = 1e4` and `n_cells = 1e4` produces a 100 million rows.
```{r, paged.print=FALSE}
# Select genes by name
fit_pivot_longer(fit, genes = "FTH1")# Select genes by index
fit_pivot_longer(fit, genes = 1:3)# Select genes with filter statement wrapped in `vars`
fit_pivot_longer(fit, genes = vars(str_starts(SYMBOL, "HSP")))# Select cells by cell type and condition
fit_pivot_longer(fit, genes = 1:10, cells = vars(cell == "CD 4 T cells" | stim == "ctrl"))
```### Plotting helpers
#### `scale_color_de` / `scale_fill_de`
```{r, paged.print=FALSE}
fit_pivot_longer(fit, genes = 1, assays = "DE", reduced_dims = "embedding") %>%
ggplot(aes(x = embedding[,1], y = embedding[,2])) +
geom_point(aes(color = DE)) +
scale_color_de(qlimits = 0.1)
```### Projection onto a reference dataset
#### `transfer_col_data`
After integrating a query and a reference dataset, transfer the annotation from the reference data to the query data.
The `ref` and `query` data must contain a shared embedding, produced by manually integrating them with Seurat, harmony, or LEMUR.```{r, paged.print=FALSE}
# This is a completely simulated example to demonstrate how to call the `transfer_col_data` function
ref_sce <- SingleCellExperiment(list(logcounts = matrix(1, nrow = 400, ncol = 300)),
colData = DataFrame(celltype = sample(c("A", "B", "C"), size = 300, replace = TRUE),
origin = sample(c("foo", "bar"), size = 300, replace = TRUE)),
reducedDims = list(embedding = t(matrix(rnorm(10 * 300), nrow = 10, ncol = 300))))transfer_col_data(ref_sce, fit, columns = vars(celltype, origin))
```# Session Info
```{r}
sessionInfo()
```