https://github.com/vjcitn/scvir
experimental interface between R and scvi-tools
https://github.com/vjcitn/scvir
bioconductor cite-seq scverse
Last synced: about 2 months ago
JSON representation
experimental interface between R and scvi-tools
- Host: GitHub
- URL: https://github.com/vjcitn/scvir
- Owner: vjcitn
- Created: 2023-01-26T20:18:50.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2024-03-20T12:03:33.000Z (about 1 year ago)
- Last Synced: 2024-05-11T07:41:43.187Z (about 1 year ago)
- Topics: bioconductor, cite-seq, scverse
- Language: R
- Homepage: https://vjcitn.github.io/scviR
- Size: 24.4 MB
- Stars: 6
- Watchers: 2
- Forks: 0
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# scviR
The scviR package provides an
experimental interface between R and [scvi-tools](https://docs.scvi-tools.org/en/stable/).Our first release addresses the use of the [totalVI
model for CITE-seq data](https://docs.scvi-tools.org/en/stable/user_guide/models/totalvi.html).- The scviR vignette works through a chunk of the colab tutorial
for [scvi-tools 0.19.0](https://colab.research.google.com/github/scverse/scvi-tutorials/blob/0.20.0/totalVI.ipynb); 0.20.0 employs muon, and this has not been addressed yet.
- scviR defines python infrastructure via the [basilisk](https://bioconductor.org/packages/basilisk)
discipline; the main python dependencies are declared in `R/basilisk.R`.
- We have collected a number of intermediate results so that the outputs of totalVI
(and other VI procedures)
can be explored without taking the time to fit the model. An example in the CITE-seq
domain is the anndata instance for the 5k-10k PBMC dataset with representations of the latent space, cluster
assignments, and UMAP projection:```
> tot = getTotalVI5k10kAdata() # retrieved on first call from Open Storage Network, cached
> tot
AnnData object with n_obs × n_vars = 10849 × 4000
obs: 'n_genes', 'percent_mito', 'n_counts', 'batch', '_scvi_labels', '_scvi_batch', 'leiden_totalVI'
var: 'highly_variable', 'highly_variable_rank', 'means', 'variances', 'variances_norm', 'highly_variable_nbatches'
uns: '_scvi_manager_uuid', '_scvi_uuid', 'hvg', 'leiden', 'log1p', 'neighbors', 'umap'
obsm: 'X_totalVI', 'X_umap', 'denoised_protein', 'protein_expression', 'protein_foreground_prob'
layers: 'counts', 'denoised_rna'
obsp: 'connectivities', 'distances'
> table(tot$obs$batch)PBMC5k PBMC10k
3994 6855> dim(xx$obsm$get("X_totalVI")) # cell positions in 20 dimensional latent space
[1] 10849 20```
Vignettes in the package show how to populate a Bioconductor SingleCellExperiment with
components of this structure to help compare methods employed in the two frameworks.