Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/lingfeiwang/normalisr

Causal inference, differential expression, and co-expression for scRNA-seq
https://github.com/lingfeiwang/normalisr

association-testing causal-inference co-expression differential-expression gene-regulatory-network hypothesis-testing normalization scrna-seq single-cell single-cell-crispr-screening

Last synced: about 1 month ago
JSON representation

Causal inference, differential expression, and co-expression for scRNA-seq

Awesome Lists containing this project

README

        

=========
Normalisr
=========
.. image:: https://img.shields.io/pypi/v/normalisr?color=informational
:target: https://pypi.python.org/pypi/normalisr

.. image:: https://zenodo.org/badge/242889849.svg
:target: https://zenodo.org/badge/latestdoi/242889849

Normalisr is a parameter-free normalization and statistical association testing framework that unifies single-cell differential expression, co-expression, and pooled single-cell CRISPR screen analyses with linear models. By systematically detecting and removing nonlinear confounders arising from library size at mean and variance levels, Normalisr achieves high sensitivity, specificity, speed, and generalizability across multiple scRNA-seq protocols and experimental conditions with unbiased p-value estimation.

Normalisr first removes confounding technical noises from raw read counts to recover the biological variations. Then, linear association testing provides a unified inferential framework with several advantages: (i) exact P-value estimation without permutation, (ii) native removal of covariates (*e.g.* batches, house-keeping programs, and untested gRNAs) as fixed effects, (iii) robustness against read count distribution distortions with enough (> 100) cells, and (iv) computational efficiency.

Normalisr is in python and provides a command-line and a python functional interface. Normalisr is published in `Nature Communications `_ (2021).

Installation
=============
Normalisr is on `PyPI `_ and can be installed with pip: ``pip install normalisr``. You can also install Normalisr from github: ``pip install git+https://github.com/lingfeiwang/normalisr.git``. Make sure you have added Normalisr's install path into PATH environment before using the command-line interface (See FAQ_). Normalisr's installation should take less than a minute.

There are more advanced installation methods but if you want that, most likely you already know how to do it. If not, give me a shout (See Issues_).

Usage
=====
Normalisr provides a command-line and a python functional interface below. You can use the examples provided below to guide yourself through Normalisr's use. Sphinx-based documentation is underway.

* Commmand-line interface
You can run Normalisr by typing ``normalisr`` on command-line. Normalisr uses submodules for different analysis steps. Type ``normalisr`` or ``normalisr -h`` for general help, and for example ``normalisr de -h`` for help on submodule 'de' of differential expression.

Normalisr uses tsv (tab separated values) file format for input and output matrices, and text file for row and column names, such as cells and genes, one per line. For initial input, Normalisr also accepts the sparse mtx format (Cell Ranger output) for raw read count matrix. Gzipped input/output files are automatically recognized if file name suffix '.gz' is present.

* Python functional interface
Normalisr's python functional interface is more flexible than command-line, but requires knowledge of python programming. Documentation of any function can be obtained with ``?`` in ipython or jupyter notebook, such as:

.. code-block::

import normalisr.normalisr as norm
?norm.de

The example jupyter notebooks also illustrate the scope of functions Normalisr provides.

Documentation
=============
Documentations are available as `html `_ and `pdf `_.

Examples and pipelines
==========================
You can find several examples in the 'examples' folder, to cover all functions Normalisr currently provides. The example datasets have been scaled down to run on a 16GB-memory personal computer. Although they only serve as demonstrations of work here, the pipelines should be transferable to a full-scale, different dataset. Since Normalisr is non-parametric, the only adjustable parameters are for quality control and final cutoffs of differential or co-expression. You can change down-sampling parameters in the examples to run the full datasets on a larger computer.

You can find more details in the respective examples.

Issues
==========================
Pease raise an issue on `github `_.

References
==========================
* Single-cell normalization and association testing unifying CRISPR screen and gene co-expression analyses with Normalisr, Lingfei Wang, Nature Communications 2021. https://doi.org/10.1038/s41467-021-26682-1

FAQ
==========================
* What does Normalisr stand for?
**N**\ ormalisr **O**\ ffers **R**\ obust **M**\ odelling of **A**\ ssociations **L**\ inearly **I**\ n **S**\ ingle-cell **R**\ NA-seq. Yes, it's a recursive acronym. See `GNU `_ and `pip `_.

* I installed Normalisr but typing ``normalisr`` says 'command not found'.
See below.

* How do I use a specific python version for Normalisr's command-line interface?
You can always use the python command to run Normalisr, such as ``python3 -m normalisr`` to replace command ``normalisr``. You can also use a specific path or version for python, such as ``python3.7 -m normalisr`` or ``/usr/bin/python3.7 -m normalisr``. Make sure you have installed Normalisr for this python version.

* Why don't the examples work?
Please make sure you followed every step in the README.md of the respective example folder with Internet connection, and then submit an issue report detailing at which executed line the error occurred with input and output.

* Does Normalisr run on Windows?
I have not tested Normalisr on Windows. However, it is purely in python and should be able to function properly.