An open API service indexing awesome lists of open source software.

https://github.com/theislab/cellink

Scalable framework for integrating single-cell omics with genetic data using AnnData.
https://github.com/theislab/cellink

anndata genetics single-cell

Last synced: 3 months ago
JSON representation

Scalable framework for integrating single-cell omics with genetic data using AnnData.

Awesome Lists containing this project

README

          

[![Build](https://github.com/theislab/cellink/actions/workflows/build.yaml/badge.svg)](https://github.com/theislab/cellink/actions/workflows/build.yaml/badge.svg)
[![License](https://img.shields.io/github/license/theislab/cellink)](https://opensource.org/licenses/Apache2.0)
[![Read the Docs](https://img.shields.io/readthedocs/cellink/latest.svg?label=Read%20the%20Docs)](https://cellink-docs.readthedocs.io/)
[![Test](https://github.com/theislab/cellink/actions/workflows/test.yaml/badge.svg)](https://github.com/theislab/cellink/actions/workflows/test.yaml)
[![pre-commit](https://img.shields.io/badge/pre--commit-enabled-brightgreen?logo=pre-commit&logoColor=white)](https://github.com/pre-commit/pre-commit)

# Single-cell Genetics Package (Cellink)

Welcome to the official documentation for **cellink**β€”the toolkit designed to bridge the gap between single-cell data and individual-level genetic analysis.

## Motivation

Integrating genetic data with cellular heterogeneity is crucial for advancing personalized medicine. **cellink** provides the missing framework for efficiently handling and analyzing genetic variation alongside complex single-cell omics data at scale.

## ✨ Key Features & Structure

**cellink** introduces the `DonorData` class, unifying individual-level and single-cell data. It extends standard formats (AnnData, MuData) with GenoAnnData for efficient genotype (via dask) and phenotype (via ehrapy) handling.

![Data structure schematic](docs/_static/img/schematic_figure.png)

````{only} html
```{image} _static/img/schematic_figure.png
:width: 750px
:alt: Data structure schematic
```
````

- **Donor-level Data (G):** `GenoAnnData`, Stores individual level data such as genotypes.
- **Cell-level Data (C):** `AnnData`/ `MuData`, Stores single-cell omics data such as gene expression.

Crucially, **`DonorData`** ensures that genetic data and single-cell modalities remain **synchronized**, preserving their donor-cell pairing even through complex filtering operations (e.g., selecting specific cell types or patient subsets).

### 2. Comprehensive Toolkit

**cellink** offers a streamlined suite of tools for the entire analysis workflow:

- **[Variant Preprocessing & Annotation](https://cellink-docs.readthedocs.io/en/latest/tutorials/explore_annotations.html):** Tools for quality control, annotation (VCF export/import), and selection of genetic variants.
- **Specialized Downstream Analysis:** Easily perform complex genetic analyses on single-cell expression data, including:
- [eQTL mapping](https://cellink-docs.readthedocs.io/en/latest/tutorials/pseudobulk_eqtl.html).
- [Rare variant association studies](https://cellink-docs.readthedocs.io/en/latest/tutorials/burden_testing.html).
- [Clumping & pruning](https://cellink-docs.readthedocs.io/en/latest/tutorials/clumping_pruning.html).
- [Colocalization analysis](https://cellink-docs.readthedocs.io/en/latest/tutorials/colocalization.html).
- **Interoperability:** **cellink** enhances standard workflows through data exports compatible with common genetic analysis
tools, e.g., for [eQTL analysis with jaxqtl or tensorqtl](https://cellink-docs.readthedocs.io/en/latest/tutorials/pseudobulk_eqtl_jaxqtl_tensorqtl.html) and includes built-in [dataloaders for deep learning](https://cellink-docs.readthedocs.io/en/latest/tutorials/run_dataloader.html).

## πŸš€ Getting Started

- Check out the **[Tutorials](https://cellink-docs.readthedocs.io/en/latest/tutorials/index.html)** section for step-by-step guides on analysis workflows.
- Explore the **[API Reference](https://cellink-docs.readthedocs.io/en/latest/api/index.html)** for detailed documentation.

Install the latest development version directly from GitHub:

```bash
pip install git+https://github.com/theislab/cellink.git@main
```

## Contact

If you found a bug, please use the [issue tracker](https://github.com/theislab/cellink/issues).

## Release notes

t.b.a

## Citation

> t.b.a

[mambaforge]: https://github.com/conda-forge/miniforge#mambaforge
[scverse discourse]: https://discourse.scverse.org/
[issue tracker]: https://github.com/theislab/cellink/issues