{"id":27452197,"url":"https://github.com/epigen/mixscape_seurat","last_synced_at":"2025-08-22T00:09:14.936Z","repository":{"id":112747696,"uuid":"481635018","full_name":"epigen/mixscape_seurat","owner":"epigen","description":"A Snakemake workflow and MrBiomics module for performing perturbation analyses of pooled (multimodal) CRISPR screens with sc/snRNA-seq read-out (scCRISPR-seq) powered by the R package Seurat's method Mixscape.","archived":false,"fork":false,"pushed_at":"2024-12-20T15:56:44.000Z","size":116,"stargazers_count":19,"open_issues_count":1,"forks_count":2,"subscribers_count":4,"default_branch":"main","last_synced_at":"2025-04-12T00:13:10.098Z","etag":null,"topics":["bioinformatics","biomedical-data-science","perturbation-analysis","sccrispr-seq","scrna-seq","single-cell","snakemake","snrna-seq","visualization","workflow"],"latest_commit_sha":null,"homepage":"https://epigen.github.io/mixscape_seurat/","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/epigen.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2022-04-14T14:26:34.000Z","updated_at":"2025-04-09T07:04:57.000Z","dependencies_parsed_at":null,"dependency_job_id":"98d30a01-8b07-4a97-a485-d228ca1a5711","html_url":"https://github.com/epigen/mixscape_seurat","commit_stats":{"total_commits":30,"total_committers":2,"mean_commits":15.0,"dds":0.2666666666666667,"last_synced_commit":"f24212075e5c4e038028617bd42342e6f3f7518a"},"previous_names":[],"tags_count":4,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/epigen%2Fmixscape_seurat","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/epigen%2Fmixscape_seurat/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/epigen%2Fmixscape_seurat/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/epigen%2Fmixscape_seurat/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/epigen","download_url":"https://codeload.github.com/epigen/mixscape_seurat/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":249063511,"owners_count":21206920,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bioinformatics","biomedical-data-science","perturbation-analysis","sccrispr-seq","scrna-seq","single-cell","snakemake","snrna-seq","visualization","workflow"],"created_at":"2025-04-15T11:41:49.045Z","updated_at":"2025-08-22T00:09:14.923Z","avatar_url":"https://github.com/epigen.png","language":"R","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![MrBiomics](https://img.shields.io/badge/MrBiomics-red)](https://github.com/epigen/MrBiomics/)\n[![DOI](https://zenodo.org/badge/481635018.svg)](https://zenodo.org/badge/latestdoi/481635018)\n[![](https://tokei.rs/b1/github/epigen/mixscape_seurat?category=code)]() \n[![](https://tokei.rs/b1/github/epigen/mixscape_seurat?category=files)]()\n[![GitHub license](https://img.shields.io/github/license/epigen/mixscape_seurat)](https://github.com/epigen/mixscape_seurat/blob/master/LICENSE)\n![GitHub Release](https://img.shields.io/github/v/release/epigen/mixscape_seurat)\n[![Snakemake](https://img.shields.io/badge/Snakemake-\u003e=8.20.1-green)](https://snakemake.readthedocs.io/en/stable/)\n\n# scCRISPR-seq Perturbation Analysis Workflow using Seurat's Mixscape\nA [Snakemake 8](https://snakemake.readthedocs.io/en/stable/) workflow for performing perturbation analyses of pooled (multimodal) CRISPR screens with scRNA-seq read-out (scCRISPR-seq, CROP-seq, Perturb-seq) powered by the R package [Seurat's](https://satijalab.org/seurat/index.html) method [Mixscape](https://satijalab.org/seurat/articles/mixscape_vignette.html).\n\n\u003e [!NOTE]  \n\u003e This workflow adheres to the module specifications of [MrBiomics](https://github.com/epigen/MrBiomics), an effort to augment research by modularizing (biomedical) data science. For more details, instructions, and modules check out the project's repository.\n\u003e\n\u003e ⭐️ **Star and share modules you find valuable** 📤 - help others discover them, and guide our future work!\n\n\u003e [!IMPORTANT]  \n\u003e **If you use this workflow in a publication, please don't forget to give credit to the authors by citing it using this DOI [10.5281/zenodo.8424761](https://doi.org/10.5281/zenodo.8424761).**\n\n![Workflow Rulegraph](./workflow/dags/rulegraph.svg)\n\n# Authors\n- [Stephan Reichl](https://github.com/sreichl)\n- [Christoph Bock](https://github.com/chrbock)\n\n\n# 💿 Software\nThis project wouldn't be possible without the following software and it's dependencies:\n\n| Software       | Reference (DOI)                                   |\n| :------------: | :-----------------------------------------------: |\n| data.table     | https://r-datatable.com                           |\n| ggplot2        | https://ggplot2.tidyverse.org/                    |\n| Mixscape       | https://doi.org/10.1038/s41588-021-00778-2        |\n| mixtools       | https://CRAN.R-project.org/package=mixtools       |\n| patchwork      | https://CRAN.R-project.org/package=patchwork      |\n| Seurat         | https://doi.org/10.1016/j.cell.2021.04.048        |\n| Snakemake      | https://doi.org/10.12688/f1000research.29032.2    |\n\n# 🔬 Methods\nThis is a template for the Methods section of a scientific publication and is intended to serve as a starting point. Only retain paragraphs relevant to your analysis. References [ref] to the respective publications are curated in the software table above. Versions (ver) have to be read out from the respective conda environment specifications (`workflow/envs/*.yaml file`) or post-execution in the result directory (`mixscape_seurat/envs/*.yaml`). Parameters that have to be adapted depending on the data or workflow configurations are denoted in squared brackets e.g., [X].\n\nThe outlined analyses were performed using the R package Seurat (ver) [ref] unless stated otherwise.\n\n**Mixscape**. We applied the Mixscape workflow [ref], implemented in Seurat, on each [sample] separately as well as all [samples] simultaneously to identify perturbed cells compared to non-targeting (NT) guide RNA (gRNA) assigned cells. Briefly, cells putatively assigned to a gRNA and respective knockout (KO) target gene in conjunction with NT cells were used to calculate cell-wise perturbation signatures by using Seurat::CalcPerturbSig to subtract the average expression profile of  the [n_neighbors] closest NT cells in [ndims]-dimensional PCA space. Using Seurat::RunMixscape, with a log2(fold change) threshold of [lfc_th] and a minimum of [min_de_genes] differentially expressed genes, cells were classified as perturbed or non-perturbed using posterior probabilities of an expectation-maximization (EM) algorithm for mixtures of univariate normals, assuming each putatively annotated target gene group is a mixture of two Gaussian distributions (perturbed signal and non-perturbed background). \n\n**Visualizations**. Statistics of the Mixscape classification of perturbed cells versus cells with no detectable perturbation on a target gene and gRNA basis using barplots. Perturbation scores of cells split by their Mixscape classification as density plots. Posterior probability values of non-perturbed and perturbed cells as violin plots using the Seurat function VlnPlot. Perturbation scores and posterior probabilities were additionally plotted split by replicates [split_by_col] and experiment conditions [split_by_col]. For the visualization of protein surface expression measured by Antibody Capture technologies the Seurat function VlnPlot for violin plots split by perturbation classification of cells was used.\n\n**Linear discriminant analysis (LDA)**. LDA was applied on the perturbation signatures of all perturbed and NT cells using Seurat::MixscapeLDA with number of principal components [npcs] per KO class to find the most discriminative subspace, given the KO/NT classes, to project the data into and visualized in two dimensions using UMAP with Seurat::RunUMAP.\n\n**The analysis and visualizations described here were performed using a publicly available Snakemake [ver] (ref) workflow [10.5281/zenodo.8424761](https://doi.org/10.5281/zenodo.8424761).**\n\n# 🚀 Features\nThe workflow performs all steps of the [Mixscape Vignette](https://satijalab.org/seurat/articles/mixscape_vignette.html) on all samples in the annotation file according to the parametrization in the config file.\n- Calculation of local perturbation signatures (`{analysis}/`)\n  - all and filtered (i.e., only pertubed cells) perturbation signatures (`{ALL|FILTERED}_PRTB_data.csv`).\n- Mixscape classification of perturbed cells versus cells with no detectable perturbation (`{analysis}/{ALL|FILTERED}_*`)\n  - Mixscape classification statistics (`{analysis}/mixscape_stats.csv`).\n- Visualization of Mixscape results (`{analysis}/plots/`)\n  - Statistics of the Mixscape classification on a target gene and guide RNA basis as bar plots (`stats/{KO}.png').\n  - Perturbation scores of cells split by their mixscape classification as density plots (`PerturbScore/{KO}_{split}.png').\n  - Posterior probability values in non perturbed and perturbed cells as violin plots (`PosteriorProbability/{KO}_{split}.png').\n  - (optional) if Antibody Capture was used: Surface protein expression measurements split by perturbation classification of cells as violin plots (`{Antibody_Capture_flag}_expression/{protein}.png').\n- Analysis of perturbation responses with Linear Discriminant Analysis (LDA)\n  - LDA components (`LDA_data.csv`)\n  - 2D visualization using UMAP as scatter plot (`{analysis}/plots/LDA_UMAP`).\n\n# 🛠️ Usage\nRead the [Mixscape Vignette](https://satijalab.org/seurat/articles/mixscape_vignette.html).\n\n# ⚙️ Configuration\nDetailed specifications can be found here [./config/README.md](./config/README.md)\n\n# 📖 Example\nExplore a detailed example showcasing module usage and up-/downstream analysis in our comprehensive end-to-end [MrBiomics Recipe](https://github.com/epigen/MrBiomics?tab=readme-ov-file#-recipes) for [scCRISPR-seq Analysis](https://github.com/epigen/MrBiomics/wiki/scCRISPR%E2%80%90seq-Analysis-Recipe), including data, configuration, annotation and results.\n\n# 🔗 Links\n- [GitHub Repository](https://github.com/epigen/mixscape_seurat/)\n- [GitHub Page](https://epigen.github.io/mixscape_seurat/)\n- [Zenodo Repository](https://doi.org/10.5281/zenodo.8424761)\n- [Snakemake Workflow Catalog Entry](https://snakemake.github.io/snakemake-workflow-catalog?usage=epigen/mixscape_seurat)\n\n# 📚 Resources\n- Recommended compatible [MrBiomics](https://github.com/epigen/MrBiomics) modules for\n  - upstream processing:\n    - [scRNA-seq Data Processing \u0026 Visualization](https://github.com/epigen/scrnaseq_processing_seurat) for processing (multimodal) single-cell transcriptome data.input.\n  - downstream analyses:\n    - [Unsupervised Analysis](https://github.com/epigen/unsupervised_analysis) to understand and visualize similarities and variations between cells/samples, including dimensionality reduction and cluster analysis. Useful for all tabular data including single-cell and bulk sequencing data.\n    - [Differential Analysis using Seurat](https://github.com/epigen/dea_seurat) to identify and visualize statistically significantly different features (e.g., genes or proteins) between groups.\n    - [Enrichment Analysis](https://github.com/epigen/enrichment_analysis) for biomedical interpretation of (differential) analysis results using prior knowledge.\n- Mixscape publication: [Papalexi et al. (2021) Nature Genetics - \"Characterizing the molecular regulation of inhibitory immune checkpoints with multimodal single-cell screens.\"](https://doi.org/10.1038/s41588-021-00778-2).\n\n# 📑 Publications\nThe following publications successfully used this module for their analyses.\n- [FirstAuthors et al. (202X) Journal Name - Paper Title.](https://doi.org/10.XXX/XXXX)\n- ...\n\n# ⭐ Star History\n\n[![Star History Chart](https://api.star-history.com/svg?repos=epigen/mixscape_seurat\u0026type=Date)](https://star-history.com/#epigen/mixscape_seurat\u0026Date)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fepigen%2Fmixscape_seurat","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fepigen%2Fmixscape_seurat","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fepigen%2Fmixscape_seurat/lists"}