{"id":21962527,"url":"https://github.com/SingleR-inc/singler-py","last_synced_at":"2025-07-22T13:32:15.084Z","repository":{"id":191741187,"uuid":"685298232","full_name":"BiocPy/singler","owner":"BiocPy","description":"Python bindings to the SingleR algorithm","archived":false,"fork":false,"pushed_at":"2024-06-27T16:28:21.000Z","size":361,"stargazers_count":6,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2024-11-05T02:37:26.452Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://biocpy.github.io/singler/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/BiocPy.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":"AUTHORS.md","dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-08-30T23:41:56.000Z","updated_at":"2024-09-24T07:42:08.000Z","dependencies_parsed_at":null,"dependency_job_id":"b76881af-135e-4c00-b04d-a53eaad27e0b","html_url":"https://github.com/BiocPy/singler","commit_stats":null,"previous_names":["biocpy/singler"],"tags_count":5,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BiocPy%2Fsingler","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BiocPy%2Fsingler/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BiocPy%2Fsingler/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BiocPy%2Fsingler/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/BiocPy","download_url":"https://codeload.github.com/BiocPy/singler/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":227101937,"owners_count":17731221,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-29T10:43:15.641Z","updated_at":"2025-07-22T13:32:15.079Z","avatar_url":"https://github.com/BiocPy.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003c!-- These are examples of badges you might want to add to your README:\n     please update the URLs accordingly\n\n[![Built Status](https://api.cirrus-ci.com/github/\u003cUSER\u003e/singler.svg?branch=main)](https://cirrus-ci.com/github/\u003cUSER\u003e/singler)\n[![ReadTheDocs](https://readthedocs.org/projects/singler/badge/?version=latest)](https://singler.readthedocs.io/en/stable/)\n[![Coveralls](https://img.shields.io/coveralls/github/\u003cUSER\u003e/singler/main.svg)](https://coveralls.io/r/\u003cUSER\u003e/singler)\n[![Conda-Forge](https://img.shields.io/conda/vn/conda-forge/singler.svg)](https://anaconda.org/conda-forge/singler)\n[![Twitter](https://img.shields.io/twitter/url/http/shields.io.svg?style=social\u0026label=Twitter)](https://twitter.com/singler)\n--\u003e\n\n[![Project generated with PyScaffold](https://img.shields.io/badge/-PyScaffold-005CA0?logo=pyscaffold)](https://pyscaffold.org/)\n[![PyPI-Server](https://img.shields.io/pypi/v/singler.svg)](https://pypi.org/project/singler/)\n[![Monthly Downloads](https://static.pepy.tech/badge/singler/month)](https://pepy.tech/project/singler)\n![Unit tests](https://github.com/SingleR-inc/singler-py/actions/workflows/pypi-test.yml/badge.svg)\n\n# Tinder for single-cell data\n\n## Overview\n\nThis package provides Python bindings to the [C++ implementation](https://github.com/SingleR-inc/singlepp) of the [SingleR method](https://github.com/SingleR-inc/SingleR),\noriginally developed by [Aran et al. (2019)](https://www.nature.com/articles/s41590-018-0276-y).\nIt is designed to annotate cell types by matching cells to known references based on their expression profiles.\nSo kind of like Tinder, but for cells.\n\n## Quick start\n\nFirstly, let's load in the famous PBMC 4k dataset from 10X Genomics:\n\n```python\nimport singlecellexperiment as sce\ndata = sce.read_tenx_h5(\"pbmc4k-tenx.h5\", realize_assays=True)\nmat = data.assay(\"counts\")\nfeatures = [str(x) for x in data.row_data[\"name\"]]\n```\n\nor if you are coming from scverse ecosystem, i.e. `AnnData`, simply read the object as `SingleCellExperiment` and extract the matrix and the features.\nRead more on [SingleCellExperiment here](https://biocpy.github.io/tutorial/chapters/experiments/single_cell_experiment.html).\n\n\n```python\nimport singlecellexperiment as sce\n\nsce_adata = sce.SingleCellExperiment.from_anndata(adata) \n\n# or from a h5ad file\nsce_h5ad = sce.read_h5ad(\"tests/data/adata.h5ad\")\n```\n\nNow, we fetch the Blueprint/ENCODE reference:\n\n```python\nimport celldex\n\nref_data = celldex.fetch_reference(\"blueprint_encode\", \"2024-02-26\", realize_assays=True)\n```\n\nWe can annotate each cell in `mat` with the reference:\n\n```python\nimport singler\nresults = singler.annotate_single(\n    test_data = mat,\n    test_features = features,\n    ref_data = ref_data,\n    ref_labels = ref_data.get_column_data().column(\"label.main\"),\n)\n```\n\nThe `results` data frame contains all of the assignments and the scores for each label:\n\n```python\nresults.column(\"best\")\n## ['Monocytes',\n##  'Monocytes',\n##  'Monocytes',\n##  'CD8+ T-cells',\n##  'CD4+ T-cells',\n##  'CD8+ T-cells',\n##  'Monocytes',\n##  'Monocytes',\n##  'B-cells',\n##  ...\n## ]\n\nresults.column(\"scores\").column(\"Macrophages\")\n## array([0.35935275, 0.40833545, 0.37430726, ..., 0.32135929, 0.29728435,\n##        0.40208581])\n```\n\n## Calling low-level functions\n\nThe `annotate_single()` function is a convenient wrapper around a number of lower-level functions in **singler**.\nAdvanced users may prefer to build the reference and run the classification separately.\nThis allows us to re-use the same reference for multiple datasets without repeating the build step.\n\n```python\nbuilt = singler.train_single(\n    ref_data = ref_data.assay(\"logcounts\"),\n    ref_labels = ref_data.get_column_data().column(\"label.main\"),\n    ref_features = ref_data.get_row_names(),\n    test_features = features,\n)\n```\n\nAnd finally, we apply the pre-built reference to the test dataset to obtain our label assignments.\nThis can be repeated with different datasets that have the same features as `test_features=`.\n\n```python\noutput = singler.classify_single(mat, ref_prebuilt=built)\n```\n\n    ## output\n    BiocFrame with 4340 rows and 3 columns\n                best                                   scores                delta\n            \u003clist\u003e                              \u003cBiocFrame\u003e   \u003cndarray[float64]\u003e\n    [0] Monocytes 0.33265560369962943:0.407117403330602...  0.40706830113982534\n    [1] Monocytes 0.4078771641637374:0.4783396310685646...  0.07000418564184802\n    [2] Monocytes 0.3517036021728629:0.4076971245524348...  0.30997293412307647\n                ...                                      ...                  ...\n    [4337]  NK cells 0.3472631136865701:0.3937898240670208...  0.09640242155786138\n    [4338]   B-cells 0.26974632191999887:0.334862058137758... 0.061215905058676856\n    [4339] Monocytes 0.39390119034537324:0.468867490667427...  0.06678168346812047\n\n## Integrating labels across references\n\nWe can use annotations from multiple references through the `annotate_integrated()` function:\n\n```python\nimport singler\nimport celldex\n\nblueprint_ref = celldex.fetch_reference(\"blueprint_encode\", \"2024-02-26\", realize_assays=True)\nimmune_cell_ref = celldex.fetch_reference(\"dice\", \"2024-02-26\", realize_assays=True)\n\nsingle_results, integrated = singler.annotate_integrated(\n    mat,\n    ref_data = [\n        blueprint_ref,\n        immune_cell_ref\n    ],\n    ref_labels = [\n        blueprint_ref.get_column_data().column(\"label.main\"),\n        immune_cell_ref.get_column_data().column(\"label.main\")\n    ],\n    test_features = features,\n    num_threads = 6\n)\n```\n\nThis annotates the test dataset against each reference individually to obtain the best per-reference label,\nand then it compares across references to find the best label from all references.\n\n```python\nintegrated.column(\"best_label\")\n## ['Monocytes', \n##  'Monocytes',\n##  'Monocytes',\n##  'CD8+ T-cells',\n##  'CD4+ T-cells',\n##  'CD8+ T-cells',\n##  'Monocytes',\n##  'Monocytes',\n##  ...\n## ]\n\nintegrated.column(\"best_reference\")\n## [0,\n##  0, \n##  0,\n##  0,\n##  0,\n##  0,\n##  0,\n##  0,\n##  ...\n## ]\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FSingleR-inc%2Fsingler-py","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FSingleR-inc%2Fsingler-py","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FSingleR-inc%2Fsingler-py/lists"}