{"id":13958686,"url":"https://github.com/immunogenomics/symphony","last_synced_at":"2025-06-28T22:42:49.114Z","repository":{"id":44762210,"uuid":"301844874","full_name":"immunogenomics/symphony","owner":"immunogenomics","description":"Efficient and precise single-cell reference atlas mapping with Symphony","archived":false,"fork":false,"pushed_at":"2023-02-05T20:27:08.000Z","size":129804,"stargazers_count":105,"open_issues_count":17,"forks_count":23,"subscribers_count":14,"default_branch":"main","last_synced_at":"2025-06-05T21:09:43.526Z","etag":null,"topics":["bioinfo","mapping","r","scrna-seq"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/immunogenomics.png","metadata":{"files":{"readme":"README.Rmd","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-10-06T20:19:49.000Z","updated_at":"2025-06-05T03:06:52.000Z","dependencies_parsed_at":"2023-02-19T02:00:29.885Z","dependency_job_id":null,"html_url":"https://github.com/immunogenomics/symphony","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/immunogenomics/symphony","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/immunogenomics%2Fsymphony","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/immunogenomics%2Fsymphony/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/immunogenomics%2Fsymphony/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/immunogenomics%2Fsymphony/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/immunogenomics","download_url":"https://codeload.github.com/immunogenomics/symphony/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/immunogenomics%2Fsymphony/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":262508266,"owners_count":23321978,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bioinfo","mapping","r","scrna-seq"],"created_at":"2024-08-08T13:01:48.414Z","updated_at":"2025-06-28T22:42:49.107Z","avatar_url":"https://github.com/immunogenomics.png","language":"Jupyter Notebook","funding_links":[],"categories":["其他_生物医药"],"sub_categories":["网络服务_其他"],"readme":"---\noutput: github_document\n---\n\n\u003c!-- README.md is generated from README.Rmd. Please edit that file --\u003e\n\n```{r, include = FALSE}\nknitr::opts_chunk$set(\n  collapse = TRUE,\n  comment = \"#\u003e\",\n  fig.path = \"man/figures/README-\",\n  fig.align = 'right',\n  out.width = \"100%\"\n)\n```\n\n# Symphony \u003cimg src=\"man/figures/symphony_logo.png\" alt=\"logo\" width=\"181\" align=\"right\"/\u003e\n\n\u003c!-- badges: start --\u003e\n\u003c!-- badges: end --\u003e\n\nEfficient and precise single-cell reference atlas mapping with Symphony\n\n[Kang et al. (Nat Comm, 2021)](https://www.nature.com/articles/s41467-021-25957-x)\n\nFor Python users, check out the [symphonypy](https://github.com/potulabe/symphonypy) package by Kseniya Petrova and Sergey Isaev.\n\n\n# Installation\n\nSymphony is available on CRAN:\n``` r\ninstall.packages(\"symphony\")\n```\n\nInstall the development version of Symphony from [GitHub](https://github.com/) use:\n\n``` r\n# install.packages(\"devtools\")\ndevtools::install_github(\"immunogenomics/symphony\")\n```\nInstall should take \u003c10 mins (pending no major issues). See installation notes below.\n\n\n# Usage/Demos\n## Tutorials\n\n* Check out the [quick start (\u003c5 min) PBMCs tutorial](https://github.com/immunogenomics/symphony/blob/main/vignettes/pbmcs_tutorial.ipynb/) for an example of how to build a custom reference and map to it.\n\n* Check out the [pre-built references tutorial](https://github.com/immunogenomics/symphony/blob/main/vignettes/prebuilt_references_tutorial.ipynb) for examples of how to map to provided Symphony references pre-built from the datasets featured in the manuscript.\n\n## Downloading pre-built references: \n\n* You can download pre-built references from [Zenodo](https://zenodo.org/record/5090425).\n\n\n## Reference building\n\n### Option 1: Starting from existing Harmony object\n\nThis function compresses an existing Harmony object into a Symphony reference that enables query mapping. We recommend this option for most users since it allows your code to be more modular and flexible.\n\n```{r eval=FALSE}\n\n# Run Harmony to integrate the reference cells\nref_harmObj = harmony::HarmonyMatrix(\n        data_mat = t(Z_pca_ref),   # starting embedding (e.g. PCA, CCA) of cells\n        meta_data = ref_metadata,  # dataframe with cell metadata\n        theta = c(2),              # cluster diversity enforcement\n        vars_use = c('donor'),     # variable to integrate out\n        nclust = 100,              # number of clusters in Harmony model\n        max.iter.harmony = 10,     # max iterations of Harmony\n        return_object = TRUE,      # set to TRUE to return the full Harmony object\n        do_pca = FALSE             # do not recompute PCs\n)\n\n# Build Symphony reference\nreference = buildReferenceFromHarmonyObj(\n        ref_harmObj,            # output object from HarmonyMatrix()\n        ref_metadata,           # dataframe with cell metadata\n        vargenes_means_sds,     # gene names, means, and std devs for scaling\n        loadings,               # genes x PCs\n        verbose = TRUE,         # display output?\n        do_umap = TRUE,         # run UMAP and save UMAP model to file?\n        save_uwot_path = '/absolute/path/uwot_model_1' # filepath to save UMAP model)\n```\nNote that `vargenes_means_sds` requires column names `c('symbol', 'mean', 'stddev')` (see [tutorial example](https://github.com/immunogenomics/symphony/blob/main/vignettes/pbmcs_tutorial.ipynb/)). \n\n### Option 2: Starting from reference genes by cells matrix\n\nThis function performs all steps of the reference building pipeline including variable gene selection, scaling, PCA, Harmony, and Symphony compression.\n\n```{r eval=FALSE}\n# Build reference\nreference = symphony::buildReference(\n    ref_exp,                   # reference expression (genes by cells)\n    ref_metadata,              # reference metadata (cells x attributes)\n    vars = c('donor'),         # variable(s) to integrate over\n    K = 100,                   # number of Harmony soft clusters\n    verbose = TRUE,            # display verbose output\n    do_umap = TRUE,            # run UMAP and save UMAP model to file\n    do_normalize = FALSE,      # perform log(CP10k) normalization on reference expression\n    vargenes_method = 'vst',   # variable gene selection method: 'vst' or 'mvp'\n    vargenes_groups = 'donor', # metadata column specifying groups for variable gene selection within each group\n    topn = 2000,               # number of variable genes (per group)\n    theta = 2,                 # Harmony parameter(s) for diversity term\n    d = 20,                    # number of dimensions for PCA\n    save_uwot_path = 'path/to/uwot_model_1', # file path to save uwot UMAP model\n    additional_genes = NULL    # vector of any additional genes to force include\n)\n\n```\n\n## Query mapping\nOnce you have a prebuilt reference (e.g. loaded from a saved .rds R object), you can directly map cells from a new query dataset onto it starting from query gene expression.\n\n```{r eval=FALSE}\n# Map query\nquery = mapQuery(query_exp,             # query gene expression (genes x cells)\n                 query_metadata,        # query metadata (cells x attributes)\n                 reference,             # Symphony reference object\n                 vars = NULL,           # Query batch variables to harmonize over (NULL treats query as one batch)\n                 do_normalize = FALSE,  # perform log(CP10k) normalization on query (set to FALSE if already normalized)\n                 do_umap = TRUE)        # project query cells into reference UMAP\n```\n\n`query$Z` contains the harmonized query feature embedding.\n\nIf your query itself has multiple sources of batch variation you would like to integrate over (e.g. technology, donors, species), you can specify them in the `vars` parameter: e.g. `vars = c('donor', 'technology')`\n\n# Installation notes\n## System requirements:\n\nSymphony has been successfully installed on Linux and Mac OS X using the devtools package to install from GitHub. \n\nDependencies:\n\n* R\u003e=3.6.x\n* RANN\n* data.table\n* irlba\n* stats\n* tibble\n* utils\n* uwot\n* Matrix\n* Rcpp\n* magrittr\n* methods\n* rlang\n* ggplot2\n* RColorBrewer\n* ggrastr\n* ggrepel\n\n\n## Troubleshooting:\n\n* You may need to install the latest version of devtools (because of the recent GitHub change from \"master\" to \"main\" terminology, which can cause `install_github` to fail).\n* You may also need to install the lastest version of Harmony:\n\n``` r\ndevtools::install_github(\"immunogenomics/harmony\")\n```\n\nWe have been notified of the following installation errors regarding `systemfonts`, `textshaping`, and `ragg` (which are all required by `ggrastr`):\n```\n# error when installing systemfonts\nft_cache.h:9:10: fatal error: ft2build.h: No such file or directory\n\n# error when installing textshaping\nConfiguration failed to find the harfbuzz freetype2 fribidi library\n\n# error when installing ragg\n\u003cstdin\u003e:1:10: fatal error: ft2build.h: No such file or directory\n```\n\nThese errors are not inherent to the Symphony package and we cannot fix them directly. However, as a workaround, you can install `systemfonts`, `textshaping`, and `ragg` separately using `install.packages()` and specify the path to the required files (replacing `/path/to` below with the path to the appropriate `include` directory containing the files).\n\n```\n# fix to install systemfonts\nwithr::with_makevars(c(CPPFLAGS=\"-I/path/to/include/freetype2/\"), install.packages(\"systemfonts\"))\n\n# fix to install textshaping\nwithr::with_makevars(c(CPPFLAGS=\"-I/path/to/include/harfbuzz/ -I/path/to/include/fribidi/ -I/path/to/include/freetype2/\"), install.packages(\"textshaping\"))\n\n# fix to install ragg\nwithr::with_makevars(c(CPPFLAGS=\"-I/path/to/include/freetype2/\"), install.packages(\"ragg\"))\n\n```\n\n# Reproducing results from manuscript\nCode to reproduce Symphony results from the Kang et al. manuscript is available on [github.com/immunogenomics/symphony_reproducibility](https://github.com/immunogenomics/symphony_reproducibility).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fimmunogenomics%2Fsymphony","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fimmunogenomics%2Fsymphony","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fimmunogenomics%2Fsymphony/lists"}