{"id":17223363,"url":"https://github.com/slowkow/snpsea","last_synced_at":"2025-06-20T22:42:37.441Z","repository":{"id":12307320,"uuid":"14939642","full_name":"slowkow/snpsea","owner":"slowkow","description":":bar_chart: Identify cell types and pathways affected by genetic risk loci.","archived":false,"fork":false,"pushed_at":"2024-02-29T00:11:51.000Z","size":21513,"stargazers_count":37,"open_issues_count":4,"forks_count":9,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-03-27T14:21:52.947Z","etag":null,"topics":["algorithm","bioinformatics","enrichment","gene","gene-sets","gwas","risk-loci","tissue"],"latest_commit_sha":null,"homepage":"http://www.broadinstitute.org/mpg/snpsea/","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/slowkow.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":"CITATION","codeowners":null,"security":null,"support":null}},"created_at":"2013-12-05T00:13:14.000Z","updated_at":"2025-03-01T14:09:13.000Z","dependencies_parsed_at":"2022-07-12T15:04:45.463Z","dependency_job_id":null,"html_url":"https://github.com/slowkow/snpsea","commit_stats":null,"previous_names":[],"tags_count":3,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/slowkow%2Fsnpsea","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/slowkow%2Fsnpsea/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/slowkow%2Fsnpsea/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/slowkow%2Fsnpsea/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/slowkow","download_url":"https://codeload.github.com/slowkow/snpsea/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248800048,"owners_count":21163404,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["algorithm","bioinformatics","enrichment","gene","gene-sets","gwas","risk-loci","tissue"],"created_at":"2024-10-15T04:08:07.808Z","updated_at":"2025-04-14T00:22:11.417Z","avatar_url":"https://github.com/slowkow.png","language":"C++","readme":"SNPsea: an algorithm to identify cell types, tissues, and pathways affected by risk loci\n========================================================================================\n\n**Home Page:** \u003chttp://www.broadinstitute.org/mpg/snpsea\u003e\n\n**Documentation:** [HTML] | [PDF] | [Epub]\n\n**Executable:** [snpsea-v1.0.3.tar.gz][exec]\n\n**Data:** [SNPsea_data_20140520.zip][data]\n\n**License:** [GNU GPLv3][license]\n\n\nCitation\n--------\n\nIf you benefit from this method, please cite:\n\n\u003e Slowikowski, K. et al. **SNPsea: an algorithm to identify cell types,\n\u003e tissues, and pathways affected by risk loci.** Bioinformatics (2014).\n\u003e doi:[10.1093/bioinformatics/btu326][Slowikowski2014]\n\nSee the first description of the algorithm and additional examples here:\n\n\u003e Hu, X. et al. *Integrating autoimmune risk loci with gene-expression data\n\u003e identifies specific pathogenic immune cell subsets.* The American Journal\n\u003e of Human Genetics 89, 496–506 (2011). [PubMed][Hu2011]\n\n[Hu2011]: http://www.ncbi.nlm.nih.gov/pubmed/21963258\n[Slowikowski2014]: http://bioinformatics.oxfordjournals.org/content/30/17/2496.full\n\n\nDescription\n-----------\n\nSNPsea is an algorithm to identify cell types and pathways likely to be\naffected by risk loci. It requires a list of SNP identifiers and a matrix of\ngenes and conditions.\n\nGenome-wide association studies (GWAS) have discovered multiple genomic loci\nassociated with risk for different types of disease. SNPsea provides a simple\nway to determine the types of cells influenced by genes in these risk loci.\n\nSuppose disease-associated alleles influence a small number of pathogenic cell\ntypes. We hypothesize that genes with critical functions in those cell types\nare likely to be within risk loci for that disease. We assume that a gene's\nspecificity to a cell type is a reasonable indicator of its importance to the\nunique function of that cell type.\n\nFirst, we identify the genes in linkage disequilibrium (LD) with the given\ntrait-associated SNPs and score the gene set for specificity to each cell\ntype. Next, we define a null distribution of scores for each cell type by\nsampling random SNP sets matched on the number of linked genes. Finally, we\nevaluate the significance of the original gene set's specificity by comparison\nto the null distributions: we calculate an exact permutation p-value.\n\nSNPsea is a general algorithm. You may provide your own:\n\n1. Continuous gene matrix with gene expression profiles (or other values).\n2. Binary gene annotation matrix with presence/absence 1/0 values.\n\nWe provide you with three expression matrices and one annotation matrix. See\nthe [Data][manualdata] section of the [Manual][HTML].\n\nThe columns of the matrix may be tissues, cell types, GO annotation codes, or\nother *conditions*. Continuous matrices *must* be normalized before running\nSNPsea: columns must be directly comparable to each other.\n\nExample\n-------\n\n![SNPsea results for RBC count-associated SNPs in the Gene Atlas.][example]\n\n[example]: https://github.com/slowkow/snpsea/blob/master/docs/figures/Red_blood_cell_count-Harst2012-45_SNPs-GeneAtlas2004-single-pvalues_barplot.png\n\nThe heatmap shows Pearson correlation coefficients between pairs of tissue\nexpression profiles. The blue bars show p-values. Statistically significant\np-values cross the Bonferroni multiple testing threshold (black line).\n\nWe identified *BM-CD71+Early Erythroid* as the cell type with most significant\nenrichment (P \u003c 2e-7) for cell type-specific gene expression relative to 78\nother tissues in the Gene Atlas ([Su *et al.* 2004][Su2004]).\n\nSNPsea tested the genes in linkage disequilibrium (LD) with 45 input SNPs\nassociated with count of red blood cells (P \u003c= 5e-8 in Europeans) ([Harst\n*et al.* 2012][Harst2012]). For each of the 79 cell types in the Gene Atlas,\nwe tested a maximum of 1e7 null SNP sets where each null SNP was matched to\nan input SNP on the number of genes in LD.\n\n[Harst2012]: http://www.ncbi.nlm.nih.gov/pubmed/23222517\n[Su2004]: http://www.ncbi.nlm.nih.gov/pubmed/15075390\n\nWe ran SNPsea like this:\n\n```bash\noptions=(\n    --snps              Red_blood_cell_count-Harst2012-45_SNPs.gwas\n    --gene-matrix       GeneAtlas2004.gct.gz\n    --gene-intervals    NCBIgenes2013.bed.gz\n    --snp-intervals     TGP2011.bed.gz\n    --null-snps         Lango2010.txt.gz\n    --out               out\n    --slop              10e3\n    --threads           8\n    --null-snpsets      0\n    --min-observations  100\n    --max-iterations    1e7\n)\nsnpsea ${options[*]}\n\n# Time elapsed: 2 minutes 36 seconds\n\n# Create the figure shown above:\nsnpsea-barplot out\n```\n\nContributing\n------------\n\nPlease [submit an issue][issues] to report bugs or ask questions.\n\nPlease contribute bug fixes or new features with a [pull request][pull] to this repository.\n\n[issues]: https://github.com/slowkow/snpsea/issues\n[pull]: https://help.github.com/articles/using-pull-requests/\n\n[license]: https://github.com/slowkow/snpsea/blob/master/LICENSE\n\n[exec]: https://github.com/slowkow/snpsea/archive/v1.0.3.tar.gz\n[data]: http://dx.doi.org/10.6084/m9.figshare.871430\n\n[HTML]: http://snpsea.readthedocs.org/en/latest/\n[manualdata]: http://snpsea.readthedocs.org/en/latest/data.html\n[PDF]: https://readthedocs.org/projects/snpsea/downloads/pdf/latest/\n[Epub]: https://readthedocs.org/projects/snpsea/downloads/epub/latest/\n\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fslowkow%2Fsnpsea","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fslowkow%2Fsnpsea","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fslowkow%2Fsnpsea/lists"}