Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

awesome-complex-trait-genetics

A list of awesome tools for complex trait genetics.
https://github.com/michelnivard/awesome-complex-trait-genetics

Last synced: about 15 hours ago
JSON representation

  • genetic architecture

      • LDSC
      • LDSR - memory summary statistics.
      • LDSR - memory summary statistics.
      • GCTB - wide SNPs. It was developed to simultaneously estimate the joint effects of all SNPs and the genetic architecture parameters for a complex trait. There are now extensions to estimate the same bayesian linear model parameters based on summary data.
      • GCTA - wide Complex Trait Analysis) is a software package initially developed to estimate the proportion of phenotypic variance explained by all genome-wide SNPs for a complex trait.
      • RHE-mc/GENIE - mc is a method to estimate the proportion of phenotypic variance explained by SNPs, and GENIE extends this to model GxE effects
      • RHE-mc/GENIE - mc is a method to estimate the proportion of phenotypic variance explained by SNPs, and GENIE extends this to model GxE effects
      • BOLT-LMM/BOLT-REML - LMM software package currently consists of two main algorithms, the BOLT-LMM algorithm for mixed model association testing, and the BOLT-REML algorithm for variance components analysis (i.e., partitioning of SNP-heritability and estimation of genetic correlations).
      • LDAK/SumHer/PCGC - correlation genotype-correlation) Regression is an alternative to REML when estimating heritability for binary traits (i.e., diseases).
      • BOLT-LMM/BOLT-REML - LMM software package currently consists of two main algorithms, the BOLT-LMM algorithm for mixed model association testing, and the BOLT-REML algorithm for variance components analysis (i.e., partitioning of SNP-heritability and estimation of genetic correlations).
      • LDAK/SumHer/PCGC - correlation genotype-correlation) Regression is an alternative to REML when estimating heritability for binary traits (i.e., diseases).
    • Univariate models (heritability/poligenicity/stratified/geneset enrichment etc)

      • i-LDSC - LD score (i-LDSC) regression: Model an additional score that measures the amount of non-additive genetic variation that is tagged by each variant in the data.
      • i-LDSC - LD score (i-LDSC) regression: Model an additional score that measures the amount of non-additive genetic variation that is tagged by each variant in the data.
      • ACLR
      • HAMSTA
      • MAGMA - set analysis of GWAS data.
      • MAGMA - set analysis of GWAS data.
  • Genetic correlation (LD score derivatives/extensions)

    • Univariate models (heritability/poligenicity/stratified/geneset enrichment etc)

      • HDL - Definition Likelihood (HDL) is a likelihood-based method for estimating genetic correlation using GWAS summary statistics. Compared to LD Score regression (LDSC), It reduces the variance of a genetic correlation estimate by about 60%.
      • HDL - Definition Likelihood (HDL) is a likelihood-based method for estimating genetic correlation using GWAS summary statistics. Compared to LD Score regression (LDSC), It reduces the variance of a genetic correlation estimate by about 60%.
    • Stratified/local genetic correlatons

    • Ancestry aware Genetic correlations:

      • s-ldxr - LDXR` is a method to stratify squared trans-ethnic genetic correlation by genomic annotations from GWAS summary statistics.
      • s-ldxr - LDXR` is a method to stratify squared trans-ethnic genetic correlation by genomic annotations from GWAS summary statistics.
      • Popcorn
      • Popcorn
      • mama - based command line tool that meta-analyzes GWAS summary statistics generated from distinct ancestry groups.
  • Model trait relationships beyond correlation

    • Genetic SEM/Factor models

    • Two sample Mendelian Randomisation

    • MR/Genetic architecture hybrid models

      • lhcMR - directional causal estimation between a pair of traits, while accounting for the presence of a potential heritable confounder acting on the pair.
      • lhcMR - directional causal estimation between a pair of traits, while accounting for the presence of a potential heritable confounder acting on the pair.
      • CAUSE
      • CAUSE
      • LCV
      • LCV
      • MR-cML - MA, applicable to GWAS summary data.
    • Mendelian randomization in _cis_

  • Colocalisation/finemapping of causal variants

    • Mendelian randomization in _cis_

      • coloc
      • fastENLOC
      • coloc
      • fastENLOC
      • FINEMAP - wide association studies.
      • polyfun - informed fine-mapping, **PolyLoc** for polygenic localization of complex trait heritability.
      • OPERA - level data.
      • SuSiEx - population finemapping using summary statistics and LD reference panels.
      • SharePro - level approach to integrate LD modeling and colocalization assessment to account for multiple causal variants in colocalization analysis.
  • gene-level analysis (TWAS)

    • Mendelian randomization in _cis_

      • FUSION - wide and regulome-wide association studies (TWAS and RWAS).
      • FOCUS - mapping Of CaUsal gene Sets) is software to fine-map transcriptome-wide association study statistics at genomic risk regions
  • Simulation

  • Genomic data wrangling

    • Mendelian randomization in _cis_

      • HAIL - source, general-purpose, Python-based data analysis tool with additional data types and methods for working with genomic data.
      • bigsnpr
      • ukbrapR - K-B-wrapper') is an R package for working in the UK Biobank Research Analysis Platform (RAP). The aim is to make it quicker, easier, and more reproducible.
      • MungeSumstats
      • tidyGWAS
      • gwasRtools
      • qgg - scale genetic and phenotypic data while **gact** is designed for establishing and populating a comprehensive database focused on genomic associations with complex traits, provies R implementations of popular follow up analysis (LDscore regresison, MAGMA, VEGAS, PoPS, etc).
      • bcftools
      • GenomicRanges
      • bedtools - army knife of tools for a wide-range of genomics analysis tasks. A very fast and easy way to intersect, merge, count, complement, and shuffle genomic intervals from multiple files in widely-used genomic file formats such as BAM, BED, GFF/GTF, VCF
  • Polygenic scores

    • Mendelian randomization in _cis_

      • PRSice
      • LDpred2 - 2 is one of the dedicated PRS programs which is an R package that uses a Bayesian approach to polygenic risk scoring.
      • GCTB - wide SNPs. It was developed to simultaneously estimate the joint effects of all SNPs and the genetic architecture parameters for a complex trait.
  • GWAS result repositories (preferably with an API)

    • Mendelian randomization in _cis_

      • ieugwasr
      • GWAScatalog API - 11-20, the GWAS Catalog contains 7083 publications, 692444 top associations and 96947 full summary statistics.
      • GWAS atlas - based) results, SNP heritability and genetic correlations with other GWAS in the database.
      • S4 programs
  • Online tools

    • Mendelian randomization in _cis_

      • gnomAD browser
      • Open Targets Platform
      • genebass - based association statistics, made available to the public. The dataset encompasses 4,529 phenotypes with gene-based and single-variant testing across 394,841 individuals with exome sequence data from the UK Biobank.
      • All by All - based and single-variant associations across nearly 250,000 whole genome sequences