https://github.com/biona001/ghostknockoffgwas
Knockoff-based analysis of GWAS summary statistics data
https://github.com/biona001/ghostknockoffgwas
conditional-independence false-discovery-rate fdr genomics gwas knockoffs summary-statistics variable-selection
Last synced: 5 months ago
JSON representation
Knockoff-based analysis of GWAS summary statistics data
- Host: GitHub
- URL: https://github.com/biona001/ghostknockoffgwas
- Owner: biona001
- License: mit
- Created: 2023-05-20T21:43:57.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2025-06-05T03:24:05.000Z (6 months ago)
- Last Synced: 2025-06-05T06:14:47.439Z (6 months ago)
- Topics: conditional-independence, false-discovery-rate, fdr, genomics, gwas, knockoffs, summary-statistics, variable-selection
- Language: Julia
- Homepage:
- Size: 12.7 MB
- Stars: 8
- Watchers: 1
- Forks: 1
- Open Issues: 5
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# GhostKnockoffGWAS
| **Documentation** | **Build Status** | **Code Coverage** |
|-------------------|------------------|--------------------|
| [](https://biona001.github.io/GhostKnockoffGWAS/dev/)| [](https://github.com/biona001/GhostKnockoffGWAS/actions) [](https://github.com/biona001/GhostKnockoffGWAS.jl/actions/workflows/JuliaNightly.yml) | [](https://codecov.io/gh/biona001/GhostKnockoffGWAS) |
This is a package for analyzing summary statistics data from [*genome-wide association studies (GWAS)*](https://en.wikipedia.org/wiki/Genome-wide_association_study) under the statistical *knockoff framework*. Compared to marginal association testing which controls the FWER, the knockoff framework conducts conditional independence testing while controlling the FDR. As a consequence, `GhostKnockoffGWAS` can be both more precise and powerful than current state-of-the-art GWAS+fine-mapping methods. Its detailed evaluations can be found in our [companion paper](https://www.biorxiv.org/content/10.1101/2024.02.28.582621v1).
## Installation
GhostKnockoffGWAS is available:
+ as a **standalone binary**
+ as a regular Julia package
For more extended instructions on installation, please refer to the [documentation](https://biona001.github.io/GhostKnockoffGWAS/dev/man/examples/).
## New users
To get started, please refer to the [documentation](https://biona001.github.io/GhostKnockoffGWAS/dev).
In `GhostKnockoffGWAS`, the main working assumption is that we do not have access to individual level genotype or phenotype data. Rather, for each SNP, we have its Z-scores with respect to some phenotype from a GWAS, and access to LD (linkage disequilibrium) data. The user is expected supply the Z-scores, while we supply pre-processed LD files freely downloadable from the cloud.
## Advantages/disadvantages of GhostKnockoffGWAS
Compared to existing knockoff methods for GWAS, the main advantages of GhostKnockoffGWAS is (1) its ease of use and (2) its computational efficiency. The only user-provided input is marginal Z-scores. Computationally, running a knockoff-based GWAS pipeline took approximately 15 minutes on 650,000 SNPs. The main limitation of GhostKnockoffGWAS is that it relies on the availability of pre-processed LD files suitable for the user's target samples.
## Bug fixes and user support
If you encounter a bug or need user support, please open a new issue on Github. Please provide as much detail as possible for bug reports, ideally a sequence of reproducible code that lead to the error.
PRs and feature requests are welcomed!
## Citation
If you use `GhostKnockoffGWAS` in your research, please cite the following references:
> He Z, Chu BB, Yang J, Gu J, Chen Z, Liu L, Morrison T, Bellow M, Qi X, Hejazi N, Mathur M, Le Guen Y, Tang H, Hastie T, Ionita-laza, I, Sabatti C, Candes C. "In silico identification of putative causal genetic variants", bioRxiv, 2024.02.28.582621; doi: https://doi.org/10.1101/2024.02.28.582621.