https://github.com/vccri/pgscatalogcalculator
https://github.com/vccri/pgscatalogcalculator
Last synced: about 2 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/vccri/pgscatalogcalculator
- Owner: VCCRI
- Created: 2020-11-12T02:02:27.000Z (over 4 years ago)
- Default Branch: master
- Last Pushed: 2021-07-09T07:58:48.000Z (almost 4 years ago)
- Last Synced: 2025-02-02T16:52:50.333Z (4 months ago)
- Language: R
- Size: 338 KB
- Stars: 0
- Watchers: 6
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
A tool that will grab PGS Catalog Data and allow for automatic scoring of samples and comparison against the EU Control Sample Set or Input Control Sample VCF
Required Software:
* R - 3.6.1
* Python - 2.7.16
* VT
* Bcftools
* PLINK 1.9# Getting Started
Installation
```
python2 -m pip install -r https://raw.githubusercontent.com/VCCRI/PGSCatalogDownloader/master/requirements.txt
R
devtools::install_github("VCCRI/PGSCatalogDownloader")
# Installing local mirror of metadata
require(PGSCatalogDownloader)
inMeta <- getLatestMeta("meta.RDS")
# If getLatestMeta function input is NULL it will save in the temporary directory
```
getLatestMeta will always return the filename location of the local mirror## Required Files
* Input VCF (Case - Required)
* Reference Sequence Fasta File
* Filled in sample.yaml, file used to point package to relevant prerequisites
* Optional - Control VCF File## Sample Run
```
require(PGSCatalogDownloader)
cl <- parallel::makeCluster(10)
doParallel::registerDoParallel(cl)
grabScoreId(inFile='sample.vcf.gz', inRef='/g/data/jb96/References_and_Databases/hs37d5.fa/hs37d5x.fa', inPGSID='PGS000073', inCL=cl, inControl='sample_test.vcf.gz', inMeta=inMeta)
parallel::stopCluster(cl)
```
## Input Parameters for grabScoreID* inFile = Input Case VCF
* inRef = Reference Sequence FASTA File
* inPGSID = PGS ID that you want to calculate the score for
* inPGSIDS = File that has newline separated list of PGS IDs that you want to calculate the score for
* inCL = Cluster that will be used to run the package
* inControl = Control VCF (Optional)
* inMeta = Local Mirror of metadata that allows for reuse (Optional)Please note that package looks for "sample.yaml" file in the current working directory to ensure that it references the correct packages and environment variables
Please find an example file in this repo: https://github.com/VCCRI/PGSCatalogDownloader/blob/master/sample.yaml
Please define the output directory in the YAML file under outputDir otherwise the tool will output files in the current R working directory
Please note that this tools creates intermediate files, error messages relating to these intermediate files should be ignored
## Output
The tool will generate a boxplot, `boxplot.png`, and CSV that displays the relative risk of patient and raw scores,`sample_out.csv`
These can be viewed concurrently by accessing `dashboard.Rmd`
These files demonstrate the stratified risk of samples against control samples and risk of each sample for the last condition respectively.