Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/mskcc/vcf_accuracy
vcf accuracy evaluator using VT, BEDTOOLS, PyVCF, and TABIX
https://github.com/mskcc/vcf_accuracy
Last synced: about 1 month ago
JSON representation
vcf accuracy evaluator using VT, BEDTOOLS, PyVCF, and TABIX
- Host: GitHub
- URL: https://github.com/mskcc/vcf_accuracy
- Owner: mskcc
- Created: 2015-08-12T15:15:00.000Z (over 9 years ago)
- Default Branch: master
- Last Pushed: 2016-05-24T15:28:24.000Z (over 8 years ago)
- Last Synced: 2023-08-06T05:35:24.592Z (over 1 year ago)
- Language: Python
- Size: 1000 KB
- Stars: 1
- Watchers: 80
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
![VCFs are Great](http://i.imgur.com/E8gfNu0.gif "Aww Yeah Accuracy Time!")
Install some Libs
=================* use THIS python for below commands, so we get argparse + all these libs:
* /opt/common/CentOS_6-dev/python/python-2.7.10/bin/python
* download/install cmo package
* git clone https://github.com/mskcc/cmo.git
* python setup.py install --user
* download/install magic package
* https://pypi.python.org/packages/source/f/filemagic/filemagic-1.6.tar.gz
* tar -xvf and python setup.py install --user
* download/install pyvcf
* https://pypi.python.org/packages/source/P/PyVCF/PyVCF-0.6.7.tar.gz
* tar -xvf and python setup.py install --userExpected formatting
===================MAF input requirements
----------------------* Requires standard TCGA MAF columns (https://wiki.nci.nih.gov/display/TCGA/Mutation+Annotation+Format+(MAF)+Specification):
* Hugo_Symbol, Entrez_Gene_ID, Center, NCBI_Build, Chromosome, Start_Position, End_Position,
Strand, Variant_Classification, Variant_Type, Reference_Allele, Tumor_Seq_Allele1, Tumor_Seq_Allele2,
dbSNP_RS, dbSNP_Val_Status, Tumor_Sample_Barcode, Matched_Norm_Sample_Barcode, Match_Norm_Seq_Allele1,
Match_Norm_Seq_Allele2, Tumor_Validation_Allele1, Tumor_Validation_Allele2, Match_Norm_Validation_Allele1,
Match_Norm_Validation_Allele2, Verification_Status, Validation_Status, Mutation_Status, Sequencing_Phase,
Sequence_Source, Validation_Method, Score, BAM_File, Sequencer* Requires MSKCC specific columns:
* n_ref_count, n_alt_count, n_depth, t_alt_count, t_ref_count, t_depth* If subsetting the bedfile chromosome labelling must match the variant file style chr1 vs. 1, MT vs. M etc.
* Pre-normalized datasets must be left aligned and processed using VT (http://genome.sph.umich.edu/wiki/Vt).