https://github.com/multimeric/1000g-megaqc

1000 genomes data processed for use as a MegaQC test set
https://github.com/multimeric/1000g-megaqc

Last synced: 4 months ago
JSON representation

1000 genomes data processed for use as a MegaQC test set

Host: GitHub
URL: https://github.com/multimeric/1000g-megaqc
Owner: multimeric
License: gpl-3.0
Created: 2020-12-03T08:35:28.000Z (over 5 years ago)
Default Branch: main
Last Pushed: 2020-12-03T09:28:06.000Z (over 5 years ago)
Last Synced: 2025-03-21T05:23:29.574Z (about 1 year ago)
Size: 1.07 MB
Stars: 1
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# 1000g-megaqc
1000 genomes data processed for use as a MegaQC test set.

## Provenance

### Processing
The data was processed as follows:

* We chose samples from 1000 Genomes Phase 3 that had `*.mapped.ILLUMINA.bwa.*.low_coverage.*.bam` and `*.wgs.COMPLETE_GENOMICS.*.snps_indels_svs_meis.high_coverage.genotypes.vcf.gz` files associated with them
* We then ran these commands over the BAM:
* `fastqc`
* `samtools stats --remove-dups --required-flag 1`
* `mosdepth --no-per-base --fast-mode --by exome.targets.bed`
* On the VCF we ran:
* `bcftools stats`
* Finally, the data was compiled by running `multiqc stats --module samtools --module bcftools --module fastqc --module mosdepth`

### Tool Versions

* bcftools=1.10.2
* fastqc=0.11.9
* samtools=1.7
* mosdepth=0.2.7

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/multimeric/1000g-megaqc

Awesome Lists containing this project

README