https://github.com/christopher-hakkaart/testdata
Test data for ont
https://github.com/christopher-hakkaart/testdata
Last synced: 3 months ago
JSON representation
Test data for ont
- Host: GitHub
- URL: https://github.com/christopher-hakkaart/testdata
- Owner: christopher-hakkaart
- Created: 2021-12-08T08:20:18.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2022-12-14T01:07:36.000Z (over 2 years ago)
- Last Synced: 2025-01-13T12:46:40.074Z (5 months ago)
- Size: 41.7 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Test data origins
## Random gene of interest with known SV
```bash
EDIL3
chr5:83940554-84384880 (-)
id = NM_005711.5
```## Make bed file
``` bash
touch GRCh38_EDIL3.bed
echo -e "chr5\t83940554\t84384880\tEDIL3\t0\t-\n" >> GRCh38_EDIL3.bed
```## Find reads mapped to EDIl3 convert to fastq, and gzip
```bash
samtools view -b A04.bam "chr5:83940554-84384880" > EDIL3.bam```
samtools index EDIL3.bam
samtools fastq EDIL3.bam > NA12878_DNA.fastq
gzip NA12878_DNA.fastq
```## Make new reference genome
``` bash
bedtools getfasta -name -fi Homo_sapiens_assembly38.fasta -bed GRCh38_EDIL3.bed > GRCh38_EDIL3.fa
samtools faidx GRCh38_EDIL3.fa
```## Test bench, truth and high confidence regions
Copied data in [hap.py](https://github.com/Illumina/hap.py#happy) example.
- NA12878_chr21.vcf.gz
- NA12878_chr21.vcf.gz.tbi
- PG_Conf_chr21.bed.gz
- PG_Conf_chr21.bed.gz.tbi
- PG_NA12878_chr21.vcf.gz
- PG_NA12878_chr21.vcf.gz.tbi## test_benchmark
The test_benchmark?.csv files in this folder and are subject to change.
They are replicated dummy files to test functionality.## Notes for future development
TODO: Add second chromosome and reads to use as base for development of variant calling for each chromosome separately.TODO: Fix error that occurs with default chromosome naming from bedtools – temporary solution is to arbitrarily name chromosome removing characters that were causing the problem.
TODO: Add notes on how SV data were derived!