Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/cchd0001/genomesequencesimulator

A simulator to genarate WSG data from refrence
https://github.com/cchd0001/genomesequencesimulator

simulator wgs

Last synced: about 1 month ago
JSON representation

A simulator to genarate WSG data from refrence

Awesome Lists containing this project

README

        

# GenomeSequenceSimulator

GSS is a small tool for simulating sequence reads from a reference genome.
A

## INSTALL

Simple build by cmake .

Here is a build example.
Enter ```src``` directory:
```
mkdir build
cd build
cmake ..
make
```

Binary execute file in ```bin/Bin/GSS```

## EXAMPLE

After build success ,enter ```example``` directory and run ```example.sh``` script.

## QUICK START

Run GSS with ```config.json``` file . All infomations are within config.json.

Here is the ```config.json``` from example
```
{ #Top level is a object !
"filePath": "./sequence.fasta", #Basic genome file.
"fileType": "fasta", #Genome file type.Only support fasta now.
"variable":0.0001,
"InDel":0.015,
"InDel_Extern":0.3,
"error":0.0001,
"output": [ #Array of all simulator infos.
{
"file_name":"f0", #Output name prefix.
"file_type":"fastq", #Output file type,support fasta and fastq
"read_type": "single", #
"read_len": 100, #
"insert_len": 100, #
"depth":80 #The expect depth of Nucleotide.
},
{
"file_name":"f1",
"file_type":"fastq",
"read_type": "pair-end",
"read_len": 100,
"depth":100,
"insert_len": 250
}
]
}
```
## OUTPUT

Here is the top 10 lines of each file that generated by above ```config.json```

```
==> f0.fastq <==
@NC_007205.1_503982_504082_0:0:0_0/2
CTGGAACGGAGTATTCGTTTTAAAAGAGGGTGTATAAGTTTTAAAAAAAAATATTTAAGAACTATAAATATTTAACAAAAAAAATAAAGAAATATAAAAA
+
iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii
@NC_007205.1_430241_430341_0:0:0_1/2
TTTTATCAATTTTTTTTATTTAAATAGAATCAACCAAGTTCATACCCTCGCACCGAGAGGAATTTAGTTAAATTTATAAAATTTCTAGTTTAATTTCCAA
+
iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii
@NC_007205.1_1139113_1139213_0:0:0_2/1
ATACATAATAAGCACTATCTCTAAAGGATTTTTTTATTTGTTTTTTAGCATTATAAATTTCAATAAATGCATTTCTCATACTATCTTCTAGACCTGAAAA

==> f1_0.fastq <==
@NC_007205.1_1137336_1137436_0:0:0_0/1
CTTACACAATTATTTTGGTAGGCAATTTTACCTCTAGCAATCATTAATTGAGAATCTCTTGTGATTGGATAATGATTATCAGAATAAGAAATGCTACTTA
+
iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii
@NC_007205.1_706428_706528_0:0:0_1/2
AAAATTCAGAGGAACTCCGTAACTTTTACCTAACACTGATTTAAGTTAAAATAATCAATGAAGTAAAAAAAGTTTGTAACCTATTAGTTGTTGGGTTGTT
+
iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii
@NC_007205.1_555447_555547_0:0:0_2/2
TTTATGCCTACCTACGGTGGAGTAGGATGATGTATAAAGAATTCTTTTAAATAAATATTTCAACCTTTCAAATTATGTTTAATATTTTATAGGAGACTGA

==> f1_1.fastq <==
@NC_007205.1_1138736_1138836_0:0:0_0/2
ATTCATCTCGGATATATGGTTTGAGATTAGATAGAAAATAATCAAAATAATTTCAATGGAAACTATTTCCCCATTACTACTTACGTTGTAATCGATTACC
+
iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii
@NC_007205.1_705028_705128_0:0:0_1/1
TATGGCAAAATATTTGAATTCATATATAAAATTGACAAGATTAGTTTAAAAAAAAAGAAATCTATCGATTTTAAAATTATTAATGAAGCTCTTGAGGAAT
+
iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii
@NC_007205.1_554047_554147_0:0:0_2/1
TTTGATAGATTTTTAAGTTTTAAAAAATCTTTAGACCTAATTTCAAAAACTGGAATATTTATTTTTGATAATTCTGAAGGATATAATGAACCAAGAGATA

```