Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/quantori/crisprseqsim
CRISPRSeqSim is a tool that simulates results of CRISPR/cas editing fragments target sequencing.
https://github.com/quantori/crisprseqsim
amplicon-sequencing bioinformatics crispr-analysis illumina ngs-analysis python read-simulation
Last synced: about 1 month ago
JSON representation
CRISPRSeqSim is a tool that simulates results of CRISPR/cas editing fragments target sequencing.
- Host: GitHub
- URL: https://github.com/quantori/crisprseqsim
- Owner: quantori
- License: apache-2.0
- Created: 2023-06-28T13:30:16.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2023-07-10T12:51:09.000Z (over 1 year ago)
- Last Synced: 2024-04-20T00:49:35.831Z (8 months ago)
- Topics: amplicon-sequencing, bioinformatics, crispr-analysis, illumina, ngs-analysis, python, read-simulation
- Language: Python
- Homepage:
- Size: 24.4 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
README
# CRISPRSeqSim #
CRISPRSeqSim is a tool that generates variations of the initial and desired sequence
(simulation of CRISPR/Cas editing) and provides a report about the mutations and their positions.
The data produced by CrispTestDataGenerator can be used for testing purposes.## How to use: ##
1. In the `Config` file adjust the settings to generate the sequence of desired length with corresponding mutations;
```- initLength:15, # the length of the desired sequence; must be divisible by 3, if "isCoding" parameter is true
- isCoding:true, # a bool value; if "false", the "initLength" can be of any size [default:true]
- mutationSite:15, # the sites where mutations in the "mutated sequence" (a sequence of interest after CRISPR/Cas editing) are to be made
- mutationLength:1, # the type of the mutation in the site. Insertion - positive integer; deletion - negative integer, substitution - 0;
- direction:[true, true, true], # false if the mutation is directed to the left (in case with deletions)
- quality:50.0 # the percentage of "corrupted" mutated sequences in the reads, that do not correspond to the mutated sequence
```
2. In the `sequences_generator.py` file (method "corrupt_generated") adjust the count. Count is the overall number of the
sequences in the file with reads. You can also adjust the number of the sequences with no mutations at all (initial sequence) in the
c_int variable:
```shell
def corrupt_generated(self, orig_sequence, sequence):
count = 100 # can be adjusted
c_int = random.randint(0, c_count // 2) # can be adjusted
```
## How it works: ##
### Output ###The file with sequences: `output.fasta`
1. The initial (reference sequence) of the adjusted length is generated randomly;
2. The mutated sequence (with mutations on mutation sites from the Config file) is generated based on the initial sequence;The file with CRISPR/Cas reads: `output.fastq`
The file contains the set number of counts. Based on the quality score the certain percentage of the sequences is corrupted
(has other mutations, than the mutated sequence). Also the certain amount of the sequences are unchanged at all (initial or
reference sequence). The number of such sequence can be found in the report file.- **sizeScale** - the field that specifies the range within which the length of the sequences in the FASTQ file should be modified;
100% - non modified;- **scaleDirection**:
- -1 - the length modification is performed to the left side;
- 0 - from the both sides;
- 1 - to the right side;