Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/natir/rustyread
A long read simulator based on badread idea
https://github.com/natir/rustyread
bioinformatics long-reads
Last synced: 2 months ago
JSON representation
A long read simulator based on badread idea
- Host: GitHub
- URL: https://github.com/natir/rustyread
- Owner: natir
- License: mit
- Created: 2021-03-19T18:32:00.000Z (almost 4 years ago)
- Default Branch: master
- Last Pushed: 2022-10-07T16:35:40.000Z (about 2 years ago)
- Last Synced: 2023-08-05T00:07:31.826Z (over 1 year ago)
- Topics: bioinformatics, long-reads
- Language: Jupyter Notebook
- Homepage:
- Size: 6.34 MB
- Stars: 21
- Watchers: 1
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: Readme.md
- License: LICENSE
Awesome Lists containing this project
README
![Test](https://github.com/natir/rustyread/workflows/Test/badge.svg)
![Lints](https://github.com/natir/rustyread/workflows/Lints/badge.svg)
![MSRV](https://github.com/natir/rustyread/workflows/MSRV/badge.svg)
[![CodeCov](https://codecov.io/gh/natir/rustyread/branch/master/graph/badge.svg)](https://codecov.io/gh/natir/rustyread)
[![Documentation](https://github.com/natir/rustyread/workflows/Documentation/badge.svg)](https://natir.github.io/rustyread/rustyread)
[![License](https://img.shields.io/badge/license-MIT-green)](https://github.com/natir/rustyread/blob/master/LICENSE)Rustyread is a drop in replacement of `badread simulate`. Rustyread is very heavily inspired by [badread](https://github.com/rrwick/Badread), it reuses the same error and quality model file. But Rustyreads is multi-threaded and benefits from other optimizations.
- [Usage](#usage)
- [Installation](#installation)
- [Minimum supported Rust version](#minimum-supported-rust-version)
- [Difference with badread](#difference-with-badread)**WARNING**:
- Rustyread has not yet been evaluated or even compared to *any* other long read generators
- Rustyread is tested only on Linux
- Rustyread is still in developpement many thing can change or be break## Usage
If previously you called badread like this:
```
badread simulate --reference {reference path} --quantity {quantity} > {reads}.fastq
```you can now replace badread by rustyread:
```
rustyread simulate --reference {reference path} --quantity {quantity} > {reads}.fastq
```But by default rustyread use all avaible core you can control it with option `threads`:
```
rustyread --theads {number of thread} simulate --reference {reference path} --quantity {quantity} > {reads}.fastq
```If you have `badread` installed in your python `sys.path` rustyread can found error and quality model automatically, but you can still use `--error_model` and `--qscore_model` option.
### Control memory usage
Rustyread memory usage could be estimated with formula: `2 * reference base + 2 * targeted base + epsilon`, to limit memory impact of Rustyread you can use parameter `number_base_store` it's take an absolute value or a relative depth, if this option is set memory usage became `2 * reference base + 2 number_base_store + epsilon`.
### Full usage
```
rustyread 0.4.1 Machamp
Pierre Marijon
A long read simulator based on badread idea and modelUSAGE:
rustyread [OPTIONS]OPTIONS:
-h, --help Print help information
-t, --threads Number of thread use by rustyread, 0 use all avaible core, default
value 0
-v, --verbosity Verbosity level also control by environment variable RUSTYREAD_LOG if
flag is set RUSTYREAD_LOG value is ignored
-V, --version Print version informationSUBCOMMANDS:
help Print this message or the help of the given subcommand(s)
simulate Generate fake long read
``````
rustyread-simulate
Generate fake long readUSAGE:
rustyread simulate [FLAGS] [OPTIONS] --reference --quantityFLAGS:
-h, --help Prints help information
--small_plasmid_bias If set, then small circular plasmids are lost when the fragment
length is too high (default: small plasmids are included regardless
of fragment length)
-V, --version Prints version informationOPTIONS:
--chimera
Percentage at which separate fragments join together [default: 1]--end_adapter
Adapter parameters for read ends (rate and amount) [default: 50,20]--end_adapter_seq
Adapter parameters for read ends [default: GCAATACGTAACTGAACGAAGT]--error_model
Path to an error model file [default: nanopore2020]--glitches
Read glitch parameters (rate, size and skip) [default: 10000,25,25]--identity
Sequencing identity distribution (mean, max and stdev) [default: 85,95,5]--junk_reads
This percentage of reads wil be low complexity junk [default: 1]--length
Fragment length distribution (mean and stdev) [default: 15000,13000]--number_base_store
Number of base, rustyread can store in ram before write in output in absolute value
(e.g. 250M) or a relative depth (e.g. 25x)--output Where read is write
--qscore_model
Path to an quality score model file [default: nanopore2020]--quantity
Either an absolute value (e.g. 250M) or a relative depth (e.g. 25x)--random_reads
This percentage of reads wil be random sequence [default: 1]--reference Reference fasta (can be gzipped, bzip2ped, xzped)
--seed
Random number generator seed for deterministic output (default: different output each
time)--start_adapter
Adapter parameters for read starts (rate and amount) [default: 90,60]--start_adapter_seq
Adapter parameters for read starts [default: AATGTACTTCGTTCAGTTACGTATTGCT]
```## Installation
### Bioconda
If you haven't bioconda setup follow [this instruction](https://bioconda.github.io/user/install.html)
```
conda|mamba install rustyread
```### With rust environment
If you haven't a rust environment you can use [rustup](https://rustup.rs/) or your package manager.
#### With cargo
```
cargo install --git https://github.com/natir/rustyread.git --tag 0.4.1
```### From source
```
git clone https://github.com/natir/rustyread.git
cd rustyread
git checkout 0.4.1
cargo install --path .
```## Minimum supported Rust version
Currently the minimum supported Rust version is 1.57.0.
## Difference with badread
- option `small_plasmid_bias` is silently ignored but small plasmid is 'sequence'