Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/natir/rustyread

A long read simulator based on badread idea
https://github.com/natir/rustyread

bioinformatics long-reads

Last synced: 2 months ago
JSON representation

A long read simulator based on badread idea

Awesome Lists containing this project

README

        

![Test](https://github.com/natir/rustyread/workflows/Test/badge.svg)
![Lints](https://github.com/natir/rustyread/workflows/Lints/badge.svg)
![MSRV](https://github.com/natir/rustyread/workflows/MSRV/badge.svg)
[![CodeCov](https://codecov.io/gh/natir/rustyread/branch/master/graph/badge.svg)](https://codecov.io/gh/natir/rustyread)
[![Documentation](https://github.com/natir/rustyread/workflows/Documentation/badge.svg)](https://natir.github.io/rustyread/rustyread)
[![License](https://img.shields.io/badge/license-MIT-green)](https://github.com/natir/rustyread/blob/master/LICENSE)

Rustyread

Rustyread is a drop in replacement of `badread simulate`. Rustyread is very heavily inspired by [badread](https://github.com/rrwick/Badread), it reuses the same error and quality model file. But Rustyreads is multi-threaded and benefits from other optimizations.

- [Usage](#usage)
- [Installation](#installation)
- [Minimum supported Rust version](#minimum-supported-rust-version)
- [Difference with badread](#difference-with-badread)

**WARNING**:
- Rustyread has not yet been evaluated or even compared to *any* other long read generators
- Rustyread is tested only on Linux
- Rustyread is still in developpement many thing can change or be break

## Usage

If previously you called badread like this:

```
badread simulate --reference {reference path} --quantity {quantity} > {reads}.fastq
```

you can now replace badread by rustyread:

```
rustyread simulate --reference {reference path} --quantity {quantity} > {reads}.fastq
```

But by default rustyread use all avaible core you can control it with option `threads`:

```
rustyread --theads {number of thread} simulate --reference {reference path} --quantity {quantity} > {reads}.fastq
```

If you have `badread` installed in your python `sys.path` rustyread can found error and quality model automatically, but you can still use `--error_model` and `--qscore_model` option.

### Control memory usage

Rustyread memory usage could be estimated with formula: `2 * reference base + 2 * targeted base + epsilon`, to limit memory impact of Rustyread you can use parameter `number_base_store` it's take an absolute value or a relative depth, if this option is set memory usage became `2 * reference base + 2 number_base_store + epsilon`.

### Full usage

```
rustyread 0.4.1 Machamp
Pierre Marijon
A long read simulator based on badread idea and model

USAGE:
rustyread [OPTIONS]

OPTIONS:
-h, --help Print help information
-t, --threads Number of thread use by rustyread, 0 use all avaible core, default
value 0
-v, --verbosity Verbosity level also control by environment variable RUSTYREAD_LOG if
flag is set RUSTYREAD_LOG value is ignored
-V, --version Print version information

SUBCOMMANDS:
help Print this message or the help of the given subcommand(s)
simulate Generate fake long read
```

```
rustyread-simulate
Generate fake long read

USAGE:
rustyread simulate [FLAGS] [OPTIONS] --reference --quantity

FLAGS:
-h, --help Prints help information
--small_plasmid_bias If set, then small circular plasmids are lost when the fragment
length is too high (default: small plasmids are included regardless
of fragment length)
-V, --version Prints version information

OPTIONS:
--chimera
Percentage at which separate fragments join together [default: 1]

--end_adapter
Adapter parameters for read ends (rate and amount) [default: 50,20]

--end_adapter_seq
Adapter parameters for read ends [default: GCAATACGTAACTGAACGAAGT]

--error_model
Path to an error model file [default: nanopore2020]

--glitches
Read glitch parameters (rate, size and skip) [default: 10000,25,25]

--identity
Sequencing identity distribution (mean, max and stdev) [default: 85,95,5]

--junk_reads
This percentage of reads wil be low complexity junk [default: 1]

--length
Fragment length distribution (mean and stdev) [default: 15000,13000]

--number_base_store
Number of base, rustyread can store in ram before write in output in absolute value
(e.g. 250M) or a relative depth (e.g. 25x)

--output Where read is write
--qscore_model
Path to an quality score model file [default: nanopore2020]

--quantity
Either an absolute value (e.g. 250M) or a relative depth (e.g. 25x)

--random_reads
This percentage of reads wil be random sequence [default: 1]

--reference Reference fasta (can be gzipped, bzip2ped, xzped)
--seed
Random number generator seed for deterministic output (default: different output each
time)

--start_adapter
Adapter parameters for read starts (rate and amount) [default: 90,60]

--start_adapter_seq
Adapter parameters for read starts [default: AATGTACTTCGTTCAGTTACGTATTGCT]
```

## Installation

### Bioconda

If you haven't bioconda setup follow [this instruction](https://bioconda.github.io/user/install.html)

```
conda|mamba install rustyread
```

### With rust environment

If you haven't a rust environment you can use [rustup](https://rustup.rs/) or your package manager.

#### With cargo

```
cargo install --git https://github.com/natir/rustyread.git --tag 0.4.1
```

### From source

```
git clone https://github.com/natir/rustyread.git
cd rustyread
git checkout 0.4.1
cargo install --path .
```

## Minimum supported Rust version

Currently the minimum supported Rust version is 1.57.0.

## Difference with badread

- option `small_plasmid_bias` is silently ignored but small plasmid is 'sequence'