An open API service indexing awesome lists of open source software.

https://github.com/jrhawley/bio-jtools-rs

A suite of bioinformatics tools for interacting with high throughput sequencing (HTS) data, written entirely in Rust
https://github.com/jrhawley/bio-jtools-rs

bioinformatics cli command-line genome-sequencing genomics hts rust-lang

Last synced: 3 months ago
JSON representation

A suite of bioinformatics tools for interacting with high throughput sequencing (HTS) data, written entirely in Rust

Awesome Lists containing this project

README

          

# bio-jtools-rs

A suite of bioinformatics tools for interacting with high throughput sequencing (HTS) data, written entirely in Rust

![Crates.io](https://img.shields.io/crates/v/bio-jtools)

## Suite

### info

Extract and print metadata about an HTS file.
For FASTQs, this includes number of bases, number of records, and all the instruments the records come from.

### filter

Filter an HTS file by its query names.
Currently only implemented for SAM/BAM files

### jaccard

Calculate the Jaccard index for each pair in a set of BED files.
Can save the results in a comma-separated file, if specified.

### org

Organize a batch of raw sequencing data.

This takes a folder directly from an Illumina sequencer with FASTQ files and organizes them as follows, ready for alginment and quality control:

```shell
YYMMDD_INSTID_RUN_FCID/
├── FASTQs/ # home for your raw data
├── Sample1_R1.fastq.gz
├── Sample1_R2.fastq.gz
└── ...
├── Aligned/ # a home for your aligned data
├── Reports/ # QC reports, etc files
├── config.tsv # a table of samples (rows) x features (cols)
├── cluster.yaml # a yaml file of cluster parameters for jobs in the Snakefile
├── README.md # description of the folder, data contents
├── setup.log # a log of what operations were performed with `bjt org`
└── Snakefile # Snakemake workflow file
```