https://github.com/jrhawley/bio-jtools-rs

A suite of bioinformatics tools for interacting with high throughput sequencing (HTS) data, written entirely in Rust
https://github.com/jrhawley/bio-jtools-rs

bioinformatics cli command-line genome-sequencing genomics hts rust-lang

Last synced: 3 months ago
JSON representation

A suite of bioinformatics tools for interacting with high throughput sequencing (HTS) data, written entirely in Rust

Host: GitHub
URL: https://github.com/jrhawley/bio-jtools-rs
Owner: jrhawley
License: mit
Created: 2020-01-12T21:17:16.000Z (over 5 years ago)
Default Branch: master
Last Pushed: 2023-11-28T21:42:55.000Z (almost 2 years ago)
Last Synced: 2025-05-30T14:40:47.634Z (4 months ago)
Topics: bioinformatics, cli, command-line, genome-sequencing, genomics, hts, rust-lang
Language: Rust
Size: 70.9 MB
Stars: 3
Watchers: 2
Forks: 0
Open Issues: 3
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# bio-jtools-rs

A suite of bioinformatics tools for interacting with high throughput sequencing (HTS) data, written entirely in Rust

![Crates.io](https://img.shields.io/crates/v/bio-jtools)

## Suite

### info

Extract and print metadata about an HTS file.
For FASTQs, this includes number of bases, number of records, and all the instruments the records come from.

### filter

Filter an HTS file by its query names.
Currently only implemented for SAM/BAM files

### jaccard

Calculate the Jaccard index for each pair in a set of BED files.
Can save the results in a comma-separated file, if specified.

### org

Organize a batch of raw sequencing data.

This takes a folder directly from an Illumina sequencer with FASTQ files and organizes them as follows, ready for alginment and quality control:

```shell
YYMMDD_INSTID_RUN_FCID/
├── FASTQs/ # home for your raw data
├── Sample1_R1.fastq.gz
├── Sample1_R2.fastq.gz
└── ...
├── Aligned/ # a home for your aligned data
├── Reports/ # QC reports, etc files
├── config.tsv # a table of samples (rows) x features (cols)
├── cluster.yaml # a yaml file of cluster parameters for jobs in the Snakefile
├── README.md # description of the folder, data contents
├── setup.log # a log of what operations were performed with `bjt org`
└── Snakefile # Snakemake workflow file
```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/jrhawley/bio-jtools-rs

Awesome Lists containing this project

README