https://github.com/biosails/pheniqs
Fast and accurate sequence demultiplexing
https://github.com/biosails/pheniqs
autocomplete bam barcode classifiers cram demultiplexing fastq multiplexing robust sam sequencing tag tagging
Last synced: 4 months ago
JSON representation
Fast and accurate sequence demultiplexing
- Host: GitHub
- URL: https://github.com/biosails/pheniqs
- Owner: biosails
- License: other
- Created: 2017-04-05T17:33:46.000Z (about 9 years ago)
- Default Branch: master
- Last Pushed: 2025-06-12T14:38:33.000Z (11 months ago)
- Last Synced: 2025-06-12T15:43:02.367Z (11 months ago)
- Topics: autocomplete, bam, barcode, classifiers, cram, demultiplexing, fastq, multiplexing, robust, sam, sequencing, tag, tagging
- Language: C++
- Homepage:
- Size: 72.2 MB
- Stars: 28
- Watchers: 1
- Forks: 4
- Open Issues: 7
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG
- License: LICENSE
Awesome Lists containing this project
README
# Pheniqs
Pheniqs is a flexible generic barcode classifier for high-throughput next-gen sequencing that caters to a wide variety of experimental designs and has been designed for efficient data processing.
Qestions? *lior [dot] galanti [ at sign ] nyu.edu* or just open a ticket.
Citing Pheniqs: [Pheniqs 2.0: accurate, high-performance Bayesian decoding and confidence estimation for combinatorial barcode indexing](https://doi.org/10.1186/s12859-021-04267-5)
Please visit the [Pheniqs website](http://biosails.github.io/pheniqs) for more information.
You might also want to check the [intro talk given by Lior Galanti on April 29, 2021](https://learn.gencore.bio.nyu.edu/pheniqs/) for the NYU gencore.
### Powerful and intuitive syntax
- Classifies standard barcode types: Sample, Cellular, and Molecular Index
- Directly writes barcodes to standard or custom BAM fields
- Addresses index tags in arbitrary locations along reads
- Easily accommodates custom barcode types, eliminating the need for pre- or post-processing
- Easily handles any number of combinatorial barcode tags
### Noise and quality aware probabilistic classifier
- [Increased accuracy](https://biosails.github.io/pheniqs/pamld) over standard edit distance methods
- Reports classification error probabilities in SAM auxiliary tags
- Modular design allows addition of new classifiers
### Robust engineering
- Multithreaded C++ implementation optimized for speed
- POSIX standard stream integration
- Directly interfaces with low level HTSLib C API
- Performance scales linearly with the number of available processing cores
### Easy to install or build
- Stable releases available from Bioconda
- [Custom package manager](https://github.com/biosails/pheniqs-build-api) can build dependencies and binaries from scratch
- Easily installed on clusters or cloud without elevated permissions
- Portable compiled binaries available
- Available in a Docker container
### Easy to use
- Simple command line syntax with autocomplete
- Reusable, inheritence enabled, [JSON](https://en.wikipedia.org/wiki/JSON) encoded configuration
- Preconfigured barcode [library sets](https://biosails.github.io/pheniqs/recipe)
- Reads and writes multiple file formats: FASTQ, SAM/BAM/CRAM
- Fast standalone file format interconversion
- Helper scripts to assist in configuration file bootstraping
- Facilitates more robust and reproducible downstream analysis
Pheniqs runs on all modern POSIX systems and provides an easy to learn command line interface with autocomplete and an extensible reusable configuration syntax. Pheniqs is an ideal utility to pre- and post-process sequence reads for other bioinformatics tools, and it may also be used simply to rapidly and efficiently interconvert a variety of standard sequence file formats without invoking any of its barcode processing features.
For more advanced users and sequencing core managers, we provide detailed [build instructions](https://biosails.github.io/pheniqs/install) and a [custom package manager](https://github.com/biosails/pheniqs-build-api) to easily build portable, statically linked, Pheniqs binaries for deployment on computing clusters. Developers can find code examples and API documention that enable them to expand Pheniqs with new classification algorithms and take advantage of the optimized multithreaded pipeline.
Pheniqs is open sourced and free for academic use under the terms of the [NYU license agreement](https://github.com/biosails/pheniqs/blob/master/LICENSE).
[](https://travis-ci.org/biosails/pheniqs)