Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/feldroop/floxer
FM-index longread PEX-based aligner
https://github.com/feldroop/floxer
Last synced: 3 months ago
JSON representation
FM-index longread PEX-based aligner
- Host: GitHub
- URL: https://github.com/feldroop/floxer
- Owner: feldroop
- License: mit
- Created: 2024-02-05T12:20:35.000Z (11 months ago)
- Default Branch: main
- Last Pushed: 2024-04-14T11:08:14.000Z (9 months ago)
- Last Synced: 2024-04-14T13:40:59.008Z (9 months ago)
- Language: C++
- Size: 166 KB
- Stars: 4
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# floxer: FM-index longread PEX-based aligner
An *exact*\* longread aligner applying the following techniques to not be orders of magnitude slower than more approximate tools like [minimap](https://github.com/lh3/minimap2).
* 2-3 error approximate FM-Index search using optimal search schemes
* heuristic anchor selection\*
* PEX hierarchical verification guided by a novel PEX tree generation strategy
* parallel and vectorized pairwise alignment implementation*Exactness*\* here means adhering to a specific formal definition. It is of course impossible to exactly solve the biological read mapping problem. This tool is guaranteed to find a similar (representative) alignment for every linear alignment that matches the query with at most a given error ratio in edit distance\*. This means that large indel and structural variant resolution are currently out of scope of this project.
\*The exactness property is not held in highly repetetive regions where seeds produce many anchors/hits/matches. Here the heuristic anchor selection is used to identify possibly non-repetitive anchors.
This is an experimental research prototype (for my master's thesis) and currently not competitive to state-of-the-art tools like minimap2, in most regards. In addition to the limitations above, it is much slower and therefore not well suited for most applications with large amounts of data.
## Installation on Linux
Requires a C++20-capable compiler and CMake.
```
git clone --recurse-submodules https://github.com/feldroop/floxer
mkdir floxer/build && cd floxer/build
cmake .. -DCMAKE_BUILD_TYPE:STRING=Release
make
```Execute the following command inside the build directory to run the tests:
```
make check
```## Usage
Basic usage:
```
./floxer --reference hg38.fasta --query reads.fastq --error-probability 0.07 --output mapped_reads.bam
```For a list and descriptions of the basic command line options, run:
```
./floxer --help
```For all available options, including ones intended exclusively for research and evaluation, run:
```
./floxer --advanced-help
```