https://github.com/sky-alin/bioforma
🧬 Rust implementations of bioinformatics data structures and algorithms for Python
https://github.com/sky-alin/bioforma
bioinformatics pyo3
Last synced: 6 months ago
JSON representation
🧬 Rust implementations of bioinformatics data structures and algorithms for Python
- Host: GitHub
- URL: https://github.com/sky-alin/bioforma
- Owner: SKY-ALIN
- License: mit
- Created: 2023-03-12T01:39:49.000Z (almost 3 years ago)
- Default Branch: main
- Last Pushed: 2024-09-16T13:39:50.000Z (over 1 year ago)
- Last Synced: 2025-07-01T04:05:58.382Z (6 months ago)
- Topics: bioinformatics, pyo3
- Language: Rust
- Homepage:
- Size: 93.8 KB
- Stars: 5
- Watchers: 2
- Forks: 0
- Open Issues: 5
-
Metadata Files:
- Readme: README.md
- License: LICENSE
- Roadmap: ROADMAP.md
Awesome Lists containing this project
README
# 🧬 bioforma
Rust implementations of bioinformatics data structures and algorithms for Python.
Pyo3-based wrap of rust-bio and rust-bio-types packages.

[](https://github.com/SKY-ALIN/bioforma/blob/main/LICENSE)
## Installation
Install using `pip install bioforma==0.1.0a0`.
Install from source using `make build & make install`.
## Examples
### One-way Open Reading Frame (ORF) Finder
```python
from bioforma.seq_analysis.orf import Finder
f = Finder(start=[b'ATG'], stop=[b'TGA', b'TAG', b'TAA'], min_len=5)
for orf in f.find_all(b'ATGGGGATGGGGGGATGGAAAAATAAGTAG'):
print(repr(orf))
# Output:
#
#
#
```
### Pairwise Alignment
Calculate alignments with a generalized variant of the Smith Waterman algorithm.
```python
from bioforma.alignment import Alignment, Scoring, PairwiseAligner
x = b"ACCGTTGACGC"
y = b"CCGGCA"
scoring = Scoring.from_scores(gap_open=-5, gap_extend=-1, match_score=1, mismatch_score=-1)
aligner = PairwiseAligner(scoring, m=len(x), n=len(y))
alignment: Alignment = aligner.calculate_semiglobal(x, y)
print(alignment.path())
# Output:
# [(1, 1, ), (2, 2, ), (3, 2, ), (4, 2, ),
# (5, 2, ), (6, 2, ), (7, 2, ), (8, 2, ),
# (9, 3, ), (10, 4, ), (11, 5, )]
print(alignment.pretty(x, y, ncol=100))
# Output:
# ACCGTTGACGC
# \|++++++\||
# CC------GGCA
print(alignment.cigar(hard_clip=False))
# Output: 1X1=6I1X2=
```
### Rank Transform
Tools based on transforming the alphabet symbols to their lexicographical ranks.
```python
from bioforma.alphabets import Alphabet, RankTransform
from bioforma.alphabets.dna import make_dna_alphabet
a: Alphabet = make_dna_alphabet()
rt = RankTransform(a)
print(repr(rt))
# Output:
print(rt.transform(b'aAcCgGtT'))
# Output: [4, 0, 5, 1, 6, 2, 7, 3]
print(rt.q_grams(2, b'ACGT'))
# Output: [1, 10, 19]
```
## Status
⚠️ In the active development phase. ⚠️
See [ROADMAP](https://github.com/SKY-ALIN/bioforma/blob/main/ROADMAP.md) for more.
---
Under [MIT Licence](https://github.com/SKY-ALIN/bioforma/blob/main/LICENSE).