https://github.com/althonos/pyaragorn
Cython bindings and Python interface to ARAGORN, a (t|mt|tm)RNA gene finder.
https://github.com/althonos/pyaragorn
aragorn bioinformatics gene-finding genome mtrna python python-library rna tmrna trna
Last synced: 3 months ago
JSON representation
Cython bindings and Python interface to ARAGORN, a (t|mt|tm)RNA gene finder.
- Host: GitHub
- URL: https://github.com/althonos/pyaragorn
- Owner: althonos
- License: gpl-3.0
- Created: 2025-05-26T22:37:58.000Z (5 months ago)
- Default Branch: main
- Last Pushed: 2025-06-02T11:23:17.000Z (4 months ago)
- Last Synced: 2025-06-03T11:15:03.495Z (4 months ago)
- Topics: aragorn, bioinformatics, gene-finding, genome, mtrna, python, python-library, rna, tmrna, trna
- Language: Cython
- Homepage:
- Size: 505 KB
- Stars: 2
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: CONTRIBUTING.md
- License: COPYING
Awesome Lists containing this project
README
# 👑 PyARAGORN [](https://github.com/althonos/pyaragorn/stargazers)
*Cython bindings and Python interface to [ARAGORN](https://www.trna.se/), a (t|mt|tm)RNA gene finder*.
[](https://github.com/althonos/pyaragorn/actions)
[](https://codecov.io/gh/althonos/pyaragorn/)
[](https://choosealicense.com/licenses/gpl-3.0/)
[](https://pypi.org/project/pyaragorn)
[](https://anaconda.org/bioconda/pyaragorn)
[](https://aur.archlinux.org/packages/python-pyaragorn)
[](https://pypi.org/project/pyaragorn/#files)
[](https://pypi.org/project/pyaragorn/#files)
[](https://pypi.org/project/pyaragorn/#files)
[](https://github.com/althonos/pyaragorn/)
[](https://git.lumc.nl/mflarralde/pyaragorn/)
[](https://github.com/althonos/pyaragorn/issues)
[](https://pyaragorn.readthedocs.io)
[](https://github.com/althonos/pyaragorn/blob/main/CHANGELOG.md)
[](https://pepy.tech/project/pyaragorn)## 🗺️ Overview
[ARAGORN](https://trna.se) is a fast method developed
by Dean Laslett & Björn Canback[\[1\]](#ref1) to identify tRNA and tmRNA
genes in genomic sequences using heuristics to detect potential high-scoring
stem-loop structures. The complementary method ARWEN, developed by the same
authors[\[2\]](#ref2) to support the detection of metazoan mitochondrial
RNA (mtRNA) genes, was later integrated into ARAGORN.`pyaragorn` is a Python module that provides bindings to ARAGORN and ARWEN
using [Cython](https://cython.org/). It directly interacts with the
ARAGORN internals, which has the following advantages:- **single dependency**: PyARAGORN is distributed as a Python package, so you
can add it as a dependency to your project, and stop worrying about the
ARAGORN binary being present on the end-user machine.
- **no intermediate files**: Everything happens in memory, in a Python object
you fully control, so you don't have to invoke the ARAGORN CLI using a
sub-process and temporary files. Sequences can be passed directly as
strings, bytes, or any buffer objects, which avoids the overhead of
formatting your input to FASTA for ARAGORN.
- **no output parsing**: The detected RNA genes are returned as Python
objects with transparent attributes, which facilitate handling the output
of ARAGORN compared to parsing the output tables.
- **same results**: PyARAGORN is tested to ensure it produces the same results
as ARAGORN `v1.2.41`, the latest release.### 📋 Features
PyARAGORN currently supports the following features from the ARAGORN
command line:- [x] tRNA gene detection (`aragorn -t`).
- [x] tmRNA gene detection (`aragorn -m`).
- [ ] mtRNA gene detection (`aragorn -mt`).
- [x] Reporting of batch mode metadata (`aragorn -w`).
- [x] Alternative genetic code (`aragorn -gc`).
- [ ] Custom genetic code (`aragorn -gc,BBB=`).
- [x] Circular and linear topologies (`aragorn -c` | `aragorn -l`).
- [ ] Intron length configuration (`aragorn -i`).
- [ ] Scoring threshold configuration (`aragorn -ps`).
- [x] Sequence extraction from RNA gene (`aragorn -seq`).
- [ ] Secondary structure extraction from each gene (`aragorn -br`).### 🧶 Thread-safety
`pyaragorn.RNAFinder` instances are thread-safe. In addition, the `find_rna`
method is re-entrant. This means you can parameterize a `RNAFinder` instance
once, and then use a pool to process sequences in parallel:```python
import multiprocessing.pool
import pyaragornrna_finder = pyaragorn.RNAFinder()
with multiprocessing.pool.ThreadPool() as pool:
predictions = pool.map(rna_finder.find_rna, sequences)
```## 🔧 Installing
This project is supported on Python 3.7 and later.
PyARAGORN can be installed directly from [PyPI](https://pypi.org/project/pyaragorn/),
which hosts some pre-built wheels for the x86-64 architecture (Linux/MacOS/Windows)
and the Aarch64 architecture (Linux/MacOS), as well as the code required to compile
from source with Cython:
```console
$ pip install pyaragorn
```## 💡 Example
Let's load a sequence from a
[GenBank](http://www.insdc.org/files/feature_table.html) file,
use a `RNAFinder` to find all the tRNA genes it contains,
and print the anticodon and corresponding amino-acids of the detected
tRNAs.### 🔬 [Biopython](https://github.com/biopython/biopython)
To use the `RNAFinder` to detect tRNA and tmRNA genes, the default operation
mode, but using the bacterial genetic code (translation table 11):```python
import Bio.SeqIO
import pyaragornrecord = Bio.SeqIO.read("sequence.gbk", "genbank")
rna_finder = pyaragorn.RNAFinder(translation_table=11)
genes = rna_finder.find_rna(bytes(record.seq))for gene in genes:
if gene.type == "tRNA":
print(
gene.amino_acid, # 3-letter code
gene.begin, # 1-based, inclusive
gene.end,
gene.strand, # +1 or -1 for direct and reverse strand
gene.energy,
gene.anticodon
)
```*On older versions of Biopython (before 1.79) you will need to use
`record.seq.encode()` instead of `bytes(record.seq)`*.## 💭 Feedback
### ⚠️ Issue Tracker
Found a bug ? Have an enhancement request ? Head over to the [GitHub issue tracker](https://github.com/althonos/pyaragorn/issues)
if you need to report or ask something. If you are filing in on a bug,
please include as much information as you can about the issue, and try to
recreate the same bug in a simple, easily reproducible situation.## 📋 Changelog
This project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.html)
and provides a [changelog](https://github.com/althonos/pyaragorn/blob/main/CHANGELOG.md)
in the [Keep a Changelog](http://keepachangelog.com/en/1.0.0/) format.## ⚖️ License
This library is provided under the [GNU General Public License v3.0 or later](https://choosealicense.com/licenses/gpl-3.0/).
ARAGORN and ARWEN were developed by Dean Laslett and are distributed under the
terms of the GPLv3 or later as well. See `vendor/aragorn` for more information.*This project is in no way not affiliated, sponsored, or otherwise endorsed
by the ARAGORN authors. It was developed by
[Martin Larralde](https://github.com/althonos/) during his PhD project
at the [Leiden University Medical Center](https://www.lumc.nl/en/) in
the [Zeller Lab](https://zellerlab.org).*## 📚 References
- \[1\] Laslett, Dean, and Bjorn Canback. “ARAGORN, a program to detect tRNA genes and tmRNA genes in nucleotide sequences.” Nucleic acids research vol. 32,1 11-6. 2 Jan. 2004, [doi:10.1093/nar/gkh152](https://doi.org/10.1093/nar/gkh152)
- \[2\] Laslett, Dean, and Björn Canbäck. “ARWEN: a program to detect tRNA genes in metazoan mitochondrial nucleotide sequences.” Bioinformatics (Oxford, England) vol. 24,2 (2008): 172-5. [doi:10.1093/bioinformatics/btm573](https://doi.org/10.1093/bioinformatics/btm573)