An open API service indexing awesome lists of open source software.

https://github.com/tomdstanton/eris

Graph-aware contextual annotation of targeted genomic features.
https://github.com/tomdstanton/eris

bacterial-genomics dna-sequences mobile-element-insertion pangenome-graph

Last synced: 20 days ago
JSON representation

Graph-aware contextual annotation of targeted genomic features.

Awesome Lists containing this project

README

          

# `eris` ๐Ÿงฌ๐Ÿโœจ

[![PyPI version](https://img.shields.io/pypi/v/eris.svg)](https://pypi.org/project/pyeris/)
[![Python versions](https://img.shields.io/pypi/pyversions/eris.svg)](https://pypi.org/project/pyeris/)
[![Documentation](https://img.shields.io/badge/docs-GitHub_Pages-blue.svg)](https://tomdstanton.github.io/eris/)
[![License](https://img.shields.io/github/license/tomdstanton/eris.svg)](LICENSE)

โ‹†โบโ€งโ‚Šโ˜ฝโ—ฏโ˜พโ‚Šโ€งโบโ‹† **Graph-aware contextual annotation of targeted genomic features.** โ‹†โบโ€งโ‚Šโ˜ฝโ—ฏโ˜พโ‚Šโ€งโบโ‹†

`eris` is a bioinformatics pipeline designed to resolve and annotate shattered genomic
features (like mobile genetic elements or structural variants) directly from assembly graphs.
By employing a dynamic "Anchor-and-Traverse" depth-first search through de Bruijn/string graphs (GFA),
`eris` stitches together split alignments across multiple micro-contigs, seamlessly bridging sequence
bubbles and gaps.

Crucially, it utilizes fractional read depth flow to distinguish between dominant wild-type structures and rare
sub-clonal insertions, returning detailed evolutionary context including upstream flanks, downstream flanks, and
trapped passenger genes.

---

## ๐Ÿš€ Features

* **Graph-Aware Traversal:** Overcomes aligner limitations by traversing assembly graph topology to bridge unaligned micro-contigs and resolve shattered targets.
* **Sub-Clonal Resolution:** Integrates graph read-depth (`dp`/`rd` tags) to calculate fractional copy number, easily distinguishing low-frequency variant bubbles from dominant paths.
* **Contextual Annotation:** Sweeps across stitched paths to identify internal passenger genes (e.g., AMR genes trapped inside transposons) and flanking genomic context.
* **High Performance:** Built for speed with a hybrid architecture utilizing `numpy`, `numba`, and `mappy` (Minimap2).
* **Standardized Outputs:** Generates detailed tabular reports (TSV), locus-specific FASTAs, and standard annotations (`pyfgs` powered GFF3/FAA).

## ๐Ÿ“ฆ Installation

`eris` requires **Python 3.11 or later**.

Install directly from PyPI:
```bash
pip install pyeris
```

### Dependencies

`eris` relies on the following core libraries:

* `numpy` (>=2.4)
* `numba` (>=0.65)
* `mappy` (>=2.3)
* `pyfgs` (>=0.0.1)
* `biopython` (>=1.87)

## ๐Ÿ› ๏ธ Usage

`eris` installs a command-line interface `eris` for immediate use.

```bash
eris -i assembly.gfa -d targets.fasta -o results/sample_A
```

### Basic Arguments
* `-i`, `--genome`: Path to a genome in fasta or GFA format (can be compressed). GFA is heavily recommended to utilize topological stitching.
* `-d`, `--targets`: Path to target nucleotide features (fasta or pre-indexed `.mmi`).
* `-o`, `--outprefix`: Prefix for all generated output files.
* `-f`, `--feature-type`: The type of feature to annotate (default: `CDS`).
* `--hops`: Maximum number of contextual genes to sweep upstream/downstream (default: `3`).
* `--tolerance`: Distance tolerance in base pairs for merging clustered alignments (default: `0`).

### Outputs
Depending on the arguments provided, eris will output:

1. **`{prefix}_report.tsv`**: A detailed report of every resolved locus, including targets, context (INSIDE, UPSTREAM, DOWNSTREAM), biological effects, topological hops, and fractional depths.
2. **`{prefix}_loci.fasta`**: The stitched nucleotide sequences for each assembled structural variant.
3. **`{prefix}_assembly.gff`**: GFF3 annotation of the global features.
4. **`{prefix}_proteins.faa`**: Amino acid fasta of the global features.

## ๐Ÿ“š Documentation

For detailed guides, API reference, and advanced configuration, visit the [eris Documentation](https://tomdstanton.github.io/eris/).

## ๐Ÿ’ป Development

`eris` uses [`hatch`](https://hatch.pypa.io/) for build and environment management.

To set up a local development environment ([uv](https://docs.astral.sh/uv/getting-started/installation/) needs to be installed):
```bash
# Clone the repository
git clone [https://github.com/tomdstanton/eris.git](https://github.com/tomdstanton/eris.git)
cd eris
make dev
```

### Running Tests & Linting

`eris` enforces rigorous type checking and linting.

```bash
# Run tests
pytest

# Run static type checking
mypy

# Run linting and formatting
ruff check
ruff format
```

## ๐Ÿ“ License

This project is licensed under the terms of the LICENSE file included in the repository.

## ๐Ÿค Authors

* **Tom Stanton** - [tomdstanton@gmail.com](mailto:tomdstanton@gmail.com)

---

![eris](https://static.wikia.nocookie.net/dreamworks/images/6/6f/Sinbad-disneyscreencaps.com-1100.jpg/revision/latest?cb=20240311205845)