https://github.com/mbhall88/pafpy
A lightweight library for working with PAF (Pairwise mApping Format) files
https://github.com/mbhall88/pafpy
alignment bioinformatics library minimap2 paf pairwise-mapping-format python
Last synced: about 1 year ago
JSON representation
A lightweight library for working with PAF (Pairwise mApping Format) files
- Host: GitHub
- URL: https://github.com/mbhall88/pafpy
- Owner: mbhall88
- License: unlicense
- Created: 2020-05-13T08:22:25.000Z (about 6 years ago)
- Default Branch: master
- Last Pushed: 2022-05-03T00:07:33.000Z (about 4 years ago)
- Last Synced: 2025-06-19T18:10:23.676Z (about 1 year ago)
- Topics: alignment, bioinformatics, library, minimap2, paf, pairwise-mapping-format, python
- Language: Python
- Homepage: https://mbh.sh/pafpy/
- Size: 215 KB
- Stars: 31
- Watchers: 2
- Forks: 3
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project
README
# pafpy
A lightweight library for working with [PAF][PAF] (Pairwise mApping Format) files.

[](https://github.com/mbhall88/pafpy/actions)
[](https://codecov.io/gh/mbhall88/pafpy)

[](https://github.com/psf/black)
**Documentation**:
[TOC]: #
## Table of Contents
- [Install](#install)
- [PyPi](#pypi)
- [Conda](#conda)
- [Locally](#locally)
- [Usage](#usage)
- [Contributing](#contributing)
## Install
### PyPi
[](https://pypi.org/project/pafpy/)

```sh
pip install pafpy
```
### Conda

[](https://bioconda.github.io/recipes/pafpy/README.html)
```sh
conda install -c bioconda pafpy
```
### Locally
If you would like to install locally, the recommended way is using [poetry][poetry].
```sh
git clone https://github.com/mbhall88/pafpy.git
cd pafpy
make install
# to check the library is installed run
poetry run python -c "from pafpy import PafRecord;print(str(PafRecord()))"
# you should see a (unmapped) PAF record printed to the terminal
# you can also run the tests if you like
make test-code
```
## Usage
For full usage, please refer to the [documentation][docs]. If there is any functionality
you feel is missing or would make `pafpy` more user-friendly, please raise an issue with
a feature request.
In the below basic usage pattern, we collect the [BLAST identity][blast] of all primary
alignments in our PAF file into a list.
```py
from typing import List
from pafpy import PafFile
path = "path/to/sample.paf"
identities: List[float] = []
with PafFile(path) as paf:
for record in paf:
if record.is_primary():
identity = record.blast_identity()
identities.append(identity)
```
Another use case might be that we want to get the identifiers of all records aligned to
a specific contig, but only keep the alignments where more than 50% of the query (read)
is aligned.
```py
from typing import List
from pafpy import PafFile
path = "path/to/sample.paf"
contig = "chr1"
min_covg = 0.5
identifiers: List[str] = []
with PafFile(path) as paf:
for record in paf:
if record.tname == contig and record.query_coverage > min_covg:
identifiers.append(record.qname)
```
## Contributing
If you would like to contribute to `pafpy`, checkout [`CONTRIBUTING.md`][contribute].
[PAF]: https://github.com/lh3/miniasm/blob/master/PAF.md
[blast]: https://lh3.github.io/2018/11/25/on-the-definition-of-sequence-identity#blast-identity
[contribute]: https://github.com/mbhall88/pafpy/blob/master/CONTRIBUTING.md
[docs]: https://mbh.sh/pafpy/
[poetry]: https://python-poetry.org/