{"id":16916934,"url":"https://github.com/althonos/pyopal","last_synced_at":"2025-09-07T17:31:31.253Z","repository":{"id":60938935,"uuid":"545519161","full_name":"althonos/pyopal","owner":"althonos","description":"Cython bindings and Python interface to Opal, a SIMD-accelerated database search aligner.","archived":false,"fork":false,"pushed_at":"2024-11-04T18:56:13.000Z","size":264,"stargazers_count":9,"open_issues_count":0,"forks_count":1,"subscribers_count":3,"default_branch":"main","last_synced_at":"2024-12-30T08:21:49.191Z","etag":null,"topics":["bioinformatics","cython-library","cython-wrapper","genomics","needleman-wunsch","python-bindings","python-library","sequence-alignment","simd","smith-waterman"],"latest_commit_sha":null,"homepage":"https://pyopal.readthedocs.io","language":"Cython","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/althonos.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"COPYING","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-10-04T14:09:55.000Z","updated_at":"2024-11-29T15:11:16.000Z","dependencies_parsed_at":"2024-01-17T23:00:33.805Z","dependency_job_id":"8c8b8aaf-7fc7-4620-bf2d-e749f7a3f66e","html_url":"https://github.com/althonos/pyopal","commit_stats":{"total_commits":64,"total_committers":1,"mean_commits":64.0,"dds":0.0,"last_synced_commit":"256d118bc956e74cdfcfe9bf128dd5f27d37cfc1"},"previous_names":[],"tags_count":18,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/althonos%2Fpyopal","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/althonos%2Fpyopal/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/althonos%2Fpyopal/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/althonos%2Fpyopal/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/althonos","download_url":"https://codeload.github.com/althonos/pyopal/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":232232313,"owners_count":18492364,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bioinformatics","cython-library","cython-wrapper","genomics","needleman-wunsch","python-bindings","python-library","sequence-alignment","simd","smith-waterman"],"created_at":"2024-10-13T19:31:30.093Z","updated_at":"2025-01-02T17:41:28.581Z","avatar_url":"https://github.com/althonos.png","language":"Cython","funding_links":[],"categories":[],"sub_categories":[],"readme":"# 🐍🌈🪨 PyOpal [![Stars](https://img.shields.io/github/stars/althonos/pyopal.svg?style=social\u0026maxAge=3600\u0026label=Star)](https://github.com/althonos/pyopal/stargazers)\n\n*[Cython](https://cython.org/) bindings and Python interface to [Opal](https://github.com/Martinsos/opal), a SIMD-accelerated database search aligner.*\n\n[![Actions](https://img.shields.io/github/actions/workflow/status/althonos/pyopal/test.yml?branch=main\u0026logo=github\u0026style=flat-square\u0026maxAge=300)](https://github.com/althonos/pyopal/actions)\n[![Coverage](https://img.shields.io/codecov/c/gh/althonos/pyopal?style=flat-square\u0026maxAge=3600\u0026logo=codecov)](https://codecov.io/gh/althonos/pyopal/)\n[![License](https://img.shields.io/badge/license-MIT-blue.svg?style=flat-square\u0026maxAge=2678400)](https://choosealicense.com/licenses/mit/)\n[![PyPI](https://img.shields.io/pypi/v/pyopal.svg?style=flat-square\u0026maxAge=3600\u0026logo=PyPI)](https://pypi.org/project/pyopal)\n[![Bioconda](https://img.shields.io/conda/vn/bioconda/pyopal?style=flat-square\u0026maxAge=3600\u0026logo=anaconda)](https://anaconda.org/bioconda/pyopal)\n[![AUR](https://img.shields.io/aur/version/python-pyopal?logo=archlinux\u0026style=flat-square\u0026maxAge=3600)](https://aur.archlinux.org/packages/python-pyopal)\n[![Wheel](https://img.shields.io/pypi/wheel/pyopal.svg?style=flat-square\u0026maxAge=3600)](https://pypi.org/project/pyopal/#files)\n[![Python Versions](https://img.shields.io/pypi/pyversions/pyopal.svg?style=flat-square\u0026maxAge=600\u0026logo=python)](https://pypi.org/project/pyopal/#files)\n[![Python Implementations](https://img.shields.io/pypi/implementation/pyopal.svg?style=flat-square\u0026maxAge=600\u0026label=impl)](https://pypi.org/project/pyopal/#files)\n[![Source](https://img.shields.io/badge/source-GitHub-303030.svg?maxAge=2678400\u0026style=flat-square)](https://github.com/althonos/pyopal/)\n[![Mirror](https://img.shields.io/badge/mirror-EMBL-009f4d?style=flat-square\u0026maxAge=2678400)](https://git.embl.de/larralde/pyopal/)\n[![Issues](https://img.shields.io/github/issues/althonos/pyopal.svg?style=flat-square\u0026maxAge=600)](https://github.com/althonos/pyopal/issues)\n[![Docs](https://img.shields.io/readthedocs/pyopal/latest?style=flat-square\u0026maxAge=600)](https://pyopal.readthedocs.io)\n[![Changelog](https://img.shields.io/badge/keep%20a-changelog-8A0707.svg?maxAge=2678400\u0026style=flat-square)](https://github.com/althonos/pyopal/blob/main/CHANGELOG.md)\n[![Downloads](https://img.shields.io/pypi/dm/pyopal?style=flat-square\u0026color=303f9f\u0026maxAge=86400\u0026label=downloads)](https://pepy.tech/project/pyopal)\n\n\n## 🗺️ Overview\n\n[Opal](https://github.com/Martinsos/opal) is a sequence aligner enabling fast\nsequence similarity search using either of the Smith-Waterman, semi-global or\nNeedleman-Wunsch algorithms. It is used part of the SW#db method[\\[1\\]](#ref1)\nto align a query sequence to multiple database sequences on CPU, using \nthe multi-sequence vectorization method described in SWIPE[\\[2\\]](#ref2)\n\nPyOpal is a Python module that provides bindings to [Opal](https://github.com/Martinsos/opal)\nusing [Cython](https://cython.org/). It implements a user-friendly, Pythonic\ninterface to query a database of sequences and access the search results. It\ninteracts with the Opal interface rather than with the CLI, which has the\nfollowing advantages:\n\n- **no binary dependency**: PyOpal is distributed as a Python package, so\n  you can add it as a dependency to your project, and stop worrying about the\n  Opal binary being present on the end-user machine.\n- **no intermediate files**: Everything happens in memory, in a Python object\n  you control, so you don't have to invoke the Opal CLI using a sub-process\n  and temporary files.\n- **better portability**: Opal uses SIMD to accelerate alignment scoring, but\n  doesn't support dynamic dispatch, so it has to be compiled on the local\n  machine to be able to use the full capabilities of the local CPU. PyOpal\n  ships several versions of Opal instead, each compiled with different target\n  features, and selects the best one for the local platform at runtime.\n- **wider platform support**: The Opal code has been backported to work on SSE2\n  rather than SSE4.1, allowing PyOpal to run on older x86 CPUs (all x86 CPUs\n  support it since 2003). In addition, Armv7 and Aarch64 CPUs are also\n  supported if they implement NEON extensions. Finally, the C++ code of Opal\n  has been modified to compile on Windows.\n\n## 🔧 Installing\n\nPyOpal is available for all modern versions (3.6+), optionally depending on\nthe lightweight Python package [`archspec`](https://pypi.org/project/archspec)\nfor runtime CPU feature detection.\n\nIt can be installed directly from [PyPI](https://pypi.org/project/pyopal/),\nwhich hosts some pre-built x86-64 wheels for Linux, MacOS, and Windows,\nAarch64 wheels for Linux and MacOS, as well as the code required to \ncompile from source with Cython:\n```console\n$ pip install pyopal\n```\n\nOtherwise, PyOpal is also available as a [Bioconda](https://bioconda.github.io/)\npackage:\n```console\n$ conda install -c bioconda pyopal\n```\n\nCheck the [*install* page](https://pyopal.readthedocs.io/en/stable/install.html)\nof the documentation for other ways to install PyOpal on your machine.\n\n## 💡 Example\n\nAll classes are imported in the main namespace `pyopal`:\n```python\nimport pyopal\n```\n\n`pyopal` can work with sequences passed as Python strings, \nas well as with ASCII strings in `bytes` objects:\n```python\nquery = \"MAGFLKVVQLLAKYGSKAVQWAWANKGKILDWLNAGQAIDWVVSKIKQILGIK\"\ndatabase = [\n    \"MESILDLQELETSEEESALMAASTVSNNC\",\n    \"MKKAVIVENKGCATCSIGAACLVDGPIPDFEIAGATGLFGLWG\",\n    \"MAGFLKVVQILAKYGSKAVQWAWANKGKILDWINAGQAIDWVVEKIKQILGIK\",\n    \"MTQIKVPTALIASVHGEGQHLFEPMAARCTCTTIISSSSTF\",\n]\n```\n\nIf you plan to reuse the database across several queries, you can store it in \na [`Database`](https://pyopal.readthedocs.io/en/stable/api/database.html#pyopal.Database), \nwhich will keep sequences encoded according to \nan [`Alphabet`](https://pyopal.readthedocs.io/en/stable/api/alphabet.html#pyopal.Alphabet):\n\n```python\ndatabase = pyopal.Database(database)\n```\n\nThe top-level function `pyopal.align` can be used to align a query\nsequence against a database, using multithreading to process chunks\nof the database in parallel:\n```python\nfor result in pyopal.align(query, database):\n    print(result.score, result.target_index, database[result.target_index])\n```\n\nSee the [API documentation](https://pyopal.readthedocs.io/en/stable/api/index.html) \nfor more examples, including how to use the internal API, and detailed \nreference of the parameters and result types.\n\n## 🧶 Thread-safety\n\n`Database` objects are thread safe through a\n[C++17 read/write lock](https://en.cppreference.com/w/cpp/thread/shared_mutex)\nthat prevents modification while the database is searched. In addition, the\n`Aligner.align`  method is re-entrant and can be safely used to query the\nsame database in parallel with different queries across different threads:\n\n```python\nimport multiprocessing.pool\nimport pyopal\nimport Bio.SeqIO\n\nqueries = [\n    \"MEQQIELDVLEISDLIAGAGENDDLAQVMAASCTTSSVSTSSSSSSS\",\n    \"MTQIKVPTALIASVHGEGQHLFEPMAARCTCTTIISSSSTF\",\n    \"MGAIAKLVAKFGWPIVKKYYKQIMQFIGEGWAINKIIDWIKKHI\",\n    \"MGPVVVFDCMTADFLNDDPNNAELSALEMEELESWGAWDGEATS\",\n]\n\ndatabase = pyopal.Database([\n    str(record.seq)\n    for record in Bio.SeqIO.parse(\"vendor/opal/test_data/db/uniprot_sprot12071.fasta\", \"fasta\")\n])\n\naligner = pyopal.Aligner()\nwith multiprocessing.pool.ThreadPool() as pool:\n    hits = dict(pool.map(lambda q: (q, aligner.align(q, database)), queries))\n```\n\n\u003c!-- ## ⏱️ Benchmarks --\u003e\n\n\n## 💭 Feedback\n\n### ⚠️ Issue Tracker\n\nFound a bug ? Have an enhancement request ? Head over to the [GitHub issue tracker](https://github.com/althonos/pyopal/issues)\nif you need to report or ask something. If you are filing in on a bug,\nplease include as much information as you can about the issue, and try to\nrecreate the same bug in a simple, easily reproducible situation.\n\n\n### 🏗️ Contributing\n\nContributions are more than welcome! See\n[`CONTRIBUTING.md`](https://github.com/althonos/pyopal/blob/main/CONTRIBUTING.md)\nfor more details.\n\n\n## 📋 Changelog\n\nThis project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.html)\nand provides a [changelog](https://github.com/althonos/pyopal/blob/main/CHANGELOG.md)\nin the [Keep a Changelog](http://keepachangelog.com/en/1.0.0/) format.\n\n\n## ⚖️ License\n\nThis library is provided under the [MIT License](https://choosealicense.com/licenses/mit/).\nOpal is developed by [Martin Šošić](https://github.com/Martinsos) and is distributed under the\nterms of the MIT License as well. See `vendor/opal/LICENSE` for more information.\n\n*This project is in no way not affiliated, sponsored, or otherwise endorsed\nby the [Opal authors](https://github.com/Martinsos). It was developed\nby [Martin Larralde](https://github.com/althonos/) during his PhD project\nat the [European Molecular Biology Laboratory](https://www.embl.de/) in\nthe [Zeller team](https://github.com/zellerlab).*\n\n\n## 📚 References\n\n- \u003ca id=\"ref1\"\u003e\\[1\\]\u003c/a\u003e Korpar Matija, Martin Šošić, Dino Blažeka, Mile Šikić. SW#db: ‘GPU-Accelerated Exact Sequence Similarity Database Search’. PLoS One. 2015 Dec 31;10(12):e0145857. [doi:10.1371/journal.pone.0145857](https://doi.org/10.1371/journal.pone.0145857). [PMID:26719890](https://pubmed.ncbi.nlm.nih.gov/26719890). [PMC4699916](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4699916/).\n- \u003ca id=\"ref2\"\u003e\\[2\\]\u003c/a\u003e Rognes Torbjørn. Faster Smith-Waterman database searches with inter-sequence SIMD parallelisation. BMC Bioinformatics. 2011 Jun 1;12:221. [doi:10.1186/1471-2105-12-221](https://doi.org/10.1186/1471-2105-12-221). [PMID:21631914](https://pubmed.ncbi.nlm.nih.gov/21631914/).[PMC3120707](http://www.ncbi.nlm.nih.gov/pmc/articles/pmc3120707/).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Falthonos%2Fpyopal","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Falthonos%2Fpyopal","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Falthonos%2Fpyopal/lists"}