{"id":16916922,"url":"https://github.com/althonos/pytantan","last_synced_at":"2026-04-21T00:01:16.092Z","repository":{"id":235185825,"uuid":"790248066","full_name":"althonos/pytantan","owner":"althonos","description":"Cython bindings and Python interface to Tantan, a fast method for identifying repeats in DNA and protein sequences.","archived":false,"fork":false,"pushed_at":"2026-04-20T21:53:27.000Z","size":162,"stargazers_count":3,"open_issues_count":1,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-04-20T23:28:36.793Z","etag":null,"topics":["bioinformatics","cython-library","cython-wrapper","dna-repeats","genomics","python-bindings","python-library","simd"],"latest_commit_sha":null,"homepage":"https://pytantan.readthedocs.io","language":"Cython","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/althonos.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"COPYING","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2024-04-22T14:33:53.000Z","updated_at":"2026-04-20T21:53:31.000Z","dependencies_parsed_at":null,"dependency_job_id":"05a14ffa-eb5e-46cb-af5e-ac406cdab63e","html_url":"https://github.com/althonos/pytantan","commit_stats":{"total_commits":61,"total_committers":1,"mean_commits":61.0,"dds":0.0,"last_synced_commit":"bd96db78024862244ff9982c48fad7a88af04c17"},"previous_names":["althonos/pytantan"],"tags_count":5,"template":false,"template_full_name":null,"purl":"pkg:github/althonos/pytantan","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/althonos%2Fpytantan","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/althonos%2Fpytantan/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/althonos%2Fpytantan/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/althonos%2Fpytantan/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/althonos","download_url":"https://codeload.github.com/althonos/pytantan/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/althonos%2Fpytantan/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32071013,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-20T21:26:33.338Z","status":"ssl_error","status_checked_at":"2026-04-20T21:26:22.081Z","response_time":94,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bioinformatics","cython-library","cython-wrapper","dna-repeats","genomics","python-bindings","python-library","simd"],"created_at":"2024-10-13T19:31:23.967Z","updated_at":"2026-04-21T00:01:16.086Z","avatar_url":"https://github.com/althonos.png","language":"Cython","funding_links":[],"categories":[],"sub_categories":[],"readme":"# 🐍🔁 PyTantan [![Stars](https://img.shields.io/github/stars/althonos/pytantan.svg?style=social\u0026maxAge=3600\u0026label=Star)](https://github.com/althonos/pytantan/stargazers)\n\n*[Cython](https://cython.org/) bindings and Python interface to [Tantan](https://gitlab.com/mcfrith/tantan), a fast method for identifying repeats in DNA and protein sequences.*\n\n[![Actions](https://img.shields.io/github/actions/workflow/status/althonos/pytantan/test.yml?branch=main\u0026logo=github\u0026style=flat-square\u0026maxAge=300)](https://github.com/althonos/pytantan/actions)\n[![Coverage](https://img.shields.io/codecov/c/gh/althonos/pytantan?style=flat-square\u0026maxAge=3600\u0026logo=codecov)](https://codecov.io/gh/althonos/pytantan/)\n[![License](https://img.shields.io/badge/license-GPLv3+-blue.svg?style=flat-square\u0026maxAge=2678400)](https://choosealicense.com/licenses/gpl-3.0/)\n[![PyPI](https://img.shields.io/pypi/v/pytantan.svg?style=flat-square\u0026maxAge=3600\u0026logo=PyPI)](https://pypi.org/project/pytantan)\n[![Bioconda](https://img.shields.io/conda/vn/bioconda/pytantan?style=flat-square\u0026maxAge=3600\u0026logo=anaconda)](https://anaconda.org/bioconda/pytantan)\n[![AUR](https://img.shields.io/aur/version/python-pytantan?logo=archlinux\u0026style=flat-square\u0026maxAge=3600)](https://aur.archlinux.org/packages/python-pytantan)\n[![Wheel](https://img.shields.io/pypi/wheel/pytantan.svg?style=flat-square\u0026maxAge=3600)](https://pypi.org/project/pytantan/#files)\n[![Python Versions](https://img.shields.io/pypi/pyversions/pytantan.svg?style=flat-square\u0026maxAge=600\u0026logo=python)](https://pypi.org/project/pytantan/#files)\n[![Python Implementations](https://img.shields.io/pypi/implementation/pytantan.svg?style=flat-square\u0026maxAge=600\u0026label=impl)](https://pypi.org/project/pytantan/#files)\n[![Source](https://img.shields.io/badge/source-GitHub-303030.svg?maxAge=2678400\u0026style=flat-square)](https://github.com/althonos/pytantan/)\n[![Mirror](https://img.shields.io/badge/mirror-LUMC-003eaa?style=flat-square\u0026maxAge=2678400)](https://git.lumc.nl/mflarralde/pytantan/)\n[![Issues](https://img.shields.io/github/issues/althonos/pytantan.svg?style=flat-square\u0026maxAge=600)](https://github.com/althonos/pytantan/issues)\n[![Docs](https://img.shields.io/readthedocs/pytantan/latest?style=flat-square\u0026maxAge=600)](https://pytantan.readthedocs.io)\n[![Changelog](https://img.shields.io/badge/keep%20a-changelog-8A0707.svg?maxAge=2678400\u0026style=flat-square)](https://github.com/althonos/pytantan/blob/main/CHANGELOG.md)\n[![Downloads](https://img.shields.io/pypi/dm/pytantan?style=flat-square\u0026color=303f9f\u0026maxAge=86400\u0026label=downloads)](https://pepy.tech/project/pytantan)\n\n\n## 🗺️ Overview\n\n[Tantan](https://gitlab.com/mcfrith/tantan) is a fast method developed\nby Martin Frith[\\[1\\]](#ref1) to identify simple repeats in DNA or protein \nsequences. It can be used to mask repeat regions in reference sequences, and \navoid false homology predictions between repeated regions.\n\nPyTantan is a Python module that provides bindings to [Tantan](https://gitlab.com/mcfrith/tantan)\nusing [Cython](https://cython.org/). It implements a user-friendly, Pythonic\ninterface to mask a sequence with various parameters. It interacts with the \nTantan interface rather than with the CLI, which has the following advantages:\n\n- **no binary dependency**: PyTantan is distributed as a Python package, so\n  you can add it as a dependency to your project, and stop worrying about the\n  `tantan` binary being present on the end-user machine.\n- **no intermediate files**: Everything happens in memory, in a Python object\n  you control, so you don't have to invoke the Tantan CLI using a sub-process\n  and temporary files.\n- **better portability**: Tantan uses SIMD to accelerate alignment scoring, \n  but doesn't support dynamic dispatch, so it has to be compiled on the local\n  machine to be able to use the full capabilities of the local CPU. PyTantan\n  ships several versions of Tantan instead, each compiled with different \n  target features, and selects the best one for the local platform at runtime.\n\n\n## 🔧 Installing\n\nPyTantan is available for all modern versions (3.6+), depending only on the\n[`scoring-matrices`](https://pypi.org/project/scoring-matrices) package, and\noptionally on the lightweight [`archspec`](https://pypi.org/project/archspec)\npackage for runtime CPU feature detection.\n\nIt can be installed directly from [PyPI](https://pypi.org/project/pytantan/),\nwhich hosts some pre-built wheels for Linux and MacOS, as well as the code \nrequired to compile from source with Cython:\n```console\n$ pip install pytantan\n```\n\nOtherwise, PyTantan is also available as a [Bioconda](https://bioconda.github.io/)\npackage:\n```console\n$ conda install -c bioconda pytantan\n```\n\nCheck the [*install* page](https://pytantan.readthedocs.io/en/stable/install.html)\nof the documentation for other ways to install PyTantan on your machine.\n\n## 💡 Example\n\nThe top-level function `pytantan.mask_repeats` can be used to mask a sequence\nwithout having to manage intermediate objects:\n\n```python\nimport pytantan\nmasked = pytantan.mask_repeats(\"ATTATTATTATTATT\")\nprint(masked)                 # ATTattattattatt\n```\n\nThe mask symbol (and other parameters) can be given as keyword arguments:\n\n```python\nimport pytantan\nmasked = pytantan.mask_repeats(\"ATTATTATTATTATT\", mask='N')\nprint(masked)                 # ATTNNNNNNNNNNNN\n```\n\nTo mask several sequences iteratively with the same parameters, consider \ncreating a `RepeatFinder` once and calling the `mask_repeats` method for \neach sequence to avoid resource re-initialization.\n\n\u003c!-- See the [API documentation](https://pytantan.readthedocs.io/en/stable/api/index.html) \nfor more examples, including how to use the internal API, and detailed \nreference of the parameters and result types. --\u003e\n\n\u003c!-- ## 🧶 Thread-safety --\u003e\n\n\u003c!-- ## ⏱️ Benchmarks --\u003e\n\n\n## 💭 Feedback\n\n### ⚠️ Issue Tracker\n\nFound a bug ? Have an enhancement request ? Head over to the [GitHub issue tracker](https://github.com/althonos/pytantan/issues)\nif you need to report or ask something. If you are filing in on a bug,\nplease include as much information as you can about the issue, and try to\nrecreate the same bug in a simple, easily reproducible situation.\n\n\n### 🏗️ Contributing\n\nContributions are more than welcome! See\n[`CONTRIBUTING.md`](https://github.com/althonos/pytantan/blob/main/CONTRIBUTING.md)\nfor more details.\n\n\n## 📋 Changelog\n\nThis project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.html)\nand provides a [changelog](https://github.com/althonos/pytantan/blob/main/CHANGELOG.md)\nin the [Keep a Changelog](http://keepachangelog.com/en/1.0.0/) format.\n\n\n## ⚖️ License\n\nThis library is provided under the [GNU General Public License v3.0 or later](https://choosealicense.com/licenses/gpl-3.0/).\nTantan is developed by [Martin Frith](https://sites.google.com/site/mcfrith/martin-frith) and is distributed under the\nterms of the GPLv3 or later as well. See `vendor/tantan/COPYING.txt` for more information.\n\n*This project is in no way not affiliated, sponsored, or otherwise endorsed\nby the [Tantan authors](https://github.com/Martinsos). It was developed\nby [Martin Larralde](https://github.com/althonos/) during his PhD project\nat the [Leiden University Medical Center](https://www.lumc.nl/en/) in\nthe [Zeller team](https://github.com/zellerlab).*\n\n\n## 📚 References\n\n- \u003ca id=\"ref1\"\u003e\\[1\\]\u003c/a\u003e Frith, Martin C. “A new repeat-masking method enables specific detection of homologous sequences.” Nucleic acids research vol. 39,4 (2011): e23. [doi:10.1093/nar/gkq1212](https://doi.org/10.1093/nar/gkq1212)\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Falthonos%2Fpytantan","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Falthonos%2Fpytantan","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Falthonos%2Fpytantan/lists"}