{"id":16916915,"url":"https://github.com/althonos/pytrimal","last_synced_at":"2025-06-23T17:34:17.405Z","repository":{"id":37919967,"uuid":"498731215","full_name":"althonos/pytrimal","owner":"althonos","description":"Cython bindings and Python interface to trimAl, a tool for automated alignment trimming. Now with SIMD!","archived":false,"fork":false,"pushed_at":"2024-08-28T14:31:32.000Z","size":737,"stargazers_count":20,"open_issues_count":0,"forks_count":2,"subscribers_count":2,"default_branch":"main","last_synced_at":"2024-12-27T18:47:03.001Z","etag":null,"topics":["alignment-trimming","bioinformatics","cython-wrapper","genomics","multiple-sequence-alignment","python","python-interface","python-library","simd"],"latest_commit_sha":null,"homepage":"https://pytrimal.readthedocs.org","language":"Cython","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/althonos.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"COPYING","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-06-01T12:41:33.000Z","updated_at":"2024-08-28T14:31:36.000Z","dependencies_parsed_at":"2023-11-18T00:13:25.073Z","dependency_job_id":"418c0e65-5cc9-4f0a-958b-546b8eb1c68f","html_url":"https://github.com/althonos/pytrimal","commit_stats":{"total_commits":274,"total_committers":2,"mean_commits":137.0,"dds":"0.0036496350364964014","last_synced_commit":"66514be302f086534617b115d73c929342bc8bc5"},"previous_names":[],"tags_count":18,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/althonos%2Fpytrimal","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/althonos%2Fpytrimal/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/althonos%2Fpytrimal/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/althonos%2Fpytrimal/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/althonos","download_url":"https://codeload.github.com/althonos/pytrimal/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":232235495,"owners_count":18492774,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["alignment-trimming","bioinformatics","cython-wrapper","genomics","multiple-sequence-alignment","python","python-interface","python-library","simd"],"created_at":"2024-10-13T19:31:20.582Z","updated_at":"2025-01-02T17:56:33.065Z","avatar_url":"https://github.com/althonos.png","language":"Cython","readme":"# 🐍✂️ PytrimAl [![Stars](https://img.shields.io/github/stars/althonos/pytrimal.svg?style=social\u0026maxAge=3600\u0026label=Star)](https://github.com/althonos/pytrimal/stargazers)\n\n*[Cython](https://cython.org/) bindings and Python interface to [trimAl](http://trimal.cgenomics.org/), a tool for automated alignment trimming. **Now with SIMD!***\n\n[![Actions](https://img.shields.io/github/actions/workflow/status/althonos/pytrimal/test.yml?branch=main\u0026logo=github\u0026style=flat-square\u0026maxAge=300)](https://github.com/althonos/pytrimal/actions)\n[![Coverage](https://img.shields.io/codecov/c/gh/althonos/pytrimal?style=flat-square\u0026maxAge=3600\u0026logo=codecov)](https://codecov.io/gh/althonos/pytrimal/)\n[![License](https://img.shields.io/badge/license-GPLv3-blue.svg?style=flat-square\u0026maxAge=2678400)](https://choosealicense.com/licenses/gpl-3.0/)\n[![PyPI](https://img.shields.io/pypi/v/pytrimal.svg?style=flat-square\u0026maxAge=3600\u0026logo=PyPI)](https://pypi.org/project/pytrimal)\n[![Bioconda](https://img.shields.io/conda/vn/bioconda/pytrimal?style=flat-square\u0026maxAge=3600\u0026logo=anaconda)](https://anaconda.org/bioconda/pytrimal)\n[![AUR](https://img.shields.io/aur/version/python-pytrimal?logo=archlinux\u0026style=flat-square\u0026maxAge=3600)](https://aur.archlinux.org/packages/python-pytrimal)\n[![Wheel](https://img.shields.io/pypi/wheel/pytrimal.svg?style=flat-square\u0026maxAge=3600)](https://pypi.org/project/pytrimal/#files)\n[![Python Versions](https://img.shields.io/pypi/pyversions/pytrimal.svg?style=flat-square\u0026maxAge=600\u0026logo=python)](https://pypi.org/project/pytrimal/#files)\n[![Python Implementations](https://img.shields.io/pypi/implementation/pytrimal.svg?style=flat-square\u0026maxAge=600\u0026label=impl)](https://pypi.org/project/pytrimal/#files)\n[![Source](https://img.shields.io/badge/source-GitHub-303030.svg?maxAge=2678400\u0026style=flat-square)](https://github.com/althonos/pytrimal/)\n[![Mirror](https://img.shields.io/badge/mirror-EMBL-009f4d?style=flat-square\u0026maxAge=2678400)](https://git.embl.de/larralde/pytrimal/)\n[![Issues](https://img.shields.io/github/issues/althonos/pytrimal.svg?style=flat-square\u0026maxAge=600)](https://github.com/althonos/pytrimal/issues)\n[![Docs](https://img.shields.io/readthedocs/pytrimal/latest?style=flat-square\u0026maxAge=600)](https://pytrimal.readthedocs.io)\n[![Changelog](https://img.shields.io/badge/keep%20a-changelog-8A0707.svg?maxAge=2678400\u0026style=flat-square)](https://github.com/althonos/pytrimal/blob/main/CHANGELOG.md)\n[![Downloads](https://img.shields.io/pypi/dm/pytrimal?style=flat-square\u0026color=303f9f\u0026maxAge=86400\u0026label=downloads)](https://pepy.tech/project/pytrimal)\n\n***⚠️ This package is based on the release candidate of trimAl 2.0, and results\nmay not be consistent across versions or with the trimAl 1.4 results.***\n\n## 🗺️ Overview\n\nPytrimAl is a Python module that provides bindings to [trimAl](http://trimal.cgenomics.org/)\nusing [Cython](https://cython.org/). It implements a user-friendly, Pythonic\ninterface to use one of the different trimming methods from trimAl and\naccess results directly. It interacts with the trimAl internals, which has\nthe following advantages:\n\n- **single dependency**: PytrimAl is distributed as a Python package, so you\n  can add it as a dependency to your project, and stop worrying about the\n  trimAl binary being present on the end-user machine.\n- **no intermediate files**: Everything happens in memory, in a Python object\n  you control, so you don't have to invoke the trimAl CLI using a\n  sub-process and temporary files.\n  [`Alignment`](https://pytrimal.readthedocs.io/en/latest/api/alignment.html#pytrimal.Alignment)\n  objects can be created directly from Python code.\n- **friendly interface**: The different trimming methods are implement as\n  Python classes that can be configured independently.\n- **error management**: Errors occuring in trimAl are converted\n  transparently into Python exceptions, including an informative\n  error message.\n- **better performance**: PytrimAl uses *SIMD* instructions to compute\n  statistics like pairwise sequence similarity. This makes the whole\n  trimming process much faster for alignment with a large number of\n  sequences, at the expense of slightly higher memory consumption.\n\n## 📋 Roadmap\n\nThe following features are available or considered for implementation:\n\n- [x] **automatic trimming**: Support for trimming alignments using one of the\n  automatic heuristics implemented in trimAl.\n- [x] **manual trimming**: Support for trimming alignments using manually\n  defined conservation and gap thresholds for each residue position.\n- [x] **overlap trimming**: Trimming sequences using residue and sequence\n  overlaps to exclude regions with minimal conservation.\n- [x] **representative trimming**: Select only representative sequences\n  from the alignment, either using a fixed number, or a maximum identity\n  threshold.\n- [x] **alignment loading from disk**: Load an alignment from disk given\n  a filename.\n- [x] **alignment loading from a file-like object**: Load an alignment from\n  a Python [file object](https://docs.python.org/3/glossary.html#term-file-object)\n  instead of a file on the local filesystem.\n- [x] **aligment creation from Python**: Create an alignment from a collection\n  of sequences stored in Python strings.\n- [x] **alignment formatting to disk**: Write an alignment to a file given\n  a filename in one of the supported file formats.\n- [x] **alignment formatting to a file-like object**: Write an alignment to\n  a file-like object in one of the supported file formats.\n- [ ] **reverse-translation**: Back-translate a protein alignment to align\n  the sequences in genomic space.\n- [x] **alternative similarity matrix**: Specify an alternative similarity\n  matrix for the alignment (instead of BLOSUM62).\n- [x] **similarity matrix creation**: Create a similarity matrix from scratch\n  from Python code.\n- [x] **windows for manual methods**: Use a sliding window for computing\n  statistics in manual methods.\n\n## 🔧 Installing\n\nPytrimAl is available for all modern versions (3.6+), with no external dependencies.\n\nIt can be installed directly from [PyPI](https://pypi.org/project/pytrimal/),\nwhich hosts some pre-built wheels for the x86-64 architecture (Linux/OSX)\nand the Aarch64 architecture (Linux only), as well as the code required to compile\nfrom source with Cython:\n```console\n$ pip install pytrimal\n```\n\nOtherwise, pytrimal is also available as a [Bioconda](https://bioconda.github.io/)\npackage:\n```console\n$ conda install -c bioconda pytrimal\n```\n\n## 💡 Example\n\nLet's load an `Alignment` from a file on the disk, and use the *strictplus*\nmethod to trim it, before printing the `TrimmedAlignment` as a Clustal block:\n```python\nfrom pytrimal import Alignment, AutomaticTrimmer\n\nali = Alignment.load(\"pytrimal/tests/data/example.001.AA.clw\")\ntrimmer = AutomaticTrimmer(method=\"strictplus\")\n\ntrimmed = trimmer.trim(ali)\nfor name, seq in zip(trimmed.names, trimmed.sequences):\n    print(name.decode().rjust(6), seq)\n```\n\nThis should output the following:\n```\nSp8    GIVLVWLFPWNGLQIHMMGII\nSp10   VIMLEWFFAWLGLEINMMVII\nSp26   GLFLAAANAWLGLEINMMAQI\nSp6    GIYLSWYLAWLGLEINMMAII\nSp17   GFLLTWFQLWQGLDLNKMPVF\nSp33   GLHMAWFQAWGGLEINKQAIL\n```\n\nYou can then use the\n[`dump`](https://pytrimal.readthedocs.io/en/latest/api/alignment.html#pytrimal.Alignment.dump)\nmethod to write the trimmed alignment to a file or file-like\nobject. For instance, save the results in\n[PIR format](https://www.bioinformatics.nl/tools/crab_pir.html)\nto a file named `example.trimmed.pir`:\n```python\ntrimmed.dump(\"example.trimmed.pir\", format=\"pir\")\n```\n\n## 🧶 Thread-safety\n\nTrimmer objects are thread-safe, and the `trim` method is re-entrant.\nThis means you can batch-process alignments in parallel using a\n[`ThreadPool`](https://docs.python.org/3/library/multiprocessing.html#multiprocessing.pool.ThreadPool)\nwith a single trimmer object:\n```python\nimport glob\nimport multiprocessing.pool\nfrom pytrimal import Alignment, AutomaticTrimmer\n\ntrimmer = AutomaticTrimmer()\nalignments = map(Alignment.load, glob.iglob(\"pytrimal/tests/data/*.fasta\"))\n\nwith multiprocessing.pool.ThreadPool() as pool:\n    trimmed_alignments = pool.map(trimmer.trim, alignments)\n```\n\n## ⏱️ Benchmarks\n\nBenchmarks were run on a [i7-10710U CPU](https://ark.intel.com/content/www/us/en/ark/products/196448/intel-core-i710710u-processor-12m-cache-up-to-4-70-ghz.html)\n@ 1.10GHz, using a single core to time the computation of several statistics,\non a variable number of sequences from\n[`example.014.AA.EggNOG.COG0591.fasta`](https://github.com/inab/trimal/blob/trimAl/dataset/example.014.AA.EggNOG.COG0591.fasta),\nan alignment of 3583 sequences and 7287 columns.\n\n![Benchmarks](https://raw.githubusercontent.com/althonos/pytrimal/main/bench/v0.5.4.svg)\n\nEach graph measures the computation time of a single trimAl statistic\n(see the [Statistics page](https://pytrimal.readthedocs.io/en/stable/statistics.html)\nof the [online documentation](https://pytrimal.readthedocs.io/) for more\ninformation.)\n\nThe `None` curve shows the time using the internal trimAl 2.0 code,\nthe `Generic` curve shows a generic C implementation with some more\noptimizations, and the `SSE` curve shows the time spent using a dedicated\nclass with [SIMD](https://en.wikipedia.org/wiki/Single_instruction,_multiple_data)\nimplementations of the statistic computation.\n\n## 💭 Feedback\n\n### ⚠️ Issue Tracker\n\nFound a bug ? Have an enhancement request ? Head over to the [GitHub issue tracker](https://github.com/althonos/pytrimal/issues)\nif you need to report or ask something. If you are filing in on a bug,\nplease include as much information as you can about the issue, and try to\nrecreate the same bug in a simple, easily reproducible situation.\n\n\n### 🏗️ Contributing\n\nContributions are more than welcome! See\n[`CONTRIBUTING.md`](https://github.com/althonos/pytrimal/blob/main/CONTRIBUTING.md)\nfor more details.\n\n\n## 📋 Changelog\n\nThis project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.html)\nand provides a [changelog](https://github.com/althonos/pytrimal/blob/main/CHANGELOG.md)\nin the [Keep a Changelog](http://keepachangelog.com/en/1.0.0/) format.\n\n\n## ⚖️ License\n\nThis library is provided under the [GNU General Public License v3.0](https://choosealicense.com/licenses/gpl-3.0/).\ntrimAl is developed by the [trimAl team](http://trimal.cgenomics.org/trimal_team) and is distributed under the\nterms of the GPLv3 as well. See `vendor/trimal/LICENSE` for more information.\n\n*This project is in no way not affiliated, sponsored, or otherwise endorsed\nby the [trimAl authors](http://trimal.cgenomics.org/trimal_team). It was developed\nby [Martin Larralde](https://github.com/althonos/) during his PhD project\nat the [European Molecular Biology Laboratory](https://www.embl.de/) in\nthe [Zeller team](https://github.com/zellerlab).*\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Falthonos%2Fpytrimal","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Falthonos%2Fpytrimal","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Falthonos%2Fpytrimal/lists"}