{"id":16265067,"url":"https://github.com/robaina/filtersam","last_synced_at":"2025-03-19T23:30:28.775Z","repository":{"id":57429528,"uuid":"400865776","full_name":"Robaina/filterSAM","owner":"Robaina","description":"Tools to filter SAM/BAM files by percent identity and percent of matched sequence","archived":false,"fork":false,"pushed_at":"2023-06-06T08:57:54.000Z","size":321,"stargazers_count":4,"open_issues_count":2,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-10T22:37:36.622Z","etag":null,"topics":["alignment","bioinformatics","computational-biology","genomics","python","samtools","sequence-alignment"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Robaina.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-08-28T18:41:26.000Z","updated_at":"2024-11-27T13:36:50.000Z","dependencies_parsed_at":"2024-10-10T17:06:05.087Z","dependency_job_id":"f8728f48-e623-4713-bf2d-80dd4baf446a","html_url":"https://github.com/Robaina/filterSAM","commit_stats":{"total_commits":35,"total_committers":2,"mean_commits":17.5,"dds":"0.11428571428571432","last_synced_commit":"ece4d4e5f8a6c311d56afd8185dc91d50b8ecc69"},"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Robaina%2FfilterSAM","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Robaina%2FfilterSAM/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Robaina%2FfilterSAM/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Robaina%2FfilterSAM/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Robaina","download_url":"https://codeload.github.com/Robaina/filterSAM/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244524384,"owners_count":20466416,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["alignment","bioinformatics","computational-biology","genomics","python","samtools","sequence-alignment"],"created_at":"2024-10-10T17:05:57.794Z","updated_at":"2025-03-19T23:30:28.380Z","avatar_url":"https://github.com/Robaina.png","language":"Python","readme":"![logo](assets/logo.png)\n## A Python tool to filter sam/bam files by percent identity or percent of matched sequence\n\n![PyPI](https://img.shields.io/pypi/v/filtersam)\n![GitHub release (latest by date)](https://img.shields.io/github/v/release/Robaina/filterSAM)\n[![GitHub license](https://img.shields.io/github/license/Robaina/filterSAM)](https://github.com/Robaina/filterSAM/blob/master/LICENSE)\n![Contributor Covenant](https://img.shields.io/badge/Contributor%20Covenant-v2.0%20adopted-ff69b4)\n[![DOI](https://zenodo.org/badge/400865776.svg)](https://zenodo.org/badge/latestdoi/400865776)\n\n\u003cbr\u003e\n\nPercent identity is computed as:\n\n$$PI = 100 \\frac{N_m}{N_m + N_i}$$\n\nwhere $N_m$ is the number of matches and $N_i$ is the number of mismatches.\n\nPercent of matched sequences is computed as:\n\n$$PM = 100 \\frac{N_m}{L}$$\n\nwhere $L$ corresponds to query sequence length.\n\n## NOTES\n\n1. Percent of matched sequence is also an alternative definition of percent identity used in some cases, for intance, in [BLAST](https://lh3.github.io/2018/11/25/on-the-definition-of-sequence-identity).\n\n2. BAM/SAM files must contain [MD tags](https://github.com/vsbuffalo/devnotes/wiki/The-MD-Tag-in-BAM-Files) to be able to filter by percent identity. Aligners such as [BWA](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2705234/) add MD tags to each queried sequence in a BAM file. MD tags can also be generated with [samtools](http://www.htslib.org/doc/samtools-calmd.html).\n\n## Installation\n\n```pip install filtersam```\n\n## Usage\n\nYou can find a jupyter notebook with usage examples [here](examples/examples.ipynb).\n\n## Citation\n\nIf you use this software, please cite it as below:\n\nRobaina-Estévez, S. (2022). filterSAM: filter sam/bam files by percent identity or percent of matched sequence (Version 0.0.11)[Computer software]. https://doi.org/10.5281/zenodo.7056278.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frobaina%2Ffiltersam","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frobaina%2Ffiltersam","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frobaina%2Ffiltersam/lists"}