{"id":20926893,"url":"https://github.com/mdshw5/simplesam","last_synced_at":"2025-05-13T17:34:00.495Z","repository":{"id":32342319,"uuid":"35917842","full_name":"mdshw5/simplesam","owner":"mdshw5","description":"Simple pure Python SAM parser and objects for working with SAM records","archived":false,"fork":false,"pushed_at":"2022-06-21T02:29:34.000Z","size":136,"stargazers_count":58,"open_issues_count":8,"forks_count":8,"subscribers_count":7,"default_branch":"master","last_synced_at":"2023-03-22T13:23:43.260Z","etag":null,"topics":["bam","bioinformatics","genomics","python","sam"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mdshw5.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2015-05-20T01:15:54.000Z","updated_at":"2023-01-15T17:32:24.000Z","dependencies_parsed_at":"2022-09-08T17:21:48.490Z","dependency_job_id":null,"html_url":"https://github.com/mdshw5/simplesam","commit_stats":null,"previous_names":[],"tags_count":null,"template":null,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mdshw5%2Fsimplesam","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mdshw5%2Fsimplesam/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mdshw5%2Fsimplesam/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mdshw5%2Fsimplesam/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mdshw5","download_url":"https://codeload.github.com/mdshw5/simplesam/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":225249664,"owners_count":17444476,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bam","bioinformatics","genomics","python","sam"],"created_at":"2024-11-18T20:44:06.807Z","updated_at":"2024-11-18T20:44:07.450Z","avatar_url":"https://github.com/mdshw5.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![PyPI](https://img.shields.io/pypi/v/simplesam.svg?)](https://pypi.python.org/pypi/simplesam)\n[![Tests](https://github.com/mdshw5/simplesam/actions/workflows/tests.yml/badge.svg)](https://github.com/mdshw5/simplesam/actions/workflows/tests.yml)\n[![Package Builds](https://github.com/mdshw5/simplesam/actions/workflows/deploy.yml/badge.svg)](https://github.com/mdshw5/simplesam/actions/workflows/deploy.yml)\n[![Documentation Status](https://readthedocs.org/projects/simplesam/badge/?version=latest)](http://simplesam.readthedocs.io/en/latest/?badge=latest)\n\n# Simple SAM parsing\nRequiring no external dependencies (except a samtools installation for BAM reading)\n\n# Installation\n`pip install simplesam`\n\n# Usage\nFor complete module documentation visit [ReadTheDocs](http://simplesam.readthedocs.io/en/latest/).\n\n## Quickstart\n\n```python\n\u003e\u003e\u003e from simplesam import Reader, Writer\n```\n\nRead from SAM/BAM files\n```python\n# can also read BAM\n\u003e\u003e\u003e in_file = open('data/NA18510.sam', 'r')\n\u003e\u003e\u003e in_sam = Reader(in_file)\n```\n\nAccess alignments using an iterator interface\n```python\n\u003e\u003e\u003e x = next(in_sam)\n\u003e\u003e\u003e type(x)\n\u003cclass 'simplesam.Sam'\u003e\n\u003e\u003e\u003e x\nSam(1:2:SRR011051.1022326)\n\u003e\u003e\u003e x.qname\n'SRR011051.1022326'\n\u003e\u003e\u003e x.rname\n'1'\n\u003e\u003e\u003e x.pos\n2\n\u003e\u003e\u003e x.seq\n'AACCCTAACCCCTAACCCTAACCCTAACCCTACCCCTAACCCTACCCCTCC'\n\u003e\u003e\u003e x.qual\n'?\u003c:;;=;\u003e;\u003c\u003c\u003c\u003e96;\u003c;;99;\u003c=3;4\u003c\u003c:(;,\u003c;;/;57\u003c;%6,=:,((3'\n\u003e\u003e\u003e x.cigar\n'8M1I42M'\n\u003e\u003e\u003e x.cigars\n((8, 'M'), (1, 'I'), (42, 'M'))\n\u003e\u003e\u003e x.gapped('seq')\n'AACCCTAACCCTAACCCTAACCCTAACCCTACCCCTAACCCTACCCCTCC'\n\u003e\u003e\u003e len(x)\n50\n\u003e\u003e\u003e x.flag\n35\n\u003e\u003e\u003e x.mapped\nTrue\n\u003e\u003e\u003e x.paired\nTrue\n\u003e\u003e\u003e x.duplicate\nFalse\n\u003e\u003e\u003e x.secondary\nFalse\n\u003e\u003e\u003e x.coords\n[2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51]\n\u003e\u003e\u003e x.tags\n{'H1': 0, 'UQ': 33, 'RG': 'SRR011051', 'H0': 0, 'MF': 130, 'Aq': 25, 'NM': 2}\n\u003e\u003e\u003e str(x)\n'SRR011051.1022326\\t35\\t1\\t2\\t255\\t8M1I42M\\t*\\t0\\t0\\tAACCCTAACCCCTAACCCTAACCCTAACCCTACCCCTAACCCTACCCCTCC\\t?\u003c:;;=;\u003e;\u003c\u003c\u003c\u003e96;\u003c;;99;\u003c=3;4\u003c\u003c:(;,\u003c;;/;57\u003c;%6,=:,((3\\tAq:i:25\\tH0:i:0\\tH1:i:0\\tMF:i:130\\tNM:i:2\\tRG:Z:SRR011051\\tUQ:i:33\\n'\n```\n\nRead the SAM sequence header structure\n```python\n\u003e\u003e\u003e from pprint import pprint\n\u003e\u003e\u003e pprint(in_sam.header)\n{'@HD': OrderedDict([('VN:1.0', ['GO:none', 'SO:coordinate'])]),\n '@SQ': {'SN:1': ['LN:247249719'],\n         'SN:2': ['LN:242951149'],\n         'SN:3': ['LN:199501827'],\n         'SN:4': ['LN:191273063'],\n         'SN:5': ['LN:180857866'],\n         'SN:6': ['LN:170899992'],\n         'SN:7': ['LN:158821424'],\n         'SN:8': ['LN:146274826'],\n         ...\n '@RG': {'ID:SRR011049': ['PL:ILLUMINA',\n                           'PU:BI.PE.080626_SL-XAN_0002_FC304CDAAXX.080630_SL-XAN_0007_FC304CDAAXX.5',\n                           'LB:Solexa-5112',\n                           'PI:330',\n                           'SM:NA18510',\n                           'CN:BI'],\n          'ID:SRR011050': ['PL:ILLUMINA',\n                           'PU:BI.PE.080626_SL-XAN_0002_FC304CDAAXX.080630_SL-XAN_0007_FC304CDAAXX.6',\n                           'LB:Solexa-5112',\n                           'PI:330',\n                           'SM:NA18510',\n                           'CN:BI'],\n         ...}\n         }\n}\n```\n\nWrite SAM files from `Sam` objects\n```python\n# Reader and Writer can also use the context handler (with: statement)\n\u003e\u003e\u003e out_file = open('test.sam', 'w')\n\u003e\u003e\u003e out_sam = Writer(out_file, in_sam.header)\n\u003e\u003e\u003e out_sam.write(x)\n\u003e\u003e\u003e out_sam.close()\n```\n\nWrite SAM files from `Sam` objects to stdout (allows `samtools view` compression)\n```python\n\u003e\u003e\u003e from sys import stdout\n\u003e\u003e\u003e stdout_sam = Writer(stdout, in_sam.header)\n\u003e\u003e\u003e stdout_sam.write(x)\n\u003e\u003e\u003e stdout_sam.close()\n```\n```bash\n$ python my_script_that_uses_simplesam.py | samtools view -hbo test.bam\n```\n\n# Example scripts\nAn example script [`pileup.py`](https://github.com/mdshw5/simplesam/blob/master/scripts/pileup.py) is installed with this module.\nThis script will generate an output that is similar to `samtools pileup` with the addition of several optional columns that summarize\ncounts for individual nucleotides (ACTGN) and deletions with respect to the reference (-). This script leverages the `Sam.gapped()` and\n`Sam.parse_md()` methods to reconstruct position-specific counts from SAM alignment records.\n\n```bash\n$ pileup.py -h\nusage: pileup [-h] [--version] [-c] [-i STATS] bam pileup\n\ngenerate a simple pileup-like file from a sorted/indexed BAM file\n\npositional arguments:\n  bam                   sorted/indexed BAM file\n  pileup                pileup output file\n\noptional arguments:\n  -h, --help            show this help message and exit\n  --version             show program's version number and exit\n  -c, --counts          display counts for A/C/T/G/N/- separately (default: False)\n  -i STATS, --stats STATS\n                        tabulate mismatches to output file\n```","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmdshw5%2Fsimplesam","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmdshw5%2Fsimplesam","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmdshw5%2Fsimplesam/lists"}