Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/ryan-williams/samtools-helpers
A few helper scripts for working with samtools
https://github.com/ryan-williams/samtools-helpers
Last synced: about 2 months ago
JSON representation
A few helper scripts for working with samtools
- Host: GitHub
- URL: https://github.com/ryan-williams/samtools-helpers
- Owner: ryan-williams
- Created: 2015-09-18T16:46:23.000Z (over 9 years ago)
- Default Branch: master
- Last Pushed: 2019-04-27T00:06:08.000Z (over 5 years ago)
- Last Synced: 2024-10-20T07:45:17.937Z (3 months ago)
- Language: Shell
- Size: 3.91 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# samtools-helpers
A few helper scripts for working with samtools.## Installation
Put the path to this repo on your `$PATH`.```sh
echo 'export PATH="$PATH:/path/to/samtools-helpers"' >> ~/.bashrc
```For some handy aliases, `source` `.samtools-rc` in this repo:
```sh
echo 'source /path/to/samtools-helpers/.samtools-rc' >> ~/.bashrc
```## Usage
The main useful scripts here are `samtools-view` (alias `sv`) and variants of it (`samtools-view-with-header` a.k.a. `svh`, `samtools-view-less` a.k.a. `svl`).Each of them takes a `.sam` runs `samtools view`, and then makes the following improvements:
* converts the "bit flag" field to 12 `0`s and `1`s
* formats the file as a table, so e.g. longer vs. shorter read-names in the first column don't mess up the alignment of subsequent columns.## Examples
#### First 5 non-header lines, using `samtools-view`:
```sh
sv 5 NA12878.sam
20FUKAAXX100202:3:6:15018:84106 000010100011 20 224759 60 101M = 225025 366 ACCCAAATCTAATCAAGGCTCCCACTCTAACTCCCAAGCTCTAGGATATACCAAGGACAAAGGAAGATCATGAAATACCACCATGGGGATTCAATCAGCAA ?@BBBCEEDFEFEEEFDEEFEEEEBFEDEFCFDDEEFEDFDFEEEFEEEECEEFEEFCEFDEEFFEFEDEEEFFFDECEDCEFEEDDFFBFEFGEAEDCCC MD:Z:101 PG:Z:BWA RG:Z:20FUK.3 AM:i:37 NM:i:0 SM:i:37 MQ:i:60 OQ:Z:HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHGHHHHHHHHHHHHHHFHHHGHHHHHIIHHDHHHHHEHHHHH UQ:i:0
20GAVAAXX100126:8:62:5578:2527 001001010011 20 224759 60 101M = 224453 -406 ACCCAAAGCTAATCAAGGCTCCCACTCTAACTCCCAAGCTCTAGGATATACCAAGGACAAAGGAAGATCATGAAATACCACCATGGGGATTCAATCAGCAA 834:/,1(:8::8::<98;-(-;>5?08/:;/+7<;=>?@:9>;==<=:<8<>?4>B>AABAAB@@;;<<=>===9>9?=9>=?==;=:;>>@3@;1 MD:Z:7T93 PG:Z:BWA RG:Z:20GAV.8 AM:i:25 NM:i:1 SM:i:37 MQ:i:60 OQ:Z:C4541/1.55555555544008??9?1514401555?AAA;5554444555?A?7AFEFFFFFFDF55555444454445555444@5@==5555555555 UQ:i:7
20FUKAAXX100202:4:47:20584:49257 000010100011 20 224761 60 101M = 225058 387 CCAAATCTAATCAAGGCTCCCACTCTAACTCCCAAGCTCTAGGATATACCAAGGACAAAGGAAGATCATGAAATACCACCATGGGGATTCAATCAGCAAAT ?ACDBBCEDFEDEFEEEFEDBECFBFEFCFDEEEFEDFDFEEEFEEEECEEFEEFCEFFEEFFEFEDEAEFFFAECEFCDFEEFBFFDBEEC:@6A?C4>B MD:Z:101 PG:Z:BWA RG:Z:20FUK.4 AM:i:37 NM:i:0 SM:i:37 MQ:i:60 OQ:Z:HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHEHHHHDHHHHHIHHHHFHGIGHFE;D9BBD7AH UQ:i:0
20GAVAAXX100126:7:47:4730:37293 000010100011 20 224761 60 101M = 225073 412 CCAAATCTAATCAAGGCTCCCACTCTAACTCCCAAGCTCTAGGATATACCAAGGACAAAGGAAGATCATGAAATACCACCATGGGGATTCAATCAGCAAAT ?BB@BCBFDDECC=E@@DB;BDCFDE<BADD>?C?EDEB>@AC==DAE?E=CAC?;:>4=B676<17@@<:AA<;6 MD:Z:101 PG:Z:BWA RG:Z:20GAV.7 AM:i:37 NM:i:0 SM:i:37 MQ:i:60 OQ:Z:BBA>AB@BB@BA?>B==??7>@BBA@:6@@@@@@A@BAA>A?B@BA?=?>9=????@?@>>>@?67@<;??@>?@????@9:96=>2236-39=73@:652 UQ:i:0
20GAVAAXX100126:5:46:21151:39489 000001010011 20 224761 60 101M = 224465 -396 CCAAATCTAATCAAGGCTCCCACTCTAACTCCCAAGCTCTAGGATATACCAAGGACAAAGGAAGATCATGAAATACCACCATGGGGATTCAATCAGCAAAT >9<=BBB>BB>EFFEEEFEEECEFEEFDEFEEEFFEEFEEFDDEEEEDEEFFDDDDFFFDDFFDEFDEEDFFEEEEEEEEEFEEEEEFFEFEFEF=DED=A MD:Z:101 PG:Z:BWA RG:Z:20GAV.5 AM:i:37 NM:i:0 SM:i:37 MQ:i:60 OQ:Z:DBGGFDFCFFBHHHHHHHHHHGHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHEHHHGH UQ:i:0
```
It's still on you to know [which of the 12 bits mean what](https://samtools.github.io/hts-specs/SAMv1.pdf), but it's a lot better than doing the binary conversion in your head!#### First 5 non-header lines, using regular `samtools view`:
```sh
$ samtools view NA12878.sam | head -n 5
20FUKAAXX100202:3:6:15018:84106 163 20 224759 60 101M = 225025 366 ACCCAAATCTAATCAAGGCTCCCACTCTAACTCCCAAGCTCTAGGATATACCAAGGACAAAGGAAGATCATGAAATACCACCATGGGGATTCAATCAGCAA ?@BBBCEEDFEFEEEFDEEFEEEEBFEDEFCFDDEEFEDFDFEEEFEEEECEEFEEFCEFDEEFFEFEDEEEFFFDECEDCEFEEDDFFBFEFGEAEDCCC MD:Z:101 PG:Z:BWA RG:Z:20FUK.3 AM:i:37 NM:i:0 SM:i:37 MQ:i:60 OQ:Z:HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHGHHHHHHHHHHHHHHFHHHGHHHHHIIHHDHHHHHEHHHHH UQ:i:0
20GAVAAXX100126:8:62:5578:2527 595 20 224759 60 101M = 224453 -406 ACCCAAAGCTAATCAAGGCTCCCACTCTAACTCCCAAGCTCTAGGATATACCAAGGACAAAGGAAGATCATGAAATACCACCATGGGGATTCAATCAGCAA 834:/,1(:8::8::<98;-(-;>5?08/:;/+7<;=>?@:9>;==<=:<8<>?4>B>AABAAB@@;;<<=>===9>9?=9>=?==;=:;>>@3@;1 MD:Z:7T93 PG:Z:BWA RG:Z:20GAV.8 AM:i:25 NM:i:1 SM:i:37 MQ:i:60 OQ:Z:C4541/1.55555555544008??9?1514401555?AAA;5554444555?A?7AFEFFFFFFDF55555444454445555444@5@==5555555555 UQ:i:7
20FUKAAXX100202:4:47:20584:49257 163 20 224761 60 101M = 225058 387 CCAAATCTAATCAAGGCTCCCACTCTAACTCCCAAGCTCTAGGATATACCAAGGACAAAGGAAGATCATGAAATACCACCATGGGGATTCAATCAGCAAAT ?ACDBBCEDFEDEFEEEFEDBECFBFEFCFDEEEFEDFDFEEEFEEEECEEFEEFCEFFEEFFEFEDEAEFFFAECEFCDFEEFBFFDBEEC:@6A?C4>B MD:Z:101 PG:Z:BWA RG:Z:20FUK.4 AM:i:37 NM:i:0 SM:i:37 MQ:i:60 OQ:Z:HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHEHHHHDHHHHHIHHHHFHGIGHFE;D9BBD7AH UQ:i:0
20GAVAAXX100126:7:47:4730:37293 163 20 224761 60 101M = 225073 412 CCAAATCTAATCAAGGCTCCCACTCTAACTCCCAAGCTCTAGGATATACCAAGGACAAAGGAAGATCATGAAATACCACCATGGGGATTCAATCAGCAAAT ?BB@BCBFDDECC=E@@DB;BDCFDE<BADD>?C?EDEB>@AC==DAE?E=CAC?;:>4=B676<17@@<:AA<;6 MD:Z:101 PG:Z:BWA RG:Z:20GAV.7 AM:i:37 NM:i:0 SM:i:37 MQ:i:60 OQ:Z:BBA>AB@BB@BA?>B==??7>@BBA@:6@@@@@@A@BAA>A?B@BA?=?>9=????@?@>>>@?67@<;??@>?@????@9:96=>2236-39=73@:652 UQ:i:0
20GAVAAXX100126:5:46:21151:39489 83 20 224761 60 101M = 224465 -396 CCAAATCTAATCAAGGCTCCCACTCTAACTCCCAAGCTCTAGGATATACCAAGGACAAAGGAAGATCATGAAATACCACCATGGGGATTCAATCAGCAAAT >9<=BBB>BB>EFFEEEFEEECEFEEFDEFEEEFFEEFEEFDDEEEEDEEFFDDDDFFFDDFFDEFDEEDFFEEEEEEEEEFEEEEEFFEFEFEF=DED=A MD:Z:101 PG:Z:BWA RG:Z:20GAV.5 AM:i:37 NM:i:0 SM:i:37 MQ:i:60 OQ:Z:DBGGFDFCFFBHHHHHHHHHHGHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHEHHHGH UQ:i:0
```Note the opaque binary-flag integers in the second field, and the misalignments of some columns.
#### Entire `.sam` file without header:
```sh
sv NA12878.sam
# or:
samtools-view NA12878.sam
```#### Entire `.sam` file with header:
```sh
svh NA12878.sam
samtools-view-with-header NA12878.sam
```