Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/cmdcolin/clustal-js
https://github.com/cmdcolin/clustal-js
Last synced: 4 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/cmdcolin/clustal-js
- Owner: cmdcolin
- Created: 2019-04-03T16:19:24.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2024-09-24T04:22:44.000Z (about 2 months ago)
- Last Synced: 2024-11-03T02:06:59.407Z (16 days ago)
- Language: TypeScript
- Size: 543 KB
- Stars: 1
- Watchers: 3
- Forks: 0
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
Awesome Lists containing this project
README
[![NPM version](https://img.shields.io/npm/v/clustal-js.svg?style=flat-square)](https://npmjs.org/package/clustal-js)
[![Build Status](https://img.shields.io/github/actions/workflow/status/cmdcolin/clustal-js/push.yml?branch=master)](https://github.com/cmdcolin/clustal-js/actions?query=branch%3Amaster+workflow%3APush+)# clustal-js
This parses CLUSTAL (multiple sequence aligner) output format files (sometimes
called .aln)## Usage
```typescript
import { parse } from 'clustal-js'
const file = fs.readFileSync('test.aln', 'utf8')
const ret = parse(file)
```## Example
Input
```
CLUSTAL O(1.2.4) multiple sequence alignmentsp|P69905|HBA_HUMAN MVLSPADKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTTKTYFPHFDLSHGSAQVKGHG 60
sp|P01942|HBA_MOUSE MVLSGEDKSNIKAAWGKIGGHGAEYGAEALERMFASFPTTKTYFPHFDVSHGSAQVKGHG 60
sp|P13786|HBAZ_CAPHI MSLTRTERTIILSLWSKISTQADVIGTETLERLFSCYPQAKTYFPHFDLHSGSAQLRAHG 60
* *: ::: : : *.*:. :. *:*:***:* .:* :********: ****::.**sp|P69905|HBA_HUMAN KKVADALTNAVAHVDDMPNALSALSDLHAHKLRVDPVNFKLLSHCLLVTLAAHLPAEFTP 120
sp|P01942|HBA_MOUSE KKVADALASAAGHLDDLPGALSALSDLHAHKLRVDPVNFKLLSHCLLVTLASHHPADFTP 120
sp|P13786|HBAZ_CAPHI SKVVAAVGDAVKSIDNVTSALSKLSELHAYVLRVDPVNFKFLSHCLLVTLASHFPADFTA 120
.**. *: .*. :*:: .*** **:***: *********:**********:* **:**sp|P69905|HBA_HUMAN AVHASLDKFLASVSTVLTSKYR 142
sp|P01942|HBA_MOUSE AVHASLDKFLASVSTVLTSKYR 142
sp|P13786|HBAZ_CAPHI DAHAAWDKFLSIVSGVLTEKYR 142
.**: ****: ** ***.***
```Output
```
{ consensus:
'* *: ::: : : *.*:. :. *:*:***:* .:* :********: ****::.**.**. *: .*. :*:: .*** **:***: *********:**********:* **:** .**: ****: ** ***.***',
alns:
[ { id: 'sp|P69905|HBA_HUMAN',
seq:
'MVLSPADKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTTKTYFPHFDLSHGSAQVKGHGKKVADALTNAVAHVDDMPNALSALSDLHAHKLRVDPVNFKLLSHCLLVTLAAHLPAEFTPAVHASLDKFLASVSTVLTSKYR' },
{ id: 'sp|P01942|HBA_MOUSE',
seq:
'MVLSGEDKSNIKAAWGKIGGHGAEYGAEALERMFASFPTTKTYFPHFDVSHGSAQVKGHGKKVADALASAAGHLDDLPGALSALSDLHAHKLRVDPVNFKLLSHCLLVTLASHHPADFTPAVHASLDKFLASVSTVLTSKYR' },
{ id: 'sp|P13786|HBAZ_CAPHI',
seq:
'MSLTRTERTIILSLWSKISTQADVIGTETLERLFSCYPQAKTYFPHFDLHSGSAQLRAHGSKVVAAVGDAVKSIDNVTSALSKLSELHAYVLRVDPVNFKFLSHCLLVTLASHFPADFTADAHAAWDKFLSIVSGVLTEKYR' } ],
header:
{ info: 'CLUSTAL O(1.2.4) multiple sequence alignment',
version: '1.2.4' } }
```### Parse pairwise outputs
```typescript
import { parsePairwise } from 'clustal-js'
const file = fs.readFileSync('test.aln', 'utf8')
const ret = parse(file)
```Input test.aln (e.g. from EMBOSS needle)
```
########################################
# Program: needle
# Rundate: Mon 5 Feb 2024 17:52:19
# Commandline: needle
# -auto
# -stdout
# -asequence emboss_needle-R20240205-175207-0261-70863964-p1m.asequence
# -bsequence emboss_needle-R20240205-175207-0261-70863964-p1m.bsequence
# Align_format: srspair
# Report_file: stdout
#########################################=======================================
#
# Aligned_sequences: 2
# 1: a
# 2: b
# Matrix: EBLOSUM62
# Gap_penalty: 10.0
# Extend_penalty: 0.5
#
# Length: 614
# Identity: 221/614 (36.0%)
# Similarity: 221/614 (36.0%)
# Gaps: 393/614 (64.0%)
# Score: 1114.0
#
#
#=======================================a 1 MGQKGHKDSLYPCGGTPESSLHEALDQCMTALDLFLTNQFSEALSYLKPR 50
b 0 -------------------------------------------------- 0
a 51 TKESMYHSLTYATILEMQAMMTFDPQDILLAGNMMKEAQMLCQRHRRKSS 100
b 0 -------------------------------------------------- 0
a 101 VTDSFSSLVNRPTLGQFTEEEIHAEVCYAECLLQRAALTFLQGSSHGGAV 150
b 0 -------------------------------------------------- 0
a 151 RPRALHDPSHACSCPPGPGRQHLFLLQDENMVSFIKGGIKVRNSYQTYKE 200
b 0 -------------------------------------------------- 0
a 201 LDSLVQSSQYCKGENHPHFEGGVKLGVGAFNLTLSMLPTRILRLLEFVGF 250
b 0 -------------------------------------------------- 0
a 251 SGNKDYGLLQLEEGASGHSFRSVLCVMLLLCYHTFLTFVLGTGNVNIEEA 300
b 0 -------------------------------------------------- 0
a 301 EKLLKPYLNRYPKGAIFLFFAGRIEVIKGNIDAAIRRFEECCEAQQHWKQ 350
b 0 -------------------------------------------------- 0
a 351 FHHMCYWELMWCFTYKGQWKMSYFYADLLSKENCWSKATYIYMKAAYLSM 400
||||||||
b 1 ------------------------------------------MKAAYLSM 8a 401 FGKEDHKPFGDDEVELFRAVPGLKLKIAGKSLPTEKFAIRKSRRYFSSNP 450
||||||||||||||||||||||||||||||||||||||||||||||||||
b 9 FGKEDHKPFGDDEVELFRAVPGLKLKIAGKSLPTEKFAIRKSRRYFSSNP 58a 451 ISLPVPALEMMYIWNGYAVIGKQPKLTDGILEIITKAEEMLEKGPENEYS 500
||||||||||||||||||||||||||||||||||||||||||||||||||
b 59 ISLPVPALEMMYIWNGYAVIGKQPKLTDGILEIITKAEEMLEKGPENEYS 108a 501 VDDECLVKLLKGLCLKYLGRVQEAEENFRSISANEKKIKYDHYLIPNALL 550
||||||||||||||||||||||||||||||||||||||||||||||||||
b 109 VDDECLVKLLKGLCLKYLGRVQEAEENFRSISANEKKIKYDHYLIPNALL 158a 551 ELALLLMEQDRNEEAIKLLESAKQNYKNYSMESRTHFRIQAATLQAKSSL 600
||||||||||||||||||||||||||||||||||||||||||||||||||
b 159 ELALLLMEQDRNEEAIKLLESAKQNYKNYSMESRTHFRIQAATLQAKSSL 208a 601 ENSSRSMVSSVSL- 613
|||||||||||||
b 209 ENSSRSMVSSVSL* 222#---------------------------------------
#---------------------------------------
```### Notes
See tests for example files