Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/flatironinstitute/deepblast
Neural Networks for Protein Sequence Alignment
https://github.com/flatironinstitute/deepblast
language-modeling neural-networks protein protein-sequences protein-structure sequence-alignment structural-alignments
Last synced: 19 days ago
JSON representation
Neural Networks for Protein Sequence Alignment
- Host: GitHub
- URL: https://github.com/flatironinstitute/deepblast
- Owner: flatironinstitute
- License: bsd-3-clause
- Created: 2020-06-04T22:28:59.000Z (over 4 years ago)
- Default Branch: master
- Last Pushed: 2024-11-11T18:12:13.000Z (about 1 month ago)
- Last Synced: 2024-11-11T19:22:49.112Z (about 1 month ago)
- Topics: language-modeling, neural-networks, protein, protein-sequences, protein-structure, sequence-alignment, structural-alignments
- Language: Python
- Homepage:
- Size: 56.7 MB
- Stars: 114
- Watchers: 6
- Forks: 21
- Open Issues: 32
-
Metadata Files:
- Readme: README.md
- License: COPYING.txt
Awesome Lists containing this project
- awesome-protein-design-software - DeepBLAST - [paper](https://doi.org/10.1101/2020.11.03.365932) (Sequence similarity search and (structural) alignment)
README
[![DOI](https://zenodo.org/badge/269478463.svg)](https://zenodo.org/badge/latestdoi/269478463)
# DeepBLAST
Learning protein structural similarity from sequence alone. Our preprint can be found [here](https://www.biorxiv.org/content/10.1101/2020.11.03.365932v1)
DeepBLAST is a neural-network based alignment algorithm that can estimate structural alignments. And it can generate structural alignments that are nearly identical to
state-of-the-art structural alignment algorithms.
![Malidup benchmark](/imgs/malidup.png)# Installation
DeepBLAST can be installed from pip via
```
pip install deepblast
```To install from the development branch run
```
pip install git+https://github.com/flatironinstitute/deepblast.git
```# Downloading pretrained models and data
- [DeepBLAST and TMvec checkpoints](https://figshare.com/articles/dataset/TMvec_DeepBLAST_models/25810099)
- [Training data and databases](https://zenodo.org/records/11199459)See the [Malisam](http://prodata.swmed.edu/malisam/) and [Malidup](http://prodata.swmed.edu/malidup/) websites to download their datasets.
# Getting started
See the [wiki](https://github.com/flatironinstitute/deepblast/wiki) on how to use DeepBLAST and TM-vec for remote homology search and alignment.
If you have questions on how to use DeepBLAST and TM-vec, feel free to raise questions in the [discussions section](https://github.com/flatironinstitute/deepblast/discussions). If you identify any potential bugs, feel free to raise them in the [issuetracker](https://github.com/flatironinstitute/deepblast/issues)# Citation
If you find our work useful, please cite us at
```
@article{morton2020protein,
title={Protein Structural Alignments From Sequence},
author={Morton, Jamie and Strauss, Charlie and Blackwell, Robert and Berenberg, Daniel and Gligorijevic, Vladimir and Bonneau, Richard},
journal={bioRxiv},
year={2020},
publisher={Cold Spring Harbor Laboratory}
}@article{hamamsy2022tm,
title={TM-Vec: template modeling vectors for fast homology detection and alignment},
author={Hamamsy, Tymor and Morton, James T and Berenberg, Daniel and Carriero, Nicholas and Gligorijevic, Vladimir and Blackwell, Robert and Strauss, Charlie EM and Leman, Julia Koehler and Cho, Kyunghyun and Bonneau, Richard},
journal={bioRxiv},
pages={2022--07},
year={2022},
publisher={Cold Spring Harbor Laboratory}
}```