https://github.com/kuixu/cryofold
Protein complex structure determination by structure prediction with cryo-EM density map constraints
https://github.com/kuixu/cryofold
Last synced: 26 days ago
JSON representation
Protein complex structure determination by structure prediction with cryo-EM density map constraints
- Host: GitHub
- URL: https://github.com/kuixu/cryofold
- Owner: kuixu
- Created: 2024-12-26T12:31:25.000Z (4 months ago)
- Default Branch: main
- Last Pushed: 2025-02-01T01:48:55.000Z (3 months ago)
- Last Synced: 2025-02-09T20:45:32.641Z (3 months ago)
- Language: Python
- Size: 1.46 MB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
Protein complex structure determination by structure prediction with cryo-EM density map constraints.
## CryoFold Architecture
![]()
## Introduction
Protein complex structure determination by structure prediction with cryo-EM density map constraints.
Cryo-electron microscopy (cryo-EM) has become a prominent approach for protein structure determination, especially for large protein complexes. However, obtaining high-resolution cryo-EM density maps remains challenging, particularly for the burgeoning discipline of cryo-electron tomography (cryo-ET). Here, we introduce CryoFold, a deep learning method for protein complex structure determination from cryo-EM density maps based on folding the input protein sequences within the map through multimodal data fusion. On benchmark datasets comprising hundreds of protein complexes with both intermediate- and low-resolution maps (i.e., 4~6 Å), CryoFold generated highly accurate atomic models, vastly outperforming the sequence-alone prediction tool AlphaFold-Multimer. CryoFold also performed well when using high-resolution maps (<4 Å), and notably can construct accurate models from in situ cryo-ET data of very large complexes consisting of hundreds of protein chains and low-resolution maps of 9.9 Å resolution. Finally, we used CryoFold to build models for the EMDB density maps lacking a PDB model, and established the CryoFoldDB database currently comprising 506 new models of protein complexes that are of higher average quality than deposited structures in PDB. Thus, CryoFold is a powerfully enabling technology for expanding the attainable scope of cryo-EM protein structure determination, especially for large protein complexes.
## Usage
We provide three ways to run CryoFold:
### 1. Web Server
https://cryonet.ai/cryofold
### 2. Using API
```commandline
python3 main.py \
--sequence FASTA.fasta \
--map MAP.mrc \```
### 3. Standalone Installation
We would release the installation package upon paper publication. You may check the following prerequisite:
1. up to 4 TB of disk space to keep sequence and structure databases
2. a GPU supports CUDA with at least 40GB memory.## Input files
[FASTA.fasta] is the path of the input sequence file with *.fasta format. [MAP.mrc] is the path of the input cryo-EM/ET map. [RECYCLE_TIMES] specifies the recycle time, and the default value is 8. [CHECKPOINT_PATH] specifies the checkpoint path, and the default value is 'params/cryofold_v1'. [GPU_DEVICE] specifies the GPU device ID.
Example of FASTA.fasta file
```
>A
MITDSLAVVLQRRDWENPGVTQLNRLAAHPPFASWRNSEEARTDRPSQQLRS
>B
MITDSLAVVLQRRDWENPGVTQLNRLAAHPPFASWRNSEEARTDRPSQQLRS
>C
MITDSLAVVLQRRDWENPGVTQLNRLAAHPPFASWRNSEEARTDRPSQQLRS
>D
MITDSLAVVLQRRDWENPGVTQLNRLAAHPPFASWRNSEEARTDRPSQQLRS
```For the description line, you could provide the chain id without any other information. For multiple chains case, each polypipetide chain ocuppies a separate sequence, i.e. Num. of chain == Num. of sequence.
In this example, we have 4 chains sharing the identical sequences.## Output files
After running the script, the generated predictive model file will be stored in the directory as ``./[MAP]_cryofold.pdb``.
## CryoFoldDB
https://cryofolddb.ai
![]()
## Copyright (C)
Protein complex structure determination by structure prediction with cryo-EM density map constraints.
Copyright (C) 2024. Kui Xu, Zhuo-Er Dong, Xing Zhang, Xin You, Pan Li, Nan Liu, Muzhi Dai, Chuangye Yan, Nieng Yan, Hong-Wei Wang, Sen-Fang Sui, Qiangfeng Cliff Zhang.
License: MIT