An open API service indexing awesome lists of open source software.

https://github.com/peldom/papers_for_protein_design_using_dl

List of papers about Proteins Design using Deep Learning
https://github.com/peldom/papers_for_protein_design_using_dl

deep-learning protein-design

Last synced: about 1 year ago
JSON representation

List of papers about Proteins Design using Deep Learning

Awesome Lists containing this project

README

          

# List of papers about Protein Design using Deep Learning

> This repository is inspired by the remarkable work of [Kevin Kaichuang Yang](https://github.com/yangkky) and their outstanding project [Machine-learning-for-proteins](https://github.com/yangkky/Machine-learning-for-proteins). We have established this repository to provide a specialized and focused platform for the field of **Deep Learning for Protein Design**, a rapidly advancing domain in computational biology.
>
> [Contributions](https://github.com/Peldom/papers_for_protein_design_using_DL/blob/main/CONTRIBUTING.md) and [suggestions](https://github.com/Peldom/papers_for_protein_design_using_DL/issues) are warmly welcome!
> Community Values, Guiding Principles, and Commitments for the Responsible Development of AI for Protein Design: [details](https://responsiblebiodesign.ai/)

*Papers last week, updated on 2025.03.22:*
+ From Atoms to Fragments: A Coarse Representation for Efficient and Functional Protein Design
+ [[bioRxiv 2025.03.19.644162](https://www.biorxiv.org/content/10.1101/2025.03.19.644162v2)] • [[Supplementary](https://www.biorxiv.org/content/biorxiv/early/2025/03/20/2025.03.19.644162/DC1/embed/media-1.pdf)] • RFdiffusion-based
+ Tuning ProteinMPNN to reduce protein visibility via MHC Class I through direct preference optimization
+ [[Protein Engineering, Design and Selection (2025)](https://academic.oup.com/peds/advance-article/doi/10.1093/protein/gzaf003/8082933)] • [[code](https://github.com/hcgasser/CAPE_MPNN)] • ProteinMPNN-based
+ Advanced Deep Learning Methods for Protein Structure Prediction and Design
+ [[arXiv:2503.13522](https://arxiv.org/abs/2503.13522v1)]
+ Inhibition of ice recrystallization with designed twistless helical repeat proteins
+ [[bioRxiv 2025.03.09.642278](https://www.biorxiv.org/content/10.1101/2025.03.09.642278v1)] • [[Supplementary](https://www.biorxiv.org/content/biorxiv/early/2025/03/13/2025.03.09.642278/DC1/embed/media-1.pdf)] • [[code](https://doi.org/10.5281/zenodo.13763849)] • RFDiffusion/ProteinMPNN-based
+ HighPlay: Cyclic Peptide Sequence Design Based on Reinforcement Learning and Protein Structure Prediction
+ [[bioRxiv 2025.03.17.643626](http://biorxiv.org/content/10.1101/2025.03.17.643626v1)]
+ Neo-1
+ paper not available • [[website](https://www.vant.ai/neo-1)] • commercial

---





deep learning for protein design


0) Benchmarks and datasets


Sequence dataset/benchmarks
Structure datasets/benchmarks
Public database
Similar list
Guides


1) Reviews and surveys


De novo design
Antibody design
Peptide design
Binder design
Enzyme design


2) Model-based design


trRosetta-based
AlphaFold2-based
DMPfold2-based
CM-Align
MSA transformer-based
DeepAb-based
TRFold2-based
GPT-based
ESM-based
Antiberta-based
Sampling-algorithms


3) Function to Scaffold


GAN-based
AutoEncoder-based
MLP-based
Diffusion-based
RL-based
Flow-based
Score-based


4) Scaffold to Sequence


Review
MLP-based
VAE-based
LSTM-based
CNN-based
GNN-based
GAN-based
Transformer-based
ResNet-based
Diffusion-based
Bayesian method
Flow-based


5) Function to Sequence


CNN-based
VAE-based
GAN-based
Transformer-based
Bayesian method
Reinforcement Learning
Flow-based
RNN-based
LSTM-based
Autoregressive
Boltzmann machine
Diffusion-based
GNN-based
Score-based


6) Function to Structure


Review
LSTM-based
Diffusion-based
RoseTTAFold-based
CNN-based
GNN-based
Transformer-based
MLP-based
Flow-based
AlphaFold-based


7) Other


Effects of mutations & Fitness Landscape
Protein Language Model & Representation Learning
Molecular Design Model
Unclassified

---

## 0. Benchmarks and datasets

### 0.1 Sequence Datasets, Benchmarks

**FLIP: Benchmark tasks in fitness landscape inference for proteins**
Christian Dallago, Jody Mou, Kadina E Johnston, Bruce Wittmann, Nick Bhattacharya, Samuel Goldman, Ali Madani, Kevin K Yang
[NeurIPS 2021 Datasets and Benchmarks Track](https://openreview.net/forum?id=p2dMLEwL8tF)/[bioRxiv 2021](https://www.biorxiv.org/content/10.1101/2021.11.09.467890v2) • [website](https://benchmark.protein.properties/) • [code](https://github.com/J-SNACKKB/FLIP) • [Supplementary](https://www.biorxiv.org/content/biorxiv/early/2022/01/19/2021.11.09.467890/DC1/embed/media-1.pdf)

**A Benchmark Framework for Evaluating Structure-to-Sequence Models for Protein Design**
Jeffrey Chan, Seyone Chithrananda, David Brookes, Sam Sinai
Paper unavailable at [Machine Learning in Structural Biology Workshop 2022](https://nips.cc/Conferences/2022/ScheduleMultitrack?event=50005)

**PDBench: Evaluating Computational Methods for Protein-Sequence Design**
Leonardo V Castorina, Rokas Petrenas, Kartic Subr, Christopher W Wood
[Bioinformatics, 2023;, btad027](https://academic.oup.com/bioinformatics/advance-article/doi/10.1093/bioinformatics/btad027/6986968) • [code](https://github.com/wells-wood-research/PDBench)

**Benchmarking deep generative models for diverse antibody sequence design**
Igor Melnyk, Payel Das, Vijil Chenthamarakshan, Aurelie Lozano
[arXiv:2111.06801](https://arxiv.org/abs/2111.06801)

**The Protein Engineering Tournament: An Open Science Benchmark for Protein Modeling and Design**
Chase Armer, Hassan Kane, Dana Cortade, Dave Estell, Adil Yusuf, Radhakrishna Sanka, Henning Redestig, TJ Brunette, Pete Kelly, Erika DeBenedictis
[arXiv:2309.09955](https://arxiv.org/abs/2309.09955v2)

**Computational Scoring and Experimental Evaluation of Enzymes Generated by Neural Networks**
Sean R.Johnson, Xiaozhi Fu, Sandra Viknander, Clara Goldin, Sarah Monaco, Aleksej Zelezniak, Kevin K. Yang
[bioRxiv (2023)](https://www.biorxiv.org/content/10.1101/2023.03.04.531015v2) • [code](https://github.com/seanrjohnson/protein_scoring)

**FLOP: Tasks for Fitness Landscapes Of Protein Wildtypes**
Peter Mørch Groth, Richard Michael, Jesper Salomon, Pengfei Tian, Wouter Boomsma
[bioRxiv 2023.06.21.545880](https://www.biorxiv.org/content/10.1101/2023.06.21.545880v2) • [code](https://github.com/petergroth/FLOP)

**ProteinGym: Large-Scale Benchmarks for Protein Design and Fitness Prediction**
Pascal Notin, Aaron W Kollasch, Daniel Ritter, Lood van Niekerk, Steffanie Paul, Hansen Spinner, Nathan Rollins, Ada Shaw, Ruben Weitzman, Jonathan Frazer, Mafalda Dias, Dinko Franceschi, Rose Orenbuch, Yarin Gal, Debora S Marks
[bioRxiv 2023.12.07.570727](https://biorxiv.org/content/10.1101/2023.12.07.570727v1) • [code](https://github.com/OATML-Markslab/ProteinGym)

**Results of the Protein Engineering Tournament: An Open Science Benchmark for Protein Modeling and Design**
Chase Armer, Hassan Kane, Dana L. Cortade, Henning Redestig, David A. Estell, Adil Yusuf, Nathan Rollins, Hansen Spinner, Debora Marks, TJ Brunette, Peter J. Kelly, Erika DeBenedictis
[bioRxiv 2024.08.12.606135](https://www.biorxiv.org/content/10.1101/2024.08.12.606135v1) • [code](https://github.com/the-protein-engineering-tournament/pet-pilot-2023) • [Supplementary](https://www.biorxiv.org/content/biorxiv/early/2024/08/12/2024.08.12.606135/DC1/embed/media-1.pdf)

**Generative AI Models for the Protein Scaffold Filling Problem**
Letu Qingge, Kushal Badal, Richard Annan, Jordan Sturtz, Xiaowen Liu, and Binhai Zhu
[Journal of Computational Biology](https://www.liebertpub.com/doi/10.1089/cmb.2024.0510)

**Benchmarking Inverse Folding Models for Antibody CDR Sequence Design**
Per Junior Greisen, Yifan Li, Yuxiang Lang, Chenrui Xu, Yi Zhou, Ziwei Pang
[bioRxiv 2024.12.16.628614](https://www.biorxiv.org/content/10.1101/2024.12.16.628614v1)

**Self-supervised machine learning methods for protein design improve sampling but not the identification of high-fitness variants**
Moritz Ertelt, Rocco Moretti, Jens Meiler, and Clara T. Schoeder
[Science Advances 11.7 (2025)](https://www.science.org/doi/10.1126/sciadv.adr7338) • [code](https://github.com/meilerlab/probabilities_design)

### 0.2 Structure Datasets, Benchmarks

**AlphaDesign: A graph protein design method and benchmark on AlphaFoldDB**
Zhangyang Gao, Cheng Tan, Stan Z. Li
[arxiv (2022)](https://arxiv.org/abs/2202.01079)

**SidechainNet: An All-Atom Protein Structure Dataset for Machine Learning**
Jonathan E. King, David Ryan Koes
[arxiv](https://arxiv.org/abs/2010.08162) • [github::sidechainnet](https://github.com/jonathanking/sidechainnet)

[TDC](https://tdcommons.ai/overview/) maintains a resource list that currently contains 22 tasks (and its datasets) related to small molecules and macromolecules, including PPI, DDI and so on. [MoleculeNet](https://github.com/GLambard/Molecules_Dataset_Collection) published a small molecule related benchmark four years ago.

> In terms of datasets and benchmarks, protein design is far less mature than drug discovery ([paperwithcode drug discovery benchmarks](https://paperswithcode.com/task/drug-discovery)). (Maybe should add the evaluation of protein design for deep learning method (especially deep generative model))
> Difficulties and opportunities always coexist. Happy to see the work of [Christian Dallago, Jody Mou, Kadina E. Johnston, Bruce J. Wittmann, Nicholas Bhattacharya, Samuel Goldman, Ali Madani, Kevin K. Yang](https://www.biorxiv.org/content/10.1101/2021.11.09.467890v1) and [Zhangyang Gao, Cheng Tan, Stan Z. Li](https://arxiv.org/abs/2202.01079).

**Sampling of structure and sequence space of small protein folds**
Thomas W. Linsky, Kyle Noble, Autumn R. Tobin, Rachel Crow, Lauren Carter, Jeffrey L. Urbauer, David Baker & Eva-Maria Strauch
[Nat Commun 13, 7151 (2022)](https://www.nature.com/articles/s41467-022-34937-8) • [code](https://github.com/strauchlab/scaffold_design) • [Supplementary](https://static-content.springer.com/esm/art%3A10.1038%2Fs41467-022-34937-8/MediaObjects/41467_2022_34937_MOESM1_ESM.pdf)

**OpenProteinSet: Training data for structural biology at scale**
Gustaf Ahdritz, Nazim Bouatta, Sachin Kadyan, Lukas Jarosch, Daniel Berenberg, Ian Fisk, Andrew M. Watkins, Stephen Ra, Richard Bonneau, Mohammed AlQuraishi
[arXiv:2308.05326](https://arxiv.org/abs/2308.05326) • [OpenFold](https://github.com/aqlaboratory/openfold)

**ProteinInvBench: Benchmarking Protein Design on Diverse Tasks, Models, and Metrics**
Zhangyang Gao, Cheng Tan, Yijie Zhang, Xingran Chen, Stan Z. Li
[GitHub](https://github.com/A4Bio/ProteinInvBench)

**PDB-Struct: A Comprehensive Benchmark for Structure-based Protein Design**
Chuanrui Wang, Bozitao Zhong, Zuobai Zhang, Narendra Chaudhary, Sanchit Misra, Jian Tang
[arXiv preprint arXiv:2312.00080 (2023)](https://arxiv.org/abs/2312.00080) • [code](https://github.com/WANG-CR/PDB-Struct)

**Scaffold-Lab: Critical Evaluation and Ranking of Protein Backbone Generation Methods in A Unified Framework**
Zhuoqi Zheng, Bo Zhang, Bozitao Zhong, Kexin Liu, Jinyu Yu, Zhengxin Li, JunJie Zhu, Ting Wei, Hai-Feng Chen
[bioRxiv 2024.02.10.579743](https://www.biorxiv.org/content/10.1101/2024.02.10.579743v1) • [code](https://github.com/Immortals-33/Scaffold-Lab) • [Supplementary](https://www.biorxiv.org/content/biorxiv/early/2024/02/12/2024.02.10.579743/DC1/embed/media-1.pdf)

**Antibody DomainBed: Out-of-Distribution Generalization in Therapeutic Protein Design**
Nataša Tagasovska, Ji Won Park, Matthieu Kirchmeyer, Nathan C. Frey, Andrew Martin Watkins, Aya Abdelsalam Ismail, Arian Rokkum Jamasb, Edith Lee, Tyler Bryson, Stephen Ra, Kyunghyun Cho
[arXiv:2407.21028](https://arxiv.org/abs/2407.21028) • [code](https://github.com/prescient-design/antibody-domainbed) • [dataset](https://www.dropbox.com/scl/fo/e670i9adp29yv2knfu6wd/h?rlkey=uax6phjjfumkk8xoxrbwcit1h&e=1&dl=0)

**Large protein databases reveal structural complementarity and functional locality**
Paweł Szczerbiak, Lukasz Szydlowski, Witold Wydmański, P. Douglas Renfrew, Julia Koehler Leman, Tomasz Kosciolek
[bioRxiv 2024.08.14.607935](https://www.biorxiv.org/content/10.1101/2024.08.14.607935v1) • [code](https://github.com/Tomasz-Lab/protein-structure-landscape) • [Supplementary](https://www.biorxiv.org/content/biorxiv/early/2024/08/14/2024.08.14.607935/DC1/embed/media-1.pdf) • [website](https://protein-structure-landscape.sano.science/)

**The Protein Design Archive (PDA): insights from 40 years of protein design**
Marta Chronowska, Michael J. Stam, Derek N. Woolfson, Luigi F. Di Constanzo, Christopher W. Wood
[bioRxiv 2024.09.05.611465](https://www.biorxiv.org/content/10.1101/2024.09.05.611465v1)/[Nat Biotechnol (2025)](https://www.nature.com/articles/s41587-025-02607-x) • [code](https://github.com/wells-wood-research/chronowska-stam-wood-2024-protein-design-archive) • [Supplementary](hhttps://www.biorxiv.org/content/biorxiv/early/2024/09/07/2024.09.05.611465/DC1/embed/media-1.docx) • [website](https://pragmaticproteindesign.bio.ed.ac.uk/pda/)

**ProteinBench: A Holistic Evaluation of Protein Foundation Models**
Fei Ye, Zaixiang Zheng, Dongyu Xue, Yuning Shen, Lihao Wang, Yiming Ma, Yan Wang, Xinyou Wang, Xiangxin Zhou, Quanquan Gu
[arXiv:2409.06744](https://arxiv.org/abs/2409.06744) • [code](https://proteinbench.github.io/)

**Benchmarking Generative Models for Antibody Design & Exploring Log-Likelihood for Sequence Ranking**
Talip Uçar, Cedric Malherbe, Ferran Gonzalez
[bioRxiv 2024.10.07.617023](https://www.biorxiv.org/content/10.1101/2024.10.07.617023v3) • [code](https://github.com/AstraZeneca/DiffAbXL)

**Towards Robust Evaluation of Protein Generative Models: A Systematic Analysis of Metrics**
Pavel Strashnov, Andrey Shevtsov, Viacheslav Meshchaninov, Maria Ivanova, Fedor Nikolaev, Olga Kardymon, Dmitry Vetrov
[bioRxiv 2024.10.25.620213](https://www.biorxiv.org/content/10.1101/2024.10.25.620213v1)

**MotifBench: A standardized protein design benchmark for motif-scaffolding problems**
Zhuoqi Zheng, Bo Zhang, Kieran Didi, Kevin K. Yang, Jason Yim, Joseph L. Watson, Hai-Feng Chen, Brian L. Trippe
[arXiv:2502.12479](https://arxiv.org/abs/2502.12479) • [code](https://github.com/blt2114/MotifBench)

### 0.3 Databases

> A list of suggested protein databases, more lists at [CNCB](https://ngdc.cncb.ac.cn/databasecommons/).

#### 0.3.1 Sequence Database

1. [UniProt](https://www.uniprot.org/downloads)
2. [DisProt](https://disprot.org)
3. [MobiDB](https://mobidb.bio.unipd.it/)
4. [Peptipedia](https://app.peptipedia.cl/)

#### 0.3.2 Structure Database

| Database | Description |
| ----------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| [PDB](https://www.rcsb.org/) | The Protein Data Bank (PDB) is a database of 3D structural data of large biological molecules, such as proteins and nucleic acids. These data are gathered using experimental methods such as X-ray crystallography, NMR spectroscopy, or cryo-electron microscopy |
| [AlphaFoldDB](https://alphafold.ebi.ac.uk/) | AlphaFoldDB is a database of protein structure predictions produced by DeepMind's AlphaFold system. It provides highly accurate predictions of protein 3D structures |
| [PDBbind](http://www.pdbbind.org.cn/download.php) | PDBbind is a comprehensive collection of the binding data of all types of biomolecular complexes in the PDB database. It is primarily used for the development and validation of computational methods for predicting molecular interactions |
| [AB-Bind](https://github.com/sarahsirin/AB-Bind-Database) | AB-Bind is a database for antibody binding affinity data. It offers a curated set of experimental binding data and corresponding antibody-protein complex structures |
| [AntigenDB](http://crdd.osdd.net/raghava/antigendb/) | AntigenDB is a manually curated database of experimentally verified antigens that includes detailed information about the antigen, the source organism, and the associated antibodies |
| [CAMEO](https://www.cameo3d.org/) | CAMEO (Continuous Automated Model EvaluatiOn) is a project for the automated evaluation of methods predicting macromolecular structure. It continuously assesses the performance of automated protein structure prediction servers |
| [CAPRI](https://www.ebi.ac.uk/msd-srv/capri/) | The Critical Assessment of PRediction of Interactions (CAPRI) is a community-wide experiment to evaluate protein-protein interaction prediction methods |
| [PIFACE](http://prism.ccbb.ku.edu.tr/piface) | PIFACE is a web server for the prediction of protein-protein interactions. It identifies potential interaction interfaces on protein surfaces |
| [SAbDab](http://opig.stats.ox.ac.uk/webapps/newsabdab/sabdab/) | The Structural Antibody Database (SAbDab) is an automatically updated resource for the structural information of antibodies from the PDB. It allows for easy access to curated, annotated, and classified antibody structures |
| [SKEMPI v2.0](https://life.bsc.es/pid/skempi2) | SKEMPI 2.0 is a database of experimental measurements of the change in binding free energy caused by mutations in protein-protein complexes |
| [ProtCAD](http://dunbrack2.fccc.edu/protcad/) | ProtCAD is a suite of tools for the design and engineering of novel protein structures, sequences, and functions. It allows users to build and manipulate complex protein structures, generate and evaluate sequence libraries, and simulate mutational effects. ProtCAD is a suite of tools for the design and engineering of novel protein structures, sequences, and functions. It allows users to build and manipulate complex protein structures, generate and evaluate sequence libraries, and simulate mutational effects. |

### 0.4 Similar List

> Some similar GitHub lists that include papers about protein design using deep learning:

1. [design_tools](https://github.com/hefeda/design_tools/blob/main/README.md)
2. [awesome-AI-based-protein-design](https://github.com/opendilab/awesome-AI-based-protein-design)
3. [ProteinStructureWithDL](https://github.com/Yang-J-LIN/ProteinStructureWithDL)
4. [List of available bioinformatic tools and services](https://neurosnap.ai/services)

### 0.5 Guides

Guides/Tutorials for beginners on GitHub:

1. [how_to_create_a_protein](https://github.com/universvm/how_to_create_a_protein)
2. [protein-design-tutorials](https://github.com/ProteinDesignLab/protein-design-tutorials)

Collection of Protein Design Labs:

- [ProteinDesignLabs](https://github.com/Zuricho/ProteinDesignLabs)

## 1. Reviews

### 1.1 De novo protein design

**Protein design: from computer models to artificial intelligence**
Antonella Paladino, Filippo Marchetti, Silvia Rinaldi, Giorgio Colombo
[Wiley Interdisciplinary Reviews: Computational Molecular Science 7.5 (2017): e1318](https://wires.onlinelibrary.wiley.com/doi/10.1002/wcms.1318)

**Advances in protein structure prediction and design**
Brian Kuhlman, Philip Bradley
[Nat Rev Mol Cell Biol 20, 681-697 (2019)](https://www.nature.com/articles/s41580-019-0163-x)

**Deep learning in protein structural modeling and design**
Wenhao Gao, Sai Pooja Mahajan, Jeremias Sulam, and Jeffrey J. Gray
[Patterns 1.9](https://www.sciencedirect.com/science/article/pii/S2666389920301902) • 2020

**100th anniversary of macromolecular science viewpoint: Data-driven protein design**
Ferguson, Andrew L., and Rama Ranganathan
[ACS Macro Letters 10.3 (2021)](https://pubs.acs.org/doi/abs/10.1021/acsmacrolett.0c00885)

**Artificial intelligence in early drug discovery enabling precision medicine**
Fabio Bonioloa, Emilio Dorigattia, Alexander J. Ohnmachta, Dieter Saurb, Benjamin Schuberta, and Michael P. Menden
[Expert Opinion on Drug Discovery 16.9 (2021)](https://www.tandfonline.com/doi/full/10.1080/17460441.2021.1918096)

**Protein design with deep learning**
Defresne, Marianne, Sophie Barbe, and Thomas Schiex
[International Journal of Molecular Sciences 22.21 (2021)](https://www.mdpi.com/1422-0067/22/21/11741)

**Protein sequence design with deep generative models**
Zachary Wu, Kadina E. Johnston, Frances H. Arnold, Kevin K. Yang
[Current Opinion in Chemical Biology 65](https://www.sciencedirect.com/science/article/pii/S136759312100051X) • [note](https://zhuanlan.zhihu.com/p/466616309) • 2021

**Structure-based protein design with deep learning**
Ovchinnikov, Sergey, and Po-Ssu Huang
[Current opinion in chemical biology 65](https://www.sciencedirect.com/science/article/pii/S1367593121001125) • [note](https://zhuanlan.zhihu.com/p/467001175) • 2021

**Deep learning techniques have significantly impacted protein structure prediction and protein design**
Pearce, Robin, and Yang Zhang
[Current opinion in structural biology 68 (2021)](https://www.sciencedirect.com/science/article/pii/S0959440X21000142)

**Recent advances in de novo protein design: Principles, methods, and applications**
Pan, Xingjie, and Tanja Kortemme
[Journal of Biological Chemistry 296 (2021)](https://www.sciencedirect.com/science/article/pii/S0021925821003367)

**Protein design via deep learning**
Wenze Ding, Kenta Nakai, Haipeng Gong
[Briefings in Bioinformatics](https://academic.oup.com/bib/advance-article/doi/10.1093/bib/bbac102/6554124) • 25 March 2022

**Deep generative modeling for protein design**
Strokach, Alexey, and Philip M. Kim
[Current Opinion in Structural Biology](https://www.sciencedirect.com/science/article/pii/S0959440X21001573) • 2022

**Dawn of a new era for membrane protein design**
Sowlati-Hashjin, Shahin, Aanshi Gandhi, and Michael Garton
[BioDesign Research (2022)](https://spj.science.org/doi/10.34133/2022/9791435)

**Deep learning approaches for conformational flexibility and switching properties in protein design**
Rudden, Lucas SP, Mahdi Hijazi, and Patrick Barth
[Frontiers in Molecular Biosciences](https://www.frontiersin.org/articles/10.3389/fmolb.2022.928534/full)

**Computational protein design with evolutionary-based and physics-inspired modeling: current and future synergies**
Cyril Malbranke, David Bikard, Simona Cocco, Rémi Monasson, Jérôme Tubiana
[arXiv:2208.13616v2](https://arxiv.org/abs/2208.13616v2)

**From sequence to function through structure: deep learning for protein design**
Noelia Ferruz, Michael Heinzinger, Mehmet Akdel, Alexander Goncearenco, Luca Naef, Christian Dallago
[bioRxiv 2022.08.31.505981](https://www.biorxiv.org/content/10.1101/2022.08.31.505981v1)/[Computational and Structural Biotechnology Journal Volume 21, 2023](https://www.sciencedirect.com/science/article/pii/S2001037022005086) • [Supplementary](https://www.biorxiv.org/content/biorxiv/early/2022/09/03/2022.08.31.505981/DC1/embed/media-1.pdf) • [accompanying list](https://github.com/hefeda/design_tools/blob/main/README.md)

**Computational protein design with data-driven approaches: Recent developments and perspectives**
Haiyan Liu, Quan Chen
[WIREs Comput Mol Sci. 2022. e1646](https://wires.onlinelibrary.wiley.com/doi/10.1002/wcms.1646)

**Understanding by design: Implementing deep learning from protein structure prediction to protein design**
Gao, Yuanxu, Jiangshan Zhan, and Albert CH Yu
[MedComm-Future Medicine 1.2 (2022): e22](https://onlinelibrary.wiley.com/doi/full/10.1002/mef2.22)

**Diffusion Models in Bioinformatics: A New Wave of Deep Learning Revolution in Action**
Zhiye Guo, Jian Liu, Yanli Wang, Mengrui Chen, Duolin Wang, Dong Xu, Jianlin Cheng
[arXiv:2302.10907](https://arxiv.org/abs/2302.10907)

**Machine learning for evolutionary-based and physicsinspired protein design: Current and future synergies**
Cyril Malbranke, David Bikard, Simona Cocco, Rémi Monasson, Jérôme Tubiana
[Current Opinion in Structural Biology](https://www.sciencedirect.com/science/article/pii/S0959440X23000453)

**De novo design of polyhedral protein assemblies: before and after the AI revolution**
Bhoomika Basu Mallik, Jenna Stanislaw, Tharindu Madhusankha Alawathurage, and Alena Khmelinskaia
[ChemBioChem 2023, e202300117](http://dx.doi.org/10.1002/cbic.202300117)

**Research progress of artificial intelligence in protein design**
CHEN Zhihang, JI Menglin, QI Yifei
[Synthetic Biology Journal (2023)](https://synbioj.cip.com.cn/article/2023/2096-8280/2023-008.shtml)

**A Survey on Graph Diffusion Models: Generative AI in Science for Molecule, Protein and Material**
Mengchun Zhang, Maryam Qamar, Taegoo Kang, Yuna Jung, Chenshuang Zhang, Sung-Ho Bae, Chaoning Zhang
[https://arxiv.org/abs/2304.01565](https://arxiv.org/pdf/2304.01565.pdf)

**Exploring the Protein Sequence Space with Global Generative Models**
Sergio Romero-Romero, Sebastian Lindner, Noelia Ferruz
[arXiv:2305.01941](https://arxiv.org/abs/2305.01941)

**The Era of Machine Learning for Protein Design, Summarized in Four Key Methods**
LucianoSphere
[Towards Data Science](https://towardsdatascience.com/the-era-of-machine-learning-for-protein-design-summarized-in-four-key-methods-d6f1dac5de96)

**Is novelty predictable?**
Clara Fannjiang, Jennifer Listgarten
[arXiv:2306.00872](https://arxiv.org/abs/2306.00872)

**Computational protein design - where it goes?**
Xu Binbin, Chen Yingjun and Xue Weiwei
[Current Medicinal Chemistry 2023](https://www.eurekaselect.com/article/132267)

**How can the protein design community best support biologists who want to harness AI tools for protein structure prediction and design?**
Birte Höcker, Peilong Lu, Anum Glasgow, Debora S. Marks
Pranam Chatterjee, Joanna S.G. Slusky, Ora Schueler-Furman, Possu Huang
[Cell Systems 14.8 (2023)](https://www.cell.com/cell-systems/fulltext/S2405-4712(23)00212-0)

**De novo 設計ナノポアの創製**
新津藍
[生物工学会誌 101.8 (2023)](https://www.jstage.jst.go.jp/article/seibutsukogaku/101/8/101_101.8_431/_article/-char/ja/)

**Generative artificial intelligence for de novo protein design**
Adam Winnifrith, Carlos Outeiral, Brian Hie
[arXiv:2310.09685](https://arxiv.org/abs/2310.09685)

**Intelligent Protein Design and Molecular Characterization Techniques: A Comprehensive Review**
Jingjing Wang, Chang Chen, Ge Yao, Junjie Ding, Liangliang Wang and Hui Jiang
[Molecules 28.23 (2023)](https://www.mdpi.com/1420-3049/28/23/7865)

**Generative models for protein sequence modeling: recent advances and future directions**
Mehrsa Mardikoraem, Zirui Wang, Nathaniel Pascual, Daniel Woldring
[Briefings in Bioinformatics](https://academic.oup.com/bib/article/24/6/bbad358/7325909)

**A new age in protein design empowered by deep learning**
Hamed Khakzad, Ilia Igashov, Arne Schneuing, Casper Goverde, Michael Bronstein, Bruno Correia
[Cell Systems, Volume 14, Issue 11](https://www.cell.com/cell-systems/fulltext/S2405-4712(23)00298-3)

**Deep learning for protein structure prediction and design—progress and applications**
Jürgen Jänes and Pedro Beltrao
[Mol Syst Biol(2024)](https://www.embopress.org/doi/full/10.1038/s44320-024-00016-x)

**De novo protein design—From new structures to programmable functions**
Tanja Kortemme
[Cell 187.3 (2024)](https://www.cell.com/cell/fulltext/S0092-8674(23)01402-2)

**Generative models for protein structures and sequences**
Chloe Hsu, Clara Fannjiang & Jennifer Listgarten
[Nat Biotechnol 42, 196–199 (2024)](https://www.nature.com/articles/s41587-023-02115-w)

**What does it take for an ‘AlphaFold Moment’ in functional protein engineering and design?**
Roberto A. Chica & Noelia Ferruz
[Nat Biotechnol 42, 173–174 (2024)](https://www.nature.com/articles/s41587-023-02120-z)

**Protein design: the experts speak**
Anne Doerr
[Nat Biotechnol 42, 175–178 (2024)](https://www.nature.com/articles/s41587-023-02111-0)

**Machine learning for functional protein design**
Pascal Notin, Nathan Rollins, Yarin Gal, Chris Sander & Debora Marks
[Nat Biotechnol 42, 216–228 (2024)](https://www.nature.com/articles/s41587-024-02127-0)

**Sparks of function by de novo protein design**
Alexander E. Chu, Tianyu Lu & Po-Ssu Huang
[Nat Biotechnol 42, 203–215 (2024)](https://www.nature.com/articles/s41587-024-02133-2) • [poster](https://drive.google.com/file/d/1sG3OlEWvhHcWAdtf7RTcCawAapDmyeEx/view)

**A Survey of Generative AI for De Novo Drug Design: New Frontiers in Molecule and Protein Generation**
Xiangru Tang, Howard Dai, Elizabeth Knight, Fang Wu, Yunyang Li, Tianxiao Li, Mark Gerstein
[arXiv:2402.08703](https://arxiv.org/abs/2402.08703)

**Security challenges by AI-assisted protein design**
Philip Hunter
[EMBO Rep(2024)](https://www.embopress.org/doi/full/10.1038/s44319-024-00124-7)

**Opportunities and challenges in design and optimization of protein function**
Dina Listov, Casper A. Goverde, Bruno E. Correia & Sarel Jacob Fleishman
[Nat Rev Mol Cell Biol (2024)](https://www.nature.com/articles/s41580-024-00718-y)

**The State-of-the-Art Overview to Application of Deep Learning in Accurate Protein Design and Structure Prediction**
Saber Saharkhiz, Mehrnaz Mostafavi, Amin Birashk, Shiva Karimian, Shayan Khalilollah, Sohrab Jaferian, Yalda Yazdani, Iraj Alipourfard, Yun Suk Huh, Marzieh Ramezani Farani & Reza Akhavan-Sigari
[Top Curr Chem (Z) 382, 23 (2024)](https://link.springer.com/article/10.1007/s41061-024-00469-6)

**Computational methods for protein design**
Noelia Ferruz, Amelie Stein
[Protein Engineering, Design and Selection, Volume 37, 2024](https://academic.oup.com/peds/article/doi/10.1093/protein/gzae011/7710436)

**Structure-based protein and small molecule generation using EGNN and diffusion models: A comprehensive review**
Farzan Soleymani, Eric Paquet, Herna Lydia Viktor, Wojtek Michalowski
[Computational and Structural Biotechnology Journal (2024)](https://www.sciencedirect.com/science/article/pii/S2001037024002228)

**Machine learning in biological physics: From biomolecular prediction to design**
Jonathan Martin, Marcos Lequerica Mateos, José N. Onuchic, and Faruck Morcos
[Proceedings of the National Academy of Sciences 121.27 (2024)](https://www.pnas.org/doi/10.1073/pnas.2311807121)

**AI has dreamt up a blizzard of new proteins. Do any of them actually work?**
Ewen Callaway
[Nature 634.8034 (2024)](https://www.nature.com/articles/d41586-024-03335-z)

**Five protein-design questions that still challenge AI**
Sara Reardon
[Nature 635.8037 (2024)](https://www.nature.com/articles/d41586-024-03595-9)

**De novo protein design in the age of artificial intelligence**
Nan Liu, Xiaocheng Jin, Chongzhou Yang, Ziyang Wang, Xiaoping Min, Shengxiang Ge
[Sheng Wu Gong Cheng Xue Bao](https://doi.org/10.13345/j.cjb.240087)

**Generative Models in Protein Engineering: A Comprehensive Survey**
Chen Xinhui, Yiwen Yuan, Joseph Liu, Chak Tou Leong, Xiaoye Zhu, Jiaqi Chen
[Neurips 2024 Workshop](https://openreview.net/forum?id=Xc7l84S0Ao)

**A Survey of Deep Learning Methods in Protein Bioinformatics and its Impact on Protein Design**
Weihang Dai
[arXiv:2501.01477](https://arxiv.org/abs/2501.01477)

**The Promise of Protein Design: A Q&A with Nobel Laureate David Baker**
David Baker and Fay Lin
[GEN Biotechnology (2025)](https://www.liebertpub.com/doi/abs/10.1089/genbio.2025.0004?journalCode=genbio)

**Protein design and structure solution for drug discovery**
Petra Bombicz
[Crystallography Reviews (2024)](https://www.tandfonline.com/doi/full/10.1080/0889311X.2024.2461923)

**A Model-Centric Review of Deep Learning for Protein Design**
Gregory W. Kyro, Tianyin Qiu, Victor S. Batista
[arXiv:2502.19173](https://arxiv.org/abs/2502.19173)

**Computational protein design**
Katherine I. Albanese, Sophie Barbe, Shunsuke Tagami, Derek N. Woolfson & Thomas Schiex
[Nature Reviews Methods Primers 5.1 (2025)](https://www.nature.com/articles/s43586-025-00383-1)

**Exploring the Blueprint of Life: The Innovation in Antibody and Protein Design**
Yang, Zhiwei, and Gerald H. Lushington
[Combinatorial chemistry & high throughput screening](https://www.eurekaselect.com/article/146786)

**Advanced Deep Learning Methods for Protein Structure Prediction and Design**
Weikun Wu, Tianyang Wang, Yichao Zhang, Ningyuan Deng, Xinyuan Song, Ziqian Bi, Zheyu Yao, Keyu Chen, Ming Li, Qian Niu, Junyu Liu, Benji Peng, Sen Zhang, Ming Liu, Li Zhang, Xuanhe Pan, Jinlang Wang, Pohsun Feng, Yizhu Wen, Lawrence KQ Yan, Hongming Tseng, Yan Zhong, Yunze Wang, Ziyuan Qin, Bowen Jing, Junjie Yang, Jun Zhou, Chia Xin Liang, Junhao Song
[arXiv:2503.13522](https://arxiv.org/abs/2503.13522v1)

### 1.2 Antibody design

**A review of deep learning methods for antibodies**
Jordan Graves, Jacob Byerly, Eduardo Priego, Naren Makkapati , S. Vince Parish, Brenda Medellin and Monica Berrondo
[Antibodies 9.2 (2020)](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7344881/pdf/antibodies-09-00012.pdf)

**Progress and challenges for the machine learning-based design of fit-for-purpose monoclonal antibodies**
Rahmad Akbar, Habib Bashour, Puneet Rawat, Philippe A. Robert, Eva Smorodina, Tudor-Stefan Cotet, Karine Flem-Karlsen, Robert Frank, Brij Bhushan Mehta, Mai Ha Vu, Talip Zengin, Jose Gutierrez-Marcos, Fridtjof Lund-Johansen, Jan Terje Andersen, and Victor Greif
[Mabs. Vol. 14. No. 1. Taylor & Francis, 2022](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8928824/)

**Advances in computational structure-based antibody design**
Hummer, Alissa M., Brennan Abanades, and Charlotte M. Deane
[Current Opinion in Structural Biology 74 (2022)](https://www.sciencedirect.com/science/article/pii/S0959440X22000586)

**Computational and artificial intelligence-based methods for antibody development**
Jisun Kim, Matthew McFee, Qiao Fang, Osama Abdin, Philip M. Kim
[Trends in Pharmacological Sciences (2023)](https://www.sciencedirect.com/science/article/pii/S0165614722002796)

**Leveraging deep learning to improve vaccine design**
Andrew P. Hederman, Margaret E. Ackerman
[Trends in immunology (2023)](https://www.cell.com/trends/immunology/fulltext/S1471-4906(23)00046-7)

**In Silico Approaches to Deliver Better Antibodies by Design: The Past, the Present and the Future**
Andreas Evers, Shipra Malhotra, Vanita D. Sood
[arXiv:2305.07488](https://arxiv.org/abs/2305.07488)

**AI Models for Protein Design are Driving Antibody Engineering**
Michael Chungyoun, Jeffrey J. Gray
[Current Opinion in Biomedical Engineering (2023): 100473](https://www.sciencedirect.com/science/article/abs/pii/S2468451123000296)

**Computational Methods in Immunology and Vaccinology: Design and Development of Antibodies and Immunogens**
Federica Guarra and Giorgio Colombo
[Journal of Chemical Theory and Computation (2023)](https://pubs.acs.org/doi/10.1021/acs.jctc.3c00513)

**Simplifying complex antibody engineering using machine learning**
Makowski, Emily K., Hsin-Ting Chen, and Peter M. Tessier
[Cell Systems 14.8 (2023)](https://www.cell.com/cell-systems/fulltext/S2405-4712(23)00118-7)/[2022 AIChE Annual Meeting. AIChE, 2022.](https://aiche.confex.com/aiche/2022/meetingapp.cgi/Paper/650993)

**AI driven B-cell Immunotherapy Design**
Bruna Moreira da Silva, David B. Ascher, Nicholas Geard, Douglas E. V. Pires
[arXiv:2309.01122](https://arxiv.org/abs/2309.01122)

**Best practices for machine learning in antibody discovery and development**
Leonard Wossnig, Norbert Furtmann, Andrew Buchanan, Sandeep Kumar, Victor Greiff
[arXiv:2312.08470](https://arxiv.org/abs/2312.08470)/[Drug Discovery Today (2024)](https://www.sciencedirect.com/science/article/pii/S1359644624001508)

**Next generation of multispecific antibody engineering**
Daniel Keri, Matt Walker, Isha Singh, Kyle Nishikawa, Fernando Garces
[Antibody Therapeutics (2023): tbad027](https://academic.oup.com/abt/article/7/1/37/7463325)

**A primer on ML in antibody engineering**
[ABHISHAIKE MAHAJAN](https://substack.com/@abhishaikemahajan)
[Substack](https://www.abhishaike.com/p/a-primer-on-ai-in-antibody-engineering) • blog

**Antibody design using deep learning: from sequence and structure design to affinity maturation**
Sara Joubbi, Alessio Micheli, Paolo Milazzo, Giuseppe Maccari, Giorgio Ciano, Dario Cardamone, Duccio Medini
[Briefings in Bioinformatics, Volume 25, Issue 4, July 2024, bbae307](https://academic.oup.com/bib/article/25/4/bbae307/7705535)

**AI-accelerated therapeutic antibody development: practical insights**
Luca Santuari, Marianne Bachmann Salvy, Ioannis Xenarios, Bulak Arpat
[Frontiers in Drug Discovery 4 (2024)](https://www.frontiersin.org/journals/drug-discovery/articles/10.3389/fddsv.2024.1447867/full)

**AI-driven antibody design with generative diffusion models: current insights and future directions**
Xin-heng He, Jun-rui Li, James Xu, Hong Shan, Shi-yi Shen, Si-han Gao & H. Eric Xu
[Acta Pharmacologica Sinica (2024)](https://www.nature.com/articles/s41401-024-01380-y)

**Applying computational protein design to therapeutic antibody discovery -- current state and perspectives**
Weronika Bielska, Igor Jaszczyszyn, Pawel Dudzic, Bartosz Janusz, Dawid Chomicz, Sonia Wrobel, Victor Greiff, Ryan Feehan, Jared Adolf-Bryfogle, Konrad Krawczyk
[arXiv:2503.00913](https://arxiv.org/abs/2503.00913)

### 1.3 Peptide design

**Deep generative models for peptide design**
Wan, Fangping, Daphne Kontogiorgos-Heintz, and Cesar de la Fuente-Nunez
[Digital Discovery (2022)](https://pubs.rsc.org/en/content/articlehtml/2022/dd/d1dd00024a)

**Design of protein segments and peptides for binding to protein targets**
Gupta, Suchetana, Noora Azadvari, and Parisa Hosseinzadeh
[BioDesign Research 2022 (2022)](https://spj.science.org/doi/10.34133/2022/9783197)

**Revolutionizing peptide-based drug discovery: Advances in the post-AlphaFold era**
Liwei Chang, Arup Mondal, Bhumika Singh, Yisel Martínez-Noa, Alberto Perez
[Wiley Interdisciplinary Reviews: Computational Molecular Science](https://wires.onlinelibrary.wiley.com/doi/epdf/10.1002/wcms.1693)

**Peptide-based drug discovery through artificial intelligence: towards an autonomous design of therapeutic peptides**
Montserrat Goles, Anamaría Daza, Gabriel Cabas-Mora, Lindybeth Sarmiento-Varón, Julieta Sepúlveda-Yañez, Hoda Anvari-Kazemabad, Mehdi D Davari, Roberto Uribe-Paredes, Álvaro Olivera-Nappa, Marcelo A Navarrete, David Medina-Ortiz
[Briefings in Bioinformatics 25.4 (2024)](https://academic.oup.com/bib/article/25/4/bbae275/7690345)

**Accelerating antimicrobial peptide design: Leveraging deep learning for rapid discovery**
Ahmad M. Al-Omari ,Yazan H. Akkam,Ala’a Zyout,Shayma’a Younis,Shefa M. Tawalbeh,Khaled Al-Sawalmeh,Amjed Al Fahoum ,Jonathan Arnold
[PloS one 19.12 (2024): e0315477](https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0315477)

**Trends in the Research and Development of Peptide Drug Conjugates: Artificial Intelligence Aided Design**
Dong-E Zhang, Dong-E Zhang, Tong He, Tong He, Tianyi Shi, Tianyi Shi, Kun Huang, Kun Huang, Anlin Peng, Anlin Peng
[Frontiers in Pharmacology 16](https://www.frontiersin.org/journals/pharmacology/articles/10.3389/fphar.2025.1553853/full)

### 1.4 Binder design

**Improving de novo Protein Binder Design with Deep Learning**
Nathaniel Bennett, Brian Coventry, Inna Goreshnik, Buwei Huang, Aza Allen, Dionne Vafeados, Ying Po Peng, Justas Dauparas, Minkyung Baek, Lance Stewart, Frank DiMaio, Steven De Munck, Savvas Savvides, David Baker
[bioRxiv 2022.06.15.495993](https://www.biorxiv.org/content/10.1101/2022.06.15.495993v1)/[Nat Commun 14, 2625 (2023)](https://www.nature.com/articles/s41467-023-38328-5) • [code](https://github.com/nrbennet/dl_binder_design) • [news](https://phys.org/news/2023-08-deep-protein.html)

**Data and AI-driven synthetic binding protein discovery**
Yanlin Li, Zixin Duan, Zhenwen Li, Weiwei Xue
[Trends in Pharmacological Sciences (2025)](https://www.cell.com/trends/pharmacological-sciences/abstract/S0165-6147(24)00268-2)

### 1.5 Enzyme design

**A review of enzyme design in catalytic stability by artificial intelligence**
Yongfan Ming, Wenkang Wang, Rui Yin, Min Zeng, Li Tang, Shizhe Tang, Min Li
[Briefings in Bioinformatics, 2023](https://academic.oup.com/bib/advance-article-abstract/doi/10.1093/bib/bbad065/7086816)

**Application of "foldability" in the intelligent of enzymes engineering and design: take AlphaFold2 for example**
MENG Qiaozhen, GUO Fei
[Synthetic Biology Journal (2023)](https://synbioj.cip.com.cn/article/2023/2096-8280/2023-011.shtml)

**AlphaFold2 and Deep Learning for Elucidating Enzyme Conformational Flexibility and Its Application for Design**
Casadevall, Guillem, Cristina Duran, and Sí­lvia Osuna
[JACS Au (2023)](https://pubs.acs.org/doi/10.1021/jacsau.3c00188)

**Accelerating Biocatalysis Discovery with Machine Learning: A Paradigm Shift in Enzyme Engineering, Discovery, and Design**
Braun Markus, Gruber Christian C, Krassnigg Andreas, Kummer Arkadij, Lutz Stefan, Oberdorfer Gustav, Siirola Elina, and Snajdrova Radka
[ACS Catal. 2023](https://pubs.acs.org/doi/10.1021/acscatal.3c03417)

**Building Enzymes through Design and Evolution**
Hossack, Euan J., Florence J. Hardy, and Anthony P. Green
[ACS Catalysis 13.19 (2023)](https://pubs.acs.org/doi/10.1021/acscatal.3c02746)

**Advances in generative modeling methods and datasets to design novel enzymes for renewable chemicals and fuels**
Rana A Barghout, Zhiqing Xu, Siddharth Betala, Radhakrishnan Mahadevan
[Current Opinion in Biotechnology, Volume 84, 2023](https://www.sciencedirect.com/science/article/abs/pii/S0958166923001179)

**Opportunites and Challenges for Machine Learning-Assisted Enzyme Engineering**
Jason Yang, Francesca-Zhoufan Li, Frances H. Arnold
[ACS Central Science (2024)](https://pubs.acs.org/doi/10.1021/acscentsci.3c01275)

**Navigating the landscape of enzyme design: from molecular simulations to machine learning**
Jiahui Zhoua, Meilan Huang
[Chemical Society Reviews (2024)](https://pubs.rsc.org/en/Content/ArticleLanding/2024/CS/D4CS00196F)

**Structure Prediction and Computational Protein Design for Efficient Biocatalysts and Bioactive Proteins**
Rebecca Buller, Jiri Damborsky, Donald Hilvert, Uwe Bornscheuer
[Angewandte Chemie (International ed. in English)](https://onlinelibrary.wiley.com/doi/10.1002/anie.202421686)

## 2. Model-based design

> Invert trained models with optimize algorithms through iterations for sequence design. Inverted structure prediction models are known as **Hallucination**.

### 2.1 trRosetta-based

**Design of proteins presenting discontinuous functional sites using deep learning**
Doug Tischer, Sidney Lisanza, Jue Wang, Runze Dong, View ORCID ProfileIvan Anishchenko, Lukas F. Milles, Sergey Ovchinnikov, David Baker
[bioRxiv (2020)](https://www.biorxiv.org/content/10.1101/2020.11.29.402743v1)

**Fast differentiable DNA and protein sequence optimization for molecular design**
Linder, Johannes, and Georg Seelig
[arXiv preprint arXiv:2005.11275 (2020)](https://arxiv.org/abs/2005.11275)

**De novo protein design by deep network hallucination**
Ivan Anishchenko, Samuel J. Pellock, Tamuka M. Chidyausiku, Theresa A. Ramelot, Sergey Ovchinnikov, Jingzhou Hao, Khushboo Bafna, Christoffer Norn, Alex Kang, Asim K. Bera, Frank DiMaio, Lauren Carter, Cameron M. Chow, Gaetano T. Montelione & David Baker
[Nature (2021)](https://doi.org/10.1038/s41586-021-04184-w) • [code](https://github.com/gjoni/trDesign) • [trRosetta](https://yanglab.nankai.edu.cn/trRosetta/download/)

**Protein sequence design by conformational landscape optimization**
Christoffer Norn, Basile I. M. Wicky, David Juergens, and Sergey Ovchinnikov
[Proceedings of the National Academy of Sciences 118.11 (2021)](https://www.pnas.org/content/118/11/e2017228118) • [code](https://github.com/gjoni/trDesign)

**De novo design of small beta barrel proteins**
David E. Kim, Davin R. Jensen, David Feldman, Doug Tischer and Ayesha Saleem, Cameron M. Chow, Xinting Li, Lauren Carter, Lukas Milles, Hannah Nguyen, Alex Kang, Asim K. Bera, Francis C. Peterson, Brian F. Volkman, Sergey Ovchinnikov, David Baker
[PNAS(2023),e2207974120](https://www.pnas.org/doi/10.1073/pnas.2207974120) • [code](https://github.com/sokrypton/TrDesign_partialhal)

**Exploring "dark matter" protein folds using deep learning**
Zander Harteveld, Alexandra Van Hall-Beauvais, Irina Morozova, Joshua Southern, Casper Alexander Goverde, Sandrine Georgeon, Stephane Rosset, Andreas Loukas, Pierre Vandergheynst, Michael Bronstein, Bruno Correia
[bioRxiv 2023.08.30.555621](https://www.biorxiv.org/content/10.1101/2023.08.30.555621v1)/[Cell Systems](https://www.cell.com/cell-systems/fulltext/S2405-4712(24)00270-9) • [Suppplymentary](https://www.biorxiv.org/content/biorxiv/early/2023/09/01/2023.08.30.555621/DC1/embed/media-1.pdf) • [code](https://github.com/zanderharteveld/genesis)

**Carving out a Glycoside Hydrolase Active Site for Incorporation into a New Protein Scaffold Using Deep Network Hallucination**
Anders Lønstrup Hansen, Frederik Friis Theisen, Ramon Crehuet, Enrique Marcos, Nushin Aghajari, and Martin Willemoës
[ACS Synth. Biol. 2024](https://pubs.acs.org/doi/10.1021/acssynbio.3c00674)

**Implicit modeling of the conformational landscape and sequence allows scoring and generation of stable proteins**
Yehlin Cho, Justas Dauparas, Kotaro Tsuboyama, Gabriel Rocklin, Sergey Ovchinnikov
[bioRxiv 2024.12.20.629706](https://www.biorxiv.org/content/10.1101/2024.12.20.629706v1) • [code](https://github.com/yehlincho/Joint_Model_Stability) • [Supplementary](https://www.biorxiv.org/content/biorxiv/early/2024/12/22/2024.12.20.629706/DC1/embed/media-1.pdf)

### 2.2 AlphaFold2-based

**End-to-end learning of multiple sequence alignments with differentiable Smith-Waterman**
Petti, Samantha, Bhattacharya, Nicholas, Rao, Roshan, Dauparas, Justas, Thomas, Neil, Zhou, Juannan, Rush, Alexander M, Koo, Peter K, Ovchinnikov, Sergey
[bioRxiv (2021)](http://repository.cshl.edu/id/eprint/40409/)/[Bioinformatics, 2022;, btac724](https://academic.oup.com/bioinformatics/advance-article/doi/10.1093/bioinformatics/btac724/6820925) • [ColabDesign](https://github.com/sokrypton/ColabDesign), [SMURF](https://github.com/spetti/SMURF), [AF2 back propagation](https://github.com/sokrypton/af_backprop) • [our notes1](https://zhuanlan.zhihu.com/p/468219547), [notes2](https://zhuanlan.zhihu.com/p/472037977) • [lecture1](https://www.youtube.com/watch?v=2HmXwlKWMVs), [lecture2](https://www.youtube.com/watch?v=BJdRvODiDnk) • [Discord](https://discord.com/invite/FpYPneYB)

**AlphaDesign: A de novo protein design framework based on AlphaFold**
Jendrusch, Michael, Jan O. Korbel, and S. Kashif Sadiq
[bioRxiv (2021)](https://www.biorxiv.org/content/10.1101/2021.10.11.463937v1)

**Using AlphaFold for Rapid and Accurate Fixed Backbone Protein Design**
Moffat, Lewis, Joe G. Greener, and David T. Jones
[bioRxiv (2021)](https://www.biorxiv.org/content/10.1101/2021.08.24.457549v1)

**State-of-the-art estimation of protein model accuracy using AlphaFold**
James P. Roney, Sergey Ovchinnikov
[bioRxiv 2022.03.11.484043](https://www.biorxiv.org/content/10.1101/2022.03.11.484043v3)/[Physical Review Letters 129.23 (2022)](https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.129.238101) • [code](https://github.com/jproney/AF2Rank)

**Solubility-aware protein binding peptide design using AlphaFold**
Takatsugu Kosugi, Masahito Ohue
[bioRxiv 2022.05.14.491955](https://doi.org/10.1101/2022.05.14.491955)/[Biomedicines 10.7 (2022)](https://www.mdpi.com/2227-9059/10/7/1626) • [Supplemental Materials](https://www.biorxiv.org/content/biorxiv/early/2022/05/15/2022.05.14.491955/DC1/embed/media-1.pdf) • [code](https://github.com/ohuelab/Solubility_AfDesign)

**Hallucinating protein assemblies**
Basile I M Wicky, Lukas F Milles, Alexis Courbet, Robert J Ragotte, Justas Dauparas, Elias Kinfu, Sam Tipps, Ryan D Kibler, Minkyung Baek, Frank DiMaio, Xinting Li, Lauren Carter, Alex Kang, Hannah Nguyen, Asim K Bera, David Baker
[bioRxiv 2022.06.09.493773](https://www.biorxiv.org/content/10.1101/2022.06.09.493773v1)/[Science (2022)](https://www.science.org/doi/10.1126/science.add1964) • [related slides](https://docs.google.com/presentation/d/1_tvzLKks83sYOKemfFeImCPnWtCQ-CHqmKK_3IQI1so/) • [our notes](https://zhuanlan.zhihu.com/p/527152827) • [news](https://www.nature.com/articles/d41586-022-02947-7)

**EvoBind: in silico directed evolution of peptide binders with AlphaFold**
Patrick Bryant, Arne Elofsson
[bioRxiv 2022.07.23.501214](https://www.biorxiv.org/content/10.1101/2022.07.23.501214v1) • [code](https://github.com/patrickbryant1/EvoBind)

**Hallucination of closed repeat proteins containing central pockets**
Linna An, Derrick R Hicks, Dmitri Zorine, Justas Dauparas, Basile I. M. Wicky, Lukas F Milles, Alexis Courbet, Asim K. Bera, Hannah Nguyen, Alex Kang, Lauren Carter, David Baker
[bioRxiv 2022.09.01.506251](https://www.biorxiv.org/content/10.1101/2022.09.01.506251v1)/[Nat Struct Mol Biol 30, 1755-1760 (2023)](https://www.nature.com/articles/s41594-023-01112-6) • [Supplementary data](https://static-content.springer.com/esm/art%3A10.1038%2Fs41594-023-01112-6/MediaObjects/41594_2023_1112_MOESM1_ESM.pdf)

**Predicting the structure of large protein complexes using AlphaFold and Monte Carlo tree search**
Patrick Bryant, Gabriele Pozzati, Wensi Zhu, Aditi Shenoy, Petras Kundrotas & Arne Elofsson
[Nature communications 13.1 (2022)](https://www.nature.com/articles/s41467-022-33729-4) • [gitlba](https://gitlab.com/patrickbryant1/molpc), [github](https://github.com/patrickbryant1/MoLPC) • [Supplementary data1](https://doi.org/10.5281/zenodo.6367019), [Supplementary data2](https://doi.org/10.17044/scilifelab.19375172)

**De novo protein design by inversion of the AlphaFold structure prediction network**
Casper Goverde, Benedict Wolf, Hamed Khakzad, Stephane Rosset, Bruno E Correia
[bioRxiv 2022.12.13.520346](https://www.biorxiv.org/content/10.1101/2022.12.13.520346v1) • [code](https://github.com/bene837/af_gradmcmc) • [lecture1](https://www.youtube.com/watch?v=aUMGuogMZCA) • [lecture2](https://www.youtube.com/watch?v=4S4J7gbhAa0)

**Code of OpenComplex**
Jingcheng, Yu and Zhaoming, Chen and Zhaoqun, Li and Mingliang, Zeng and Wenjun, Lin and He, Huang and Qiwei, Ye
[code](https://github.com/baaihealth/OpenComplex)

**Efficient and scalable de novo protein design using a relaxed sequence space**
Christopher Josef Frank, Ali Khoshouei, Yosta de Stigter, Dominik Schiewitz, Shihao Feng, Sergey Ovchinnikov, Hendrik Dietz
[bioRxiv 2023.02.24.529906](https://www.biorxiv.org/content/10.1101/2023.02.24.529906v1) • [code](https://github.com/sokrypton/ColabDesign/blob/main/af/examples/af_relax_design.ipynb)

**Cyclic peptide structure prediction and design using AlphaFold**
Stephen A. Rettie, Katelyn V. Campbell, Asim K. Bera, Alex Kang, Simon Kozlov, Joshmyn De La Cruz, Victor Adebomi, Guangfeng Zhou, Frank DiMaio, Sergey Ovchinnikov, Gaurav Bhardwaj
[bioRxiv](https://www.biorxiv.org/content/10.1101/2023.02.25.529956v1.full.pdf) • [Code](https://github.com/sokrypton/ColabDesign/blob/main/af/examples/af_cyc_design.ipynb) • [Supplementary](https://www.biorxiv.org/content/biorxiv/early/2023/02/26/2023.02.25.529956/DC1/embed/media-1.xlsx)

**De novo design of luciferases using deep learning**
Andy Hsien-Wei Yeh, Christoffer Norn, Yakov Kipnis, Doug Tischer, Samuel J. Pellock, Declan Evans, Pengchen Ma, Gyu Rie Lee, Jason Z. Zhang, Ivan Anishchenko, Brian Coventry, Longxing Cao, Justas Dauparas, Samer Halabiya, Michelle DeWitt, Lauren Carter, K. N. Houk & David Baker
[Nature](https://www.nature.com/articles/s41586-023-05696-3) • [Code](https://files.ipd.uw.edu/pub/luxSit/scaffold_generation.tar.gz) • [Supplementary Materials](https://static-content.springer.com/esm/art%3A10.1038%2Fs41586-023-05696-3/MediaObjects/41586_2023_5696_MOESM1_ESM.pdf)

**In silico evolution of protein binders with deep learning models for structure prediction and sequence design**
Odessa J Goudy, Amrita Nallathambi, Tomoaki Kinjo, Nicholas Randolph, Brian Kuhlman
[bioRxiv 2023.05.03.539278](https://www.biorxiv.org/content/10.1101/2023.05.03.539278v1) • [Supplementary](https://www.biorxiv.org/content/biorxiv/early/2023/05/03/2023.05.03.539278/DC1/embed/media-1.pdf) • [code](https://github.com/KuhlmanLab/evopro)

**Computational design of soluble analogues of integral membrane protein structures**
Casper Alexander Goverde, Martin Pacesa, Lars Jeremy Dornfeld, Sandrine Georgeon, Stephane Rosset, Justas Dauparas, Christian Shellhaas, Simon Kozlov, David Baker, Sergey Ovchinnikov, Bruno Correia
[bioRxiv 2023.05.09.540044](https://www.biorxiv.org/content/10.1101/2023.05.09.540044v2)/[Nature (2024)](https://www.nature.com/articles/s41586-024-07601-y) • [code](https://github.com/bene837/af2seq) • [Supplementary](https://www.biorxiv.org/content/biorxiv/early/2023/05/09/2023.05.09.540044/DC1/embed/media-1.pdf)

**Antibody Complementarity-Determining Region Sequence Design using AlphaFold2 and Binding Affinity Prediction Model**
Takafumi Ueki, Masahito Ohue
[bioRxiv 2023.06.02.543382](https://www.biorxiv.org/content/10.1101/2023.06.02.543382v1)

**Context-Dependent Design of Induced-fit Enzymes using Deep Learning Generates Well Expressed, Thermally Stable and Active Enzymes**
Lior Zimmerman, Noga Alon, Itay Levin, Anna Koganitsky, Nufar Shpigel, Chen Brestel, Gideon David Lapidoth
[bioRxiv 2023.07.27.550799](https://www.biorxiv.org/content/10.1101/2023.07.27.550799v2) • [Supplementary](https://www.biorxiv.org/content/biorxiv/early/2023/07/31/2023.07.27.550799/DC1/embed/media-1.xlsx)

**Highly accurate and robust protein sequence design with CarbonDesign**/**Accurate and robust protein sequence design with CarbonDesign**
Milong Ren, Chungong Yu, Dongbo Bu, Haicang Zhang
[bioRxiv 2023.08.07.552204](https://www.biorxiv.org/content/10.1101/2023.08.07.552204v1)/[Nat Mach Intell 6, 536–547 (2024)](https://www.nature.com/articles/s42256-024-00838-2) • [code](https://github.com/zhanghaicang/carbonmatrix_public)

**Design of Cyclic Peptides Targeting Protein-Protein Interactions using AlphaFold**
Takatsugu Kosugi, Masahito Ohue
[bioRxiv 2023.08.20.554056](https://www.biorxiv.org/content/10.1101/2023.08.20.554056v1) • [Supplementary](https://www.biorxiv.org/content/biorxiv/early/2023/08/21/2023.08.20.554056/DC1/embed/media-1.pdf) • [code](https://github.com/YoshitakaMo/localcolabfold/)

**MetaPPI: In Silico Screen for Novel CRBN-based Substrates**
neoxbio
[website](https://www.neoxbio.com/platform-technology.html) • [news](https://mp.weixin.qq.com/s/Kb4EQ0YvYDvoLZ_cnAlUPw) • masif-based • commercial

**AlphaFold Distillation for Protein Design**
Anonymous
[ICLR 2024](https://openreview.net/forum?id=3pgJNIx3gc) • [code](https://anonymous.4open.science/r/AFDistill-28C3)

**High-throughput computational discovery of inhibitory protein fragments with AlphaFold**
Andrew Savinov, Sebastian Swanson, Amy E. Keating, Gene-Wei Li
[bioRxiv 2023.12.19.572389](https://www.biorxiv.org/content/10.1101/2023.12.19.572389v1) • [code](https://github.com/swanss/FragFold)

**An integrative approach to protein sequence design through multiobjective optimization**
Lu Hong, Tanja Kortemme
[bioRxiv 2024.03.01.582670](https://www.biorxiv.org/content/10.1101/2024.03.01.582670v1)/[PLOS Computational Biology 20(7)](https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1011953) • [code](https://github.com/luhong88/int_seq_des) • [Supplementary](https://www.biorxiv.org/content/biorxiv/early/2024/03/04/2024.03.01.582670/DC1/embed/media-1.pdf)

**Protein Design Using Structure-Prediction Networks: AlphaFold and RoseTTAFold as Protein Structure Foundation Models**
Jue Wang, Joseph L. Watson and Sidney L. Lisanza
[Cold Spring Harbor Perspectives in Biology(2024)](https://cshperspectives.cshlp.org/content/early/2024/03/01/cshperspect.a041472.short)

**Context-dependent design of induced-fit enzymes using deep learning generates well-expressed, thermally stable and active enzymes**
Lior Zimmerman, Noga Alon, Itay Levin, and Gideon D. Lapidoth
[Proceedings of the National Academy of Sciences 121.11(2024)](https://www.pnas.org/doi/10.1073/pnas.2313809121)

**Design of Repeat Alpha-Beta Proteins with Capping Helices**
Dmitri Zorine, David Baker
[bioRxiv 2024.06.15.590358](https://www.biorxiv.org/content/10.1101/2024.06.15.590358v1) • [code](https://github.com/dmitropher/af2_multistate_hallucination)

**Design of linear and cyclic peptide binders of different lengths only from a protein target sequence**
Qiuzhen Li, Efstathios Nikolaos Vlachos, Patrick Bryant
[bioRxiv 2024.06.20.599739](https://www.biorxiv.org/content/10.1101/2024.06.20.599739v1) • [code](https://zenodo.org/records/11543503) • [Supplementary](https://www.biorxiv.org/content/biorxiv/early/2024/06/22/2024.06.20.599739/DC1/embed/media-1.pdf)

**BindCraft: one-shot design of functional protein binders**
Martin Pacesa, Lennart Nickel, Joseph Schmidt, Ekaterina Pyatova, Christian Schellhaas, Lucas Kissling, Ana Alcaraz-Serna, Yehlin Cho, Kourosh H. Ghamary, Laura Vinue, Brahm J. Yachnin, Andrew M. Wollacott, Stephen Buckley, Sandrine Georgeon, Casper A. Goverde, Georgios N. Hatzopoulos, Pierre Gonczy, Yannick D. Muller, Gerald Schwank, Sergey Ovchinnikov, Bruno E. Correia
[bioRxiv 2024.09.30.615802](https://www.biorxiv.org/content/10.1101/2024.09.30.615802v1) • [code](https://github.com/martinpacesa/BindCraftz)

**Design of linear and cyclic peptide binders of different lengths from protein sequence information**
Qiuzhen Li, Efstathios Nikolaos Vlachos, Patrick Bryant
[bioRxiv 2024.06.20.599739](https://www.biorxiv.org/content/10.1101/2024.06.20.599739v2) • [code](https://zenodo.org/records/13913345)

**Scalable protein design using optimization in a relaxed sequence space**
Christopher Frank, Ali Khoshouei , Lara Fub , Dominik Schiwietz , Dominik Putz, Lara Weber, Zhixuan Zhao, Motoyuki Hattori, Shihao Feng, Yosta de Stigter, Sergey Ovchinnikov, Hendrik Dietz
[Science386,439-445(2024)](https://www.science.org/doi/10.1126/science.adq1741) • [code](https://github.com/sokrypton/ColabDesign)

**Alphafold2 refinement improves designability of large de novo proteins**
Christopher Josef Frank, Dominik Schiwietz, Lara Fuss, Sergey Ovchinnikov, Hendrik Dietz
[bioRxiv 2024.11.21.624687](https://www.biorxiv.org/content/10.1101/2024.11.21.624687v1) • [colab](https://colab.research.google.com/drive/14ULdrjOmH-XMtGDrikzjDF1FLegZg3-a?usp=sharing)

**Low-N OpenFold fine-tuning improves peptide design without additional structures**
Theodore Sternlieb, Jakub Otwinowski, Sam Sinai, Jeffrey Chan
[Machine Learning for Structural Biology Workshop, NeurIPS 2024](https://www.mlsb.io/papers_2024/Low-N_OpenFold_fine-tuning_improves_peptide_design_without_additional_structures.pdf)

**HighPlay: Cyclic Peptide Sequence Design Based on Reinforcement Learning and Protein Structure Prediction**
Huitian Lin, Cheng Zhu, Tianfeng Shang, Ning Zhu, Kang Lin, Xiang Shao, Xudong Wang, Hongliang Duan
[bioRxiv 2025.03.17.643626](http://biorxiv.org/content/10.1101/2025.03.17.643626v1)

### 2.3 DMPfold2-based

**Design in the DARK: Learning Deep Generative Models for De Novo Protein Design**
Moffat, Lewis, Shaun M. Kandathil, and David T. Jones
[bioRxiv (2022)](https://www.biorxiv.org/content/10.1101/2022.01.27.478087v1) • [DMPfold2](https://github.com/psipred/DMPfold2)

### 2.4 CM-Align

**AutoFoldFinder: An Automated Adaptive Optimization Toolkit for De Novo Protein Fold Design**
Shuhao Zhang, Youjun Xu, Jianfeng Pei, Luhua Lai
[NeurIPS 2021](https://www.mlsb.io/papers_2021/MLSB2021_AutoFoldFinder.pdf)

### 2.5 MSA-transformer-based

**Protein language models trained on multiple sequence alignments learn phylogenetic relationships**
Damiano Sgarbossa, Umberto Lupo, Anne-Florence Bitbol
[arXiv preprint arXiv:2203.15465 (2022)](https://arxiv.org/abs/2203.15465)/[bioRxiv 2022.04.14.488405](https://www.biorxiv.org/content/10.1101/2022.04.14.488405v1)

**EvoOpt: an MSA-guided, fully unsupervised sequence optimization pipeline for protein design**
Hideki Yamaguchi, Yutaka Saito
[NeurIPS 2022](https://www.mlsb.io/papers_2022/EvoOpt_an_MSA_guided_fully_unsupervised_sequence_optimization_pipeline_for_protein_design.pdf)

**Generative power of a protein language model trained on multiple sequence alignments**
Sgarbossa, Damiano, Umberto Lupo, and Anne-Florence Bitbol
[Elife 12 (2023): e79854](https://elifesciences.org/articles/79854) • [code](https://github.com/Bitbol-Lab/Iterative_masking)

### 2.6 DeepAb-based

**Towards deep learning models for target-specific antibody design**
Sai Pooja Mahajan, Jeffrey Ruffolo, Rahel Frick, Jeffrey J. Gray
[Biophysical Journal 121.3 (2022)](https://www.cell.com/biophysj/pdf/S0006-3495(21)03758-9.pdf) • [DeepAb](https://github.com/RosettaCommons/DeepAb) • [lecture](https://www.youtube.com/watch?v=LIo-1jPfrns)

**Hallucinating structure-conditioned antibody libraries for target-specific binders**
Sai Pooja Mahajan, Jeffrey A Ruffolo, Rahel Frick, Jeffrey J. Gray
[bioRxiv 2022.06.06.494991](https://www.biorxiv.org/content/10.1101/2022.06.06.494991v1)/[Front. Immunol. 13:999034](https://www.frontiersin.org/articles/10.3389/fimmu.2022.999034/full) • [Supplementary](https://www.biorxiv.org/content/biorxiv/early/2022/06/06/2022.06.06.494991/DC1/embed/media-1.pdf) • [code](https://github.com/RosettaCommons/FvHallucinator)

### 2.7 TRFold2-based

[News of TRDesign](https://mp.weixin.qq.com/s/OQzKawtL9RdK9HzYsfu80g)
[TIANRANG XLab](https://xlab.tianrang.com/)
paper unavailable • [slides](https://pan.baidu.com/share/init?surl=4AOW_D9dwlvC7VGGZA2tmQ&pwd=ffui) • [website](https://xcreator.tianrang.com/auth/login) • commercial • [news](https://mp.weixin.qq.com/s/45Gz7GWOGxHl0i6LXxTUpw)

### 2.8 GPT-based

**Multi-segment preserving sampling for deep manifold sampler**
Daniel Berenberg, Jae Hyeon Lee, Simon Kelow, Ji Won Park, Andrew Watkins, Vladimir Gligorijević, Richard Bonneau, Stephen Ra, Kyunghyun Cho
[arXiv preprint arXiv:2205.04259 (2022)](https://arxiv.org/abs/2205.04259)

**Preference optimization of protein language models as a multi-objective binder design paradigm**
Pouria Mistani, Venkatesh Mysore
[arXiv:2403.04187](https://arxiv.org/abs/2403.04187)

**HMAMP: Hypervolume-Driven Multi-Objective Antimicrobial Peptides Design**
Li Wang, Yiping Li, Xiangzheng Fu, Xiucai Ye, Junfeng Shi, Gary G. Yen, Xiangxiang Zeng
[arXiv:2405.00753](https://arxiv.org/abs/2405.00753)

### 2.9 ESM-based

**Generating novel protein sequences using Gibbs sampling of masked language models**
Sean R. Johnson, Sarah Monaco, Kenneth Massie, Zaid Syed
[bioRxiv 2021.01.26.428322](https://www.biorxiv.org/content/10.1101/2021.01.26.428322v1) • [code](https://github.com/seanrjohnson/protein_gibbs_sampler)

**A high-level programming language for generative protein design**
Brian Hie, Salvatore Candido, Zeming Lin, Ori Kabeli, Roshan Rao, Nikita Smetanin, Tom Sercu, Alexander Rives
[bioRxiv 2022.12.21.521526](https://www.biorxiv.org/content/10.1101/2022.12.21.521526v1)

**Language models generalize beyond natural proteins**
Robert Verkuil, Ori Kabeli, Yilun Du, Basile IM Wicky, Lukas F Milles, Justas Dauparas, David Baker, Sergey Ovchinnikov, Tom Sercu, Alexander Rives
[bioRxiv 2022.12.21.521521](https://www.biorxiv.org/content/10.1101/2022.12.21.521521v1)

**ESMFold Hallucinates Native-Like Protein Sequences**
Jeliazko R Jeliazkov, Diego del Alamo, Joel D Karpiak
[bioRxiv 2023.05.23.541774](https://www.biorxiv.org/content/10.1101/2023.05.23.541774v1)

**Protein Language Model Supervised Precise and Efficient Protein Backbone Design Method**
Bo Zhang, Kexin Liu, Zhuoqi Zheng, Yunfeiyang Liu, Junxi Mu, Ting Wei, Hai-Feng Chen
[bioRxiv 2023.10.26.564121](https://www.biorxiv.org/content/10.1101/2023.10.26.564121v1)/[preprint](https://www.researchsquare.com/article/rs-5450034/v1) • [code](https://github.com/sirius777coder/GPDL) • [Supplementary](https://www.biorxiv.org/content/biorxiv/early/2023/10/30/2023.10.26.564121/DC1/embed/media-1.pdf)

**Unexplored regions of the protein sequence-structure map revealed at scale by a library of foldtuned language models**
Arjuna M. Subramanian, Matt Thomson
[bioRxiv 2023.12.22.573145](https://www.biorxiv.org/content/10.1101/2023.12.22.573145v1)

**Computational scoring and experimental evaluation of enzymes generated by neural networks**
Sean R. Johnson, Xiaozhi Fu, Sandra Viknander, Clara Goldin, Sarah Monaco, Aleksej Zelezniak & Kevin K. Yang
[Nature Biotechnology (2024)](https://www.nature.com/articles/s41587-024-02214-2) • [code](https://github.com/seanrjohnson/protein_scoring)

**Exploring Latent Space for Generating Peptide Analogs Using Protein Language Models**
Po-Yu Liang, Xueting Huang, Tibo Duran, Andrew J. Wiemer, Jun Bai
[arXiv:2408.08341](https://arxiv.org/abs/2408.08341) • [code](https://github.com/LabJunBMI/Latent-Space-Peptide-Analogues-Generation)

**Designing diverse and high-performance proteins with a large language model in the loop**
Carlos A. Gomez-Uribe, Japheth Gado, Meiirbek Islamov
[bioRxiv 2024.10.25.620340](https://www.biorxiv.org/content/10.1101/2024.10.25.620340v1)

**Key-cutting machine: A novel optimization framework for tailored protein and peptide design**
Yan C. Leyva, Marcelo D. T. Torres, Carlos A. Oliva, Cesar de la Fuente-Nunez, Carlos A. Brizuela
[bioRxiv 2025.01.05.631393](https://www.biorxiv.org/content/10.1101/2025.01.05.631393v1) • [code](https://github.com/cbrizuel/KCM)

**Improving functional protein generation via foundation model-derived latent space likelihood optimization**
Changge Guan, Fangping Wan, Marcelo D. T. Torres, Cesar de la Fuente-Nunez
[bioRxiv 2025.01.07.631724](https://www.biorxiv.org/content/10.1101/2025.01.07.631724v1) • [Supplementary](https://www.biorxiv.org/content/biorxiv/early/2025/01/08/2025.01.07.631724/DC1/embed/media-1.docx)

### 2.10 Antiberta-based

**DyAb: sequence-based antibody design and property prediction in a low-data regime**
Joshua Yao-Yu Lin, Jennifer L. Hofmann, Andrew Leaver-Fay, Wei-Ching Liang, Stefania Vasilaki, Edith Lee, Pedro O. Pinheiro, Natasa Tagasovska, James R. Kiefer, Yan Wu, Franziska Seeger, Richard Bonneau, Vladimir Gligorijevic, Andrew Watkins, Kyunghyun Cho, Nathan C. Frey
[bioRxiv 2025.01.28.635353](https://www.biorxiv.org/content/10.1101/2025.01.28.635353v1) • [code](github.com/prescient-design/lobster) • [Supplementary](https://www.biorxiv.org/content/biorxiv/early/2025/02/02/2025.01.28.635353/DC1/embed/media-1.pdf)

### 2.11 Sampling-algorithms

**AdaLead: A simple and robust adaptive greedy search algorithm for sequence design**
Sam Sinai, Richard Wang, Alexander Whatley, Stewart Slocum, Elina Locane, Eric D. Kelsic
[arXiv preprint arXiv:2010.02141 (2020)](https://arxiv.org/abs/2010.02141) • [code](https://github.com/samsinai/FLEXS)

**Autofocused oracles for model-based design**
Fannjiang, Clara, and Jennifer Listgarten
[Advances in Neural Information Processing Systems 33 (2020)](https://proceedings.neurips.cc/paper/2020/file/972cda1e62b72640cb7ac702714a115f-Paper.pdf)

**An Efficient MCMC Approach to Energy Function Optimization in Protein Structure Prediction**
Lakshmi A. Ghantasala, Risi Jaiswal, Supriyo Datta
[arXiv:2211.03193](https://arxiv.org/abs/2211.03193)

**Plug & Play Directed Evolution of Proteins with Gradient-based Discrete MCMC**
Patrick Emami, Aidan Perreault, Jeffrey Law, David Biagioni, Peter St. Joh
[NeurIPS 2022](https://www.mlsb.io/papers_2022/Plug_Play_Directed_Evolution_of_Proteins_with_Gradient_based_Discrete_MCMC.pdf)/[arXiv:2212.09925](https://arxiv.org/abs/2212.09925)

**Importance Weighted Expectation-Maximization for Protein Sequence Design**
Zhenqiao Song, Lei Li
[arXiv:2305.00386](https://arxiv.org/abs/2305.00386) • [Supplementary](https://www.biorxiv.org/content/biorxiv/early/2023/05/09/2023.05.09.539914/DC1/embed/media-1.pdf)

**Simultaneous enhancement of multiple functional properties using evolution-informed protein design**
Benjamin Fram, Ian Truebridge, Yang Su, Adam J. Riesselman, John B. Ingraham, Alessandro Passera, Eve Napier, Nicole N. Thadani, Samuel Lim, Kristen Roberts, Gurleen Kaur, Michael Stiffler, Debora S. Marks, Christopher D. Bahl, Amir R. Khan, Chris Sander, Nicholas P. Gauthier
[bioRxiv (2023): 2023-05](https://www.biorxiv.org/content/10.1101/2023.05.09.539914v1)

**Optimizing protein fitness using Gibbs sampling with Graph-based Smoothing**
Andrew Kirjner, Jason Yim, Raman Samusevich, Tommi Jaakkola, Regina Barzilay, Ila Fiete
[arXiv:2307.00494](https://arxiv.org/abs/2307.00494) • [code](https://github.com/kirjner/GGS)

## 3. Function to Scaffold

> These models design backbone/scaffold/template in Cartesian coordinates, contact maps, distance maps and φ & ψ angles. Including conditional/unconditional generative models.

### 3.1 GAN-based

**Generative modeling for protein structures**
Anand, Namrata, and Possu Huang
[NeurIPS 2018](https://proceedings.neurips.cc/paper/2018/file/afa299a4d1d8c52e75dd8a24c3ce534f-Paper.pdf)

**Fully differentiable full-atom protein backbone generation**
Anand Namrata, Raphael Eguchi, and Po-Ssu Huang
[OpenReview ICLR 2019 workshop DeepGenStruct](https://openreview.net/forum?id=SJxnVL8YOV) • without code

**RamaNet: Computational de novo helical protein backbone design using a long short-term memory generative neural network**
Sabban, Sari, and Mikhail Markovsky
[F1000Research 9 (2020)](http://f1000researchdata.s3.amazonaws.com/manuscripts/29106/f45e92eb-5d68-4da0-b918-91ded85d2e7d_22907_-_sari_sabban_v2.pdf) • [code](https://sarisabban.github.io/RamaNet/) • pyRosetta • tensorflow • maximizaing the fluorescence of a protein

**A Generative Model for Creating Path Delineated Helical Proteins**
Nicholas B. Woodall, Ryan Kibler, Basile Wicky, Brian Coventry
[bioRxiv 2023.05.24.542095](https://www.biorxiv.org/content/10.1101/2023.05.24.542095v1) • [code](https://github.com/NickWoodall/HelixGen)

### 3.2 AutoEncoder-based

**Conditioning by adaptive sampling for robust design**
Brookes, David, Hahnbeom Park, and Jennifer Listgarten
[International conference on machine learning. PMLR, 2019](http://proceedings.mlr.press/v97/brookes19a/brookes19a.pdf) • without code

**IG-VAE: generative modeling of immunoglobulin proteins by direct 3D coordinate generation**
Raphael R. Eguchi, Christian A. Choe, Po-Ssu Huang
[Biorxiv (2020)](https://www.biorxiv.org/content/10.1101/2020.08.07.242347v2) • without code

**Generating tertiary protein structures via an interpretative variational autoencoder**
Xiaojie Guo, Yuanqi Du, Sivani Tadepalli, Liang Zhao, Amarda Shehu
[arXiv preprint arXiv:2004.07119 (2020)](https://arxiv.org/abs/2004.07119) • code not available

**Function-guided protein design by deep manifold sampling**
Vladimir Gligorijevic, Stephen Ra, Daniel Berenberg, Richard Bonneau, Kyunghyun Cho
[NeurIPS 2021](https://www.mlsb.io/papers_2021/MLSB2021_Function-guided_protein_design_by.pdf) • without code

**Deep sharpening of topological features for de novo protein design**
Zander Harteveld, Joshua Southern, Michaël Defferrard, Andreas Loukas, Pierre Vandergheynst, Micheal Bronstein, Bruno Correia
[ICLR2022 Machine Learning for Drug Discovery. 2022](https://openreview.net/forum?id=DwN81YIXGQP) • code not available

**End-to-End deep structure generative model for protein design**
Boqiao Lai, matthew McPartlon, Jinbo Xu
[bioRxiv 2022.07.09.499440](https://www.biorxiv.org/content/10.1101/2022.07.09.499440v1)

**Deep Generative Design of Epitope-Specific Binding Proteins by Latent Conformation Optimization**
Raphael R Eguchi, Christian A Choe, Udit Parekh, Irene S Khalek, Michael D Ward, Neha Vithani, Gregory R Bowman, Joseph G Jardine, Possu Huang
[bioRxiv 2022.12.22.521698](https://www.biorxiv.org/content/10.1101/2022.12.22.521698v1)

**Leveraging Deep Generative Model For Computational Protein Design And Optimization**
Boqiao Lai
[arXiv:2408.17241](https://arxiv.org/abs/2408.17241) • PhD thesis

**CyclicCAE: A Conformational Autoencoder for Efficient Heterochiral Macrocyclic Backbone Sampling**
Andrew C. Powers, P. Douglas Renfrew, Parisa Hosseinzadeh, Vikram Khipple Mulligan
[bioRxiv 2025.02.21.639569](https://www.biorxiv.org/content/10.1101/2025.02.21.639569v1)

### 3.3 MLP-based

**A backbone-centred energy function of neural networks for protein design**
Bin Huang, Yang Xu, Xiuhong Hu, Yongrui Liu, Shanhui Liao, Jiahai Zhang, Chengdong Huang, Jingjun Hong, Quan Chen & Haiyan Liu
[Nature (2022)](https://doi.org/10.1038/s41586-021-04383-5) • [code](https://zenodo.org/record/4533424#.YwP3UPFBwqs)

**De novo Design of Cavity-Containing Proteins with a Backbone-Centered Neural Network Energy Function**
Yang Xu, Xiuhong Hu, Chenchen Wang, Yongrui Liu, Quan Chen
Haiyan Liu
[Structure (2024)](https://www.cell.com/structure/fulltext/S0969-2126(24)00007-8)

### 3.4 Diffusion-based

**Diffusion probabilistic modeling of protein backbones in 3D for the motif-scaffolding problem**
Brian L. Trippe, Jason Yim, Doug Tischer, Tamara Broderick, David Baker, Regina Barzilay, Tommi Jaakkola
[arXiv:2206.04119](https://arxiv.org/abs/2206.04119v2)/[NeurIPS 2022](https://www.mlsb.io/papers_2022/Diffusion_probabilistic_modeling_of_protein_backbones_in_3D_for_the_motif_scaffolding_problem.pdf)/[ICLR 2023](https://openreview.net/forum?id=6TxBxqNME1Y) • [poster](https://nips.cc/media/PosterPDFs/NeurIPS%202022/d3d9446802a44259755d38e6d163e820.png?t=1667835607.0141048) • [Supplementary](https://openreview.net/attachment?id=6TxBxqNME1Y&name=supplementary_material) • [code](https://github.com/blt2114/ProtDiff_SMCDiff)

**ProteinSGM: Score-based generative modeling for de novo protein design**
Jin Sub Lee, Philip M Kim
[bioRxiv 2022.07.13.499967](https://www.biorxiv.org/content/10.1101/2022.07.13.499967v2)/[Nat Comput Sci (2023)](https://www.nature.com/articles/s43588-023-00440-3) • [code](https://gitlab.com/mjslee0921/proteinsgm)

**Protein structure generation via folding diffusion**
Kevin E. Wu, Kevin K. Yang, Rianne van den Berg, James Y. Zou, Alex X. Lu, Ava P. Amini
[arXiv:2209.15611](https://arxiv.org/abs/2209.15611v2)/[Nat Commun 15, 1059 (2024)](https://www.nature.com/articles/s41467-024-45051-2) • [code](https://github.com/microsoft/foldingdiff)

**Generating Novel, Designable, and Diverse Protein Structures by Equivariantly Diffusing Oriented Residue Clouds**
Yeqing Lin, Mohammed AlQuraishi
[arXiv:2301.12485v3](https://arxiv.org/abs/2301.12485v3) • [code](https://github.com/aqlaboratory/genie) • [news](https://www.dw.com/en/generative-ai-inventing-proteins-is-changing-medicine/a-66356415)

**SE(3) diffusion model with application to protein backbone generation**
Jason Yim, Brian L. Trippe, Valentin De Bortoli, Emile Mathieu, Arnaud Doucet, Regina Barzilay, Tommi Jaakkola
[arXiv:2302.02277](https://arxiv.org/abs/2302.02277v2)/[ICLR 2023](https://openreview.net/forum?id=6TxBxqNME1Y) • [code](https://github.com/jasonkyuyim/se3_diffusion) • [Supplementary](https://openreview.net/attachment?id=6TxBxqNME1Y&name=supplementary_material)

**A Latent Diffusion Model for Protein Structure Generation**
Cong Fu, Keqiang Yan, Limei Wang, Wing Yee Au, Michael McThrow, Tao Komikado, Koji Maruhashi, Kanji Uchino, Xiaoning Qian, Shuiwang Ji
[arXiv:2305.04120](https://arxiv.org/abs/2305.04120)

**Practical and Asymptotically Exact Conditional Sampling in Diffusion Models**
Luhuan Wu, Brian L. Trippe, Christian A. Naesseth, David M. Blei, John P. Cunningham
[arXiv:2306.17775](https://arxiv.org/abs/2306.17775) • [code](https://github.com/blt2114/twisted_diffusion_sampler)

**Dynamics-Informed Protein Design with Structure Conditioning**
Simon V. Mathis, Urszula Julia Komorowska, Mateja Jamnik, Pietro Lió
[WCBICML2023](https://icml-compbio.github.io/2023/papers/WCBICML2023_paper121.pdf)/[ICLR 2024](https://openreview.net/forum?id=jZPqf2G9Sw)

**ForceGen: End-to-end de novo protein generation based on nonlinear mechanical unfolding responses using a protein language diffusion model**
Bo Ni and David L. Kaplan and M. Buehler
[arXiv:2310.10605](https://arxiv.org/abs/2310.10605)/[Science Advances 10.6 (2024)](https://www.science.org/doi/10.1126/sciadv.adl4000) • [Supplementary](https://www.dropbox.com/scl/fi/33tnpd6u2xwermlvj22y9/SI_3_unfolding_movies_from_dataset.zip?rlkey=qno7rcitcdree8t9cj8wzg9sf&dl=0) • [code](https://github.com/lamm-mit/ProteinMechanicsDiffusionDesign)

**DiffSDS: A geometric sequence diffusion model for protein backbone inpainting**
Anonymous
[ICLR 2024](https://openreview.net/forum?id=2xYO9oxh0y)/[arXiv:2301.09642](https://arxiv.org/abs/2301.09642)

**A framework for conditional diffusion modelling with applications in motif scaffolding for protein design**
Kieran Didi, Francisco Vargas, Simon V Mathis, Vincent Dutordoir, Emile Mathieu, Urszula J Komorowska, Pietro Lio
[arXiv:2312.09236](https://arxiv.org/abs/2312.09236)

**TopoDiff: Improving Protein Backbone Generation with Topology-aware Latent Encoding**
Yuyang Zhang, Zihui (Zinnia) Ma, Haipeng Gong
[bioRxiv 2023.12.13.571602](https://www.biorxiv.org/content/10.1101/2023.12.13.571602v1)

**Improved motif-scaffolding with SE(3) flow matching**
Jason Yim, Andrew Campbell, Emile Mathieu, Andrew Y. K. Foong, Michael Gastegger, José Jiménez-Luna, Sarah Lewis, Victor Garcia Satorras, Bastiaan S. Veeling, Frank Noé, Regina Barzilay, Tommi S. Jaakkola
[arXiv:2401.04082](https://arxiv.org/abs/2401.04082)/[TMLR](https://openreview.net/forum?id=fa1ne8xDGn) • [code1](https://github.com/microsoft/frame-flow),[code2](https://github.com/microsoft/protein-frame-flow)

**DiffTopo: Fold exploration using coarse grained protein topology representations**
Yangyang Miao, Bruno Correia
[bioRxiv 2024.02.01.578456](https://www.biorxiv.org/content/10.1101/2024.02.01.578456v1)/ICLR 2024

**Diffusion models in protein structure and docking**
Jason Yim, Hannes Stärk, Gabriele Corso, Bowen Jing, Regina Barzilay, Tommi S. Jaakkola
[Wiley Interdisciplinary Reviews: Computational Molecular Science 14.2 (2024)](https://wires.onlinelibrary.wiley.com/doi/10.1002/wcms.1711) • review

**De novo antibody design with SE(3) diffusion**
Daniel Cutting, Frédéric A. Dreyer, David Errington, Constantin Schneider, Charlotte M. Deane
[arXiv:2405.07622](https://arxiv.org/abs/2405.07622)

**Out of Many, One: Designing and Scaffolding Proteins at the Scale of the Structural Universe with Genie 2**
Yeqing Lin, Minji Lee, Zhao Zhang, Mohammed AlQuraishi
[arXiv:2405.15489](https://arxiv.org/abs/2405.15489) • [code](https://github.com/aqlaboratory/genie2) • [news](https://www.marktechpost.com/2024/05/29/genie-2-transforming-protein-design-with-advanced-multi-motif-scaffolding-and-enhanced-structural-diversity/)

**Diffuse StructGen-1 (DSG-1)**
[the Diffuse team](https://www.linkedin.com/company/diffuse-bio/)
[technical appendix](https://diffuse.bio/updates.html#appendix) • commercial

**Floating Anchor Diffusion Model for Multi-motif Scaffolding**
Ke Liu, Weian Mao, Shuaike Shen, Xiaoran Jiao, Zheng Sun, Hao Chen, Chunhua Shen
[ICML 2024](https://proceedings.mlr.press/v235/liu24av.html)/[arXiv:2406.03141](https://arxiv.org/abs/2406.03141) • [code](https://github.com/aim-uofa/FADiff) • [poster](https://icml.cc/virtual/2024/poster/34654)

**De novo Design of A Fusion Protein Tool for GPCR Research**
Kaixuan Gao, Xin Zhang, Jia Nie, Hengyu Meng, Weishe Zhang, Boxue Tian, Xiangyu Liu
[bioRxiv 2024.09.14.613090](https://www.biorxiv.org/content/10.1101/2024.09.14.613090v1) • [Supplementary](https://www.biorxiv.org/content/biorxiv/early/2024/09/15/2024.09.14.613090/DC1/embed/media-1.pdf) • RFdiffusion-based

**Text2Protein: A Generative Model for Designated Protein Design on Given Description**
Ramtin Hosseini, Siyang Zhang, Pengtao Xie
[PREPRINT (Version 1) available at Research Square](https://doi.org/10.21203/rs.3.rs-4868665/v1) • [code](https://github.com/szhan227/text2protein)

**Improving diffusion-based protein backbone generation with global-geometry-aware latent encoding**
Yuyang Zhang, Yuhang Liu, Zinnia Ma, Min Li, Chunfu Xu, Haipeng Gong
[bioRxiv 2024.10.05.616664](https://www.biorxiv.org/content/10.1101/2024.10.05.616664v1) • [code](https://github.com/meneshail/TopoDiff)

**Diffusion Posterior Sampling via Sequential Monte Carlo for Zero-Shot Scaffolding of Protein Motifs**
Young, James Matthew Uygongco, and Omer Deniz Akyildiz
[Imperial CollegeofScience, Technology and Medicine, 2024](https://matsagad.com/files/papers/MRes_Project.pdf) • [code](https://github.com/matsagad/mres-project) • Master thesis • Genie-based

**Protein A-like Peptide Design Based on Diffusion and ESM2 Models**
Long Zhao, Qiang He, Huijia Song, Huijia Song,Tianqian Zhou, An Luo, Zhenguo Wen,Teng Wang, and Xiaozhu Lin
[Molecules 29.20 (2024)](https://www.mdpi.com/1420-3049/29/20/4965) • [code](https://github.com/tomlongcool/diffusion4Protein)

**FoldMark: Protecting Protein Generative Models with Watermarking**
Zaixi Zhang, Ruofan Jin, Kaidi Fu, Le Cong, Marinka Zitnik, Mengdi Wang
[arXiv:2410.20354](https://arxiv.org/abs/2410.20354) • [code](https://github.com/zaixizhang/FoldMark)

**ProteinWeaver: A Divide-and-Assembly Approach for Protein Backbone Design**
Yiming Ma, Fei Ye, Yi Zhou, Zaixiang Zheng, Dongyu Xue, Quanquan Gu
[arXiv:2411.16686](https://arxiv.org/abs/2411.16686)

**On Diffusion Posterior Sampling via Sequential Monte Carlo for Zero-Shot Scaffolding of Protein Motifs**
James Matthew Young, O. Deniz Akyildiz
[arXiv:2412.05788](https://arxiv.org/abs/2412.05788) • [code](https://github.com/matsagad/mres-project)

**From thermodynamics to protein design: Diffusion models for biomolecule generation towards autonomous protein engineering**
Wen-ran Li, Xavier F. Cadet, David Medina-Ortiz, Mehdi D. Davari, Ramanathan Sowdhamini, Cedric Damour, Yu Li, Alain Miranville, Frederic Cadet
[arXiv:2501.02680](https://arxiv.org/abs/2501.02680) • review

**RFdiffusion Exhibits Low Success Rate in De Novo Design of Functional Protein Binders for Biochemical Detection**
Bruce Jiang, Xiaoxiao Li, Amber Guo, Moris Wei, Jonny Wu
[bioRxiv 2025.02.07.636769](https://www.biorxiv.org/content/10.1101/2025.02.07.636769v1)

**From Atoms to Fragments: A Coarse Representation for Efficient and Functional Protein Design**
Leonardo V Castorina, Christopher W Wood, Kartic Subr
[bioRxiv 2025.03.19.644162](https://www.biorxiv.org/content/10.1101/2025.03.19.644162v2) • [Supplementary](https://www.biorxiv.org/content/biorxiv/early/2025/03/20/2025.03.19.644162/DC1/embed/media-1.pdf) • RFdiffusion-based

### 3.5 RL-based

**Top-down design of protein nanomaterials with reinforcement learning**
Isaac D Lutz, Shunzhi Wang, Christoffer Norn, Andrew J Borst, Yan Ting Zhao, Annie Dosey, Longxing Cao, Zhe Li, Minkyung Baek, Neil P King, Hannele Ruohola-Baker, David Baker
[bioRxiv 2022.09.25.509419](https://www.biorxiv.org/content/10.1101/2022.09.25.509419v1)/[Science380, 266-273(2023)](https://www.science.org/doi/10.1126/science.adf6591) • [code](https://github.com/idlutz/protein-backbone-MCTS),[code2](https://files.ipd.uw.edu/pub/2023_RL_capsid_design/sequence_design_pipeline.tar)

**Model-based reinforcement learning for protein backbone design**
Frederic Renard, Cyprien Courtot, Alfredo Reichlin, Oliver Bent
[arXiv:2405.01983](https://arxiv.org/abs/2405.01983)

**Target-based de novo design of cyclic peptide binders**
Fanhao Wang, Tiantian Zhang, Jintao Zhu, Xiaoling Zhang, Changsheng Zhang, Luhua Lai
[bioRxiv 2025.01.18.633746](https://www.biorxiv.org/content/10.1101/2025.01.18.633746v1) • [Supplementary](https://www.biorxiv.org/content/biorxiv/early/2025/01/19/2025.01.18.633746/DC1/embed/media-1.pdf)

### 3.6 Flow-based

**SE(3)-Stochastic Flow Matching for Protein Backbone Generation**
Avishek Joey Bose, Tara Akhound-Sadegh, Kilian Fatras, Guillaume Huguet, Jarrid Rector-Brooks, Cheng-Hao Liu, Andrei Cristian Nica, Maksym Korablyov, Michael Bronstein, Alexander Tong
[arXiv:2310.02391](https://arxiv.org/abs/2310.02391)/[ICLR 2024](https://openreview.net/forum?id=kJFIH23hXb)

**Fast protein backbone generation with SE(3) flow matching**
Jason Yim, Andrew Campbell, Andrew Y. K. Foong, Michael Gastegger, José Jiménez-Luna, Sarah Lewis, Victor Garcia Satorras, Bastiaan S. Veeling, Regina Barzilay, Tommi Jaakkola, Frank Noé
[arXiv:2310.05297](https://arxiv.org/abs/2310.05297) • [code](https://github.com/microsoft/frame-flow)

**Sequence-Augmented SE(3)-Flow Matching For Conditional Protein Backbone Generation**
Guillaume Huguet, James Vuckovic, Kilian Fatras, Eric Thibodeau-Laufer, Pablo Lemos, Riashat Islam, Cheng-Hao Liu, Jarrid Rector-Brooks, Tara Akhound-Sadegh, Michael Bronstein, Alexander Tong, Avishek Joey Bose
[arXiv:2405.20313](https://arxiv.org/abs/2405.20313)/[NeurIPS 2024](https://openreview.net/forum?id=paYwtPBpyZ) • [website](https://www.dreamfold.ai/blog/foldflow-2) • [lecture](https://www.youtube.com/watch?v=xgA8T9h8mm0)

**Design of Ligand-Binding Proteins with Atomic Flow Matching**
Junqi Liu, Shaoning Li, Chence Shi, Zhi Yang, Jian Tang
[arXiv:2409.12080](https://arxiv.org/abs/2409.12080)

**Proteina: Scaling Flow-based Protein Structure Generative Models**
Tomas Geffner, Kieran Didi, Zuobai Zhang, Danny Reidenbach, Zhonglin Cao, Jason Yim, Mario Geiger, Christian Dallago, Emine Kucukbenli, Arash Vahdat, Karsten Kreis
[ICLR 2025 Oral](https://openreview.net/forum?id=TVQLu34bdw) • [code](https://github.com/NVIDIA-Digital-Bio/proteina/) • [website](https://research.nvidia.com/labs/genair/proteina/) • [lecture](https://www.youtube.com/watch?v=Y2dRj9_ZEHw)

**ProtComposer: Compositional Protein Structure Generation with 3D Ellipsoids**
Hannes Stark, Bowen Jing, Tomas Geffner, Jason Yim, Tommi Jaakkola, Arash Vahdat, Karsten Kreis
[ICLR 2025 Oral](https://openreview.net/forum?id=0ctvBgKFgc) • [code](https://github.com/NVlabs/protcomposer) • [lecture](https://www.youtube.com/watch?v=2G0d-RePc7c)

### 3.7 Score-based

**Score-Based Generative Models for Designing Binding Peptide Backbones**
John D Boom, Matthew Greenig, Pietro Sormanni, Pietro Liò
[arXiv:2310.07051](https://arxiv.org/abs/2310.07051) • [code](https://github.com/mgreenig/loopgen)

**Building Confidence in Deep Generative Protein Design**
Tianyuan Zheng, Alessandro Rondina, Pietro Liò
[arXiv:2411.18568](https://arxiv.org/abs/2411.18568) • [code](https://github.com/ECburx/PROTEVAL)

## 4.Scaffold to Sequence

> Identify amino sequence from given backbone/scaffold/template constrains: torsion angles(φ & ψ), backbone angles(θ and τ), backbone dihedrals (φ, ψ & ω), backbone atoms (Cα, N, C, & O), Cα − Cα distance, unit direction vectors of Cα−Cα, Cα−N & Cα−C, etc(aka. inverse folding). Referred from [here](https://arxiv.org/abs/2202.01079). Energy-based models are also inculded for task of rotamer conformation(χ angles or atom coordinates) recovery.

### 4.0 Review

**Protein sequence design on given backbones with deep learning**
Yufeng Liu, Haiyan Liu
[Protein Engineering, Design and Selection, 2023](https://academic.oup.com/peds/advance-article-abstract/doi/10.1093/protein/gzad024/7503843)

**Multi-indicator comparative evaluation for deep Learning-Based protein sequence design methods**
Jinyu Yu, Junxi Mu, Ting Wei, Hai-Feng Chen
[Bioinformatics, 2024;, btae037](https://academic.oup.com/bioinformatics/advance-article/doi/10.1093/bioinformatics/btae037/7585533)

**Generative AI for Controllable Protein Sequence Design: A Survey**
Yiheng Zhu, Zitai Kong, Jialu Wu, Weize Liu, Yuqiang Han, Mingze Yin, Hongxia Xu, Chang-Yu Hsieh, Tingjun Hou
[arXiv:2402.10516](https://arxiv.org/abs/2402.10516)

### 4.1 MLP-based

**3D representations of amino acids-applications to protein sequence comparison and classification**
Li, Jie, and Patrice Koehl
[Computational and structural biotechnology journal 11.18 (2014)](https://www.sciencedirect.com/science/article/pii/S2001037014000270) • 2014

**Direct prediction of profiles of sequences compatible with a protein structure by neural networks with fragment-based local and energy-based nonlocal profiles**
Zhixiu Li, Yuedong Yang, Eshel Faraggi, Jian Zhan, Yaoqi Zhou
[Proteins: Structure, Function, and Bioinformatics 82.10 (2014)](https://onlinelibrary.wiley.com/doi/abs/10.1002/prot.24620) • code unavailable

**SPIN2: Predicting sequence profiles from protein structures using deep neural networks**
James O'Connell, Zhixiu Li, Jack Hanson, Rhys Heffernan, James Lyons, Kuldip Paliwal, Abdollah Dehzangi, Yuedong Yang, Yaoqi Zhou
[Proteins: Structure, Function, and Bioinformatics 86.6 (2018)](https://onlinelibrary.wiley.com/doi/abs/10.1002/prot.25489) • code unavailable

**Computational protein design with deep learning neural networks**
Jingxue Wang, Huali Cao, John Z. H. Zhang & Yifei Qi
[Scientific reports 8.1 (2018)](https://www.nature.com/articles/s41598-018-24760-x.pdf) • code unavailable

**Ligand-aware protein sequence design using protein self contacts**
Jody Mou, Benjamin Fry, Chun-Chen Yao, Nicholas Polizzi
[NeurIPS 2022](https://www.dropbox.com/s/98ri2f9gverljcw/Ligand-aware_protein_sequence_design_using_protein_self_contacts.pdf?dl=0)

**SeqPredNN: a neural network that generates protein sequences that fold into specified tertiary structures**
Lategan, F. Adriaan, Caroline Schreiber, and Hugh G. Patterton
[BMC bioinformatics 24.1 (2023)](https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-023-05498-4) • [code](https://github.com/falategan/SeqPredNN)

### 4.2 VAE-based

**Design of metalloproteins and novel protein folds using variational autoencoders**
Greener, Joe G., Lewis Moffat, and David T. Jones
[Scientific reports 8.1 (2018)](https://www.nature.com/articles/s41598-018-34533-1)

### 4.3 LSTM-based

**To improve protein sequence profile prediction through image captioning on pairwise residue distance map**
Sheng Chen, Zhe Sun, Lihua Lin, Zifeng Liu, Xun Liu, Yutian Chong, Yutong Lu, Huiying Zhao, and Yuedong Yang
[Journal of chemical information and modeling 60.1 (2019)](https://pubs.acs.org/doi/abs/10.1021/acs.jcim.9b00438) • [SPROF](https://github.com/biomed-AI/SPROF)

**Deep learning of Protein Sequence Design of Protein-protein Interactions**
Syrlybaeva, Raulia, and Eva-Maria Strauch
[bioRxiv (2022)](https://www.biorxiv.org/content/10.1101/2022.01.28.478262v1)/[Bioinformatics, 2022;, btac733](https://academic.oup.com/bioinformatics/advance-article/doi/10.1093/bioinformatics/btac733/6827796) • [Supplementary](https://www.biorxiv.org/content/10.1101/2022.01.28.478262v1.supplementary-material) • [cod