{"id":13685298,"url":"https://github.com/Peldom/papers_for_protein_design_using_DL","last_synced_at":"2025-05-01T04:30:33.789Z","repository":{"id":37658632,"uuid":"431884770","full_name":"Peldom/papers_for_protein_design_using_DL","owner":"Peldom","description":"List of papers about Proteins Design using Deep Learning","archived":false,"fork":false,"pushed_at":"2025-04-25T07:34:29.000Z","size":7035,"stargazers_count":1648,"open_issues_count":0,"forks_count":193,"subscribers_count":126,"default_branch":"main","last_synced_at":"2025-04-25T08:38:58.572Z","etag":null,"topics":["deep-learning","protein-design"],"latest_commit_sha":null,"homepage":"","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Peldom.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2021-11-25T15:01:27.000Z","updated_at":"2025-04-25T07:34:33.000Z","dependencies_parsed_at":"2023-01-31T20:46:24.686Z","dependency_job_id":"26c37b77-924a-454e-b5cd-7cde0fe2d374","html_url":"https://github.com/Peldom/papers_for_protein_design_using_DL","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Peldom%2Fpapers_for_protein_design_using_DL","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Peldom%2Fpapers_for_protein_design_using_DL/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Peldom%2Fpapers_for_protein_design_using_DL/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Peldom%2Fpapers_for_protein_design_using_DL/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Peldom","download_url":"https://codeload.github.com/Peldom/papers_for_protein_design_using_DL/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":251824034,"owners_count":21649789,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deep-learning","protein-design"],"created_at":"2024-08-02T14:00:48.411Z","updated_at":"2025-05-01T04:30:33.757Z","avatar_url":"https://github.com/Peldom.png","language":null,"funding_links":[],"categories":["Other lists"],"sub_categories":[],"readme":"# List of papers about Protein Design using Deep Learning\n\n\u003e This repository is inspired by the remarkable work of [Kevin Kaichuang Yang](https://github.com/yangkky) and their outstanding project [Machine-learning-for-proteins](https://github.com/yangkky/Machine-learning-for-proteins). We have established this repository to provide a specialized and focused platform for the field of **Deep Learning for Protein Design**, a rapidly advancing domain in computational biology.\n\u003e\n\u003e [Contributions](https://github.com/Peldom/papers_for_protein_design_using_DL/blob/main/CONTRIBUTING.md) and [suggestions](https://github.com/Peldom/papers_for_protein_design_using_DL/issues) are warmly welcome!\n\u003e Community Values, Guiding Principles, and Commitments for the Responsible Development of AI for Protein Design: [details](https://responsiblebiodesign.ai/)\n\n\u003c!-- \u003e\n\u003e1. Mini protein, binders, metalloprotein, antibody, peptide \u0026 molecule designs are included  \n\u003e2. More de novo protein design paper list at [Wangchentong](https://github.com/Wangchentong)'s GitHub repo: [paper_for_denovo_protein_design](https://github.com/Wangchentong/paper_for_denovo_protein_design)  \n\u003e3. Our notes of these papers are shared in a **[Zhihu Column](https://www.zhihu.com/column/c_1475864742820929537)** (simplified Chinese/English), more suggested notes at [RosettAI](https://www.zhihu.com/column/rosettastudy)   --\u003e\n\n*Papers last week, updated on 2025.04.25:*\n+   The Dance of Atoms-De Novo Protein Design with Diffusion Model\n    + [[arXiv:2504.16479](https://arxiv.org/abs/2504.16479)] • review\n+   A novel decoding strategy for ProteinMPNN to design with less MHC Class I immune-visibility\n    + [[bioRxiv 2025.04.14.648837](https://www.biorxiv.org/content/10.1101/2025.04.14.648837v1)] • ProteinMPNN-based\n+   An All-Atom Generative Model for Designing Protein Complexes\n    + [[arXiv:2504.13075](https://arxiv.org/abs/2504.13075)] • [[code](https://github.com/bytedance/apm)]\n+   Crowdsourced Protein Design: Lessons From the Adaptyv EGFR Binder Competition\n    + [[bioRxiv 2025.04.17.648362](https://www.biorxiv.org/content/10.1101/2025.04.17.648362v2)] • [[github](https://github.com/adaptyvbio/egfr2024_post_competition)]\n+   Designing Novel Solenoid Proteins with In Silico Evolution\n    + [[bioRxiv 2025.04.23.646631](https://www.biorxiv.org/content/10.1101/2025.04.23.646631v1)] • [[Supplementary](https://www.biorxiv.org/content/biorxiv/early/2025/04/24/2025.04.23.646631/DC1/embed/media-1.pdf)]\n\n\n\n---\n\n\u003cp align=\"center\"\u003e\n  \u003cbr\u003e\n  \u003c!-- \u003cimg src=\"dl_pd.png\" alt=\"deep learning for protein design\" width=\"500\"\u003e --\u003e\n  \u003cimg src=\"cover.jpg\" alt=\"deep learning for protein design\"\u003e\n\u003c/p\u003e\n\u003c!-- ## Menu --\u003e\n\u003c!-- \u003e Heading [[2]](#2-model-based-design) follows a **\"generator-predictor-optimizer\" paradigm**, Heading [[3]](#3-function-to-scaffold), [[4]](#4scaffold-to-sequence)\u0026[[6]](#6-function-to-structure) follow [\"Inside-out\" paradigm](https://www.nature.com/articles/nature19946)(*function-scaffold-sequence*) from [RosettaCommons](https://www.rosettacommons.org/), Heading [[5]](#5function-to-sequence)\u0026[[7]](#7-other-tasks) follow other ML/DL strategies   --\u003e\n\u003cp align='center'\u003e\n  \u003cstrong\u003e\u003ca href='#0-benchmarks-and-datasets'\u003e0) Benchmarks and datasets \u003c/a\u003e\u003c/strong\u003e\n  \u003cbr\u003e\n  \u003ca href=\"#01-sequence-datasets-benchmarks\"\u003eSequence dataset/benchmarks\u003c/a\u003e •\n  \u003ca href=\"#02-structure-datasets-benchmarks\"\u003eStructure datasets/benchmarks\u003c/a\u003e •\n  \u003ca href=\"#03-databases\"\u003ePublic database\u003c/a\u003e •\n  \u003ca href=\"#04-similar-list\"\u003eSimilar list\u003c/a\u003e •\n  \u003ca href=\"#05-guides\"\u003eGuides\u003c/a\u003e\n  \u003cbr\u003e\n  \u003cstrong\u003e\u003ca href=\"#1-reviews\"\u003e1) Reviews and surveys\u003c/a\u003e\u003c/strong\u003e\n  \u003cbr\u003e\n  \u003ca href=\"#11-de-novo-protein-design\"\u003eDe novo design\u003c/a\u003e •\n  \u003ca href=\"#12-antibody-design\"\u003eAntibody design\u003c/a\u003e •\n  \u003ca href=\"#13-peptide-design\"\u003ePeptide design\u003c/a\u003e •\n  \u003ca href=\"#14-binder-design\"\u003eBinder design\u003c/a\u003e •\n  \u003ca href=\"#15-enzyme-design\"\u003eEnzyme design\u003c/a\u003e\n  \u003cbr\u003e\n  \u003cstrong\u003e\u003ca href=\"#2-model-based-design\"\u003e2) Model-based design\u003c/a\u003e\u003c/strong\u003e\n  \u003cbr\u003e\n  \u003ca href=\"#21-trrosetta-based\"\u003etrRosetta-based\u003c/a\u003e •\n  \u003ca href=\"#22-alphafold2-based\"\u003eAlphaFold2-based\u003c/a\u003e •\n  \u003ca href=\"#23-dmpfold2-based\"\u003eDMPfold2-based\u003c/a\u003e •\n  \u003ca href=\"#24-cm-align\"\u003eCM-Align\u003c/a\u003e •\n  \u003ca href=\"#25-msa-transformer-based\"\u003eMSA transformer-based\u003c/a\u003e •\n  \u003ca href=\"#26-deepab-based\"\u003eDeepAb-based\u003c/a\u003e •\n  \u003ca href=\"#27-trfold2-based\"\u003eTRFold2-based\u003c/a\u003e •\n  \u003ca href=\"#28-gpt-based\"\u003eGPT-based\u003c/a\u003e •\n  \u003ca href=\"#29-esm-based\"\u003eESM-based\u003c/a\u003e •\n  \u003ca href=\"#210-antiberta-based\"\u003eAntiberta-based\u003c/a\u003e •\n  \u003ca href=\"#211-boltz-based\"\u003eBoltz-based\u003c/a\u003e •\n  \u003ca href=\"#212-sampling-algorithms\"\u003eSampling-algorithms\u003c/a\u003e\n  \u003cbr\u003e\n  \u003cstrong\u003e\u003ca href=\"#3-function-to-scaffold\" class=\"large-link\"\u003e3) Function to Scaffold\u003c/a\u003e\u003c/strong\u003e\n  \u003cbr\u003e\n  \u003ca href=\"#31-gan-based\"\u003eGAN-based\u003c/a\u003e •\n  \u003ca href=\"#32-autoencoder-based\"\u003eAutoEncoder-based\u003c/a\u003e •\n  \u003ca href=\"#33-mlp-based\"\u003eMLP-based\u003c/a\u003e •\n  \u003ca href=\"#34-diffusion-based\"\u003eDiffusion-based\u003c/a\u003e •\n  \u003ca href=\"#35-rl-based\"\u003eRL-based\u003c/a\u003e •\n  \u003ca href=\"#36-flow-based\"\u003eFlow-based\u003c/a\u003e •\n  \u003ca href=\"#37-score-based\"\u003eScore-based\u003c/a\u003e\n  \u003cbr\u003e\n  \u003cstrong\u003e\u003ca href=\"#4scaffold-to-sequence\"\u003e4) Scaffold to Sequence\u003c/a\u003e\u003c/strong\u003e\n  \u003cbr\u003e\n  \u003ca href=\"#40-review\"\u003eReview\u003c/a\u003e •\n  \u003ca href=\"#41-mlp-based\"\u003eMLP-based\u003c/a\u003e •\n  \u003ca href=\"#42-vae-based\"\u003eVAE-based\u003c/a\u003e •\n  \u003ca href=\"#43-lstm-based\"\u003eLSTM-based\u003c/a\u003e •\n  \u003ca href=\"#44-cnn-based\"\u003eCNN-based\u003c/a\u003e •\n  \u003ca href=\"#45-gnn-based\"\u003eGNN-based\u003c/a\u003e •\n  \u003ca href=\"#46-gan-based\"\u003eGAN-based\u003c/a\u003e •\n  \u003ca href=\"#47-transformer-based\"\u003eTransformer-based\u003c/a\u003e •\n  \u003ca href=\"#48-resnet-based\"\u003eResNet-based\u003c/a\u003e •\n  \u003ca href=\"#49-diffusion-based\"\u003eDiffusion-based\u003c/a\u003e •\n  \u003ca href=\"#410-bayesian-based\"\u003eBayesian method\u003c/a\u003e •\n  \u003ca href=\"#411-flow-based\"\u003eFlow-based\u003c/a\u003e\n  \u003cbr\u003e\n  \u003cstrong\u003e\u003ca href=\"#5function-to-sequence\"\u003e5) Function to Sequence\u003c/a\u003e\u003c/strong\u003e\n  \u003cbr\u003e\n  \u003ca href=\"#51-cnn-based\"\u003eCNN-based\u003c/a\u003e •\n  \u003ca href=\"#52-vae-based\"\u003eVAE-based\u003c/a\u003e •\n  \u003ca href=\"#53-gan-based\"\u003eGAN-based\u003c/a\u003e •\n  \u003ca href=\"#54-transformer-based\"\u003eTransformer-based\u003c/a\u003e •\n  \u003ca href=\"#55-bayesian-based\"\u003eBayesian method\u003c/a\u003e •\n  \u003ca href=\"#56-rl-based\"\u003eReinforcement Learning\u003c/a\u003e •\n  \u003ca href=\"#57-flow-based\"\u003eFlow-based\u003c/a\u003e •\n  \u003ca href=\"#58-rnn-based\"\u003eRNN-based\u003c/a\u003e •\n  \u003ca href=\"#59-lstm-based\"\u003eLSTM-based\u003c/a\u003e •\n  \u003ca href=\"#510-autoregressive-models\"\u003eAutoregressive\u003c/a\u003e •\n  \u003ca href=\"#511-boltzmann-machine-based\"\u003eBoltzmann machine\u003c/a\u003e •\n  \u003ca href=\"#512-diffusion-based\"\u003eDiffusion-based\u003c/a\u003e •\n  \u003ca href=\"#513-gnn-based\"\u003eGNN-based\u003c/a\u003e •\n  \u003ca href=\"#514-score-based\"\u003eScore-based\u003c/a\u003e\n  \u003cbr\u003e\n  \u003cstrong\u003e\u003ca href=\"#6-function-to-structure\"\u003e6) Function to Structure\u003c/a\u003e\u003c/strong\u003e\n  \u003cbr\u003e\n  \u003ca href=\"#60-review\"\u003eReview\u003c/a\u003e •\n  \u003ca href=\"#61-lstm-based\"\u003eLSTM-based\u003c/a\u003e •\n  \u003ca href=\"#62-diffusion-based\"\u003eDiffusion-based\u003c/a\u003e •\n  \u003ca href=\"#63-rosettafold-based\"\u003eRoseTTAFold-based\u003c/a\u003e •\n  \u003ca href=\"#64-cnn-based\"\u003eCNN-based\u003c/a\u003e •\n  \u003ca href=\"#65-gnn-based\"\u003eGNN-based\u003c/a\u003e •\n  \u003ca href=\"#66-transformer-based\"\u003eTransformer-based\u003c/a\u003e •\n  \u003ca href=\"#67-mlp-based\"\u003eMLP-based\u003c/a\u003e •\n  \u003ca href=\"#68-flow-based\"\u003eFlow-based\u003c/a\u003e •\n  \u003ca href=\"#69-alphafold-based\"\u003eAlphaFold-based\u003c/a\u003e\n  \u003cbr\u003e\n  \u003cstrong\u003e\u003ca href=\"#7-other-tasks\"\u003e7) Other\u003c/a\u003e\u003c/strong\u003e\n  \u003cbr\u003e\n  \u003ca href=\"#71-effects-of-mutation--fitness-landscape\"\u003eEffects of mutations \u0026 Fitness Landscape\u003c/a\u003e  •\n  \u003ca href=\"#72-protein-language-models-plm-and-representation-learning\"\u003eProtein Language Model \u0026 Representation Learning\u003c/a\u003e  •\n  \u003ca href=\"#73-molecular-design-models\"\u003eMolecular Design Model\u003c/a\u003e •\n  \u003ca href=\"#74-unclassified\"\u003eUnclassified\u003c/a\u003e\n\u003c/p\u003e\n\n---\n\n\n## 0. Benchmarks and datasets\n\n### 0.1 Sequence Datasets, Benchmarks\n\n**FLIP: Benchmark tasks in fitness landscape inference for proteins**  \nChristian Dallago, Jody Mou, Kadina E Johnston, Bruce Wittmann, Nick Bhattacharya, Samuel Goldman, Ali Madani, Kevin K Yang  \n[NeurIPS 2021 Datasets and Benchmarks Track](https://openreview.net/forum?id=p2dMLEwL8tF)/[bioRxiv 2021](https://www.biorxiv.org/content/10.1101/2021.11.09.467890v2) • [website](https://benchmark.protein.properties/) • [code](https://github.com/J-SNACKKB/FLIP) • [Supplementary](https://www.biorxiv.org/content/biorxiv/early/2022/01/19/2021.11.09.467890/DC1/embed/media-1.pdf)\n\n**A Benchmark Framework for Evaluating Structure-to-Sequence Models for Protein Design**  \nJeffrey Chan, Seyone Chithrananda, David Brookes, Sam Sinai  \nPaper unavailable at [Machine Learning in Structural Biology Workshop 2022](https://nips.cc/Conferences/2022/ScheduleMultitrack?event=50005)\n\n**PDBench: Evaluating Computational Methods for Protein-Sequence Design**  \nLeonardo V Castorina, Rokas Petrenas, Kartic Subr, Christopher W Wood  \n[Bioinformatics, 2023;, btad027](https://academic.oup.com/bioinformatics/advance-article/doi/10.1093/bioinformatics/btad027/6986968) • [code](https://github.com/wells-wood-research/PDBench)\n\n**Benchmarking deep generative models for diverse antibody sequence design**  \nIgor Melnyk, Payel Das, Vijil Chenthamarakshan, Aurelie Lozano  \n[arXiv:2111.06801](https://arxiv.org/abs/2111.06801)\n\n**The Protein Engineering Tournament: An Open Science Benchmark for Protein Modeling and Design**  \nChase Armer, Hassan Kane, Dana Cortade, Dave Estell, Adil Yusuf, Radhakrishna Sanka, Henning Redestig, TJ Brunette, Pete Kelly, Erika DeBenedictis  \n[arXiv:2309.09955](https://arxiv.org/abs/2309.09955v2)\n\n**Computational Scoring and Experimental Evaluation of Enzymes Generated by Neural Networks**  \nSean R.Johnson, Xiaozhi Fu, Sandra Viknander, Clara Goldin, Sarah Monaco, Aleksej Zelezniak, Kevin K. Yang  \n[bioRxiv (2023)](https://www.biorxiv.org/content/10.1101/2023.03.04.531015v2) • [code](https://github.com/seanrjohnson/protein_scoring)\n\n**FLOP: Tasks for Fitness Landscapes Of Protein Wildtypes**  \nPeter Mørch Groth, Richard Michael, Jesper Salomon, Pengfei Tian, Wouter Boomsma  \n[bioRxiv 2023.06.21.545880](https://www.biorxiv.org/content/10.1101/2023.06.21.545880v2) • [code](https://github.com/petergroth/FLOP)\n\n**ProteinGym: Large-Scale Benchmarks for Protein Design and Fitness Prediction**  \nPascal Notin, Aaron W Kollasch, Daniel Ritter, Lood van Niekerk, Steffanie Paul, Hansen Spinner, Nathan Rollins, Ada Shaw, Ruben Weitzman, Jonathan Frazer, Mafalda Dias, Dinko Franceschi, Rose Orenbuch, Yarin Gal, Debora S Marks  \n[bioRxiv 2023.12.07.570727](https://biorxiv.org/content/10.1101/2023.12.07.570727v1) • [code](https://github.com/OATML-Markslab/ProteinGym)\n\n**Results of the Protein Engineering Tournament: An Open Science Benchmark for Protein Modeling and Design**  \nChase Armer, Hassan Kane, Dana L. Cortade, Henning Redestig, David A. Estell, Adil Yusuf, Nathan Rollins, Hansen Spinner, Debora Marks, TJ Brunette, Peter J. Kelly, Erika DeBenedictis  \n[bioRxiv 2024.08.12.606135](https://www.biorxiv.org/content/10.1101/2024.08.12.606135v1) • [code](https://github.com/the-protein-engineering-tournament/pet-pilot-2023) • [Supplementary](https://www.biorxiv.org/content/biorxiv/early/2024/08/12/2024.08.12.606135/DC1/embed/media-1.pdf)\n\n**Generative AI Models for the Protein Scaffold Filling Problem**  \nLetu Qingge, Kushal Badal, Richard Annan, Jordan Sturtz, Xiaowen Liu, and Binhai Zhu  \n[Journal of Computational Biology](https://www.liebertpub.com/doi/10.1089/cmb.2024.0510)\n\n**Benchmarking Inverse Folding Models for Antibody CDR Sequence Design**  \nPer Junior Greisen, Yifan Li, Yuxiang Lang, Chenrui Xu, Yi Zhou, Ziwei Pang  \n[bioRxiv 2024.12.16.628614](https://www.biorxiv.org/content/10.1101/2024.12.16.628614v1)\n\n**Self-supervised machine learning methods for protein design improve sampling but not the identification of high-fitness variants**  \nMoritz Ertelt, Rocco Moretti, Jens Meiler, and Clara T. Schoeder  \n[Science Advances 11.7 (2025)](https://www.science.org/doi/10.1126/sciadv.adr7338) • [code](https://github.com/meilerlab/probabilities_design)\n\n**Crowdsourced Protein Design: Lessons From the Adaptyv EGFR Binder Competition**  \nTudor-Stefan Cotet, Igor Krawczuk, Filippo Stocco, Noelia Ferruz, Anthony Gitter, Yoichi Kurumida, Lucas de Almeida Machado, Francesco Paesani, Cianna N. Calia, Chance A. Challacombe, Nikhil Haas, Ahmad Qamar, Bruno E. Correia, Martin Pacesa, Lennart Nickel, Kartic Subr, Leonardo V. Castorina, Maxwell J. Campbell, Constance Ferragu, Patrick Kidger, Logan Hallee, Christopher W. Wood, Michael J. Stam, Tadas Kluonis, Suleyman Mert Unal, Elian Belot, Alexander Naka, Adaptyv Competition Organizers  \n[bioRxiv 2025.04.17.648362](https://www.biorxiv.org/content/10.1101/2025.04.17.648362v2) • [github](https://github.com/adaptyvbio/egfr2024_post_competition)\n\n### 0.2 Structure Datasets, Benchmarks\n\n**AlphaDesign: A graph protein design method and benchmark on AlphaFoldDB**  \nZhangyang Gao, Cheng Tan, Stan Z. Li  \n[arxiv (2022)](https://arxiv.org/abs/2202.01079)\n\n**SidechainNet: An All-Atom Protein Structure Dataset for Machine Learning**  \nJonathan E. King, David Ryan Koes  \n[arxiv](https://arxiv.org/abs/2010.08162) • [github::sidechainnet](https://github.com/jonathanking/sidechainnet)\n\n[TDC](https://tdcommons.ai/overview/) maintains a resource list that currently contains 22 tasks (and its datasets) related to small molecules and macromolecules, including PPI, DDI and so on. [MoleculeNet](https://github.com/GLambard/Molecules_Dataset_Collection) published a small molecule related benchmark four years ago.\n\n\u003e In terms of datasets and benchmarks, protein design is far less mature than drug discovery ([paperwithcode drug discovery benchmarks](https://paperswithcode.com/task/drug-discovery)). (Maybe should add the evaluation of protein design for deep learning method (especially deep generative model))\n\u003e Difficulties and opportunities always coexist. Happy to see the work of [Christian Dallago, Jody Mou, Kadina E. Johnston, Bruce J. Wittmann, Nicholas Bhattacharya, Samuel Goldman, Ali Madani, Kevin K. Yang](https://www.biorxiv.org/content/10.1101/2021.11.09.467890v1) and [Zhangyang Gao, Cheng Tan, Stan Z. Li](https://arxiv.org/abs/2202.01079).\n\n**Sampling of structure and sequence space of small protein folds**  \nThomas W. Linsky, Kyle Noble, Autumn R. Tobin, Rachel Crow, Lauren Carter, Jeffrey L. Urbauer, David Baker \u0026 Eva-Maria Strauch  \n[Nat Commun 13, 7151 (2022)](https://www.nature.com/articles/s41467-022-34937-8) • [code](https://github.com/strauchlab/scaffold_design) • [Supplementary](https://static-content.springer.com/esm/art%3A10.1038%2Fs41467-022-34937-8/MediaObjects/41467_2022_34937_MOESM1_ESM.pdf)\n\n**OpenProteinSet: Training data for structural biology at scale**  \nGustaf Ahdritz, Nazim Bouatta, Sachin Kadyan, Lukas Jarosch, Daniel Berenberg, Ian Fisk, Andrew M. Watkins, Stephen Ra, Richard Bonneau, Mohammed AlQuraishi  \n[arXiv:2308.05326](https://arxiv.org/abs/2308.05326) • [OpenFold](https://github.com/aqlaboratory/openfold)\n\n**ProteinInvBench: Benchmarking Protein Design on Diverse Tasks, Models, and Metrics**  \nZhangyang Gao, Cheng Tan, Yijie Zhang, Xingran Chen, Stan Z. Li  \n[GitHub](https://github.com/A4Bio/ProteinInvBench)\n\n**PDB-Struct: A Comprehensive Benchmark for Structure-based Protein Design**  \nChuanrui Wang, Bozitao Zhong, Zuobai Zhang, Narendra Chaudhary, Sanchit Misra, Jian Tang  \n[arXiv preprint arXiv:2312.00080 (2023)](https://arxiv.org/abs/2312.00080) • [code](https://github.com/WANG-CR/PDB-Struct)\n\n**Scaffold-Lab: Critical Evaluation and Ranking of Protein Backbone Generation Methods in A Unified Framework**  \nZhuoqi Zheng, Bo Zhang, Bozitao Zhong, Kexin Liu, Jinyu Yu, Zhengxin Li, JunJie Zhu, Ting Wei, Hai-Feng Chen  \n[bioRxiv 2024.02.10.579743](https://www.biorxiv.org/content/10.1101/2024.02.10.579743v1) • [code](https://github.com/Immortals-33/Scaffold-Lab) • [Supplementary](https://www.biorxiv.org/content/biorxiv/early/2024/02/12/2024.02.10.579743/DC1/embed/media-1.pdf)\n\n**Antibody DomainBed: Out-of-Distribution Generalization in Therapeutic Protein Design**  \nNataša Tagasovska, Ji Won Park, Matthieu Kirchmeyer, Nathan C. Frey, Andrew Martin Watkins, Aya Abdelsalam Ismail, Arian Rokkum Jamasb, Edith Lee, Tyler Bryson, Stephen Ra, Kyunghyun Cho  \n[arXiv:2407.21028](https://arxiv.org/abs/2407.21028) • [code](https://github.com/prescient-design/antibody-domainbed) • [dataset](https://www.dropbox.com/scl/fo/e670i9adp29yv2knfu6wd/h?rlkey=uax6phjjfumkk8xoxrbwcit1h\u0026e=1\u0026dl=0)\n\n**Large protein databases reveal structural complementarity and functional locality**  \nPaweł Szczerbiak, Lukasz Szydlowski, Witold Wydmański, P. Douglas Renfrew, Julia Koehler Leman, Tomasz Kosciolek  \n[bioRxiv 2024.08.14.607935](https://www.biorxiv.org/content/10.1101/2024.08.14.607935v1) • [code](https://github.com/Tomasz-Lab/protein-structure-landscape) • [Supplementary](https://www.biorxiv.org/content/biorxiv/early/2024/08/14/2024.08.14.607935/DC1/embed/media-1.pdf) • [website](https://protein-structure-landscape.sano.science/)\n\n**The Protein Design Archive (PDA): insights from 40 years of protein design**  \nMarta Chronowska, Michael J. Stam, Derek N. Woolfson, Luigi F. Di Constanzo, Christopher W. Wood  \n[bioRxiv 2024.09.05.611465](https://www.biorxiv.org/content/10.1101/2024.09.05.611465v1)/[Nat Biotechnol (2025)](https://www.nature.com/articles/s41587-025-02607-x) • [code](https://github.com/wells-wood-research/chronowska-stam-wood-2024-protein-design-archive) • [Supplementary](hhttps://www.biorxiv.org/content/biorxiv/early/2024/09/07/2024.09.05.611465/DC1/embed/media-1.docx) • [website](https://pragmaticproteindesign.bio.ed.ac.uk/pda/)\n\n**ProteinBench: A Holistic Evaluation of Protein Foundation Models**  \nFei Ye, Zaixiang Zheng, Dongyu Xue, Yuning Shen, Lihao Wang, Yiming Ma, Yan Wang, Xinyou Wang, Xiangxin Zhou, Quanquan Gu  \n[arXiv:2409.06744](https://arxiv.org/abs/2409.06744) • [code](https://proteinbench.github.io/)\n\n**Benchmarking Generative Models for Antibody Design \u0026 Exploring Log-Likelihood for Sequence Ranking**  \nTalip Uçar, Cedric Malherbe, Ferran Gonzalez  \n[bioRxiv 2024.10.07.617023](https://www.biorxiv.org/content/10.1101/2024.10.07.617023v3) • [code](https://github.com/AstraZeneca/DiffAbXL)\n\n**Towards Robust Evaluation of Protein Generative Models: A Systematic Analysis of Metrics**  \nPavel Strashnov, Andrey Shevtsov, Viacheslav Meshchaninov, Maria Ivanova, Fedor Nikolaev, Olga Kardymon, Dmitry Vetrov  \n[bioRxiv 2024.10.25.620213](https://www.biorxiv.org/content/10.1101/2024.10.25.620213v1)\n\n**MotifBench: A standardized protein design benchmark for motif-scaffolding problems**  \nZhuoqi Zheng, Bo Zhang, Kieran Didi, Kevin K. Yang, Jason Yim, Joseph L. Watson, Hai-Feng Chen, Brian L. Trippe  \n[arXiv:2502.12479](https://arxiv.org/abs/2502.12479) • [code](https://github.com/blt2114/MotifBench)\n\n**Systematic comparison of Generative AI-Protein Models reveals fundamental differences between structural and sequence-based approaches**  \nAlexander J Barnett, KC Rajendra, Pratikshya Pandey, Pamodha Somasiri, Kirsten A Fairfax, Sandy Hung, Alex W Hewitt  \n[bioRxiv 2025.03.23.644844](https://www.biorxiv.org/content/10.1101/2025.03.23.644844v1) • [code](https://github.com/hewittlab/Systematic-comparison-of-Generative-AI-Protein-Models) • [Supplementary](https://www.biorxiv.org/content/biorxiv/early/2025/03/24/2025.03.23.644844/DC1/embed/media-1.docx)\n\n### 0.3 Databases\n\n\u003e A list of suggested protein databases, more lists at [CNCB](https://ngdc.cncb.ac.cn/databasecommons/).\n\n#### 0.3.1 Sequence Database\n\n1. [UniProt](https://www.uniprot.org/downloads)\n2. [DisProt](https://disprot.org)\n3. [MobiDB](https://mobidb.bio.unipd.it/)\n4. [Peptipedia](https://app.peptipedia.cl/)\n\n#### 0.3.2 Structure Database\n\n| Database                                                    | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |\n| ----------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |\n| [PDB](https://www.rcsb.org/)                                   | The Protein Data Bank (PDB) is a database of 3D structural data of large biological molecules, such as proteins and nucleic acids. These data are gathered using experimental methods such as X-ray crystallography, NMR spectroscopy, or cryo-electron microscopy                                                                                                                                                                                                                                                               |\n| [AlphaFoldDB](https://alphafold.ebi.ac.uk/)                    | AlphaFoldDB is a database of protein structure predictions produced by DeepMind's AlphaFold system. It provides highly accurate predictions of protein 3D structures                                                                                                                                                                                                                                                                                                                                                             |\n| [PDBbind](http://www.pdbbind.org.cn/download.php)              | PDBbind is a comprehensive collection of the binding data of all types of biomolecular complexes in the PDB database. It is primarily used for the development and validation of computational methods for predicting molecular interactions                                                                                                                                                                                                                                                                                     |\n| [AB-Bind](https://github.com/sarahsirin/AB-Bind-Database)      | AB-Bind is a database for antibody binding affinity data. It offers a curated set of experimental binding data and corresponding antibody-protein complex structures                                                                                                                                                                                                                                                                                                                                                             |\n| [AntigenDB](http://crdd.osdd.net/raghava/antigendb/)           | AntigenDB is a manually curated database of experimentally verified antigens that includes detailed information about the antigen, the source organism, and the associated antibodies                                                                                                                                                                                                                                                                                                                                            |\n| [CAMEO](https://www.cameo3d.org/)                              | CAMEO (Continuous Automated Model EvaluatiOn) is a project for the automated evaluation of methods predicting macromolecular structure. It continuously assesses the performance of automated protein structure prediction servers                                                                                                                                                                                                                                                                                               |\n| [CAPRI](https://www.ebi.ac.uk/msd-srv/capri/)                  | The Critical Assessment of PRediction of Interactions (CAPRI) is a community-wide experiment to evaluate protein-protein interaction prediction methods                                                                                                                                                                                                                                                                                                                                                                          |\n| [PIFACE](http://prism.ccbb.ku.edu.tr/piface)                   | PIFACE is a web server for the prediction of protein-protein interactions. It identifies potential interaction interfaces on protein surfaces                                                                                                                                                                                                                                                                                                                                                                                    |\n| [SAbDab](http://opig.stats.ox.ac.uk/webapps/newsabdab/sabdab/) | The Structural Antibody Database (SAbDab) is an automatically updated resource for the structural information of antibodies from the PDB. It allows for easy access to curated, annotated, and classified antibody structures                                                                                                                                                                                                                                                                                                    |\n| [SKEMPI v2.0](https://life.bsc.es/pid/skempi2)                 | SKEMPI 2.0 is a database of experimental measurements of the change in binding free energy caused by mutations in protein-protein complexes                                                                                                                                                                                                                                                                                                                                                                                      |\n| [ProtCAD](http://dunbrack2.fccc.edu/protcad/)                  | ProtCAD is a suite of tools for the design and engineering of novel protein structures, sequences, and functions. It allows users to build and manipulate complex protein structures, generate and evaluate sequence libraries, and simulate mutational effects. ProtCAD is a suite of tools for the design and engineering of novel protein structures, sequences, and functions. It allows users to build and manipulate complex protein structures, generate and evaluate sequence libraries, and simulate mutational effects. |\n\n### 0.4 Similar List\n\n\u003e Some similar GitHub lists that include papers about protein design using deep learning:\n\n1. [design_tools](https://github.com/hefeda/design_tools/blob/main/README.md)\n2. [awesome-AI-based-protein-design](https://github.com/opendilab/awesome-AI-based-protein-design)\n3. [ProteinStructureWithDL](https://github.com/Yang-J-LIN/ProteinStructureWithDL)\n4. [List of available bioinformatic tools and services](https://neurosnap.ai/services)\n\n### 0.5 Guides\n\nGuides/Tutorials for beginners on GitHub:\n\n1. [how_to_create_a_protein](https://github.com/universvm/how_to_create_a_protein)\n2. [protein-design-tutorials](https://github.com/ProteinDesignLab/protein-design-tutorials)\n\nCollection of Protein Design Labs:\n\n- [ProteinDesignLabs](https://github.com/Zuricho/ProteinDesignLabs)\n\n## 1. Reviews\n\n### 1.1 De novo protein design\n\n**Protein design: from computer models to artificial intelligence**  \nAntonella Paladino, Filippo Marchetti, Silvia Rinaldi, Giorgio Colombo  \n[Wiley Interdisciplinary Reviews: Computational Molecular Science 7.5 (2017): e1318](https://wires.onlinelibrary.wiley.com/doi/10.1002/wcms.1318)\n\n**Advances in protein structure prediction and design**  \nBrian Kuhlman, Philip Bradley  \n[Nat Rev Mol Cell Biol 20, 681-697 (2019)](https://www.nature.com/articles/s41580-019-0163-x)\n\n**Deep learning in protein structural modeling and design**  \nWenhao Gao, Sai Pooja Mahajan, Jeremias Sulam, and Jeffrey J. Gray  \n[Patterns 1.9](https://www.sciencedirect.com/science/article/pii/S2666389920301902) • 2020\n\n**100th anniversary of macromolecular science viewpoint: Data-driven protein design**  \nFerguson, Andrew L., and Rama Ranganathan  \n[ACS Macro Letters 10.3 (2021)](https://pubs.acs.org/doi/abs/10.1021/acsmacrolett.0c00885)\n\n**Artificial intelligence in early drug discovery enabling precision medicine**  \nFabio Bonioloa, Emilio Dorigattia, Alexander J. Ohnmachta, Dieter Saurb, Benjamin Schuberta, and Michael P. Menden  \n[Expert Opinion on Drug Discovery 16.9 (2021)](https://www.tandfonline.com/doi/full/10.1080/17460441.2021.1918096)\n\n**Protein design with deep learning**  \nDefresne, Marianne, Sophie Barbe, and Thomas Schiex  \n[International Journal of Molecular Sciences 22.21 (2021)](https://www.mdpi.com/1422-0067/22/21/11741)\n\n**Protein sequence design with deep generative models**  \nZachary Wu, Kadina E. Johnston, Frances H. Arnold, Kevin K. Yang  \n[Current Opinion in Chemical Biology 65](https://www.sciencedirect.com/science/article/pii/S136759312100051X) • [note](https://zhuanlan.zhihu.com/p/466616309) • 2021\n\n**Structure-based protein design with deep learning**  \nOvchinnikov, Sergey, and Po-Ssu Huang  \n[Current opinion in chemical biology 65](https://www.sciencedirect.com/science/article/pii/S1367593121001125) • [note](https://zhuanlan.zhihu.com/p/467001175) • 2021\n\n**Deep learning techniques have significantly impacted protein structure prediction and protein design**  \nPearce, Robin, and Yang Zhang  \n[Current opinion in structural biology 68 (2021)](https://www.sciencedirect.com/science/article/pii/S0959440X21000142)\n\n**Recent advances in de novo protein design: Principles, methods, and applications**  \nPan, Xingjie, and Tanja Kortemme  \n[Journal of Biological Chemistry 296 (2021)](https://www.sciencedirect.com/science/article/pii/S0021925821003367)\n\n**Protein design via deep learning**  \nWenze Ding, Kenta Nakai, Haipeng Gong  \n[Briefings in Bioinformatics](https://academic.oup.com/bib/advance-article/doi/10.1093/bib/bbac102/6554124) • 25 March 2022\n\n**Deep generative modeling for protein design**  \nStrokach, Alexey, and Philip M. Kim  \n[Current Opinion in Structural Biology](https://www.sciencedirect.com/science/article/pii/S0959440X21001573) • 2022\n\n**Dawn of a new era for membrane protein design**  \nSowlati-Hashjin, Shahin, Aanshi Gandhi, and Michael Garton  \n[BioDesign Research (2022)](https://spj.science.org/doi/10.34133/2022/9791435)\n\n**Deep learning approaches for conformational flexibility and switching properties in protein design**  \nRudden, Lucas SP, Mahdi Hijazi, and Patrick Barth  \n[Frontiers in Molecular Biosciences](https://www.frontiersin.org/articles/10.3389/fmolb.2022.928534/full)\n\n**Computational protein design with evolutionary-based and physics-inspired modeling: current and future synergies**  \nCyril Malbranke, David Bikard, Simona Cocco, Rémi Monasson, Jérôme Tubiana  \n[arXiv:2208.13616v2](https://arxiv.org/abs/2208.13616v2)\n\n**From sequence to function through structure: deep learning for protein design**  \nNoelia Ferruz, Michael Heinzinger, Mehmet Akdel, Alexander Goncearenco, Luca Naef, Christian Dallago  \n[bioRxiv 2022.08.31.505981](https://www.biorxiv.org/content/10.1101/2022.08.31.505981v1)/[Computational and Structural Biotechnology Journal Volume 21, 2023](https://www.sciencedirect.com/science/article/pii/S2001037022005086) • [Supplementary](https://www.biorxiv.org/content/biorxiv/early/2022/09/03/2022.08.31.505981/DC1/embed/media-1.pdf) • [accompanying list](https://github.com/hefeda/design_tools/blob/main/README.md)\n\n**Computational protein design with data-driven approaches: Recent developments and perspectives**  \nHaiyan Liu, Quan Chen  \n[WIREs Comput Mol Sci. 2022. e1646](https://wires.onlinelibrary.wiley.com/doi/10.1002/wcms.1646)\n\n**Understanding by design: Implementing deep learning from protein structure prediction to protein design**  \nGao, Yuanxu, Jiangshan Zhan, and Albert CH Yu  \n[MedComm-Future Medicine 1.2 (2022): e22](https://onlinelibrary.wiley.com/doi/full/10.1002/mef2.22)\n\n**Diffusion Models in Bioinformatics: A New Wave of Deep Learning Revolution in Action**  \nZhiye Guo, Jian Liu, Yanli Wang, Mengrui Chen, Duolin Wang, Dong Xu, Jianlin Cheng  \n[arXiv:2302.10907](https://arxiv.org/abs/2302.10907)\n\n**Machine learning for evolutionary-based and physicsinspired protein design: Current and future synergies**  \nCyril Malbranke, David Bikard, Simona Cocco, Rémi Monasson, Jérôme Tubiana  \n[Current Opinion in Structural Biology](https://www.sciencedirect.com/science/article/pii/S0959440X23000453)\n\n**De novo design of polyhedral protein assemblies: before and after the AI revolution**  \nBhoomika Basu Mallik, Jenna Stanislaw, Tharindu Madhusankha Alawathurage, and Alena Khmelinskaia  \n[ChemBioChem 2023, e202300117](http://dx.doi.org/10.1002/cbic.202300117)\n\n**Research progress of artificial intelligence in protein design**  \nCHEN Zhihang, JI Menglin, QI Yifei  \n[Synthetic Biology Journal (2023)](https://synbioj.cip.com.cn/article/2023/2096-8280/2023-008.shtml)\n\n**A Survey on Graph Diffusion Models: Generative AI in Science for Molecule, Protein and Material**  \nMengchun Zhang, Maryam Qamar, Taegoo Kang, Yuna Jung, Chenshuang Zhang, Sung-Ho Bae, Chaoning Zhang  \n[https://arxiv.org/abs/2304.01565](https://arxiv.org/pdf/2304.01565.pdf)\n\n**Exploring the Protein Sequence Space with Global Generative Models**  \nSergio Romero-Romero, Sebastian Lindner, Noelia Ferruz  \n[arXiv:2305.01941](https://arxiv.org/abs/2305.01941)\n\n**The Era of Machine Learning for Protein Design, Summarized in Four Key Methods**  \nLucianoSphere  \n[Towards Data Science](https://towardsdatascience.com/the-era-of-machine-learning-for-protein-design-summarized-in-four-key-methods-d6f1dac5de96)\n\n**Is novelty predictable?**  \nClara Fannjiang, Jennifer Listgarten  \n[arXiv:2306.00872](https://arxiv.org/abs/2306.00872)\n\n**Computational protein design - where it goes?**  \nXu Binbin, Chen Yingjun and Xue Weiwei  \n[Current Medicinal Chemistry 2023](https://www.eurekaselect.com/article/132267)\n\n**How can the protein design community best support biologists who want to harness AI tools for protein structure prediction and design?**  \nBirte Höcker, Peilong Lu, Anum Glasgow, Debora S. Marks  \nPranam Chatterjee, Joanna S.G. Slusky, Ora Schueler-Furman, Possu Huang  \n[Cell Systems 14.8 (2023)](https://www.cell.com/cell-systems/fulltext/S2405-4712(23)00212-0)\n\n**De novo 設計ナノポアの創製**  \n新津藍  \n[生物工学会誌 101.8 (2023)](https://www.jstage.jst.go.jp/article/seibutsukogaku/101/8/101_101.8_431/_article/-char/ja/)\n\n**Generative artificial intelligence for de novo protein design**  \nAdam Winnifrith, Carlos Outeiral, Brian Hie  \n[arXiv:2310.09685](https://arxiv.org/abs/2310.09685)\n\n**Intelligent Protein Design and Molecular Characterization Techniques: A Comprehensive Review**  \nJingjing Wang, Chang Chen, Ge Yao, Junjie Ding, Liangliang Wang and Hui Jiang  \n[Molecules 28.23 (2023)](https://www.mdpi.com/1420-3049/28/23/7865)\n\n**Generative models for protein sequence modeling: recent advances and future directions**  \nMehrsa Mardikoraem, Zirui Wang, Nathaniel Pascual, Daniel Woldring  \n[Briefings in Bioinformatics](https://academic.oup.com/bib/article/24/6/bbad358/7325909)\n\n**A new age in protein design empowered by deep learning**  \nHamed Khakzad, Ilia Igashov, Arne Schneuing, Casper Goverde, Michael Bronstein, Bruno Correia  \n[Cell Systems, Volume 14, Issue 11](https://www.cell.com/cell-systems/fulltext/S2405-4712(23)00298-3)\n\n**Deep learning for protein structure prediction and design—progress and applications**  \nJürgen Jänes and Pedro Beltrao  \n[Mol Syst Biol(2024)](https://www.embopress.org/doi/full/10.1038/s44320-024-00016-x)\n\n**De novo protein design—From new structures to programmable functions**  \nTanja Kortemme  \n[Cell 187.3 (2024)](https://www.cell.com/cell/fulltext/S0092-8674(23)01402-2)\n\n**Generative models for protein structures and sequences**  \nChloe Hsu, Clara Fannjiang \u0026 Jennifer Listgarten  \n[Nat Biotechnol 42, 196–199 (2024)](https://www.nature.com/articles/s41587-023-02115-w)\n\n**What does it take for an ‘AlphaFold Moment’ in functional protein engineering and design?**  \nRoberto A. Chica \u0026 Noelia Ferruz  \n[Nat Biotechnol 42, 173–174 (2024)](https://www.nature.com/articles/s41587-023-02120-z)\n\n**Protein design: the experts speak**  \nAnne Doerr  \n[Nat Biotechnol 42, 175–178 (2024)](https://www.nature.com/articles/s41587-023-02111-0)\n\n**Machine learning for functional protein design**  \nPascal Notin, Nathan Rollins, Yarin Gal, Chris Sander \u0026 Debora Marks  \n[Nat Biotechnol 42, 216–228 (2024)](https://www.nature.com/articles/s41587-024-02127-0)\n\n**Sparks of function by de novo protein design**  \nAlexander E. Chu, Tianyu Lu \u0026 Po-Ssu Huang  \n[Nat Biotechnol 42, 203–215 (2024)](https://www.nature.com/articles/s41587-024-02133-2) • [poster](https://drive.google.com/file/d/1sG3OlEWvhHcWAdtf7RTcCawAapDmyeEx/view)\n\n**A Survey of Generative AI for De Novo Drug Design: New Frontiers in Molecule and Protein Generation**  \nXiangru Tang, Howard Dai, Elizabeth Knight, Fang Wu, Yunyang Li, Tianxiao Li, Mark Gerstein  \n[arXiv:2402.08703](https://arxiv.org/abs/2402.08703)\n\n**Security challenges by AI-assisted protein design**  \nPhilip Hunter  \n[EMBO Rep(2024)](https://www.embopress.org/doi/full/10.1038/s44319-024-00124-7)\n\n**Opportunities and challenges in design and optimization of protein function**  \nDina Listov, Casper A. Goverde, Bruno E. Correia \u0026 Sarel Jacob Fleishman  \n[Nat Rev Mol Cell Biol (2024)](https://www.nature.com/articles/s41580-024-00718-y)\n\n**The State-of-the-Art Overview to Application of Deep Learning in Accurate Protein Design and Structure Prediction**  \nSaber Saharkhiz, Mehrnaz Mostafavi, Amin Birashk, Shiva Karimian, Shayan Khalilollah, Sohrab Jaferian, Yalda Yazdani, Iraj Alipourfard, Yun Suk Huh, Marzieh Ramezani Farani \u0026 Reza Akhavan-Sigari  \n[Top Curr Chem (Z) 382, 23 (2024)](https://link.springer.com/article/10.1007/s41061-024-00469-6)\n\n**Computational methods for protein design**  \nNoelia Ferruz, Amelie Stein  \n[Protein Engineering, Design and Selection, Volume 37, 2024](https://academic.oup.com/peds/article/doi/10.1093/protein/gzae011/7710436)\n\n**Structure-based protein and small molecule generation using EGNN and diffusion models: A comprehensive review**  \nFarzan Soleymani, Eric Paquet, Herna Lydia Viktor, Wojtek Michalowski  \n[Computational and Structural Biotechnology Journal (2024)](https://www.sciencedirect.com/science/article/pii/S2001037024002228)\n\n**Machine learning in biological physics: From biomolecular prediction to design**  \nJonathan Martin, Marcos Lequerica Mateos, José N. Onuchic, and Faruck Morcos  \n[Proceedings of the National Academy of Sciences 121.27 (2024)](https://www.pnas.org/doi/10.1073/pnas.2311807121)\n\n**AI has dreamt up a blizzard of new proteins. Do any of them actually work?**  \nEwen Callaway  \n[Nature 634.8034 (2024)](https://www.nature.com/articles/d41586-024-03335-z)\n\n**Five protein-design questions that still challenge AI**  \nSara Reardon  \n[Nature 635.8037 (2024)](https://www.nature.com/articles/d41586-024-03595-9)\n\n**De novo protein design in the age of artificial intelligence**  \nNan Liu, Xiaocheng Jin, Chongzhou Yang, Ziyang Wang, Xiaoping Min, Shengxiang Ge  \n[Sheng Wu Gong Cheng Xue Bao](https://doi.org/10.13345/j.cjb.240087)\n\n**Generative Models in Protein Engineering: A Comprehensive Survey**  \nChen Xinhui, Yiwen Yuan, Joseph Liu, Chak Tou Leong, Xiaoye Zhu, Jiaqi Chen  \n[Neurips 2024 Workshop](https://openreview.net/forum?id=Xc7l84S0Ao)\n\n**A Survey of Deep Learning Methods in Protein Bioinformatics and its Impact on Protein Design**  \nWeihang Dai  \n[arXiv:2501.01477](https://arxiv.org/abs/2501.01477)\n\n**The Promise of Protein Design: A Q\u0026A with Nobel Laureate David Baker**  \nDavid Baker and Fay Lin  \n[GEN Biotechnology (2025)](https://www.liebertpub.com/doi/abs/10.1089/genbio.2025.0004?journalCode=genbio)\n\n**Protein design and structure solution for drug discovery**  \nPetra Bombicz  \n[Crystallography Reviews (2024)](https://www.tandfonline.com/doi/full/10.1080/0889311X.2024.2461923)\n\n**A Model-Centric Review of Deep Learning for Protein Design**  \nGregory W. Kyro, Tianyin Qiu, Victor S. Batista  \n[arXiv:2502.19173](https://arxiv.org/abs/2502.19173)\n\n**Computational protein design**  \nKatherine I. Albanese, Sophie Barbe, Shunsuke Tagami, Derek N. Woolfson \u0026 Thomas Schiex  \n[Nature Reviews Methods Primers 5.1 (2025)](https://www.nature.com/articles/s43586-025-00383-1)\n\n**Exploring the Blueprint of Life: The Innovation in Antibody and Protein Design**  \nYang, Zhiwei, and Gerald H. Lushington  \n[Combinatorial chemistry \u0026 high throughput screening](https://www.eurekaselect.com/article/146786)\n\n**Advanced Deep Learning Methods for Protein Structure Prediction and Design**  \nYichao Zhang, Ningyuan Deng, Xinyuan Song, Ziqian Bi, Tianyang Wang, Zheyu Yao, Keyu Chen, Ming Li, Qian Niu, Junyu Liu, Benji Peng, Sen Zhang, Ming Liu, Li Zhang, Xuanhe Pan, Jinlang Wang, Pohsun Feng, Yizhu Wen, Lawrence KQ Yan, Hongming Tseng, Yan Zhong, Yunze Wang, Ziyuan Qin, Bowen Jing, Junjie Yang, Jun Zhou, Chia Xin Liang, Junhao Song  \n[arXiv:2503.13522](https://arxiv.org/abs/2503.13522)\n\n**Deep Learning-Driven Protein Structure Prediction and Design: Key Model Developments by Nobel Laureates and Multi-Domain Applications**  \nWanqing Yang, Yanwei Wang, Yang Wang  \n[arXiv:2504.01490](https://arxiv.org/abs/2504.01490)\n\n**Intelligent mining, engineering, and de novo design of proteins**  \nLIU Cui, SHI Zhenkun, MA Hongwu, LIAO Xiaoping  \n[Sheng wu gong cheng xue bao= Chinese journal of biotechnology 41.3 (2025)](https://cjb.ijournals.cn/html/cjbcn/2025/3/07240629.htm)\n\n### 1.2 Antibody design\n\n**A review of deep learning methods for antibodies**  \nJordan Graves, Jacob Byerly, Eduardo Priego, Naren Makkapati , S. Vince Parish, Brenda Medellin and Monica Berrondo  \n[Antibodies 9.2 (2020)](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7344881/pdf/antibodies-09-00012.pdf)\n\n**Progress and challenges for the machine learning-based design of fit-for-purpose monoclonal antibodies**  \nRahmad Akbar, Habib Bashour, Puneet Rawat, Philippe A. Robert, Eva Smorodina, Tudor-Stefan Cotet, Karine Flem-Karlsen, Robert Frank, Brij Bhushan Mehta, Mai Ha Vu, Talip Zengin, Jose Gutierrez-Marcos, Fridtjof Lund-Johansen,  Jan Terje Andersen, and Victor Greif  \n[Mabs. Vol. 14. No. 1. Taylor \u0026amp; Francis, 2022](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8928824/)\n\n**Advances in computational structure-based antibody design**  \nHummer, Alissa M., Brennan Abanades, and Charlotte M. Deane  \n[Current Opinion in Structural Biology 74 (2022)](https://www.sciencedirect.com/science/article/pii/S0959440X22000586)\n\n**Computational and artificial intelligence-based methods for antibody development**  \nJisun Kim, Matthew McFee, Qiao Fang, Osama Abdin, Philip M. Kim  \n[Trends in Pharmacological Sciences (2023)](https://www.sciencedirect.com/science/article/pii/S0165614722002796)\n\n**Leveraging deep learning to improve vaccine design**  \nAndrew P. Hederman, Margaret E. Ackerman  \n[Trends in immunology (2023)](https://www.cell.com/trends/immunology/fulltext/S1471-4906(23)00046-7)\n\n**In Silico Approaches to Deliver Better Antibodies by Design: The Past, the Present and the Future**  \nAndreas Evers, Shipra Malhotra, Vanita D. Sood  \n[arXiv:2305.07488](https://arxiv.org/abs/2305.07488)\n\n**AI Models for Protein Design are Driving Antibody Engineering**  \nMichael Chungyoun, Jeffrey J. Gray  \n[Current Opinion in Biomedical Engineering (2023): 100473](https://www.sciencedirect.com/science/article/abs/pii/S2468451123000296)\n\n**Computational Methods in Immunology and Vaccinology: Design and Development of Antibodies and Immunogens**  \nFederica Guarra and Giorgio Colombo  \n[Journal of Chemical Theory and Computation (2023)](https://pubs.acs.org/doi/10.1021/acs.jctc.3c00513)\n\n**Simplifying complex antibody engineering using machine learning**  \nMakowski, Emily K., Hsin-Ting Chen, and Peter M. Tessier  \n[Cell Systems 14.8 (2023)](https://www.cell.com/cell-systems/fulltext/S2405-4712(23)00118-7)/[2022 AIChE Annual Meeting. AIChE, 2022.](https://aiche.confex.com/aiche/2022/meetingapp.cgi/Paper/650993)\n\n**AI driven B-cell Immunotherapy Design**  \nBruna Moreira da Silva, David B. Ascher, Nicholas Geard, Douglas E. V. Pires  \n[arXiv:2309.01122](https://arxiv.org/abs/2309.01122)\n\n**Best practices for machine learning in antibody discovery and development**  \nLeonard Wossnig, Norbert Furtmann, Andrew Buchanan, Sandeep Kumar, Victor Greiff  \n[arXiv:2312.08470](https://arxiv.org/abs/2312.08470)/[Drug Discovery Today (2024)](https://www.sciencedirect.com/science/article/pii/S1359644624001508)\n\n**Next generation of multispecific antibody engineering**  \nDaniel Keri, Matt Walker, Isha Singh, Kyle Nishikawa, Fernando Garces  \n[Antibody Therapeutics (2023): tbad027](https://academic.oup.com/abt/article/7/1/37/7463325)\n\n**A primer on ML in antibody engineering**  \n[ABHISHAIKE MAHAJAN](https://substack.com/@abhishaikemahajan)  \n[Substack](https://www.abhishaike.com/p/a-primer-on-ai-in-antibody-engineering) • blog\n\n**Antibody design using deep learning: from sequence and structure design to affinity maturation**  \nSara Joubbi, Alessio Micheli, Paolo Milazzo, Giuseppe Maccari, Giorgio Ciano, Dario Cardamone, Duccio Medini  \n[Briefings in Bioinformatics, Volume 25, Issue 4, July 2024, bbae307](https://academic.oup.com/bib/article/25/4/bbae307/7705535)\n\n**AI-accelerated therapeutic antibody development: practical insights**  \nLuca Santuari, Marianne Bachmann Salvy, Ioannis Xenarios, Bulak Arpat  \n[Frontiers in Drug Discovery 4 (2024)](https://www.frontiersin.org/journals/drug-discovery/articles/10.3389/fddsv.2024.1447867/full)\n\n**AI-driven antibody design with generative diffusion models: current insights and future directions**  \nXin-heng He, Jun-rui Li, James Xu, Hong Shan, Shi-yi Shen, Si-han Gao \u0026 H. Eric Xu  \n[Acta Pharmacologica Sinica (2024)](https://www.nature.com/articles/s41401-024-01380-y)\n\n**Applying computational protein design to therapeutic antibody discovery -- current state and perspectives**  \nWeronika Bielska, Igor Jaszczyszyn, Pawel Dudzic, Bartosz Janusz, Dawid Chomicz, Sonia Wrobel, Victor Greiff, Ryan Feehan, Jared Adolf-Bryfogle, Konrad Krawczyk  \n[arXiv:2503.00913](https://arxiv.org/abs/2503.00913)\n\n### 1.3 Peptide design\n\n**Deep generative models for peptide design**  \nWan, Fangping, Daphne Kontogiorgos-Heintz, and Cesar de la Fuente-Nunez  \n[Digital Discovery (2022)](https://pubs.rsc.org/en/content/articlehtml/2022/dd/d1dd00024a)\n\n**Design of protein segments and peptides for binding to protein targets**  \nGupta, Suchetana, Noora Azadvari, and Parisa Hosseinzadeh  \n[BioDesign Research 2022 (2022)](https://spj.science.org/doi/10.34133/2022/9783197)\n\n**Revolutionizing peptide-based drug discovery: Advances in the post-AlphaFold era**  \nLiwei Chang, Arup Mondal, Bhumika Singh, Yisel Martínez-Noa, Alberto Perez  \n[Wiley Interdisciplinary Reviews: Computational Molecular Science](https://wires.onlinelibrary.wiley.com/doi/epdf/10.1002/wcms.1693)\n\n**Peptide-based drug discovery through artificial intelligence: towards an autonomous design of therapeutic peptides**  \nMontserrat Goles, Anamaría Daza, Gabriel Cabas-Mora, Lindybeth Sarmiento-Varón, Julieta Sepúlveda-Yañez, Hoda Anvari-Kazemabad, Mehdi D Davari, Roberto Uribe-Paredes, Álvaro Olivera-Nappa, Marcelo A Navarrete, David Medina-Ortiz  \n[Briefings in Bioinformatics 25.4 (2024)](https://academic.oup.com/bib/article/25/4/bbae275/7690345)\n\n**Accelerating antimicrobial peptide design: Leveraging deep learning for rapid discovery**  \nAhmad M. Al-Omari ,Yazan H. Akkam,Ala’a Zyout,Shayma’a Younis,Shefa M. Tawalbeh,Khaled Al-Sawalmeh,Amjed Al Fahoum ,Jonathan Arnold  \n[PloS one 19.12 (2024): e0315477](https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0315477)\n\n**Trends in the Research and Development of Peptide Drug Conjugates: Artificial Intelligence Aided Design**  \nDong-E Zhang, Dong-E Zhang, Tong He, Tong He, Tianyi Shi, Tianyi Shi, Kun Huang, Kun Huang, Anlin Peng, Anlin Peng  \n[Frontiers in Pharmacology 16](https://www.frontiersin.org/journals/pharmacology/articles/10.3389/fphar.2025.1553853/full)\n\n### 1.4 Binder design\n\n**Improving de novo Protein Binder Design with Deep Learning**  \nNathaniel Bennett, Brian Coventry, Inna Goreshnik, Buwei Huang, Aza Allen, Dionne Vafeados, Ying Po Peng, Justas Dauparas, Minkyung Baek, Lance Stewart, Frank DiMaio, Steven De Munck, Savvas Savvides, David Baker  \n[bioRxiv 2022.06.15.495993](https://www.biorxiv.org/content/10.1101/2022.06.15.495993v1)/[Nat Commun 14, 2625 (2023)](https://www.nature.com/articles/s41467-023-38328-5) • [code](https://github.com/nrbennet/dl_binder_design) • [news](https://phys.org/news/2023-08-deep-protein.html)\n\n**Data and AI-driven synthetic binding protein discovery**  \nYanlin Li, Zixin Duan, Zhenwen Li, Weiwei Xue  \n[Trends in Pharmacological Sciences (2025)](https://www.cell.com/trends/pharmacological-sciences/abstract/S0165-6147(24)00268-2)\n\n### 1.5 Enzyme design\n\n**A review of enzyme design in catalytic stability by artificial intelligence**  \nYongfan Ming, Wenkang Wang, Rui Yin, Min Zeng, Li Tang, Shizhe Tang, Min Li  \n[Briefings in Bioinformatics, 2023](https://academic.oup.com/bib/advance-article-abstract/doi/10.1093/bib/bbad065/7086816)\n\n**Application of \"foldability\" in the intelligent of enzymes engineering and design: take AlphaFold2 for example**  \nMENG Qiaozhen, GUO Fei  \n[Synthetic Biology Journal (2023)](https://synbioj.cip.com.cn/article/2023/2096-8280/2023-011.shtml)\n\n**AlphaFold2 and Deep Learning for Elucidating Enzyme Conformational Flexibility and Its Application for Design**  \nCasadevall, Guillem, Cristina Duran, and Sí­lvia Osuna  \n[JACS Au (2023)](https://pubs.acs.org/doi/10.1021/jacsau.3c00188)\n\n**Accelerating Biocatalysis Discovery with Machine Learning: A Paradigm Shift in Enzyme Engineering, Discovery, and Design**  \nBraun Markus, Gruber Christian C, Krassnigg Andreas, Kummer Arkadij, Lutz Stefan, Oberdorfer Gustav, Siirola Elina, and Snajdrova Radka  \n[ACS Catal. 2023](https://pubs.acs.org/doi/10.1021/acscatal.3c03417)\n\n**Building Enzymes through Design and Evolution**  \nHossack, Euan J., Florence J. Hardy, and Anthony P. Green  \n[ACS Catalysis 13.19 (2023)](https://pubs.acs.org/doi/10.1021/acscatal.3c02746)\n\n**Advances in generative modeling methods and datasets to design novel enzymes for renewable chemicals and fuels**  \nRana A Barghout, Zhiqing Xu, Siddharth Betala, Radhakrishnan Mahadevan  \n[Current Opinion in Biotechnology, Volume 84, 2023](https://www.sciencedirect.com/science/article/abs/pii/S0958166923001179)\n\n**Opportunites and Challenges for Machine Learning-Assisted Enzyme Engineering**  \nJason Yang, Francesca-Zhoufan Li, Frances H. Arnold  \n[ACS Central Science (2024)](https://pubs.acs.org/doi/10.1021/acscentsci.3c01275)\n\n**Navigating the landscape of enzyme design: from molecular simulations to machine learning**  \nJiahui Zhoua, Meilan Huang  \n[Chemical Society Reviews (2024)](https://pubs.rsc.org/en/Content/ArticleLanding/2024/CS/D4CS00196F)\n\n**Structure Prediction and Computational Protein Design for Efficient Biocatalysts and Bioactive Proteins**  \nRebecca Buller, Jiri Damborsky, Donald Hilvert, Uwe Bornscheuer  \n[Angewandte Chemie (International ed. in English)](https://onlinelibrary.wiley.com/doi/10.1002/anie.202421686)\n\n## 2. Model-based design\n\n\u003e Invert trained models with optimize algorithms through iterations for sequence design. Inverted structure prediction models are known as **Hallucination**.\n\n### 2.1 trRosetta-based\n\n**Design of proteins presenting discontinuous functional sites using deep learning**  \nDoug Tischer, Sidney Lisanza, Jue Wang, Runze Dong,  View ORCID ProfileIvan Anishchenko, Lukas F. Milles, Sergey Ovchinnikov, David Baker  \n[bioRxiv (2020)](https://www.biorxiv.org/content/10.1101/2020.11.29.402743v1)\n\n**Fast differentiable DNA and protein sequence optimization for molecular design**  \nLinder, Johannes, and Georg Seelig  \n[arXiv preprint arXiv:2005.11275 (2020)](https://arxiv.org/abs/2005.11275)\n\n**De novo protein design by deep network hallucination**  \nIvan Anishchenko, Samuel J. Pellock, Tamuka M. Chidyausiku, Theresa A. Ramelot, Sergey Ovchinnikov, Jingzhou Hao, Khushboo Bafna, Christoffer Norn, Alex Kang, Asim K. Bera, Frank DiMaio, Lauren Carter, Cameron M. Chow, Gaetano T. Montelione \u0026 David Baker  \n[Nature (2021)](https://doi.org/10.1038/s41586-021-04184-w)  • [code](https://github.com/gjoni/trDesign) • [trRosetta](https://yanglab.nankai.edu.cn/trRosetta/download/)\n\n**Protein sequence design by conformational landscape optimization**  \nChristoffer Norn, Basile I. M. Wicky, David Juergens, and Sergey Ovchinnikov  \n[Proceedings of the National Academy of Sciences 118.11 (2021)](https://www.pnas.org/content/118/11/e2017228118) • [code](https://github.com/gjoni/trDesign)\n\n**De novo design of small beta barrel proteins**  \nDavid E. Kim, Davin R. Jensen, David Feldman, Doug Tischer  and Ayesha Saleem, Cameron M. Chow, Xinting Li, Lauren Carter, Lukas Milles, Hannah Nguyen, Alex Kang, Asim K. Bera, Francis C. Peterson, Brian F. Volkman, Sergey Ovchinnikov, David Baker  \n[PNAS(2023),e2207974120](https://www.pnas.org/doi/10.1073/pnas.2207974120) • [code](https://github.com/sokrypton/TrDesign_partialhal)\n\n**Exploring \"dark matter\" protein folds using deep learning**  \nZander Harteveld, Alexandra Van Hall-Beauvais, Irina Morozova, Joshua Southern, Casper Alexander Goverde, Sandrine Georgeon, Stephane Rosset, Andreas Loukas, Pierre Vandergheynst, Michael Bronstein, Bruno Correia  \n[bioRxiv 2023.08.30.555621](https://www.biorxiv.org/content/10.1101/2023.08.30.555621v1)/[Cell Systems](https://www.cell.com/cell-systems/fulltext/S2405-4712(24)00270-9) • [Suppplymentary](https://www.biorxiv.org/content/biorxiv/early/2023/09/01/2023.08.30.555621/DC1/embed/media-1.pdf) • [code](https://github.com/zanderharteveld/genesis)\n\n**Carving out a Glycoside Hydrolase Active Site for Incorporation into a New Protein Scaffold Using Deep Network Hallucination**  \nAnders Lønstrup Hansen, Frederik Friis Theisen, Ramon Crehuet, Enrique Marcos, Nushin Aghajari, and Martin Willemoës  \n[ACS Synth. Biol. 2024](https://pubs.acs.org/doi/10.1021/acssynbio.3c00674)\n\n**Implicit modeling of the conformational landscape and sequence allows scoring and generation of stable proteins**  \nYehlin Cho, Justas Dauparas, Kotaro Tsuboyama, Gabriel Rocklin, Sergey Ovchinnikov  \n[bioRxiv 2024.12.20.629706](https://www.biorxiv.org/content/10.1101/2024.12.20.629706v1) • [code](https://github.com/yehlincho/Joint_Model_Stability) • [Supplementary](https://www.biorxiv.org/content/biorxiv/early/2024/12/22/2024.12.20.629706/DC1/embed/media-1.pdf)\n\n### 2.2 AlphaFold2-based\n\n**End-to-end learning of multiple sequence alignments with differentiable Smith-Waterman**  \nPetti, Samantha, Bhattacharya, Nicholas, Rao, Roshan, Dauparas, Justas, Thomas, Neil, Zhou, Juannan, Rush, Alexander M, Koo, Peter K, Ovchinnikov, Sergey  \n[bioRxiv (2021)](http://repository.cshl.edu/id/eprint/40409/)/[Bioinformatics, 2022;, btac724](https://academic.oup.com/bioinformatics/advance-article/doi/10.1093/bioinformatics/btac724/6820925) • [ColabDesign](https://github.com/sokrypton/ColabDesign), [SMURF](https://github.com/spetti/SMURF), [AF2 back propagation](https://github.com/sokrypton/af_backprop) • [our notes1](https://zhuanlan.zhihu.com/p/468219547), [notes2](https://zhuanlan.zhihu.com/p/472037977) • [lecture1](https://www.youtube.com/watch?v=2HmXwlKWMVs), [lecture2](https://www.youtube.com/watch?v=BJdRvODiDnk) • [Discord](https://discord.com/invite/FpYPneYB)\n\n**AlphaDesign: A de novo protein design framework based on AlphaFold**  \nJendrusch, Michael, Jan O. Korbel, and S. Kashif Sadiq  \n[bioRxiv (2021)](https://www.biorxiv.org/content/10.1101/2021.10.11.463937v1)\n\n**Using AlphaFold for Rapid and Accurate Fixed Backbone Protein Design**  \nMoffat, Lewis, Joe G. Greener, and David T. Jones  \n[bioRxiv (2021)](https://www.biorxiv.org/content/10.1101/2021.08.24.457549v1)\n\n**State-of-the-art estimation of protein model accuracy using AlphaFold**  \nJames P. Roney, Sergey Ovchinnikov  \n[bioRxiv 2022.03.11.484043](https://www.biorxiv.org/content/10.1101/2022.03.11.484043v3)/[Physical Review Letters 129.23 (2022)](https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.129.238101) • [code](https://github.com/jproney/AF2Rank)\n\n**Solubility-aware protein binding peptide design using AlphaFold**  \nTakatsugu Kosugi, Masahito Ohue  \n[bioRxiv 2022.05.14.491955](https://doi.org/10.1101/2022.05.14.491955)/[Biomedicines 10.7 (2022)](https://www.mdpi.com/2227-9059/10/7/1626) • [Supplemental Materials](https://www.biorxiv.org/content/biorxiv/early/2022/05/15/2022.05.14.491955/DC1/embed/media-1.pdf) • [code](https://github.com/ohuelab/Solubility_AfDesign)\n\n**Hallucinating protein assemblies**  \nBasile I M Wicky, Lukas F Milles, Alexis Courbet, Robert J Ragotte, Justas Dauparas, Elias Kinfu, Sam Tipps, Ryan D Kibler, Minkyung Baek, Frank DiMaio, Xinting Li, Lauren Carter, Alex Kang, Hannah Nguyen, Asim K Bera, David Baker  \n[bioRxiv 2022.06.09.493773](https://www.biorxiv.org/content/10.1101/2022.06.09.493773v1)/[Science (2022)](https://www.science.org/doi/10.1126/science.add1964) • [related slides](https://docs.google.com/presentation/d/1_tvzLKks83sYOKemfFeImCPnWtCQ-CHqmKK_3IQI1so/) • [our notes](https://zhuanlan.zhihu.com/p/527152827) • [news](https://www.nature.com/articles/d41586-022-02947-7)\n\n**EvoBind: in silico directed evolution of peptide binders with AlphaFold**  \nPatrick Bryant, Arne Elofsson  \n[bioRxiv 2022.07.23.501214](https://www.biorxiv.org/content/10.1101/2022.07.23.501214v1) • [code](https://github.com/patrickbryant1/EvoBind)\n\n**Hallucination of closed repeat proteins containing central pockets**  \nLinna An, Derrick R Hicks, Dmitri Zorine, Justas Dauparas, Basile I. M. Wicky, Lukas F Milles, Alexis Courbet, Asim K. Bera, Hannah Nguyen, Alex Kang, Lauren Carter, David Baker  \n[bioRxiv 2022.09.01.506251](https://www.biorxiv.org/content/10.1101/2022.09.01.506251v1)/[Nat Struct Mol Biol 30, 1755-1760 (2023)](https://www.nature.com/articles/s41594-023-01112-6) • [Supplementary data](https://static-content.springer.com/esm/art%3A10.1038%2Fs41594-023-01112-6/MediaObjects/41594_2023_1112_MOESM1_ESM.pdf)\n\n**Predicting the structure of large protein complexes using AlphaFold and Monte Carlo tree search**  \nPatrick Bryant, Gabriele Pozzati, Wensi Zhu, Aditi Shenoy, Petras Kundrotas \u0026 Arne Elofsson  \n[Nature communications 13.1 (2022)](https://www.nature.com/articles/s41467-022-33729-4) • [gitlba](https://gitlab.com/patrickbryant1/molpc), [github](https://github.com/patrickbryant1/MoLPC) • [Supplementary data1](https://doi.org/10.5281/zenodo.6367019), [Supplementary data2](https://doi.org/10.17044/scilifelab.19375172)\n\n**De novo protein design by inversion of the AlphaFold structure prediction network**  \nCasper Goverde, Benedict Wolf, Hamed Khakzad, Stephane Rosset, Bruno E Correia  \n[bioRxiv 2022.12.13.520346](https://www.biorxiv.org/content/10.1101/2022.12.13.520346v1) • [code](https://github.com/bene837/af_gradmcmc) • [lecture1](https://www.youtube.com/watch?v=aUMGuogMZCA) • [lecture2](https://www.youtube.com/watch?v=4S4J7gbhAa0)\n\n**Code of OpenComplex**  \nJingcheng, Yu and Zhaoming, Chen and Zhaoqun, Li and Mingliang, Zeng and Wenjun, Lin and He, Huang and Qiwei, Ye  \n[code](https://github.com/baaihealth/OpenComplex)\n\n**Efficient and scalable de novo protein design using a relaxed sequence space**  \nChristopher Josef Frank, Ali Khoshouei, Yosta de Stigter, Dominik Schiewitz, Shihao Feng, Sergey Ovchinnikov, Hendrik Dietz  \n[bioRxiv 2023.02.24.529906](https://www.biorxiv.org/content/10.1101/2023.02.24.529906v1) • [code](https://github.com/sokrypton/ColabDesign/blob/main/af/examples/af_relax_design.ipynb)\n\n**Cyclic peptide structure prediction and design using AlphaFold**  \nStephen A. Rettie, Katelyn V. Campbell, Asim K. Bera, Alex Kang, Simon Kozlov, Joshmyn De La Cruz, Victor Adebomi, Guangfeng Zhou, Frank DiMaio, Sergey Ovchinnikov, Gaurav Bhardwaj  \n[bioRxiv](https://www.biorxiv.org/content/10.1101/2023.02.25.529956v1.full.pdf) • [Code](https://github.com/sokrypton/ColabDesign/blob/main/af/examples/af_cyc_design.ipynb) • [Supplementary](https://www.biorxiv.org/content/biorxiv/early/2023/02/26/2023.02.25.529956/DC1/embed/media-1.xlsx)\n\n**De novo design of luciferases using deep learning**  \nAndy Hsien-Wei Yeh, Christoffer Norn, Yakov Kipnis, Doug Tischer, Samuel J. Pellock, Declan Evans, Pengchen Ma, Gyu Rie Lee, Jason Z. Zhang, Ivan Anishchenko, Brian Coventry, Longxing Cao, Justas Dauparas, Samer Halabiya, Michelle DeWitt, Lauren Carter, K. N. Houk \u0026 David Baker  \n[Nature](https://www.nature.com/articles/s41586-023-05696-3) • [Code](https://files.ipd.uw.edu/pub/luxSit/scaffold_generation.tar.gz) • [Supplementary Materials](https://static-content.springer.com/esm/art%3A10.1038%2Fs41586-023-05696-3/MediaObjects/41586_2023_5696_MOESM1_ESM.pdf)\n\n**In silico evolution of protein binders with deep learning models for structure prediction and sequence design**  \nOdessa J Goudy, Amrita Nallathambi, Tomoaki Kinjo, Nicholas Randolph, Brian Kuhlman  \n[bioRxiv 2023.05.03.539278](https://www.biorxiv.org/content/10.1101/2023.05.03.539278v1) • [Supplementary](https://www.biorxiv.org/content/biorxiv/early/2023/05/03/2023.05.03.539278/DC1/embed/media-1.pdf) • [code](https://github.com/KuhlmanLab/evopro)\n\n**Computational design of soluble analogues of integral membrane protein structures**  \nCasper Alexander Goverde, Martin Pacesa, Lars Jeremy Dornfeld, Sandrine Georgeon, Stephane Rosset, Justas Dauparas, Christian Shellhaas, Simon Kozlov, David Baker, Sergey Ovchinnikov, Bruno Correia  \n[bioRxiv 2023.05.09.540044](https://www.biorxiv.org/content/10.1101/2023.05.09.540044v2)/[Nature (2024)](https://www.nature.com/articles/s41586-024-07601-y) • [code](https://github.com/bene837/af2seq) • [Supplementary](https://www.biorxiv.org/content/biorxiv/early/2023/05/09/2023.05.09.540044/DC1/embed/media-1.pdf)\n\n**Antibody Complementarity-Determining Region Sequence Design using AlphaFold2 and Binding Affinity Prediction Model**  \nTakafumi Ueki, Masahito Ohue  \n[bioRxiv 2023.06.02.543382](https://www.biorxiv.org/content/10.1101/2023.06.02.543382v1)\n\n**Context-Dependent Design of Induced-fit Enzymes using Deep Learning Generates Well Expressed, Thermally Stable and Active Enzymes**  \nLior Zimmerman, Noga Alon, Itay Levin, Anna Koganitsky, Nufar Shpigel, Chen Brestel, Gideon David Lapidoth  \n[bioRxiv 2023.07.27.550799](https://www.biorxiv.org/content/10.1101/2023.07.27.550799v2) • [Supplementary](https://www.biorxiv.org/content/biorxiv/early/2023/07/31/2023.07.27.550799/DC1/embed/media-1.xlsx)\n\n**Highly accurate and robust protein sequence design with CarbonDesign**/**Accurate and robust protein sequence design with CarbonDesign**  \nMilong Ren, Chungong Yu, Dongbo Bu, Haicang Zhang  \n[bioRxiv 2023.08.07.552204](https://www.biorxiv.org/content/10.1101/2023.08.07.552204v1)/[Nat Mach Intell 6, 536–547 (2024)](https://www.nature.com/articles/s42256-024-00838-2) • [code](https://github.com/zhanghaicang/carbonmatrix_public)\n\n**Design of Cyclic Peptides Targeting Protein-Protein Interactions using AlphaFold**  \nTakatsugu Kosugi, Masahito Ohue  \n[bioRxiv 2023.08.20.554056](https://www.biorxiv.org/content/10.1101/2023.08.20.554056v1) • [Supplementary](https://www.biorxiv.org/content/biorxiv/early/2023/08/21/2023.08.20.554056/DC1/embed/media-1.pdf) • [code](https://github.com/YoshitakaMo/localcolabfold/)\n\n**MetaPPI: In Silico Screen for Novel CRBN-based Substrates**  \nneoxbio  \n[website](https://www.neoxbio.com/platform-technology.html) • [news](https://mp.weixin.qq.com/s/Kb4EQ0YvYDvoLZ_cnAlUPw) • masif-based • commercial\n\n**AlphaFold Distillation for Protein Design**  \nAnonymous  \n[ICLR 2024](https://openreview.net/forum?id=3pgJNIx3gc) • [code](https://anonymous.4open.science/r/AFDistill-28C3)\n\n**High-throughput computational discovery of inhibitory protein fragments with AlphaFold**  \nAndrew Savinov, Sebastian Swanson, Amy E. Keating, Gene-Wei Li  \n[bioRxiv 2023.12.19.572389](https://www.biorxiv.org/content/10.1101/2023.12.19.572389v1) • [code](https://github.com/swanss/FragFold)\n\n**An integrative approach to protein sequence design through multiobjective optimization**  \nLu Hong, Tanja Kortemme  \n[bioRxiv 2024.03.01.582670](https://www.biorxiv.org/content/10.1101/2024.03.01.582670v1)/[PLOS Computational Biology 20(7)](https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1011953) • [code](https://github.com/luhong88/int_seq_des) • [Supplementary](https://www.biorxiv.org/content/biorxiv/early/2024/03/04/2024.03.01.582670/DC1/embed/media-1.pdf)\n\n**Protein Design Using Structure-Prediction Networks: AlphaFold and RoseTTAFold as Protein Structure Foundation Models**  \nJue Wang, Joseph L. Watson and Sidney L. Lisanza  \n[Cold Spring Harbor Perspectives in Biology(2024)](https://cshperspectives.cshlp.org/content/early/2024/03/01/cshperspect.a041472.short)\n\n**Context-dependent design of induced-fit enzymes using deep learning generates well-expressed, thermally stable and active enzymes**  \nLior Zimmerman, Noga Alon, Itay Levin, and Gideon D. Lapidoth  \n[Proceedings of the National Academy of Sciences 121.11(2024)](https://www.pnas.org/doi/10.1073/pnas.2313809121)\n\n**Design of Repeat Alpha-Beta Proteins with Capping Helices**  \nDmitri Zorine, David Baker  \n[bioRxiv 2024.06.15.590358](https://www.biorxiv.org/content/10.1101/2024.06.15.590358v1) • [code](https://github.com/dmitropher/af2_multistate_hallucination)\n\n**Design of linear and cyclic peptide binders of different lengths only from a protein target sequence**  \nQiuzhen Li, Efstathios Nikolaos Vlachos, Patrick Bryant  \n[bioRxiv 2024.06.20.599739](https://www.biorxiv.org/content/10.1101/2024.06.20.599739v1) • [code](https://zenodo.org/records/11543503) • [Supplementary](https://www.biorxiv.org/content/biorxiv/early/2024/06/22/2024.06.20.599739/DC1/embed/media-1.pdf)\n\n**BindCraft: one-shot design of functional protein binders**  \nMartin Pacesa, Lennart Nickel, Joseph Schmidt, Ekaterina Pyatova, Christian Schellhaas, Lucas Kissling, Ana Alcaraz-Serna, Yehlin Cho, Kourosh H. Ghamary, Laura Vinue, Brahm J. Yachnin, Andrew M. Wollacott, Stephen Buckley, Sandrine Georgeon, Casper A. Goverde, Georgios N. Hatzopoulos, Pierre Gonczy, Yannick D. Muller, Gerald Schwank, Sergey Ovchinnikov, Bruno E. Correia  \n[bioRxiv 2024.09.30.615802](https://www.biorxiv.org/content/10.1101/2024.09.30.615802v1) • [code](https://github.com/martinpacesa/BindCraftz)\n\n**Design of linear and cyclic peptide binders of different lengths from protein sequence information**  \nQiuzhen Li, Efstathios Nikolaos Vlachos, Patrick Bryant  \n[bioRxiv 2024.06.20.599739](https://www.biorxiv.org/content/10.1101/2024.06.20.599739v2) • [code](https://zenodo.org/records/13913345)\n\n**Scalable protein design using optimization in a relaxed sequence space**  \nChristopher Frank, Ali Khoshouei , Lara Fub , Dominik Schiwietz , Dominik Putz, Lara Weber, Zhixuan Zhao, Motoyuki Hattori, Shihao Feng, Yosta de Stigter, Sergey Ovchinnikov, Hendrik Dietz  \n[Science386,439-445(2024)](https://www.science.org/doi/10.1126/science.adq1741) • [code](https://github.com/sokrypton/ColabDesign)\n\n**Alphafold2 refinement improves designability of large de novo proteins**  \nChristopher Josef Frank, Dominik Schiwietz, Lara Fuss, Sergey Ovchinnikov, Hendrik Dietz  \n[bioRxiv 2024.11.21.624687](https://www.biorxiv.org/content/10.1101/2024.11.21.624687v1) • [colab](https://colab.research.google.com/drive/14ULdrjOmH-XMtGDrikzjDF1FLegZg3-a?usp=sharing)\n\n**Low-N OpenFold fine-tuning improves peptide design without additional structures**  \nTheodore Sternlieb, Jakub Otwinowski, Sam Sinai, Jeffrey Chan\n[Machine Learning for Structural Biology Workshop, NeurIPS 2024](https://www.mlsb.io/papers_2024/Low-N_OpenFold_fine-tuning_improves_peptide_design_without_additional_structures.pdf)\n\n**HighPlay: Cyclic Peptide Sequence Design Based on Reinforcement Learning and Protein Structure Prediction**  \nHuitian Lin, Cheng Zhu, Tianfeng Shang, Ning Zhu, Kang Lin, Xiang Shao, Xudong Wang, Hongliang Duan  \n[bioRxiv 2025.03.17.643626](http://biorxiv.org/content/10.1101/2025.03.17.643626v1)\n\n**Designing Novel Solenoid Proteins with In Silico Evolution**  \nDaniella Pretorius, Georgi Ivanov Nikov, Kono Washio, Steve-William Florent, Henry Taunt, Sergey Ovchinnikov, James William Murray  \n[bioRxiv 2025.04.23.646631](https://www.biorxiv.org/content/10.1101/2025.04.23.646631v1) • [Supplementary](https://www.biorxiv.org/content/biorxiv/early/2025/04/24/2025.04.23.646631/DC1/embed/media-1.pdf)\n\n### 2.3 DMPfold2-based\n\n**Design in the DARK: Learning Deep Generative Models for De Novo Protein Design**  \nMoffat, Lewis, Shaun M. Kandathil, and David T. Jones  \n[bioRxiv (2022)](https://www.biorxiv.org/content/10.1101/2022.01.27.478087v1) • [DMPfold2](https://github.com/psipred/DMPfold2)\n\n### 2.4 CM-Align\n\n**AutoFoldFinder: An Automated Adaptive Optimization Toolkit for De Novo Protein Fold Design**  \nShuhao Zhang, Youjun Xu, Jianfeng Pei, Luhua Lai  \n[NeurIPS 2021](https://www.mlsb.io/papers_2021/MLSB2021_AutoFoldFinder.pdf)\n\n### 2.5 MSA-transformer-based\n\n**Protein language models trained on multiple sequence alignments learn phylogenetic relationships**  \nDamiano Sgarbossa, Umberto Lupo, Anne-Florence Bitbol  \n[arXiv preprint arXiv:2203.15465 (2022)](https://arxiv.org/abs/2203.15465)/[bioRxiv 2022.04.14.488405](https://www.biorxiv.org/content/10.1101/2022.04.14.488405v1)\n\n**EvoOpt: an MSA-guided, fully unsupervised sequence optimization pipeline for protein design**  \nHideki Yamaguchi, Yutaka Saito  \n[NeurIPS 2022](https://www.mlsb.io/papers_2022/EvoOpt_an_MSA_guided_fully_unsupervised_sequence_optimization_pipeline_for_protein_design.pdf)\n\n**Generative power of a protein language model trained on multiple sequence alignments**  \nSgarbossa, Damiano, Umberto Lupo, and Anne-Florence Bitbol  \n[Elife 12 (2023): e79854](https://elifesciences.org/articles/79854) • [code](https://github.com/Bitbol-Lab/Iterative_masking)\n\n### 2.6 DeepAb-based\n\n**Towards deep learning models for target-specific antibody design**  \nSai Pooja Mahajan, Jeffrey Ruffolo, Rahel Frick, Jeffrey J. Gray  \n[Biophysical Journal 121.3 (2022)](https://www.cell.com/biophysj/pdf/S0006-3495(21)03758-9.pdf) • [DeepAb](https://github.com/RosettaCommons/DeepAb) • [lecture](https://www.youtube.com/watch?v=LIo-1jPfrns)\n\n**Hallucinating structure-conditioned antibody libraries for target-specific binders**  \nSai Pooja Mahajan, Jeffrey A Ruffolo, Rahel Frick, Jeffrey J. Gray  \n[bioRxiv 2022.06.06.494991](https://www.biorxiv.org/content/10.1101/2022.06.06.494991v1)/[Front. Immunol. 13:999034](https://www.frontiersin.org/articles/10.3389/fimmu.2022.999034/full) • [Supplementary](https://www.biorxiv.org/content/biorxiv/early/2022/06/06/2022.06.06.494991/DC1/embed/media-1.pdf) • [code](https://github.com/RosettaCommons/FvHallucinator)\n\n### 2.7 TRFold2-based\n\n[News of TRDesign](https://mp.weixin.qq.com/s/OQzKawtL9RdK9HzYsfu80g)  \n[TIANRANG XLab](https://xlab.tianrang.com/)\npaper unavailable • [slides](https://pan.baidu.com/share/init?surl=4AOW_D9dwlvC7VGGZA2tmQ\u0026pwd=ffui) • [website](https://xcreator.tianrang.com/auth/login) • commercial • [news](https://mp.weixin.qq.com/s/45Gz7GWOGxHl0i6LXxTUpw)\n\n### 2.8 GPT-based\n\n**Multi-segment preserving sampling for deep manifold sampler**  \nDaniel Berenberg, Jae Hyeon Lee, Simon Kelow, Ji Won Park, Andrew Watkins, Vladimir Gligorijević, Richard Bonneau, Stephen Ra, Kyunghyun Cho  \n[arXiv preprint arXiv:2205.04259 (2022)](https://arxiv.org/abs/2205.04259)\n\n**Preference optimization of protein language models as a multi-objective binder design paradigm**  \nPouria Mistani, Venkatesh Mysore  \n[arXiv:2403.04187](https://arxiv.org/abs/2403.04187)\n\n**HMAMP: Hypervolume-Driven Multi-Objective Antimicrobial Peptides Design**  \nLi Wang, Yiping Li, Xiangzheng Fu, Xiucai Ye, Junfeng Shi, Gary G. Yen, Xiangxiang Zeng  \n[arXiv:2405.00753](https://arxiv.org/abs/2405.00753)\n\n### 2.9 ESM-based\n\n**Generating novel protein sequences using Gibbs sampling of masked language models**  \nSean R. Johnson, Sarah Monaco, Kenneth Massie, Zaid Syed  \n[bioRxiv 2021.01.26.428322](https://www.biorxiv.org/content/10.1101/2021.01.26.428322v1) • [code](https://github.com/seanrjohnson/protein_gibbs_sampler)\n\n**A high-level programming language for generative protein design**  \nBrian Hie, Salvatore Candido, Zeming Lin, Ori Kabeli, Roshan Rao, Nikita Smetanin, Tom Sercu, Alexander Rives  \n[bioRxiv 2022.12.21.521526](https://www.biorxiv.org/content/10.1101/2022.12.21.521526v1)\n\n**Language models generalize beyond natural proteins**  \nRobert Verkuil, Ori Kabeli, Yilun Du, Basile IM Wicky, Lukas F Milles, Justas Dauparas, David Baker, Sergey Ovchinnikov, Tom Sercu, Alexander Rives  \n[bioRxiv 2022.12.21.521521](https://www.biorxiv.org/content/10.1101/2022.12.21.521521v1)\n\n**ESMFold Hallucinates Native-Like Protein Sequences**  \nJeliazko R Jeliazkov, Diego del Alamo, Joel D Karpiak  \n[bioRxiv 2023.05.23.541774](https://www.biorxiv.org/content/10.1101/2023.05.23.541774v1)\n\n**Protein Language Model Supervised Precise and Efficient Protein Backbone Design Method**  \nBo Zhang, Kexin Liu, Zhuoqi Zheng, Yunfeiyang Liu, Junxi Mu, Ting Wei, Hai-Feng Chen  \n[bioRxiv 2023.10.26.564121](https://www.biorxiv.org/content/10.1101/2023.10.26.564121v1)/[preprint](https://www.researchsquare.com/article/rs-5450034/v1) • [code](https://github.com/sirius777coder/GPDL) • [Supplementary](https://www.biorxiv.org/content/biorxiv/early/2023/10/30/2023.10.26.564121/DC1/embed/media-1.pdf)\n\n**Unexplored regions of the protein sequence-structure map revealed at scale by a library of foldtuned language models**  \nArjuna M. Subramanian, Matt Thomson  \n[bioRxiv 2023.12.22.573145](https://www.biorxiv.org/content/10.1101/2023.12.22.573145v1)\n\n**Computational scoring and experimental evaluation of enzymes generated by neural networks**  \nSean R. Johnson, Xiaozhi Fu, Sandra Viknander, Clara Goldin, Sarah Monaco, Aleksej Zelezniak \u0026 Kevin K. Yang  \n[Nature Biotechnology (2024)](https://www.nature.com/articles/s41587-024-02214-2) • [code](https://github.com/seanrjohnson/protein_scoring)\n\n**Exploring Latent Space for Generating Peptide Analogs Using Protein Language Models**  \nPo-Yu Liang, Xueting Huang, Tibo Duran, Andrew J. Wiemer, Jun Bai  \n[arXiv:2408.08341](https://arxiv.org/abs/2408.08341) • [code](https://github.com/LabJunBMI/Latent-Space-Peptide-Analogues-Generation)\n\n**Designing diverse and high-performance proteins with a large language model in the loop**  \nCarlos A. Gomez-Uribe, Japheth Gado, Meiirbek Islamov  \n[bioRxiv 2024.10.25.620340](https://www.biorxiv.org/content/10.1101/2024.10.25.620340v1)\n\n**Key-cutting machine: A novel optimization framework for tailored protein and peptide design**  \nYan C. Leyva, Marcelo D. T. Torres, Carlos A. Oliva, Cesar de la Fuente-Nunez, Carlos A. Brizuela  \n[bioRxiv 2025.01.05.631393](https://www.biorxiv.org/content/10.1101/2025.01.05.631393v1) • [code](https://github.com/cbrizuel/KCM)\n\n**Improving functional protein generation via foundation model-derived latent space likelihood optimization**  \nChangge Guan, Fangping Wan, Marcelo D. T. Torres, Cesar de la Fuente-Nunez  \n[bioRxiv 2025.01.07.631724](https://www.biorxiv.org/content/10.1101/2025.01.07.631724v1) • [Supplementary](https://www.biorxiv.org/content/biorxiv/early/2025/01/08/2025.01.07.631724/DC1/embed/media-1.docx)\n\n### 2.10 Antiberta-based\n\n**DyAb: sequence-based antibody design and property prediction in a low-data regime**  \nJoshua Yao-Yu Lin, Jennifer L. Hofmann, Andrew Leaver-Fay, Wei-Ching Liang, Stefania Vasilaki, Edith Lee, Pedro O. Pinheiro, Natasa Tagasovska, James R. Kiefer, Yan Wu, Franziska Seeger, Richard Bonneau, Vladimir Gligorijevic, Andrew Watkins, Kyunghyun Cho, Nathan C. Frey  \n[bioRxiv 2025.01.28.635353](https://www.biorxiv.org/content/10.1101/2025.01.28.635353v1) • [code](github.com/prescient-design/lobster) • [Supplementary](https://www.biorxiv.org/content/biorxiv/early/2025/02/02/2025.01.28.635353/DC1/embed/media-1.pdf)\n\n### 2.11 Boltz-based\n\n**Boltzdesign1: Inverting All-Atom Structure Prediction Model for Generalized Biomolecular Binder Design**  \nYehlin Cho, Martin Pacesa, Zhidian Zhang, Bruno E. Correia, Sergey Ovchinnikov\n[bioRxiv 2025.04.06.647261](https://www.biorxiv.org/content/10.1101/2025.04.06.647261v1) • [code](https://github.com/yehlincho/BoltzDesign1)\n\n### 2.12 Sampling-algorithms\n\n**AdaLead: A simple and robust adaptive greedy search algorithm for sequence design**  \nSam Sinai, Richard Wang, Alexander Whatley, Stewart Slocum, Elina Locane, Eric D. Kelsic  \n[arXiv preprint arXiv:2010.02141 (2020)](https://arxiv.org/abs/2010.02141) • [code](https://github.com/samsinai/FLEXS)\n\n**Autofocused oracles for model-based design**  \nFannjiang, Clara, and Jennifer Listgarten  \n[Advances in Neural Information Processing Systems 33 (2020)](https://proceedings.neurips.cc/paper/2020/file/972cda1e62b72640cb7ac702714a115f-Paper.pdf)\n\n**An Efficient MCMC Approach to Energy Function Optimization in Protein Structure Prediction**  \nLakshmi A. Ghantasala, Risi Jaiswal, Supriyo Datta  \n[arXiv:2211.03193](https://arxiv.org/abs/2211.03193)\n\n**Plug \u0026 Play Directed Evolution of Proteins with Gradient-based Discrete MCMC**  \nPatrick Emami, Aidan Perreault, Jeffrey Law, David Biagioni, Peter St. Joh  \n[NeurIPS 2022](https://www.mlsb.io/papers_2022/Plug_Play_Directed_Evolution_of_Proteins_with_Gradient_based_Discrete_MCMC.pdf)/[arXiv:2212.09925](https://arxiv.org/abs/2212.09925)\n\n**Importance Weighted Expectation-Maximization for Protein Sequence Design**  \nZhenqiao Song, Lei Li  \n[arXiv:2305.00386](https://arxiv.org/abs/2305.00386) • [Supplementary](https://www.biorxiv.org/content/biorxiv/early/2023/05/09/2023.05.09.539914/DC1/embed/media-1.pdf)\n\n**Simultaneous enhancement of multiple functional properties using evolution-informed protein design**  \nBenjamin Fram, Ian Truebridge, Yang Su, Adam J. Riesselman, John B. Ingraham, Alessandro Passera, Eve Napier, Nicole N. Thadani, Samuel Lim, Kristen Roberts, Gurleen Kaur, Michael Stiffler, Debora S. Marks, Christopher D. Bahl, Amir R. Khan, Chris Sander, Nicholas P. Gauthier  \n[bioRxiv (2023): 2023-05](https://www.biorxiv.org/content/10.1101/2023.05.09.539914v1)\n\n**Optimizing protein fitness using Gibbs sampling with Graph-based Smoothing**  \nAndrew Kirjner, Jason Yim, Raman Samusevich, Tommi Jaakkola, Regina Barzilay, Ila Fiete  \n[arXiv:2307.00494](https://arxiv.org/abs/2307.00494) • [code](https://github.com/kirjner/GGS)\n\n**Reliable algorithm selection for machine learning-guided design**  \nClara Fannjiang, Ji Won Park  \n[arXiv:2503.20767](https://arxiv.org/abs/2503.20767)\n\n**Why risk matters for protein binder design**  \nTudor-Stefan Cotet, Igor Krawczuk  \n[arXiv:2504.00146](https://arxiv.org/abs/2504.00146)\n\n## 3. Function to Scaffold\n\n\u003e These models design backbone/scaffold/template in Cartesian coordinates, contact maps, distance maps and φ \u0026 ψ angles. Including conditional/unconditional generative models.\n\n### 3.1 GAN-based\n\n**Generative modeling for protein structures**  \nAnand, Namrata, and Possu Huang  \n[NeurIPS 2018](https://proceedings.neurips.cc/paper/2018/file/afa299a4d1d8c52e75dd8a24c3ce534f-Paper.pdf)\n\n**Fully differentiable full-atom protein backbone generation**  \nAnand Namrata, Raphael Eguchi, and Po-Ssu Huang  \n[OpenReview ICLR 2019 workshop DeepGenStruct](https://openreview.net/forum?id=SJxnVL8YOV) • without code\n\n**RamaNet: Computational de novo helical protein backbone design using a long short-term memory generative neural network**  \nSabban, Sari, and Mikhail Markovsky  \n[F1000Research 9 (2020)](http://f1000researchdata.s3.amazonaws.com/manuscripts/29106/f45e92eb-5d68-4da0-b918-91ded85d2e7d_22907_-_sari_sabban_v2.pdf) • [code](https://sarisabban.github.io/RamaNet/) • pyRosetta • tensorflow • maximizaing the fluorescence of a protein\n\n**A Generative Model for Creating Path Delineated Helical Proteins**  \nNicholas B. Woodall, Ryan Kibler, Basile Wicky, Brian Coventry  \n[bioRxiv 2023.05.24.542095](https://www.biorxiv.org/content/10.1101/2023.05.24.542095v1) • [code](https://github.com/NickWoodall/HelixGen)\n\n### 3.2 AutoEncoder-based\n\n**Conditioning by adaptive sampling for robust design**  \nBrookes, David, Hahnbeom Park, and Jennifer Listgarten  \n[International conference on machine learning. PMLR, 2019](http://proceedings.mlr.press/v97/brookes19a/brookes19a.pdf)  • without code\n\n**IG-VAE: generative modeling of immunoglobulin proteins by direct 3D coordinate generation**  \nRaphael R. Eguchi, Christian A. Choe, Po-Ssu Huang  \n[Biorxiv (2020)](https://www.biorxiv.org/content/10.1101/2020.08.07.242347v2) • without code\n\n**Generating tertiary protein structures via an interpretative variational autoencoder**  \nXiaojie Guo, Yuanqi Du, Sivani Tadepalli, Liang Zhao, Amarda Shehu  \n[arXiv preprint arXiv:2004.07119 (2020)](https://arxiv.org/abs/2004.07119) • code not available\n\n**Function-guided protein design by deep manifold sampling**  \nVladimir Gligorijevic, Stephen Ra, Daniel Berenberg, Richard Bonneau, Kyunghyun Cho  \n[NeurIPS 2021](https://www.mlsb.io/papers_2021/MLSB2021_Function-guided_protein_design_by.pdf) • without code\n\n**Deep sharpening of topological features for de novo protein design**  \nZander Harteveld, Joshua Southern, Michaël Defferrard, Andreas Loukas, Pierre Vandergheynst, Micheal Bronstein, Bruno Correia  \n[ICLR2022 Machine Learning for Drug Discovery. 2022](https://openreview.net/forum?id=DwN81YIXGQP) • code not available\n\n**End-to-End deep structure generative model for protein design**  \nBoqiao Lai, matthew McPartlon, Jinbo Xu  \n[bioRxiv 2022.07.09.499440](https://www.biorxiv.org/content/10.1101/2022.07.09.499440v1)\n\n**Deep Generative Design of Epitope-Specific Binding Proteins by Latent Conformation Optimization**  \nRaphael R Eguchi, Christian A Choe, Udit Parekh, Irene S Khalek, Michael D Ward, Neha Vithani, Gregory R Bowman, Joseph G Jardine, Possu Huang  \n[bioRxiv 2022.12.22.521698](https://www.biorxiv.org/content/10.1101/2022.12.22.521698v1)\n\n**Leveraging Deep Generative Model For Computational Protein Design And Optimization**  \nBoqiao Lai  \n[arXiv:2408.17241](https://arxiv.org/abs/2408.17241) • PhD thesis\n\n**CyclicCAE: A Conformational Autoencoder for Efficient Heterochiral Macrocyclic Backbone Sampling**  \nAndrew C. Powers, P. Douglas Renfrew, Parisa Hosseinzadeh, Vikram Khipple Mulligan  \n[bioRxiv 2025.02.21.639569](https://www.biorxiv.org/content/10.1101/2025.02.21.639569v1)\n\n### 3.3 MLP-based\n\n**A backbone-centred energy function of neural networks for protein design**  \nBin Huang, Yang Xu, Xiuhong Hu, Yongrui Liu, Shanhui Liao, Jiahai Zhang, Chengdong Huang, Jingjun Hong, Quan Chen \u0026 Haiyan Liu  \n[Nature (2022)](https://doi.org/10.1038/s41586-021-04383-5) • [code](https://zenodo.org/record/4533424#.YwP3UPFBwqs)\n\n**De novo Design of Cavity-Containing Proteins with a Backbone-Centered Neural Network Energy Function**  \nYang Xu, Xiuhong Hu, Chenchen Wang, Yongrui Liu, Quan Chen\nHaiyan Liu  \n[Structure (2024)](https://www.cell.com/structure/fulltext/S0969-2126(24)00007-8)\n\n### 3.4 Diffusion-based\n\n**Diffusion probabilistic modeling of protein backbones in 3D for the motif-scaffolding problem**  \nBrian L. Trippe, Jason Yim, Doug Tischer, Tamara Broderick, David Baker, Regina Barzilay, Tommi Jaakkola  \n[arXiv:2206.04119](https://arxiv.org/abs/2206.04119v2)/[NeurIPS 2022](https://www.mlsb.io/papers_2022/Diffusion_probabilistic_modeling_of_protein_backbones_in_3D_for_the_motif_scaffolding_problem.pdf)/[ICLR 2023](https://openreview.net/forum?id=6TxBxqNME1Y) • [poster](https://nips.cc/media/PosterPDFs/NeurIPS%202022/d3d9446802a44259755d38e6d163e820.png?t=1667835607.0141048) • [Supplementary](https://openreview.net/attachment?id=6TxBxqNME1Y\u0026name=supplementary_material) • [code](https://github.com/blt2114/ProtDiff_SMCDiff)\n\n**ProteinSGM: Score-based generative modeling for de novo protein design**  \nJin Sub Lee, Philip M Kim  \n[bioRxiv 2022.07.13.499967](https://www.biorxiv.org/content/10.1101/2022.07.13.499967v2)/[Nat Comput Sci (2023)](https://www.nature.com/articles/s43588-023-00440-3) • [code](https://gitlab.com/mjslee0921/proteinsgm)\n\n**Protein structure generation via folding diffusion**  \nKevin E. Wu, Kevin K. Yang, Rianne van den Berg, James Y. Zou, Alex X. Lu, Ava P. Amini  \n[arXiv:2209.15611](https://arxiv.org/abs/2209.15611v2)/[Nat Commun 15, 1059 (2024)](https://www.nature.com/articles/s41467-024-45051-2) • [code](https://github.com/microsoft/foldingdiff)\n\n**Generating Novel, Designable, and Diverse Protein Structures by Equivariantly Diffusing Oriented Residue Clouds**  \nYeqing Lin, Mohammed AlQuraishi  \n[arXiv:2301.12485v3](https://arxiv.org/abs/2301.12485v3) • [code](https://github.com/aqlaboratory/genie) • [news](https://www.dw.com/en/generative-ai-inventing-proteins-is-changing-medicine/a-66356415)\n\n**SE(3) diffusion model with application to protein backbone generation**  \nJason Yim, Brian L. Trippe, Valentin De Bortoli, Emile Mathieu, Arnaud Doucet, Regina Barzilay, Tommi Jaakkola  \n[arXiv:2302.02277](https://arxiv.org/abs/2302.02277v2)/[ICLR 2023](https://openreview.net/forum?id=6TxBxqNME1Y) • [code](https://github.com/jasonkyuyim/se3_diffusion) • [Supplementary](https://openreview.net/attachment?id=6TxBxqNME1Y\u0026name=supplementary_material)\n\n**A Latent Diffusion Model for Protein Structure Generation**  \nCong Fu, Keqiang Yan, Limei Wang, Wing Yee Au, Michael McThrow, Tao Komikado, Koji Maruhashi, Kanji Uchino, Xiaoning Qian, Shuiwang Ji  \n[arXiv:2305.04120](https://arxiv.org/abs/2305.04120)\n\n**Practical and Asymptotically Exact Conditional Sampling in Diffusion Models**  \nLuhuan Wu, Brian L. Trippe, Christian A. Naesseth, David M. Blei, John P. Cunningham  \n[arXiv:2306.17775](https://arxiv.org/abs/2306.17775) • [code](https://github.com/blt2114/twisted_diffusion_sampler)\n\n**Dynamics-Informed Protein Design with Structure Conditioning**  \nSimon V. Mathis, Urszula Julia Komorowska, Mateja Jamnik, Pietro Lió  \n[WCBICML2023](https://icml-compbio.github.io/2023/papers/WCBICML2023_paper121.pdf)/[ICLR 2024](https://openreview.net/forum?id=jZPqf2G9Sw)\n\n**ForceGen: End-to-end de novo protein generation based on nonlinear mechanical unfolding responses using a protein language diffusion model**  \nBo Ni and David L. Kaplan and M. Buehler  \n[arXiv:2310.10605](https://arxiv.org/abs/2310.10605)/[Science Advances 10.6 (2024)](https://www.science.org/doi/10.1126/sciadv.adl4000) • [Supplementary](https://www.dropbox.com/scl/fi/33tnpd6u2xwermlvj22y9/SI_3_unfolding_movies_from_dataset.zip?rlkey=qno7rcitcdree8t9cj8wzg9sf\u0026dl=0) • [code](https://github.com/lamm-mit/ProteinMechanicsDiffusionDesign)\n\n**DiffSDS: A geometric sequence diffusion model for protein backbone inpainting**  \nAnonymous  \n[ICLR 2024](https://openreview.net/forum?id=2xYO9oxh0y)/[arXiv:2301.09642](https://arxiv.org/abs/2301.09642)\n\n**A framework for conditional diffusion modelling with applications in motif scaffolding for protein design**  \nKieran Didi, Francisco Vargas, Simon V Mathis, Vincent Dutordoir, Emile Mathieu, Urszula J Komorowska, Pietro Lio  \n[arXiv:2312.09236](https://arxiv.org/abs/2312.09236)\n\n**TopoDiff: Improving Protein Backbone Generation with Topology-aware Latent Encoding**  \nYuyang Zhang, Zihui (Zinnia) Ma, Haipeng Gong  \n[bioRxiv 2023.12.13.571602](https://www.biorxiv.org/content/10.1101/2023.12.13.571602v1)\n\n**Improved motif-scaffolding with SE(3) flow matching**  \nJason Yim, Andrew Campbell, Emile Mathieu, Andrew Y. K. Foong, Michael Gastegger, José Jiménez-Luna, Sarah Lewis, Victor Garcia Satorras, Bastiaan S. Veeling, Frank Noé, Regina Barzilay, Tommi S. Jaakkola  \n[arXiv:2401.04082](https://arxiv.org/abs/2401.04082)/[TMLR](https://openreview.net/forum?id=fa1ne8xDGn) • [code1](https://github.com/microsoft/frame-flow),[code2](https://github.com/microsoft/protein-frame-flow)\n\n**DiffTopo: Fold exploration using coarse grained protein topology representations**  \nYangyang Miao, Bruno Correia  \n[bioRxiv 2024.02.01.578456](https://www.biorxiv.org/content/10.1101/2024.02.01.578456v1)/ICLR 2024\n\n**Diffusion models in protein structure and docking**  \nJason Yim, Hannes Stärk, Gabriele Corso, Bowen Jing, Regina Barzilay, Tommi S. Jaakkola  \n[Wiley Interdisciplinary Reviews: Computational Molecular Science 14.2 (2024)](https://wires.onlinelibrary.wiley.com/doi/10.1002/wcms.1711) • review\n\n**De novo antibody design with SE(3) diffusion**  \nDaniel Cutting, Frédéric A. Dreyer, David Errington, Constantin Schneider, Charlotte M. Deane  \n[arXiv:2405.07622](https://arxiv.org/abs/2405.07622)\n\n**Out of Many, One: Designing and Scaffolding Proteins at the Scale of the Structural Universe with Genie 2**  \nYeqing Lin, Minji Lee, Zhao Zhang, Mohammed AlQuraishi  \n[arXiv:2405.15489](https://arxiv.org/abs/2405.15489) • [code](https://github.com/aqlaboratory/genie2) • [news](https://www.marktechpost.com/2024/05/29/genie-2-transforming-protein-design-with-advanced-multi-motif-scaffolding-and-enhanced-structural-diversity/)\n\n**Diffuse StructGen-1 (DSG-1)**  \n[the Diffuse team](https://www.linkedin.com/company/diffuse-bio/)  \n[technical appendix](https://diffuse.bio/updates.html#appendix) • commercial\n\n**Floating Anchor Diffusion Model for Multi-motif Scaffolding**  \nKe Liu, Weian Mao, Shuaike Shen, Xiaoran Jiao, Zheng Sun, Hao Chen, Chunhua Shen  \n[ICML 2024](https://proceedings.mlr.press/v235/liu24av.html)/[arXiv:2406.03141](https://arxiv.org/abs/2406.03141) • [code](https://github.com/aim-uofa/FADiff) • [poster](https://icml.cc/virtual/2024/poster/34654)\n\n**De novo Design of A Fusion Protein Tool for GPCR Research**  \nKaixuan Gao, Xin Zhang, Jia Nie, Hengyu Meng, Weishe Zhang, Boxue Tian, Xiangyu Liu  \n[bioRxiv 2024.09.14.613090](https://www.biorxiv.org/content/10.1101/2024.09.14.613090v1) • [Supplementary](https://www.biorxiv.org/content/biorxiv/early/2024/09/15/2024.09.14.613090/DC1/embed/media-1.pdf) • RFdiffusion-based\n\n**Text2Protein: A Generative Model for Designated Protein Design on Given Description**  \nRamtin Hosseini, Siyang Zhang, Pengtao Xie  \n[PREPRINT (Version 1) available at Research Square](https://doi.org/10.21203/rs.3.rs-4868665/v1) • [code](https://github.com/szhan227/text2protein)\n\n**Improving diffusion-based protein backbone generation with global-geometry-aware latent encoding**  \nYuyang Zhang, Yuhang Liu, Zinnia Ma, Min Li, Chunfu Xu, Haipeng Gong  \n[bioRxiv 2024.10.05.616664](https://www.biorxiv.org/content/10.1101/2024.10.05.616664v1) • [code](https://github.com/meneshail/TopoDiff)\n\n**Diffusion Posterior Sampling via Sequential Monte Carlo for Zero-Shot Scaffolding of Protein Motifs**  \nYoung, James Matthew Uygongco, and Omer Deniz Akyildiz  \n[Imperial CollegeofScience, Technology and Medicine, 2024](https://matsagad.com/files/papers/MRes_Project.pdf) • [code](https://github.com/matsagad/mres-project) • Master thesis • Genie-based\n\n**Protein A-like Peptide Design Based on Diffusion and ESM2 Models**  \nLong Zhao, Qiang He, Huijia Song, Huijia Song,Tianqian Zhou, An Luo, Zhenguo Wen,Teng Wang, and Xiaozhu Lin  \n[Molecules 29.20 (2024)](https://www.mdpi.com/1420-3049/29/20/4965) • [code](https://github.com/tomlongcool/diffusion4Protein)\n\n**FoldMark: Protecting Protein Generative Models with Watermarking**  \nZaixi Zhang, Ruofan Jin, Kaidi Fu, Le Cong, Marinka Zitnik, Mengdi Wang  \n[arXiv:2410.20354](https://arxiv.org/abs/2410.20354) • [code](https://github.com/zaixizhang/FoldMark)\n\n**ProteinWeaver: A Divide-and-Assembly Approach for Protein Backbone Design**  \nYiming Ma, Fei Ye, Yi Zhou, Zaixiang Zheng, Dongyu Xue, Quanquan Gu  \n[arXiv:2411.16686](https://arxiv.org/abs/2411.16686)\n\n**On Diffusion Posterior Sampling via Sequential Monte Carlo for Zero-Shot Scaffolding of Protein Motifs**  \nJames Matthew Young, O. Deniz Akyildiz  \n[arXiv:2412.05788](https://arxiv.org/abs/2412.05788) • [code](https://github.com/matsagad/mres-project)\n\n**From thermodynamics to protein design: Diffusion models for biomolecule generation towards autonomous protein engineering**  \nWen-ran Li, Xavier F. Cadet, David Medina-Ortiz, Mehdi D. Davari, Ramanathan Sowdhamini, Cedric Damour, Yu Li, Alain Miranville, Frederic Cadet  \n[arXiv:2501.02680](https://arxiv.org/abs/2501.02680) • review\n\n**RFdiffusion Exhibits Low Success Rate in De Novo Design of Functional Protein Binders for Biochemical Detection**  \nBruce Jiang, Xiaoxiao Li, Amber Guo, Moris Wei, Jonny Wu  \n[bioRxiv 2025.02.07.636769](https://www.biorxiv.org/content/10.1101/2025.02.07.636769v1)\n\n**From Atoms to Fragments: A Coarse Representation for Efficient and Functional Protein Design**  \nLeonardo V Castorina, Christopher W Wood, Kartic Subr  \n[bioRxiv 2025.03.19.644162](https://www.biorxiv.org/content/10.1101/2025.03.19.644162v2) • [Supplementary](https://www.biorxiv.org/content/biorxiv/early/2025/03/20/2025.03.19.644162/DC1/embed/media-1.pdf) • RFdiffusion-based\n\n**Hierarchical Protein Backbone Generation with Latent and Structure Diffusion**  \nJason Yim, Marouane Jaakik, Ge Liu, Jacob Gershon, Karsten Kreis, David Baker, Regina Barzilay, Tommi Jaakkola  \n[ICLR 2025](https://openreview.net/forum?id=J19jKa3wFj)\n\n**The Dance of Atoms-De Novo Protein Design with Diffusion Model**  \nYujie Qin, Ming He, Changyong Yu, Ming Ni, Xian Liu, Xiaochen Bo  \n[arXiv:2504.16479](https://arxiv.org/abs/2504.16479) • review\n\n### 3.5 RL-based\n\n**Top-down design of protein nanomaterials with reinforcement learning**  \nIsaac D Lutz, Shunzhi Wang, Christoffer Norn, Andrew J Borst, Yan Ting Zhao, Annie Dosey, Longxing Cao, Zhe Li, Minkyung Baek, Neil P King, Hannele Ruohola-Baker, David Baker  \n[bioRxiv 2022.09.25.509419](https://www.biorxiv.org/content/10.1101/2022.09.25.509419v1)/[Science380, 266-273(2023)](https://www.science.org/doi/10.1126/science.adf6591) • [code](https://github.com/idlutz/protein-backbone-MCTS),[code2](https://files.ipd.uw.edu/pub/2023_RL_capsid_design/sequence_design_pipeline.tar)\n\n**Model-based reinforcement learning for protein backbone design**  \nFrederic Renard, Cyprien Courtot, Alfredo Reichlin, Oliver Bent  \n[arXiv:2405.01983](https://arxiv.org/abs/2405.01983)\n\n**Target-based de novo design of cyclic peptide binders**  \nFanhao Wang, Tiantian Zhang, Jintao Zhu, Xiaoling Zhang, Changsheng Zhang, Luhua Lai  \n[bioRxiv 2025.01.18.633746](https://www.biorxiv.org/content/10.1101/2025.01.18.633746v1) • [Supplementary](https://www.biorxiv.org/content/biorxiv/early/2025/01/19/2025.01.18.633746/DC1/embed/media-1.pdf)\n\n### 3.6 Flow-based\n\n**SE(3)-Stochastic Flow Matching for Protein Backbone Generation**  \nAvishek Joey Bose, Tara Akhound-Sadegh, Kilian Fatras, Guillaume Huguet, Jarrid Rector-Brooks, Cheng-Hao Liu, Andrei Cristian Nica, Maksym Korablyov, Michael Bronstein, Alexander Tong  \n[arXiv:2310.02391](https://arxiv.org/abs/2310.02391)/[ICLR 2024](https://openreview.net/forum?id=kJFIH23hXb)\n\n**Fast protein backbone generation with SE(3) flow matching**  \nJason Yim, Andrew Campbell, Andrew Y. K. Foong, Michael Gastegger, José Jiménez-Luna, Sarah Lewis, Victor Garcia Satorras, Bastiaan S. Veeling, Regina Barzilay, Tommi Jaakkola, Frank Noé  \n[arXiv:2310.05297](https://arxiv.org/abs/2310.05297) • [code](https://github.com/microsoft/frame-flow)\n\n**Sequence-Augmented SE(3)-Flow Matching For Conditional Protein Backbone Generation**  \nGuillaume Huguet, James Vuckovic, Kilian Fatras, Eric Thibodeau-Laufer, Pablo Lemos, Riashat Islam, Cheng-Hao Liu, Jarrid Rector-Brooks, Tara Akhound-Sadegh, Michael Bronstein, Alexander Tong, Avishek Joey Bose  \n[arXiv:2405.20313](https://arxiv.org/abs/2405.20313)/[NeurIPS 2024](https://openreview.net/forum?id=paYwtPBpyZ) • [website](https://www.dreamfold.ai/blog/foldflow-2) • [lecture](https://www.youtube.com/watch?v=xgA8T9h8mm0)\n\n**Design of Ligand-Binding Proteins with Atomic Flow Matching**  \nJunqi Liu, Shaoning Li, Chence Shi, Zhi Yang, Jian Tang  \n[arXiv:2409.12080](https://arxiv.org/abs/2409.12080)\n\n**Proteina: Scaling Flow-based Protein Structure Generative Models**  \nTomas Geffner, Kieran Didi, Zuobai Zhang, Danny Reidenbach, Zhonglin Cao, Jason Yim, Mario Geiger, Christian Dallago, Emine Kucukbenli, Arash Vahdat, Karsten Kreis  \n[ICLR 2025 Oral](https://openreview.net/forum?id=TVQLu34bdw) • [code](https://github.com/NVIDIA-Digital-Bio/proteina/) • [website](https://research.nvidia.com/labs/genair/proteina/) • [lecture](https://www.youtube.com/watch?v=Y2dRj9_ZEHw)\n\n**ProtComposer: Compositional Protein Structure Generation with 3D Ellipsoids**  \nHannes Stark, Bowen Jing, Tomas Geffner, Jason Yim, Tommi Jaakkola, Arash Vahdat, Karsten Kreis  \n[ICLR 2025 Oral](https://openreview.net/forum?id=0ctvBgKFgc) • [code](https://github.com/NVlabs/protcomposer) • [lecture](https://www.youtube.com/watch?v=2G0d-RePc7c)\n\n### 3.7 Score-based\n\n**Score-Based Generative Models for Designing Binding Peptide Backbones**  \nJohn D Boom, Matthew Greenig, Pietro Sormanni, Pietro Liò  \n[arXiv:2310.07051](https://arxiv.org/abs/2310.07051) • [code](https://github.com/mgreenig/loopgen)\n\n**Building Confidence in Deep Generative Protein Design**  \nTianyuan Zheng, Alessandro Rondina, Pietro Liò  \n[arXiv:2411.18568](https://arxiv.org/abs/2411.18568) • [code](https://github.com/ECburx/PROTEVAL)\n\n## 4.Scaffold to Sequence\n\n\u003e Identify amino sequence from given backbone/scaffold/template constrains: torsion angles(φ \u0026 ψ), backbone angles(θ and τ), backbone dihedrals (φ, ψ \u0026 ω), backbone atoms (Cα, N, C, \u0026 O), Cα − Cα distance, unit direction vectors of Cα−Cα, Cα−N \u0026 Cα−C, etc(aka. inverse folding). Referred from [here](https://arxiv.org/abs/2202.01079). Energy-based models are also inculded for task of rotamer conformation(χ angles or atom coordinates) recovery.\n\n### 4.0 Review\n\n**Protein sequence design on given backbones with deep learning**  \nYufeng Liu, Haiyan Liu  \n[Protein Engineering, Design and Selection, 2023](https://academic.oup.com/peds/advance-article-abstract/doi/10.1093/protein/gzad024/7503843)\n\n**Multi-indicator comparative evaluation for deep Learning-Based protein sequence design methods**  \nJinyu Yu, Junxi Mu, Ting Wei, Hai-Feng Chen  \n[Bioinformatics, 2024;, btae037](https://academic.oup.com/bioinformatics/advance-article/doi/10.1093/bioinformatics/btae037/7585533)\n\n**Generative AI for Controllable Protein Sequence Design: A Survey**  \nYiheng Zhu, Zitai Kong, Jialu Wu, Weize Liu, Yuqiang Han, Mingze Yin, Hongxia Xu, Chang-Yu Hsieh, Tingjun Hou  \n[arXiv:2402.10516](https://arxiv.org/abs/2402.10516)\n\n### 4.1 MLP-based\n\n**3D representations of amino acids-applications to","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FPeldom%2Fpapers_for_protein_design_using_DL","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FPeldom%2Fpapers_for_protein_design_using_DL","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FPeldom%2Fpapers_for_protein_design_using_DL/lists"}