{"id":19839843,"url":"https://github.com/qdata/deepvhppi","last_synced_at":"2025-05-01T19:30:26.434Z","repository":{"id":83652139,"uuid":"316011074","full_name":"QData/DeepVHPPI","owner":"QData","description":"Motif Transformers for Predicting Protein-Protein Interactions Between a Novel Virus and Humans","archived":false,"fork":false,"pushed_at":"2024-12-04T16:57:49.000Z","size":893,"stargazers_count":10,"open_issues_count":0,"forks_count":2,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-04-06T17:05:21.878Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/QData.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-11-25T17:35:36.000Z","updated_at":"2025-03-26T10:41:18.000Z","dependencies_parsed_at":"2023-03-12T18:59:25.243Z","dependency_job_id":null,"html_url":"https://github.com/QData/DeepVHPPI","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/QData%2FDeepVHPPI","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/QData%2FDeepVHPPI/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/QData%2FDeepVHPPI/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/QData%2FDeepVHPPI/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/QData","download_url":"https://codeload.github.com/QData/DeepVHPPI/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":251932525,"owners_count":21667159,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-12T12:24:34.403Z","updated_at":"2025-05-01T19:30:26.429Z","avatar_url":"https://github.com/QData.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"### Title: Transfer Learning for Predicting Virus-Host Protein Interactions for Novel Virus Sequences\n\n+ authors: Jack Lanchantin, Tom Weingarten, Arshdeep Sekhon, Clint Miller, Yanjun Qi\n+ 2021 ACM Conference on Bioinformatics, Computational Biology, and Health Informatics (ACM BCB)\n\n\n### PDF\n\n- @[BioArxiv](https://www.biorxiv.org/content/10.1101/2020.12.14.422772v2)\n- @[ACM](https://dl.acm.org/doi/abs/10.1145/3459930.3469527)\n- GitHub [https://github.com/QData/DeepVHPPI](https://github.com/QData/DeepVHPPI)\n\n\n### Talk: [Slide](https://docs.google.com/presentation/d/1LfSVsZ2hSy7F-AVXt1WIUfT2Qsw-db7VITZhmifAniQ/edit)\n\n\n### Abstract\n\nViruses such as SARS-CoV-2 infect the human body by forming interactions between virus proteins and human proteins. However, experimental methods to find protein interactions are inadequate: large scale experiments are noisy, and small scale experiments are slow and expensive. Inspired by the recent successes of deep neural networks, we hypothesize that deep learning methods are well-positioned to aid and augment biological experiments, hoping to help identify more accurate virus-host protein interaction maps. Moreover, computational methods can quickly adapt to predict how virus mutations change protein interactions with the host proteins.\n\nWe propose DeepVHPPI, a novel deep learning framework combining a self-attention-based transformer architecture and a transfer learning training strategy to predict interactions between human proteins and virus proteins that have novel sequence patterns. We show that our approach outperforms the state-of-the-art methods significantly in predicting Virus–Human protein interactions for SARS-CoV-2, H1N1, and Ebola. In addition, we demonstrate how our framework can be used to predict and interpret the interactions of mutated SARS-CoV-2 Spike protein sequences.\n\nWe make all of our data and code available on GitHub [https://github.com/QData/DeepVHPPI](https://github.com/QData/DeepVHPPI).\n\n\n![demo1](zmedia/deepVH2.png)\n![demo2](zmedia/deepVH3.png)\n![demo3](zmedia/deepVH4.png)\n\n\n### Citations\n\n```\n@article {Lanchantin2020.12.14.422772,\n\tauthor = {Lanchantin, Jack and Weingarten, Tom and Sekhon, Arshdeep and Miller, Clint and Qi, Yanjun},\n\ttitle = {Transfer Learning for Predicting Virus-Host Protein Interactions for Novel Virus Sequences},\n\telocation-id = {2020.12.14.422772},\n\tyear = {2021},\n\tdoi = {10.1101/2020.12.14.422772},\n\tpublisher = {Cold Spring Harbor Laboratory},\n\tabstract = {Viruses such as SARS-CoV-2 infect the human body by forming interactions between virus proteins and human proteins. However, experimental methods to find protein interactions are inadequate: large scale experiments are noisy, and small scale experiments are slow and expensive. Inspired by the recent successes of deep neural networks, we hypothesize that deep learning methods are well-positioned to aid and augment biological experiments, hoping to help identify more accurate virus-host protein interaction maps. Moreover, computational methods can quickly adapt to predict how virus mutations change protein interactions with the host proteins.We propose DeepVHPPI, a novel deep learning framework combining a self-attention-based transformer architecture and a transfer learning training strategy to predict interactions between human proteins and virus proteins that have novel sequence patterns. We show that our approach outperforms the state-of-the-art methods significantly in predicting Virus{\\textendash}Human protein interactions for SARS-CoV-2, H1N1, and Ebola. In addition, we demonstrate how our framework can be used to predict and interpret the interactions of mutated SARS-CoV-2 Spike protein sequences.Availability We make all of our data and code available on GitHub https://github.com/QData/DeepVHPPI.ACM Reference Format Jack Lanchantin, Tom Weingarten, Arshdeep Sekhon, Clint Miller, and Yanjun Qi. 2021. Transfer Learning for Predicting Virus-Host Protein Interactions for Novel Virus Sequences. In Proceedings of ACM Conference (ACM-BCB). ACM, New York, NY, USA, 10 pages. https://doi.org/??Competing Interest StatementThe authors have declared no competing interest.},\n\tURL = {https://www.biorxiv.org/content/early/2021/06/08/2020.12.14.422772},\n\teprint = {https://www.biorxiv.org/content/early/2021/06/08/2020.12.14.422772.full.pdf},\n\tjournal = {bioRxiv}\n}\n\n```\n\n# How to get the data\n```\nwget https://www.cs.virginia.edu/~yq2h/jack/ppi/deepvhppi.tar.gz\ntar -xvf deepvhppi.tar.gz\n```\n\n# How to run the code \n\n**SARS-CoV-2 PPI**\n```\nCUDA_VISIBLE_DEVICES=0,1,2,3 python main.py --data_root ./data/ -tr yang/train.json -va yang/test.json -te  HVPPI/test.json -v vocab.data -s 1024 -hs 512 -l 12  -o results  --lr 0.00001 --dropout 0.1 --epochs 200 --attn_heads 8 --activation 'gelu' --task biogrid  --emb_type 'conv' --overwrite  --batch_size 4 --grad_ac_steps 4\n```\n\n**ZHOU PPI**\n```\nCUDA_VISIBLE_DEVICES=0,1,2,3 python main.py --data_root ./data/ -tr zhou/h1n1/human/train.json  -va zhou/h1n1/human/test.json -v vocab.data -s 1024 -hs 512 -l 12  -o results --lr 0.00001 --dropout 0.1 --epochs 20000 --attn_heads 8 --activation 'gelu' --task ppi --emb_type 'conv' --overwrite  --batch_size 8 --grad_ac_steps 2 --name '' \n```\n\n**BARMAN PPI**\n```\nCUDA_VISIBLE_DEVICES=0,1,2,3 python main.py --data_root ./data/ -tr barman/train1.json  -va barman/test1.json -v vocab.data -s 1600 -hs 512 -l 12  -o results  --lr 0.00001 --dropout 0.1 --epochs 200 --attn_heads 8 --activation 'gelu' --task ppi  --emb_type 'conv' --overwrite  --batch_size 4 --grad_ac_steps 4\n```\n\n**DeNovo SLIM PPI**\n```\nCUDA_VISIBLE_DEVICES=0,1,2,3 python main.py --data_root ./data/ -tr denovo/train.json  -va denovo/test.json -v vocab.data -s 1024 -hs 512 -l 12  -o results --lr 0.00001 --dropout 0.1 --epochs 20000 --attn_heads 8 --activation 'gelu' --task ppi --emb_type 'conv' --overwrite  --batch_size 8 --grad_ac_steps 2 --name '' --saved_bert ./results/multi.bert.bsz_16.layers_12.size_512.heads_8.drop_10.lr_1e-05.saved_bert.torch/best_model.pt\n```\n\n\n### Support or Contact\n\nHaving trouble with our tools? Please [contact Jack](mailto:jacklanchantin@gmail.com) and we’ll help you sort it out.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fqdata%2Fdeepvhppi","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fqdata%2Fdeepvhppi","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fqdata%2Fdeepvhppi/lists"}