{"id":13752363,"url":"https://github.com/chao1224/ChatDrug","last_synced_at":"2025-05-09T19:32:02.736Z","repository":{"id":169509840,"uuid":"642452221","full_name":"chao1224/ChatDrug","owner":"chao1224","description":"LLM for Drug Editing, ICLR 2024","archived":false,"fork":false,"pushed_at":"2024-05-28T19:44:44.000Z","size":4701,"stargazers_count":141,"open_issues_count":2,"forks_count":8,"subscribers_count":3,"default_branch":"main","last_synced_at":"2024-11-12T13:23:56.892Z","etag":null,"topics":["chatgpt","chatgpt3","conversation","domain-feedback","drug","drug-discovery","drug-editing","editing","llm","molecule","motif","peptide","protein","retrieval","secondary-structure","small-molecule","structure"],"latest_commit_sha":null,"homepage":"https://chao1224.github.io/ChatDrug","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/chao1224.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-05-18T15:42:07.000Z","updated_at":"2024-11-09T07:34:17.000Z","dependencies_parsed_at":"2024-01-14T09:33:38.559Z","dependency_job_id":"aa5ae8f2-03a8-4783-82ab-66f94446990d","html_url":"https://github.com/chao1224/ChatDrug","commit_stats":{"total_commits":24,"total_committers":2,"mean_commits":12.0,"dds":"0.41666666666666663","last_synced_commit":"fb8470b81686c533fb3df1ad96cdb3405dae802d"},"previous_names":["chao1224/chatdrug"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chao1224%2FChatDrug","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chao1224%2FChatDrug/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chao1224%2FChatDrug/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chao1224%2FChatDrug/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/chao1224","download_url":"https://codeload.github.com/chao1224/ChatDrug/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":224880777,"owners_count":17385367,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["chatgpt","chatgpt3","conversation","domain-feedback","drug","drug-discovery","drug-editing","editing","llm","molecule","motif","peptide","protein","retrieval","secondary-structure","small-molecule","structure"],"created_at":"2024-08-03T09:01:04.606Z","updated_at":"2024-11-16T05:30:27.231Z","avatar_url":"https://github.com/chao1224.png","language":"Python","funding_links":[],"categories":["Ranked by starred repositories","Machine Learning Tasks and Models"],"sub_categories":["LLM for Biology"],"readme":"# Conversational Drug Editing Using Retrieval and Domain Feedback\n\n**ICLR 2024**\n\nAuthors: Shengchao Liu\u003csup\u003e+\u003c/sup\u003e, Jiongxiao Wang\u003csup\u003e+\u003c/sup\u003e, Yijin Yang, Chengpeng Wang, Ling Liu, Hongyu Guo\u003csup\u003e\\*\u003c/sup\u003e, Chaowei Xiao\u003csup\u003e\\*\u003c/sup\u003e\n\n\u003csup\u003e+\u003c/sup\u003e Equal contribution\u003cbr\u003e\n\u003csup\u003e\\*\u003c/sup\u003e Equal advising\n\n[[Paper](https://openreview.net/forum?id=yRrPfKyJQ2)]\n[[Project Page](https://chao1224.github.io/ChatDrug)]\n[[ArXiv](https://arxiv.org/abs/2305.18090)]\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"figure/pipeline.png\" /\u003e \n\u003c/p\u003e\n\n\nChatDrug is for conversational drug editing, and three types of drugs are considered:\n- Small Molecules\n- Peptides\n- Proteins\n\u003cp align=\"left\"\u003e\n  \u003cimg src=\"figure/final_demo.gif\" width=\"100%\" /\u003e \n\u003c/p\u003e\n\n## Environment\n\nSetup the anaconda (skip this if you already have conda)\n ```bash\nwget https://repo.continuum.io/archive/Anaconda3-2019.10-Linux-x86_64.sh\nbash Anaconda3-2019.10-Linux-x86_64.sh -b\nexport PATH=$PWD/anaconda3/bin:$PATH\n```\n\nThen download the required python packages:\n```bash\nconda create -n ChatDrug python=3.8\nconda activate ChatDrug\npip install rdkit-pypi==2022.9.4\nconda install -y numpy networkx scikit-learn\nconda install -y -c conda-forge -c pytorch pytorch=1.9.1\n\npip install tensorflow\npip install mhcflurry\npip install levenshtein\n\npip install transformers\npip install lmdb\npip install seqeval\npip install openai\npip install fastchat\npip install psutil\npip install accelerate\n\npip install -e .\n```\n\n## Dataset\n\nWe provide the dataset in [this link](https://huggingface.co/datasets/chao1224/ChatDrug_data). You can manually download and move to the `data` folder or using the following python script.\n```\nfrom huggingface_hub import snapshot_download\n\nsnapshot_download(repo_id=\"chao1224/ChatDrug_data\", repo_type=\"dataset\", local_dir=\"data\", local_dir_use_symlinks=False, ignore_patterns=[\"README.md\"])\n```\nPlease give credits to the original papers. For more details of dataset, please check the [data folder](./data).\n\n## Evaluation\n\nThe evaluation metrics for three editing tasks are below:\n| Drug Type | Evaluation |\n| -- | -- |\n| Small Molecule | RDKit (`conda install -y -c rdkit rdkit`)|\n| Peptide | [MHCFlurry](https://github.com/openvax/mhcflurry)|\n| Protein | [ProteinDT paper](https://arxiv.org/abs/2302.04611), [checkpoints](https://huggingface.co/chao1224/ProteinCLAP_pretrain_EBM_NCE_downstream_property_prediction) |\n\nFor evaluation on peptides and proteins, please read the following instructions:\n- For peptides (MHCFlurry), please run the following bash commands:\n```\n\u003e pip install mhcflurry\n\u003e mhcflurry-downloads fetch models_class1_presentation\n\u003e mhcflurry-downloads path models_class1_presentation\n$PATH\n\u003e mv $PATH data/peptide/models_class1_presentation\n```\n- For proteins (ProteinDT / ProteinCLAP), please run the following python script:\n```\nfrom huggingface_hub import hf_hub_download\n\nhf_hub_download(\n  repo_id=\"chao1224/ProteinCLAP_pretrain_EBM_NCE_downstream_property_prediction\",\n  repo_type=\"model\",\n  filename=\"pytorch_model_ss3.bin\",\n  cache_dir=\"data/protein\")\n```\nPlease give credits to the original papers. For more details of evaluation, please check the [data folder](./data).\n\n## Prompt for Drug Editing\n\nAll the task prompts are defined in `ChatDrug/task_and_evaluation`. you can also find it on [the hugging face link](https://huggingface.co/datasets/chao1224/ChatDrug_prompt).\n\n## Usage\n\nPlease provide your OpenAI API Key in `ChatDrug/task_and_evaluation/Conversational_LLMs_utils.py`\n\nTo use ChatDrug, please use the following command:\n```\npython main_ChatDrug.py --task task_id --log_file results/ChatDrug.log --record_file results/ChatDrug.json --C 2\n```\nResults will be saved in `results/`.\n\nFor protein editing tasks, multiple evaluation times in retrieval process would consume a lot of time. Thus, we provide a fast version of conversation setting. Running the following command to implement accelerate ChatDrug for protein editing tasks:\n```\npython main_ChatDrug.py --task task_id --log_file results/ChatDrug_fast_protein.log --record_file results/ChatDrug_fast_protein.json --C 2 --fast_protein\n```\n\nWe also provide code for In-Context Learning setting:\n```\npython main_InContext.py --task task_id --log_file results/InContext.log --record_file results/InContext.json\n```\n\n\n## Cite Us\nFeel free to cite this work if you find it useful to you!\n\n```\n@inproceedings{liu2024chatdrug,\n    title={Conversational Drug Editing Using Retrieval and Domain Feedback},\n    author={Shengchao Liu, Jiongxiao Wang, Yijin Yang, Chengpeng Wang, Ling Liu, Hongyu Guo, Chaowei Xiao},\n    booktitle={The Twelfth International Conference on Learning Representations},\n    year={2024},\n    url={https://openreview.net/forum?id=yRrPfKyJQ2}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fchao1224%2FChatDrug","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fchao1224%2FChatDrug","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fchao1224%2FChatDrug/lists"}