{"id":20995171,"url":"https://github.com/seonghwanseo/pharmaconet","last_synced_at":"2025-05-12T00:41:57.117Z","repository":{"id":197765366,"uuid":"699273873","full_name":"SeonghwanSeo/PharmacoNet","owner":"SeonghwanSeo","description":"Official Github for \"PharmacoNet: deep learning-guided pharmacophore modeling for ultra-large-scale virtual screening\" (Chemical Science)","archived":false,"fork":false,"pushed_at":"2025-02-07T07:50:31.000Z","size":15897,"stargazers_count":70,"open_issues_count":2,"forks_count":5,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-05-12T00:41:46.021Z","etag":null,"topics":["deep-learning","drug-discovery","instance-segmentation","machine-learning","pharmacophore","pharmacophore-modelling","virtual-screening"],"latest_commit_sha":null,"homepage":"https://doi.org/10.1039/D4SC04854G","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/SeonghwanSeo.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-10-02T09:55:28.000Z","updated_at":"2025-04-10T03:49:12.000Z","dependencies_parsed_at":null,"dependency_job_id":"85d20ed9-ec84-4106-98ad-bade608d4b1f","html_url":"https://github.com/SeonghwanSeo/PharmacoNet","commit_stats":null,"previous_names":["seonghwanseo/pharmaconet"],"tags_count":7,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SeonghwanSeo%2FPharmacoNet","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SeonghwanSeo%2FPharmacoNet/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SeonghwanSeo%2FPharmacoNet/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SeonghwanSeo%2FPharmacoNet/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/SeonghwanSeo","download_url":"https://codeload.github.com/SeonghwanSeo/PharmacoNet/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253655919,"owners_count":21943072,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deep-learning","drug-discovery","instance-segmentation","machine-learning","pharmacophore","pharmacophore-modelling","virtual-screening"],"created_at":"2024-11-19T07:22:15.229Z","updated_at":"2025-05-12T00:41:57.069Z","avatar_url":"https://github.com/SeonghwanSeo.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# PharmacoNet: Open-source Protein-based Pharmacophore Modeling\n\n[![DOI](https://zenodo.org/badge/699273873.svg)](https://zenodo.org/doi/10.5281/zenodo.12168474)\n[![license: MIT](https://img.shields.io/badge/License-MIT-purple.svg)](LICENSE)\n\n**Before using PharmacoNet, consider using [OpenPharmaco](https://github.com/SeonghwanSeo/OpenPharmaco): GUI powered by PharmacoNet.**\n\n**Chemical Science (Open Access)** [[paper](https://doi.org/10.1039/D4SC04854G)]\n\nOfficial Github for **_PharmacoNet: deep learning-guided pharmacophore modeling for ultra-large-scale virtual screening_** by Seonghwan Seo\\* and Woo Youn Kim.\n\nPharmacoNet is an extremely rapid yet reasonably accurate ligand evaluation tool with high generation ability:\n\n1. Fully automated protein-based pharmacophore modeling based on image instance segmentation modeling\n2. Coarse-grained graph matching at the pharmacophore level for high throughput virtual screening\n3. Pharmacophore-aware scoring function with parameterized analytical function for robust generalization ability\n4. Better pocket representation for deep learning developer ([section](#pharmacophore-feature-extraction))\n\nIf you have any problems or need help with the code, please add an github issue or contact [shwan0106@kaist.ac.kr](mailto:shwan0106@kaist.ac.kr).\n\n\\* You can read the previous NeurIPS 2023 Workshop version at [arXiv](https://arxiv.org/abs/2310.00681).\n\n## Table of Contents\n\n- [Quick Start](#quick-start)\n- [Installation](#installation)\n- [Pharmacophore Modeling](#pharmacophore-modeling)\n- [Virtual Screening](#virtual-screening)\n- [Pharmacophore Feature Extraction](#pharmacophore-feature-extraction)\n- [Pre-trained Docking Proxy](#pretrained-docking-proxy)\n- [Citation](#citation)\n\n## Quick Start\n\n```bash\n# Pharmacophore Modeling\npython modeling.py --pdb \u003cPDB ID\u003e   # RCSB PDB importing\npython modeling.py --protein \u003cPROTEIN_PATH\u003e --prefix \u003cEXP_NAME\u003e --cuda  # CUDA acceleration\npython modeling.py --protein \u003cPROTEIN_PATH\u003e --prefix \u003cEXP_NAME\u003e --ref_ligand \u003cREF_LIGAND_PATH\u003e\n\n# Virtual Screening\npython screening.py -p \u003cMODEL_PATH\u003e --library \u003cLIBRARY_DIR\u003e --out \u003cRESULT_PATH\u003e --cpus \u003cNCPU\u003e\n\n# Feature Extraction for Deep Learning Researcher\npython feature_extraction.py --protein \u003cPROTEIN_PATH\u003e --ref_ligand \u003cREF_LIGAND_PATH\u003e --out \u003cSAVE_PKL_PATH\u003e\npython feature_extraction.py --protein \u003cPROTEIN_PATH\u003e --center \u003cX\u003e \u003cY\u003e \u003cZ\u003e --out \u003cSAVE_PKL_PATH\u003e --cuda\n```\n\n## Installation\n\n- Using `environment.yml`\n  For various environment including Linux, MacOS and Window, the script installs **cpu-only version of PyTorch** by default. You can install a cuda-available version by modifying `environment.yml` or installing PyTorch manually.\n\n  ```bash\n  conda create -f environment.yml\n  conda activate pmnet\n  pip install .\n  ```\n\n- Manual Installation\n\n  ```bash\n  # Required python\u003e=3.9, Best Performance at higher version. (3.9, 3.10, 3.11, 3.12(best))\n  conda create --name pmnet python=3.12 pymol-open-source=3.0.0 numpy=1.26.4\n  conda activate pmnet\n\n  pip install torch # 1.13\u003c=torch\u003c=2.5.1, CUDA acceleration is available. 1min for 1 cpu, 10s for 1 gpu\n  pip install rdkit biopython omegaconf tdqm numba # Numba is optional, but recommended.\n  pip install molvoxel # Molecular voxelization tools with minimal dependencies (https://github.com/SeonghwanSeo/molvoxel.git)\n  ```\n\n- Installation for Proxy Model (For DL developer)\n\n  ```bash\n  # in your project\n  pip install pharmaconet @ git+https://github.com/SeonghwanSeo/PharmacoNet.git\n  ```\n\n## Pharmacophore Modeling\n\nYou can run `model.py` for automated protein-based pharmacophore modeling with RCSB PDB code or custom protein path (`--protein`). With protein path, you should enter `--prefix`.\n\n#### Example with RCSB PDB Code\n\nThe pharmacophore model file is `result/6oim/6oim_D_MOV_model.pm` and the pymol session file is `result/6oim/6oim_D_MOV_model.pse`\n\n```bash\n# Pharmacophore Modeling for KRAS(G12C) - PDBID: 6OIM\n\u003e python modeling.py --pdb 6oim\nINFO:root:Load PharmacoNet finish\nINFO:root:Download 6oim to result/6oim/6oim.pdb\n==============================\n\nINFO:root:A total of 3 ligand(s) are detected!\nLigand 1\n- ID      : MG (Chain: B [auth A])\n- Center  : -2.512, 2.588, 0.220\n- Name    : MAGNESIUM ION\n\nLigand 2\n- ID      : GDP (Chain: C [auth A])\n- Center  : -6.125, 3.588, 7.310\n- Name    : GUANOSINE-5-DIPHOSPHATE\n\nLigand 3\n- ID      : MOV (Chain: D [auth A])\n- Center  : 1.872, -8.260, -1.361\n- Name    : AMG 510 (BOUND FORM)\n- Synonyms: 6-FLUORO-7-(2-FLUORO-6-HYDROXYPHENYL)-4-[(2S)-2-METHYL-4-PROPANOYLPIPERAZIN-1-YL]-1-[4-METHYL-2-(PROPAN-2-YL)PYRIDIN-3-YL]PYRIDO[2,3-D]PYRIMIDIN-2(1H)-ONE\n\nINFO:root:Select the ligand number(s) (ex. 3 ; 1,3 ; manual ; all ; exit)\nligand number:3 # USER INPUT: Enter the ligand number for binding site detection\nINFO:root:Running 3th Ligand...\nLigand 3\n- ID      : MOV (Chain: D [auth A])\n- Center  : 1.872, -8.260, -1.361\n- Name    : AMG 510 (BOUND FORM)\n- Synonyms: 6-FLUORO-7-(2-FLUORO-6-HYDROXYPHENYL)-4-[(2S)-2-METHYL-4-PROPANOYLPIPERAZIN-1-YL]-1-[4-METHYL-2-(PROPAN-2-YL)PYRIDIN-3-YL]PYRIDO[2,3-D]PYRIMIDIN-2(1H)-ONE\nINFO:root:Save Pharmacophore Model to result/6oim/6oim_D_MOV_model.pm\nINFO:root:Save Pymol Visualization Session to result/6oim/6oim_D_MOV_model.pse\n```\n\n#### Example with custom protein\n\n```bash\n# With reference ligand.\n\u003e python modeling.py --protein ./examples/6OIM_protein.pdb --ref_ligand ./examples/6OIM_D_MOV.pdb --prefix 6oim\nINFO:root:Load PharmacoNet finish\nINFO:root:Load examples/6OIM_protein.pdb\nINFO:root:Using center of examples/6oim_D_MOV.pdb as center of box\nINFO:root:Save Pharmacophore Model to result/6oim/6oim_6oim_D_MOV_model.pm\nINFO:root:Save Pymol Visualization Session to result/6oim/6oim_6oim_D_MOV_model.pse\n\n# Without reference ligand -\u003e center is required.\n\u003e python modeling.py --protein ./examples/6OIM_protein.pdb --prefix 6oim\nINFO:root:Load PharmacoNet finish\nINFO:root:Load examples/6OIM_protein.pdb\nWARNING:root:No ligand is detected!\nINFO:root:Enter the center of binding site manually:\nx: 2 # USER INPUT: Enter x\ny: -8 # USER INPUT: Enter y\nz: -1 # USER INPUT: Enter z\nINFO:root:Using center (2.0, -8.0, -1.0)\nINFO:root:Save Pharmacophore Model to result/6OIM/6OIM_2.0_-8.0_-1.0_model.pm\nINFO:root:Save Pymol Visualization Session to result/6OIM/6OIM_2.0_-8.0_-1.0_model.pse\n```\n\n#### Example with custom model weight file (offline)\n\nPharmacoNet's weight file is automatically downloaded during `modeling.py`.\nIf your environment is offline, you can download the weight files from [Google Drive](https://drive.google.com/uc?id=1gzjdM7bD3jPm23LBcDXtkSk18nETL04p).\n\n```bash\n\u003e python modeling.py --pdb 6oim --weight_path \u003cWEIGHT_PATH\u003e\n```\n\n## Virtual Screening\n\nWe provide the simple script for screening.\n\n```bash\n# Default Parameter Setting (Cation/Anion: 8, Aromatic/Halogen/HBA/HBD: 4, Hydrophobic: 1)\npython screening.py -p \u003cMODEL_PATH\u003e --library \u003cLIBRARY_DIR\u003e --out \u003cRESULT_PATH\u003e --cpus \u003cNCPU\u003e\n\n# Custom Parameters Setting\npython screening.py -p \u003cMODEL_PATH\u003e --library \u003cLIBRARY_DIR\u003e --out \u003cRESULT_PATH\u003e --cpus \u003cNCPU\u003e \\\n  --anion \u003cANION\u003e --cation \u003cCATION\u003e --aromatic \u003cAROMATIC\u003e \\\n  --hbd \u003cHBD\u003e --hba \u003cHBA\u003e --halogen \u003cHALOGEN\u003e --hydrophobic \u003cHYDROPHOBIC\u003e\n\n# Example\npython screening.py -p ./result/6oim/6oim_D_MOV_model.pm --library examples/library --out result.csv --cpus 1\npython screening.py -p ./result/6oim/6oim_D_MOV_model.pm --library examples/library --out result.csv --cpus 2 --hbd 5 --hba 5 --aromatic 8\n```\n\n#### Example python code for ligand evaluation\n\nAlso, it can be easily included in your custom script via the python code below. (\\* Multiprocessing is allowed)\n\n```python\nfrom pmnet import PharmacophoreModel\nmodel = PharmacophoreModel.load(\u003cPHARMCOPHORE_MODEL_PATH\u003e)\n\n# NOTE: Scoring with ligand file with 1 or more conformers\nscore = model.scoring_file(\u003cLIGAND_PATH\u003e) # SDF, MOL2, PDB\n\n# NOTE: Scoring with RDKit ETKDG Conformers\nscore = model.scoring_smiles(\u003cSMILES\u003e, \u003cNUM_CONFORMERS\u003e)\n```\n\n## Pharmacophore Feature Extraction\n\n**_See: [`./developer/`](/developer/), [`./src/pmnet_appl/`](/src/pmnet_appl/)._**\n\nFor deep learning researcher who want to use PharmacoNet as pre-trained model for feature extraction, we provide the python API.\n\n```python\nfrom pmnet.api import PharmacoNet, get_pmnet_dev, ProteinParser\nmodule: PharmacoNet = get_pmnet_dev('cuda') # default: score_threshold=0.5 (less threshold: more features)\n\n# End-to-End calculation\npmnet_attr = module.feature_extraction(\u003cPROTEIN_PATH\u003e, ref_ligand_path=\u003cREF_LIGAND_PATH\u003e)\npmnet_attr = module.feature_extraction(\u003cPROTEIN_PATH\u003e, center=(\u003cCENTER_X\u003e, \u003cCENTER_Y\u003e, \u003cCENTER_Z\u003e))\n\n# Step-wise calculation\n## In Dataset\nparser = ProteinParser(center_noise=\u003cCENTER_NOISE\u003e) # center_noise: for data augmentation\n## In Model (freezed, method is decorated by torch.no_grad())\npmnet_attr = module.run_extraction(protein_data)\n\n\"\"\"\npmnet_attr = (multi_scale_features, hotspot_infos)\n- multi_scale_features: tuple[Tensor, Tensor, Tensor, Tensor, Tensor]:\n    - [96, 4, 4, 4], [96, 8, 8, 8], [96, 16, 16, 16], [96, 32, 32, 32], [96, 64, 64, 64]\n- hotspot_infos: list[hotspot_info]\n    hotspot_info: dict[str, Any]\n      - hotspot_feature: Tensor [192,]\n      - hotspot_position: tuple[float, float, float] - (x, y, z)\n      - hotspot_score: float in [0, 1]\n      - nci_type: str (10 types)\n          'Hydrophobic': Hydrophobic interaction\n          'PiStacking_P': PiStacking (Parallel)\n          'PiStacking_T': PiStacking (T-shaped)\n          'PiCation_lring': Interaction btw Protein Cation \u0026 Ligand Aromatic Ring\n          'PiCation_pring': Interaction btw Protein Aromatic Ring \u0026 Ligand Cation\n          'SaltBridge_pneg': SaltBridge btw Protein Anion \u0026 Ligand Cation\n          'SaltBridge_lneg': SaltBridge btw Protein Cation \u0026 Ligand Anion\n          'XBond': Halogen Bond\n          'HBond_pdon': Hydrogen Bond btw Protein Donor \u0026 Ligand Acceptor\n          'HBond_ldon': Hydrogen Bond btw Protein Acceptor \u0026 Ligand Donor\n\n      # Features obtained from `nci_type`, i.e. `nci_type` is all you need.\n      - hotspot_type: str (7 types)\n          {'Hydrophobic', 'Aromatic', 'Cation', 'Anion',\n           'Halogen', 'HBond_donor', 'HBond_acceptor'}\n      - point_type: str (7 types)\n          {'Hydrophobic', 'Aromatic', 'Cation', 'Anion',\n           'Halogen', 'HBond_donor', 'HBond_acceptor'}\n\"\"\"\n```\n\n## Pretrained Docking Proxy\n\n**_See: [`./src/pmnet_appl/`](/src/pmnet_appl/)._**\n\nWe provide pre-trained docking proxy models which predict docking score against arbitrary protein using PharmacoNet.\nWe hope this implementation prompts the molecule optimization.\n\nIf you use this implementation, please cite PharmacoNet with original papers.\n\nImplementation List:\n\n- TacoGFN: Target-conditioned GFlowNet for Structure-based Drug Design [[paper](https://arxiv.org/abs/2310.03223)]\n\nRelated Works:\n\n- RxnFlow: Generative Flows on Synthetic Pathway for Drug Design [[paper](https://arxiv.org/abs/2410.04542)]\n\n## Citation\n\nPaper on [Chemical Science](https://doi.org/10.1039/D4SC04854G), [arXiv](https://arxiv.org/abs/2310.00681).\n\n```bibtex\n@article{seo2024pharmaconet,\n  title={PharmacoNet: deep learning-guided pharmacophore modeling for ultra-large-scale virtual screening},\n  author={Seo, Seonghwan and Kim, Woo Youn},\n  journal={Chemical Science},\n  year={2024},\n  publisher={Royal Society of Chemistry}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fseonghwanseo%2Fpharmaconet","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fseonghwanseo%2Fpharmaconet","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fseonghwanseo%2Fpharmaconet/lists"}