{"id":11800693,"url":"https://github.com/durrantlab/deepfrag","last_synced_at":"2025-12-29T05:59:24.153Z","repository":{"id":213156541,"uuid":"657851609","full_name":"durrantlab/deepfrag","owner":"durrantlab","description":"DeepFrag is a deep convolutional neural network that guides ligand optimization by extending a ligand with a molecular fragment, such that the resulting extension is also highly complementary to the receptor. The DeepFrag web application is also a useful tool for teaching students about medicinal chemistry and lead optimization.","archived":false,"fork":false,"pushed_at":"2025-06-30T01:31:28.000Z","size":34913,"stargazers_count":25,"open_issues_count":2,"forks_count":5,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-06-30T02:34:57.597Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://durrantlab.pitt.edu/deepfragmodel/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/durrantlab.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGES.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE.md","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":"CODEOWNERS","security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2023-06-24T02:45:02.000Z","updated_at":"2025-06-30T01:31:31.000Z","dependencies_parsed_at":"2023-12-22T20:54:10.988Z","dependency_job_id":"30ca84ee-5003-4dc0-93d3-4a5ca2c0fa11","html_url":"https://github.com/durrantlab/deepfrag","commit_stats":null,"previous_names":["durrantlab/deepfrag"],"tags_count":5,"template":false,"template_full_name":null,"purl":"pkg:github/durrantlab/deepfrag","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/durrantlab%2Fdeepfrag","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/durrantlab%2Fdeepfrag/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/durrantlab%2Fdeepfrag/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/durrantlab%2Fdeepfrag/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/durrantlab","download_url":"https://codeload.github.com/durrantlab/deepfrag/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/durrantlab%2Fdeepfrag/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279007834,"owners_count":26084368,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-11T02:00:06.511Z","response_time":55,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-06-28T07:42:39.764Z","updated_at":"2025-10-11T16:30:44.850Z","avatar_url":"https://github.com/durrantlab.png","language":"Python","funding_links":[],"categories":["药物发现、药物设计"],"sub_categories":["网络服务_其他"],"readme":"# DeepFrag\n\nDeepFrag is a machine learning model for fragment-based lead optimization. In\nthis repository, you will find code to train the model and code to run\ninference using a pre-trained model.\n\n## Citation\n\nIf you use DeepFrag in your research, please cite as:\n\nGreen, H., Koes, D. R., \u0026 Durrant, J. D. (2021). DeepFrag: a deep\nconvolutional neural network for fragment-based lead optimization. Chemical\nScience.\n\n```tex\n@article{green2021deepfrag,\n  title={DeepFrag: a deep convolutional neural network for fragment-based lead optimization},\n  author={Green, Harrison and Koes, David Ryan and Durrant, Jacob D},\n  journal={Chemical Science},\n  year={2021},\n  publisher={Royal Society of Chemistry}\n}\n```\n\n## Usage\n\nThere are three ways to use DeepFrag:\n\n1. **DeepFrag Browser App**: We have released a free, open-source browser app\n   for DeepFrag that requires no setup and does not transmit any structures to\n   a remote server.\n    - View the online version at\n      [durrantlab.pitt.edu/deepfrag](https://durrantlab.pitt.edu/deepfrag/)\n    - See the code at\n      [git.durrantlab.pitt.edu/jdurrant/deepfrag-app](https://git.durrantlab.pitt.edu/jdurrant/deepfrag-app)\n2. **DeepFrag CLI**: In this repository we have included a `deepfrag.py`\n   script that can perform common prediction tasks using the API.\n    - See the `DeepFrag CLI` section below\n3. **DeepFrag API**: For custom tasks or fine-grained control over\n   predictions, you can invoke the DeepFrag API directly and interface with\n   the raw data structures and the PyTorch model. We have created an example\n   Google Colab (Jupyter notebook) that demonstrates how to perform manual\n   predictions.\n    - See the interactive\n      [Colab](https://colab.research.google.com/drive/1If8rWQ9aVKJyJwfaOql56mA2llqC0iur).\n\n## DeepFrag CLI\n\nThe DeepFrag CLI is invoked by running `python3 deepfrag.py` in this\nrepository. The CLI requires a pre-trained model and the fragment library to\nrun. You will be prompted to download both when you first run the CLI and\nthese will be saved in the `./.store` directory.\n\n### Structure (specify exactly one)\n\nThe input structures are specified using either a manual receptor and ligand\npdb or by specifying a pdb id and the ligand residue number.\n\n- `--receptor \u003crec.pdb\u003e --ligand \u003clig.pdb\u003e`\n- `--pdb \u003cpdbid\u003e --resnum \u003cresnum\u003e`\n\n### Connection Point (specify exactly one)\n\nDeepFrag will predict new fragments that connect to the _connection point_ via\na single bond. You must specify the connection point atom using one of the\nfollowing:\n\n- `--cname \u003cname\u003e`: Specify the connection point by atom name (e.g. `C3`,\n  `N5`, `O2`, ...).\n- `--cx \u003cx\u003e --cy \u003cy\u003e --cz \u003cz\u003e`: Specify the connection point by atomic\n  coordinate. DeepFrag will find the closest atom to this point.\n\n### Fragment Removal (optional) (specify exactly one)\n\nIf you are using DeepFrag for fragment _replacement_, you must first remove\nthe original fragment from the ligand structure. You can either do this by\nhand, e.g. editing the PDB, or DeepFrag can do this for you by specifying\n_which_ fragment should be removed.\n\n_Note: predicting fragments in place of hydrogen atoms (e.g. protons) does not\nrequire any fragment removal since hydrogen atoms are ignored by the model._\n\nTo remove a fragment, you specify a second atom that is contained in the\nfragment. Like the connection point, you can either use the atom name or the\natom coordinate.\n\n- `--rname \u003cname\u003e`: Specify the connection point by atom name (e.g. `C3`,\n  `N5`, `O2`, ...).\n- `--rx \u003cx\u003e --ry \u003cy\u003e --rz \u003cz\u003e`: Specify the connection point by atomic\n  coordinate. DeepFrag will find the closest atom to this point.\n\n### Output (optional)\n\nBy default, DeepFrag will print a list of fragment predictions to stdout\nsimilar to the [Browser App](https://durrantlab.pitt.edu/deepfrag/).\n\n- `--out \u003cout.csv\u003e`: Save predictions in CSV format to `out.csv`. Each line\n  contains the fragment rank, score and SMILES string.\n\n### Miscellaneous (optional)\n\n- `--full`: Generate SMILES strings with the full ligand structure instead of\n  just the fragment. (__IMPORTANT NOTE__: Bond orders are not assigned to the\n  parent portion of the full ligand structure. These must be added manually.)\n- `--cpu/--gpu`: DeepFrag will attempt to infer if a Cuda GPU is available and\n  fallback to the CPU if it is not. You can set either the `--cpu` or `--gpu`\n  flag to explicitly specify the target device.\n- `--num_grids \u003cnum\u003e`: Number of grid rotations to use. Using more will take\n  longer but produce a more stable prediction. (Default: 4)\n- `--top_k \u003ck\u003e`: Number of predictions to print in stdout. Use -1 to display\n  all. (Default: 25)\n\n## Reproduce Results\n\nYou can use the DeepFrag CLI to reproduce the highlighted results from the\nmain manuscript:\n\n### 1. Fragment replacement\n\nTo replace fragments, specify the connection point (`cname` or `cx/cy/cz`) and\nspecify a second atom that is contained in the fragment (`rname` or\n`rx/ry/rz`).\n\n```bash\n# Fig. 3: (2XP9) H. sapiens peptidyl-prolyl cis-trans isomerase NIMA-interacting 1 (HsPin1p)\n\n# Carboxylate A\n$ python3 deepfrag.py --pdb 2xp9 --resnum 1165 --cname C10 --rname C12\n\n# Phenyl B\n$ python3 deepfrag.py --pdb 2xp9 --resnum 1165 --cname C1 --rname C2\n\n# Phenyl C\n$ python3 deepfrag.py --pdb 2xp9 --resnum 1165 --cname C18 --rname C19\n```\n\n```bash\n# Fig. 4A: (6QZ8) Protein myeloid cell leukemia1 (Mcl-1)\n\n# Carboxylate group interacting with R263\n$ python3 deepfrag.py --pdb 6qz8 --resnum 401 --cname C12 --rname C14\n\n# Ethyl group\n$ python3 deepfrag.py --pdb 6qz8 --resnum 401 --cname C6 --rname C10\n\n# Methyl group\n$ python3 deepfrag.py --pdb 6qz8 --resnum 401 --cname C25 --rname C30\n\n# Chlorine atom\n$ python3 deepfrag.py --pdb 6qz8 --resnum 401 --cname C28 --rname CL\n```\n\n```bash\n# Fig. 4B: (1X38) Family GH3 b-D-glucan glucohydrolase (barley)\n\n# Hydroxyl group interacting with R158 and D285\n$ python3 deepfrag.py --pdb 1x38 --resnum 1001 --cname C2B --rname O2B\n\n# Phenyl group interacting with W286 and W434\n$ python3 deepfrag.py --pdb 1x38 --resnum 1001 --cname C7B --rname C1\n```\n\n```bash\n# Fig. 4C: (4FOW) NanB sialidase (Streptococcus pneumoniae)\n\n# Amino group\n$ python3 deepfrag.py --pdb 4fow --resnum 701 --cname CAE --rname NAA\n```\n\n### 2. Fragment addition\n\nFor fragment addition, you only need to specify the atom connection point\n(`cname` or `cx/cy/cz`). In this case, DeepFrag will implicitly replace a\nvalent hydrogen.\n\n```bash\n# Fig. 5: Ligands targeting the SARS-CoV-2 main protease (MPro)\n\n# 5A: (5RGH) Extension on Z1619978933\n$ python3 deepfrag.py --pdb 5rgh --resnum 404 --cname C09\n\n# 5B: (5R81) Extension on Z1367324110\n$ python3 deepfrag.py --pdb 5r81 --resnum 1001 --cname C07\n```\n\n## Overview\n\n- `config`: fixed configuration information (e.g., TRAIN/VAL/TEST partitions)\n- `configurations`: benchmark model configurations (see\n  [`configurations/README.md`](configurations/README.md))\n- `data`: training/inference data (see [`data/README.md`](data/README.md))\n- `leadopt`: main module code\n  - `models`: pytorch architecture definitions\n  - `data_util.py`: utility code for reading packed fragment/fingerprint data\n      files\n  - `grid_util.py`: GPU-accelerated grid generation code\n  - `metrics.py`: pytorch implementations of several metrics\n  - `model_conf.py`: contains code to configure and train models\n  - `util.py`: utility code for rdkit/openbabel processing\n- `scripts`: data processing scripts (see\n  [`scripts/README.md`](scripts/README.md))\n- `train.py`: CLI interface to launch training runs\n\n## Dependencies\n\nYou can build a virtualenv with the requirements:\n\n```sh\n$ python3 -m venv leadopt_env\n$ source ./leadopt_env/bin/activate\n$ pip install -r requirements.txt\n$ pip install prody\n$ pip install torch==2.1.2+cu118 torchvision==0.16.2+cu118 --index-url https://download.pytorch.org/whl/cu118\n$ sudo apt install nvidia-cuda-toolkit\n```\n\nRegarding the nvidia-cuda-toolkit, you may wish to ensure that the toolkit\nversion matches cuda installed on your machine. You can check the version of\ncuda by running the following commands:\n\n```sh\n$ nvcc --version\n$ nvidia-smi\n```\n\nNote: We used `Cuda 10.1` for training.\n\n## Training\n\nTo train a model, you can use the `train.py` utility script. You can specify\nmodel parameters as command line arguments or load parameters from a\nconfiguration args.json file.\n\n```bash\npython train.py \\\n    --save_path=/path/to/model \\\n    --wandb_project=my_project \\\n    {model_type} \\\n    --model_arg1=x \\\n    --model_arg2=y \\\n    ...\n```\n\nor\n\n```bash\npython train.py \\\n    --save_path=/path/to/model \\\n    --wandb_project=my_project \\\n    --configuration=./configurations/args.json\n```\n\n`save_path` is a directory to save the best model. The directory will be\ncreated if it doesn't exist. If this is not provided, the model will not be\nsaved.\n\n`wandb_project` is an optional wandb project name. If provided, the run will\nbe logged to wandb.\n\nSee below for available models and model-specific parameters:\n\n## Leadopt Models\n\nIn this repository, trainable models are subclasses of\n`model_conf.LeadoptModel`. This class encapsulates model configuration\narguments and pytorch models and enables saving and loading multi-component\nmodels.\n\n```py\nfrom leadopt.model_conf import LeadoptModel, MODELS\n\nmodel = MODELS['voxel']({args...})\nmodel.train(save_path='./mymodel')\n\n...\n\nmodel2 = LeadoptModel.load('./mymodel')\n```\n\nInternally, model arguments are configured by setting up an `argparse` parser\nand passing around a `dict` of configuration parameters in `self._args`.\n\n### VoxelNet\n\n```text\n--no_partitions     If set, disable the use of TRAIN/VAL partitions during\n                    training.\n-f FRAGMENTS, --fragments FRAGMENTS\n                    Path to fragments file.\n-fp FINGERPRINTS, --fingerprints FINGERPRINTS\n                    Path to fingerprints file.\n-lr LEARNING_RATE, --learning_rate LEARNING_RATE\n--num_epochs NUM_EPOCHS\n                    Number of epochs to train for.\n--test_steps TEST_STEPS\n                    Number of evaluation steps per epoch.\n-b BATCH_SIZE, --batch_size BATCH_SIZE\n--grid_width GRID_WIDTH\n--grid_res GRID_RES\n--fdist_min FDIST_MIN\n                    Ignore fragments closer to the receptor than this\n                    distance (Angstroms).\n--fdist_max FDIST_MAX\n                    Ignore fragments further from the receptor than this\n                    distance (Angstroms).\n--fmass_min FMASS_MIN\n                    Ignore fragments smaller than this mass (Daltons).\n--fmass_max FMASS_MAX\n                    Ignore fragments larger than this mass (Daltons).\n--ignore_receptor\n--ignore_parent\n-rec_typer {single,single_h,simple,simple_h,desc,desc_h}\n-lig_typer {single,single_h,simple,simple_h,desc,desc_h}\n-rec_channels REC_CHANNELS\n-lig_channels LIG_CHANNELS\n--in_channels IN_CHANNELS\n--output_size OUTPUT_SIZE\n--pad\n--blocks BLOCKS [BLOCKS ...]\n--fc FC [FC ...]\n--use_all_labels\n--dist_fn {mse,bce,cos,tanimoto}\n--loss {direct,support_v1}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdurrantlab%2Fdeepfrag","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdurrantlab%2Fdeepfrag","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdurrantlab%2Fdeepfrag/lists"}