https://github.com/cch1999/posecheck
Pose checks for 3D Structure-based Drug Design methods
https://github.com/cch1999/posecheck
docking drug-design generative-model molecule-generation
Last synced: 7 months ago
JSON representation
Pose checks for 3D Structure-based Drug Design methods
- Host: GitHub
- URL: https://github.com/cch1999/posecheck
- Owner: cch1999
- License: mit
- Created: 2023-09-24T21:43:26.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2024-08-29T22:19:06.000Z (about 1 year ago)
- Last Synced: 2024-09-27T09:36:25.726Z (about 1 year ago)
- Topics: docking, drug-design, generative-model, molecule-generation
- Language: Python
- Homepage:
- Size: 35.4 MB
- Stars: 69
- Watchers: 1
- Forks: 6
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
- Changelog: changelog.md
- License: LICENSE
Awesome Lists containing this project
README
# PoseCheck: Benchmarking Generated Poses
[](https://arxiv.org/abs/2308.07413)

[](https://github.com/pre-commit/pre-commit)
[](https://github.com/cch1999/posecheck/actions/workflows/tests.yaml)
[](https://posecheck.readthedocs.io/en/latest/?badge=latest)
[](https://www.repostatus.org/#active)
[](https://opensource.org/licenses/MIT)
[Paper](https://arxiv.org/abs/2308.07413) | [Documentation](https://posecheck.readthedocs.io/en/latest/)
## What is PoseCheck?
PoseCheck is a package for analysing the quality of generated protein-ligand complexes from 3D target-conditioned generative models.
## Installation
```bash
pip install posecheck
```## Example usage
We provide a simple top level API to easily interact with the whole of the benchmark. Just define the `PoseCheck` object once at the top of your existing testing code and test molecules by loading them in iteratively. You can also use all the testing functions manually as well (see Docs for more info).
```python
from posecheck import PoseCheck# Initialize the PoseCheck object
pc = PoseCheck()# Load a protein from a PDB file (will run reduce in the background)
pc.load_protein_from_pdb("data/examples/1a2g.pdb")# Load ligands from an SDF file
pc.load_ligands_from_sdf("data/examples/1a2g_ligand.sdf")
# Alternatively, load RDKit molecules directly
# pc.load_ligands_from_mols([rdmol])# Check for clashes
clashes = pc.calculate_clashes()
print(f"Number of clashes in example molecule: {clashes[0]}")# Check for strain
strain = pc.calculate_strain_energy()
print(f"Strain energy of example molecule: {strain[0]}")# Check for interactions
interactions = pc.calculate_interactions()
print(f"Interactions of example molecule: {interactions}")
```## Tips
- Reading and processing all the PDB files using `reduce` can take a while for a large test set. If you are running `PoseCheck` frequently, it might be worth pre-processing all proteins yourself using `prot = posecheck.utils.loading.load_protein_from_pdb(pdb_path)` and setting this directly within `PoseCheck` using `pc.protein = prot`.
## Data from the paper
The data for the paper can be found at the following Zenodo link and should be placed in the `data` directory.
[](https://zenodo.org/records/10208912)
## Cite
```bibtex
@article{harris2023benchmarking,
title={Benchmarking Generated Poses: How Rational is Structure-based Drug Design with Generative Models?},
author={Harris, Charles and Didi, Kieran and Jamasb, Arian R and Joshi, Chaitanya K and Mathis, Simon V and Lio, Pietro and Blundell, Tom},
journal={arXiv preprint arXiv:2308.07413},
year={2023}
}
```## Acknowledgements
PoseCheck relies on several other codebases to function. Here are the links to them:
- [RDKit](https://github.com/rdkit/rdkit): A collection of cheminformatics and machine learning tools.
- [ProLIF](https://github.com/chemosim-lab/ProLIF): Protein-Ligand Interaction Fingerprints generator.
- [Seaborn](https://github.com/mwaskom/seaborn): Statistical data visualization library.
- [NumPy](https://github.com/numpy/numpy): The fundamental package for scientific computing with Python.
- [DataMol](https://github.com/datamol-org/datamol): A minimalist and practical chemoinformatics library for python.
- [Pandas](https://github.com/pandas-dev/pandas): Powerful data structures for data analysis, time series, and statistics.
- [Reduce](https://github.com/rlabduke/reduce): A program for adding hydrogens to a Protein DataBank (PDB) molecular structure file.There is also the similar package [PoseBusters](https://github.com/maabuu/posebusters) which provides additional tests to us and is recommended if you are benchmarking protein-ligand docking models.