https://github.com/adam-maz/virtual_screening
Within this repository I present scripts that can be helpful during virtual screening in drug design & development.
https://github.com/adam-maz/virtual_screening
clusterization jupyter-notebook k-means-clustering maestro-schrodinger medicinal-chemistry molecular-fingerprints pandas python rdkit scikit-learn scoring-functions virtual-screening
Last synced: about 1 month ago
JSON representation
Within this repository I present scripts that can be helpful during virtual screening in drug design & development.
- Host: GitHub
- URL: https://github.com/adam-maz/virtual_screening
- Owner: Adam-maz
- Created: 2025-02-16T23:15:33.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-02-18T00:31:10.000Z (over 1 year ago)
- Last Synced: 2025-05-15T19:09:38.340Z (about 1 year ago)
- Topics: clusterization, jupyter-notebook, k-means-clustering, maestro-schrodinger, medicinal-chemistry, molecular-fingerprints, pandas, python, rdkit, scikit-learn, scoring-functions, virtual-screening
- Language: Jupyter Notebook
- Homepage:
- Size: 232 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Description
Virtual screening (VS) in drug discovery and development is a process that enables researchers to filter a collection of molecules to select the most suitable compounds for a particular molecular target. Molecular docking is one of the most useful protocols in virtual screening. In the case of large compound libraries, certain filters (such as ADME) can be applied to reduce the dataset size. Another approach is to filter out molecules with unfavorable force field-based energy (docking score).

Virtual Screening. Image generated via Gemini 2.0 Flash
Here, I present an AI-driven protocol that allows researchers to cluster compounds based on their docking scores (Glide gscores) and molecular fingerprints (binary representations of a molecule's 2D structure). This approach represents a subtle interplay between Structure-Based and Ligand-Based Drug Design (SBDD/LBDD) and can lead to more efficient screening, especially in cases where the researcher knows a reference molecule with experimentally measured biological activity.
---
# Content
- **`gscores_fps_clusterization_tool.ipynb`** – Jupyter Notebook containing the code for clustering.
---
# Dependencies
The script uses **RDKit** to generate Morgan fingerprints. Therefore, you need to have this library installed in your virtual environment. If you haven't installed RDKit yet, simply run:
```python
pip install rdkit
```