An open API service indexing awesome lists of open source software.

https://github.com/dmis-lab/arkdta


https://github.com/dmis-lab/arkdta

Last synced: 3 months ago
JSON representation

Awesome Lists containing this project

README

          

# ArkDTA: Attention Regularization guided by non-Covalent Interactions for Explainable Drug-Target Binding Affinity Prediction

## Abstract

Protein-ligand binding affinity prediction is a central task in drug design and development. Cross-modal attention mechanism has recently become a core component of many deep learning models due to its potential to improve model explainability. Non-covalent interactions, one of the most critical domain knowledge in binding affinity prediction task, should be incorporated in protein-ligand attention mechanism for more explainable deep DTI models. We propose ArkDTA, a novel deep neural architecture for explainable binding affinity prediction guided by non-covalent interactions. Experimental results show that ArkDTA achieves predictive performance comparable to current state-of-the-art models while significantly improving model explainability. Qualitative investigation into our novel attention mechanism reveals that ArkDTA can identify potential regions for non-covalent interactions between candidate drug compounds and target proteins, as well as guiding internal operations of the model in a more interpretable and domain-aware manner. (*submitted to ISMB2023, under review*)

## Overview of ArkDTA

![img](./figures/0_arkdta.png)

## Attention Regularization guided by non-Covalent Interactions

![img](./figures/1_arkmab.png)

## Prerequisites for running ArkDTA

- Python 3.7.9
- CUDA: 11.X
- Download and extract data.tar.gz ([link](https://drive.google.com/file/d/1hmR5w47VUk6RW0br8BanJT94R2FPHgDL/view?usp=share_link)), 45MB) at current directory. These files are the preprocessed datasets PDBBind (ver.2020), Davis and Metz.
- Download and extract saved.tar.gz ([link](https://drive.google.com/file/d/1iVttdzlAMXYeJ11JKVe19Dkvgpb8PZSS/view?usp=share_link)), 170MB) at directory **./saved**. These files are the model checkpoints for each fold of the PDBbind datset.

## Installing the Python (3.8.12) Conda Environment

```
conda env create -f arkdta.yaml
conda activate arkdta
```

## How to use the ArkDTA source code

### Training ArkDTA on PDBBind Dataset

Run the following code,
```
python run.py -pn {wandb_project_name} -sn arkdta -mg {multiple gpu indices}
```

If you want to train ArkDTA on the IC50 subset, configure the **/sessions/arkdta.yaml** by editing the following,
```
ba_measure: IC50
```

### Evaluating ArkDTA on PDBBind Dataset (5CV)

Run the following code,
```
python run.py -pn {wandb_project_name} -sn arkdta -mg {multiple gpu indices} -tm
```

### Finetuning ArkDTA on other datasets (Davis, Metz)

Configure the **/sessions/arkdta.yaml** by editing the following,
```
dataset_subsets: davis
dataset_partition: randomsingle
```

Then run the following code,
```
python run.py -pn {wandb_project_name} -sn arkdta -mg {multiple gpu indices} -ft {davis or metz}
```

### Evaluating ArkDTA on other datasets
Run the following code,
```
python run.py -pn {wandb_project_name} -sn arkdta -mg {multiple gpu indices} -tm -cn {your/saved/path_davis or _metz}
```

### Running model inference and extracting attention maps from ArkDTA

Run the following script,
```
./arkdta.sh
```

You can change the input SMILES (ligands) or FASTA sequence (proteins) by editting the **arkdta.sh** file.

#### 4x6n, 3Y5

![img](./figures/2_4x6n_3y5.png)

#### 6n77, KEJ

![img](./figures/3_6n77_kej.png)

#### 8bq4, QZR

![img](./figures/4_8bq4_qzr.png)

## Contributors


Name
Affiliation
Email


Mogan Gim
Data Mining and Information Systems Lab,
Korea University, Seoul, South Korea
akim@korea.ac.kr


Junseok Choe
Data Mining and Information Systems Lab,
Korea University, Seoul, South Korea
juns94@korea.ac.kr


Seungheun Baek
Data Mining and Information Systems Lab,
Korea University, Seoul, South Korea
tmdgms9417@korea.ac.kr


Jueon Park
Data Mining and Information Systems Lab,
Korea University, Seoul, South Korea
jueon_park@korea.ac.kr


Chaeeun Lee
Data Mining and Information Systems Lab,
Korea University, Seoul, South Korea
chaeeunlee1997@korea.ac.kr


Minjae Ju†
LG CNS, AI Research Center, Seoul, South Korea
minjae.ju@lgcns.com


Sumin Lee†
LG AI Research, Seoul South Korea
sumin.lee@lgresearch.ai


Jaewoo Kang*
Data Mining and Information Systems Lab,
Korea University, Seoul, South Korea
kangj@korea.ac.kr

- †: *This work was done while the author was a graduate student at Korea University Computer Science Department.*
- *: *Corresponding Author*