Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/tsudalab/molslepa

Interpretable Fragment-based Molecule Design with Self-learning Entropic Population Annealing
https://github.com/tsudalab/molslepa

Last synced: about 1 month ago
JSON representation

Interpretable Fragment-based Molecule Design with Self-learning Entropic Population Annealing

Host: GitHub
URL: https://github.com/tsudalab/molslepa
Owner: tsudalab
License: mit
Created: 2023-04-10T15:04:55.000Z (over 1 year ago)
Default Branch: molslepa
Last Pushed: 2023-04-13T12:48:49.000Z (over 1 year ago)
Last Synced: 2024-11-19T16:42:50.297Z (about 1 month ago)
Language: Python
Size: 21.7 MB
Stars: 5
Watchers: 8
Forks: 1
Open Issues: 1
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

        # MolSLEPA

Interpretable Fragment-based Molecule Design with Self-learning Entropic Population Annealing

## Environment Setup

An environment for rationale can be easily setup via Anaconda:

```

git clone https://github.com/tsudalab/MolSLEPA.git

cd MolSLEPA

conda env create -f environment.yml

conda activate molslepa

```

## Workflow

All command line executables are under the folder 'cli':

```

cd cli

```

The molecule generation in MolSLEPA refers to [MoLeR](https://github.com/microsoft/molecule-generation). 

To run MolSLEPA, follow four steps:

### Step 1: Preprocessing

- Preprocess data using the 'preprocess.py' script. This script takes a plain text list of SMILES strings and turns it into '*.pkl' files containing descriptions of the molecular graphs and generation traces. You need to provide train, valid and test datasets. Each file contains SMILES strings, one per line. The folder and file name must match the name in  'preprocess.py'. To run

```

python preprocess.py

```

### Step 2: Training

- Train MoLeR on the preprocessed data using the 'train.py' script. This script trains MoLeR until convergence, run

```

python train.py

```

### Step 3: Sampling

- Sample fragment-based chemical space using the class 'Sample' in 'MolSLEPA' script. This script generates a set of weighted samples of molecules, run

```

python molslepa.py

```

### Step 4: Dos Estimation

- Calculate the saliency of fragments using the class 'MultiHistogram' in 'MolSLEPA' script. Thi script approximate the density of states (DoS), which is determined by the weights obtained in last step. 

'ploy.ipynb' provides a reproduction of the resulting figure in the paper.