An open API service indexing awesome lists of open source software.

https://github.com/patonlab/molcomplex

Command line and webapp for retrosynthetic disconnections, molecular complexity and synthetic accessibility metrics
https://github.com/patonlab/molcomplex

complexity-score disconnections molecular-complexity molecular-descriptors retrosynthetic-analysis synthesizability

Last synced: 5 months ago
JSON representation

Command line and webapp for retrosynthetic disconnections, molecular complexity and synthetic accessibility metrics

Awesome Lists containing this project

README

          

![MolComplex](https://github.com/patonlab/molcomplex/blob/master/molcomplex_logo.png)

Implementing a variety of complementary metrics for molecular complexity and synthetic accessibility.

A collaboration with the Sarpong group to understand complexity of molecules

# Requirements
- numpy, pandas
- rdkit
- openbabel
- mordred
- SYBA (conda install -c lich syba)

## Set up conda environment directly using the yml file:
To install the required packages through Conda, use the env.yml file as follows and the activate the environment:
1. `conda env create -f env.yml`
2. `conda activate mc`
This will set up the environment with molcomplex installed.

## For installation by cloning the GitHub folder, perform the follwoing steps:
1. Download the zipped folder or clone using: `git clone https://github.com/patonlab/molcomplex.git`
2. Navigate to the installed folder and run: `python setup.py install`. This will install `molcomplex` in the environment you are present in.
3. Install necessary dependencies using the following: `conda install -c lich syba`, `conda install -c conda-forge rdkit`, and `conda install -c conda-forge openbabel`

## Recommended installation and update guide
In a nutshell, `molcomplex` and its dependencies are installed/updated as follows:

`pip install molcomplex`

`conda install -c conda-forge openbabel rdkit Mordred`

`conda install lich::syba`

`conda install numpy pandas`

## Usage
To display the options type:

``python -m molcomplex -h``

The `molcomplex` package can be utilised as follows to obtain a csv with complexity scores.

``python -m molcomplex -f examples/test.txt``

To write to CSV add in the following:

``python -m molcomplex -f examples/test.txt --csv``

To perform a retro analysis by breaking down bonds to get complexity scores for precursors of the input SMILES add the following option:

``python -m molcomplex -f examples/test.txt --csv --retro``

## Usage APP
To run the web app perform the following steps:
1. Navigate to the webapp folder: `cd mcwebapp`
2. Run the app as follows: `python molcomplexapp.py`
3. copy paste the `http://127.0.0.1:8050/` or similar into web browser to utilise as an app.

# Metrics implemented
- Bertz Complexity (CT) Score (JACS 1981, 103, 3241-3243)
- Balaban J Score (Chem. Phys. Lett. 1982, 89, 399-404)
- Coley SCScore (J. Chem. Inf. Model. 2018, 58, 2, 252)
- IPC: Bonchev & Trinajstic's information content of the coefficients of the characteristic polynomial of the adjacency matrix of a hydrogen-suppressed graph of a molecule (J. Chem. Phys. 1977, 67, 4517-4533)
- Ertl SA_Score (J. Cheminform. 2009, 1, 8)
- Boettcher Score (J. Chem. Inf. Model. 2016, 56, 3, 462–470)
- Rücker's total walk count (twc) index: Rücker, G.; Rücker, C. Counts of All Walks as Atomic and Molecular Descriptors. (J. Chem. Inf. Comput. Sci. 1993, 33, 683-695)
- Proudfoot's Cm index based on atom environments: Proudfoot, J. R. A path based approach to assessing molecular complexity. Bioorganic Med. Chem. Lett. 27, 2014–2017 (2017)
- Kappa Shape Indices 1, 2 & 3 (Quant. Struct. Act. Relat. 1986, 5, 1-7)
- McGowan Volume (Chromatographia, 1987, 23, 243-246)
- Labute Approximate Surface Area (Methods Mol Biol 2004, 275, 261-78)
- Van der Waals Volume Atomic and Bond Contributions (J. Org. Chem. 2003, 68, 7368-7373).
- Zagreb Index
- MOE Type Desciptors (Labute ASA, PEOE VSA, SMR VSA, SLogP VSA)
- SYBA Score (J. Cheminformatics 2020, 12, 35)
- Multiple additional 2D metrics.

# Metrics to implement:

- Bertz’s Ns and Nt index: Bertz, S. H. & Sommer, T. J. Rigorous mathematical approaches to strategic bonds and synthetic analysis based on conceptually simple new complexity indices. Chem. Commun. 16, 2409–2410 (1997).

- Randić's zeta index: Randić, M. & Plavšić, D. Characterization of molecular complexity. Int. J. Quantum Chem. 91, 20–31 (2002).

- https://www.nature.com/articles/s41598-018-37253-8

Two noteworthy substructure-based methods are:
- Barone, R. & Chanon, M. A new and simple approach to chemical complexity. Application to the synthesis of natural products. J. Chem. Inf. Comput. Sci. 41, 269–272 (2001).

- Whitlock, H. W. On the structure of total synthesis of complex natural products. J. Org. Chem. 63, 7982–7989 (1998).

# Citation:

Molecular Complexity-Inspired Synthetic Strategies Toward the Calyciphylline A-type Daphniphyllum Alkaloids Himalensine A and Daphenylline. Wright, B. A.; Okada, T.; Regni, A.; Luchini, G.; Sowndarya, S. V. S.; Chaisan, N.; Kölbl, S.; Kim, S. F.; Paton, R. S.; Sarpong, R. S. _submmitted_ **2024**

# Acknowledgment:

This material is based upon work supported by the U.S. National Science Foundation under the NSF Center for Computer Assisted Synthesis (C-CAS), grant number CHE–2202693.