An open API service indexing awesome lists of open source software.

https://github.com/informaticsmatters/standardize-molecule

Utilities for SMILES analysis
https://github.com/informaticsmatters/standardize-molecule

Last synced: 10 months ago
JSON representation

Utilities for SMILES analysis

Awesome Lists containing this project

README

          

# Informatics Matters Standardize Molecule Utility

[![build](https://github.com/InformaticsMatters/standardize-molecule/actions/workflows/build.yaml/badge.svg)](https://github.com/InformaticsMatters/standardize-molecule/actions/workflows/build.yaml)
[![publish](https://github.com/InformaticsMatters/standardize-molecule/actions/workflows/publish.yaml/badge.svg)](https://github.com/InformaticsMatters/standardize-molecule/actions/workflows/publish.yaml)

[![PyPI version](https://badge.fury.io/py/im-standardize-molecule.svg)](https://badge.fury.io/py/im-standardize-molecule)
![PyPI - License](https://img.shields.io/pypi/l/im-standardize-molecule)

The *im-standardize-molecule* module is a utility provided by Informatics Matters

It contains a Python module to:
1. Take an original SMILES (*Simple Molecule Input Line rESpresentation*) and
convert it to an Isomeric or Nonisomeric canonical SMILES representation in a consistent way.
2. Convert an *rdkit.Chem.rdchem.Mol* object to a standard format.

For installation and testing, see the associated README.rst file.

## Usage

**standardize_to_noniso_smiles** *(osmiles:str)*

**standardize_to_iso_smiles** *(osmiles:str)*

Takes a smiles and returns the converted smiles and standardized molecule as an *rdkit.Chem.rdchem.Mol*

Example::

from standardize_molecule import standardize_to_noniso_smiles
noniso = standardize_to_noniso_smiles('N[C@@H](C[C@@H]1CCNC1O)[C@@H](O)CO')
[17:30:44] Initializing MetalDisconnector
[17:30:44] Running MetalDisconnector
[17:30:44] Initializing Normalizer
[17:30:44] Running Normalizer
[17:30:44] Running Uncharger
print(noniso[0])
NC(CC1CCNC1O)C(O)CO

**standardize_molecule** *(molecule: object, osmiles:str = 'NotProvided')*

Takes an rdkit.Chem.rdchem.Mol object and (optionally) a smiles string for logging and returns
a standardized molecule.

Example::

from rdkit import Chem
from standardize_molecule import standardize_molecule
mol = Chem.MolFromSmiles('C1=CC=CC=C1')
std = standardize_molecule(mol, 'C1=CC=CC=C1')
[17:37:40] Initializing MetalDisconnector
[17:37:40] Running MetalDisconnector
[17:37:40] Initializing Normalizer
[17:37:40] Running Normalizer
[17:37:40] Running Uncharger
print(std)

print(noniso[1].GetNumHeavyAtoms())
13