https://github.com/rxn4chemistry/rxnmapper
RXNMapper: Unsupervised attention-guided atom-mapping. Code complementing our Science Advances publication on "Extraction of organic chemistry grammar from unsupervised learning of chemical reactions" (https://advances.sciencemag.org/content/7/15/eabe4166).
https://github.com/rxn4chemistry/rxnmapper
atom-mapping chemistry reactions rxn smiles transformer
Last synced: 30 days ago
JSON representation
RXNMapper: Unsupervised attention-guided atom-mapping. Code complementing our Science Advances publication on "Extraction of organic chemistry grammar from unsupervised learning of chemical reactions" (https://advances.sciencemag.org/content/7/15/eabe4166).
- Host: GitHub
- URL: https://github.com/rxn4chemistry/rxnmapper
- Owner: rxn4chemistry
- License: mit
- Created: 2020-05-13T19:01:05.000Z (almost 6 years ago)
- Default Branch: main
- Last Pushed: 2024-09-19T20:40:41.000Z (over 1 year ago)
- Last Synced: 2025-03-07T00:48:25.788Z (12 months ago)
- Topics: atom-mapping, chemistry, reactions, rxn, smiles, transformer
- Language: Python
- Homepage: http://rxnmapper.ai
- Size: 8.43 MB
- Stars: 301
- Watchers: 9
- Forks: 72
- Open Issues: 4
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-biochem-ai - RXNMapper (2021) - based encoder-decoder transformers to perform atom-atom mapping between two SMILES strings. (Atom Mapping)
README
# Extraction of organic chemistry grammar from unsupervised learning of chemical reactions
Enable robust atom mapping on valid reaction SMILES. The atom-mapping information was learned by an ALBERT model trained in an unsupervised fashion on a large dataset of chemical reactions.
- [Extraction of organic chemistry grammar from unsupervised learning of chemical reactions](https://advances.sciencemag.org/content/7/15/eabe4166): peer-reviewed Science Advances publication (open access).
- [Demo](http://rxnmapper.ai/demo.html): give RXNMapper a try!
- [Unsupervised attention-guided atom-mapping preprint](http://dx.doi.org/10.26434/chemrxiv.12298559): presented at the ML Interpretability for Scientific Discovery ICML workshop, 2020.
## Installation
### Create virtual environment (optional)
```bash
python3 -m venv .venv
source .venv/bin/activate
```
### Install from pip
```bash
pip install "rxnmapper[rdkit]"
```
You can leave out `[rdkit]` if RDKit is already available in your Python environment.
### From source
```bash
git clone https://github.com/rxn4chemistry/rxnmapper.git
cd rxnmapper
pip install -e ".[rdkit]"
```
## Usage
### Basic usage
```python
from rxnmapper import RXNMapper
rxn_mapper = RXNMapper()
rxns = ['CC(C)S.CN(C)C=O.Fc1cccnc1F.O=C([O-])[O-].[K+].[K+]>>CC(C)Sc1ncccc1F', 'C1COCCO1.CC(C)(C)OC(=O)CONC(=O)NCc1cccc2ccccc12.Cl>>O=C(O)CONC(=O)NCc1cccc2ccccc12']
results = rxn_mapper.get_attention_guided_atom_maps(rxns)
```
The results contain the mapped reactions and confidence scores:
```python
[{'mapped_rxn': 'CN(C)C=O.F[c:5]1[n:6][cH:7][cH:8][cH:9][c:10]1[F:11].O=C([O-])[O-].[CH3:1][CH:2]([CH3:3])[SH:4].[K+].[K+]>>[CH3:1][CH:2]([CH3:3])[S:4][c:5]1[n:6][cH:7][cH:8][cH:9][c:10]1[F:11]',
'confidence': 0.9565619900376546},
{'mapped_rxn': 'C1COCCO1.CC(C)(C)[O:3][C:2](=[O:1])[CH2:4][O:5][NH:6][C:7](=[O:8])[NH:9][CH2:10][c:11]1[cH:12][cH:13][cH:14][c:15]2[cH:16][cH:17][cH:18][cH:19][c:20]12.Cl>>[O:1]=[C:2]([OH:3])[CH2:4][O:5][NH:6][C:7](=[O:8])[NH:9][CH2:10][c:11]1[cH:12][cH:13][cH:14][c:15]2[cH:16][cH:17][cH:18][cH:19][c:20]12',
'confidence': 0.9704424331552834}]
```
To account for batching and error handling automatically, you can use `BatchedMapper` instead:
```python
from rxnmapper import BatchedMapper
rxn_mapper = BatchedMapper(batch_size=32)
rxns = ['CC[O-]~[Na+].BrCC>>CCOCC', 'invalid>>reaction']
# The following calls work with input of arbitrary size. Also, they do not raise
# any exceptions but will return ">>" or an empty dictionary for the second reaction.
results = list(rxn_mapper.map_reactions(rxns)) # results as strings directly
results = list(rxn_mapper.map_reactions_with_info(rxns)) # results as dictionaries (as above)
```
### Testing
You can run the test suite with:
```bash
pip install -e .[dev,rdkit]
pytest tests
```
## Examples
To learn more see the [examples](./examples).
## Data
Data can be found at: https://ibm.box.com/v/RXNMapperData
## Citation
```
@article{schwaller2021extraction,
title={Extraction of organic chemistry grammar from unsupervised learning of chemical reactions},
author={Schwaller, Philippe and Hoover, Benjamin and Reymond, Jean-Louis and Strobelt, Hendrik and Laino, Teodoro},
journal={Science Advances},
volume={7},
number={15},
pages={eabe4166},
year={2021},
publisher={American Association for the Advancement of Science}
}
```