Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/kohulan/decimer-image_transformer

DECIMER Image Transformer is a deep-learning-based tool designed for automated recognition of chemical structure images. Leveraging transformer architectures, the model converts chemical images into SMILES strings, enabling the digitization of chemical data from scanned documents, literature, and patents.
https://github.com/kohulan/decimer-image_transformer

chemical-image-recognition decimer deep-learning image-data-mining python tensorflow tpu transformers

Last synced: 4 days ago
JSON representation

DECIMER Image Transformer is a deep-learning-based tool designed for automated recognition of chemical structure images. Leveraging transformer architectures, the model converts chemical images into SMILES strings, enabling the digitization of chemical data from scanned documents, literature, and patents.

Awesome Lists containing this project

README

        

# ๐Ÿงช DECIMER Image Transformer ๐Ÿ–ผ๏ธ

### Deep Learning for Chemical Image Recognition using Efficient-Net V2 + Transformer


DECIMER Logo

[![License](https://img.shields.io/badge/License-MIT%202.0-blue.svg?style=for-the-badge)](https://opensource.org/licenses/MIT)
[![Maintenance](https://img.shields.io/badge/Maintained%3F-yes-green.svg?style=for-the-badge)](https://GitHub.com/Kohulan/DECIMER-Image_Transformer/graphs/commit-activity)
[![GitHub issues](https://img.shields.io/github/issues/Kohulan/DECIMER-Image_Transformer.svg?style=for-the-badge)](https://GitHub.com/Kohulan/DECIMER-Image_Transformer/issues/)
[![GitHub contributors](https://img.shields.io/github/contributors/Kohulan/DECIMER-Image_Transformer.svg?style=for-the-badge)](https://GitHub.com/Kohulan/DECIMER-Image_Transformer/graphs/contributors/)
[![tensorflow](https://img.shields.io/badge/TensorFlow-2.10.1-FF6F00.svg?style=for-the-badge&logo=tensorflow)](https://www.tensorflow.org)
[![DOI](https://zenodo.org/badge/293572361.svg)](https://zenodo.org/badge/latestdoi/293572361)
[![Documentation Status](https://readthedocs.org/projects/decimer-image-transformer/badge/?version=latest&style=for-the-badge)](https://decimer-image-transformer.readthedocs.io/en/latest/?badge=latest)
[![GitHub release](https://img.shields.io/github/release/Kohulan/DECIMER-Image_Transformer.svg?style=for-the-badge)](https://GitHub.com/Kohulan/DECIMER-Image_Transformer/releases/)
[![PyPI version fury.io](https://badge.fury.io/py/decimer.svg?style=for-the-badge)](https://pypi.python.org/pypi/decimer/)

---

## ๐Ÿ“š Table of Contents

- [Abstract](#-abstract)
- [Method and Model Changes](#-method-and-model-changes)
- [Installation](#-installation)
- [Usage](#-usage)
- [Hand-drawn Model](#-decimer---hand-drawn-model)
- [Citation](#-citation)
- [Acknowledgements](#-acknowledgements)
- [Author](#-author-kohulan)
- [Project Website](#-project-website)
- [Research Group](#-research-group)

---

## ๐Ÿ”ฌ Abstract

The DECIMER 2.2 project tackles the OCSR (Optical Chemical Structure Recognition) challenge using cutting-edge computational intelligence methods. Our goal? To provide an automated, open-source software solution for chemical image recognition.

We've supercharged DECIMER with Google's TPU (Tensor Processing Unit) to handle datasets of over 1 million images with lightning speed!

---

## ๐Ÿง  Method and Model Changes



๐Ÿ–ผ๏ธ Image Feature Extraction


Now utilizing EfficientNet-V2 for superior image analysis




๐Ÿ”ฎ SMILES Prediction


Employing a state-of-the-art transformer model



### ๐Ÿš€ Training Enhancements

1. **TFRecord Files**: Lightning-fast data reading
2. **Google Cloud Buckets**: Efficient cloud storage solution
3. **TensorFlow Data Pipeline**: Optimized data loading
4. **TPU Strategy**: Harnessing the power of Google's TPUs

---

## ๐Ÿ’ป Installation

```bash
# Create a conda wonderland
conda create --name DECIMER python=3.10.0 -y
conda activate DECIMER

# Equip yourself with DECIMER
pip install decimer
```

---

## ๐ŸŽฎ Usage

```python
from DECIMER import predict_SMILES

# Unleash the power of DECIMER
image_path = "path/to/your/chemical/masterpiece.jpg"
SMILES = predict_SMILES(image_path)
print(f"๐ŸŽ‰ Decoded SMILES: {SMILES}")
```

---

## โœ๏ธ DECIMER - Hand-drawn Model

๐ŸŒŸ **New Feature Alert!** ๐ŸŒŸ

Our latest model brings the magic of AI to hand-drawn chemical structures!

[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.10781330.svg)](https://doi.org/10.5281/zenodo.10781330)

---

## ๐Ÿ“œ Citation

If DECIMER helps your research, please cite:

1. Rajan K, et al. "DECIMER.ai - An open platform for automated optical chemical structure identification, segmentation and recognition in scientific publications." *Nat. Commun.* 14, 5045 (2023).
2. Rajan, K., et al. "DECIMER 1.0: deep learning for chemical image recognition using transformers." *J Cheminform* 13, 61 (2021).
3. Rajan, K., et al. "Advancements in hand-drawn chemical structure recognition through an enhanced DECIMER architecture," *J Cheminform* 16, 78 (2024).

---

## ๐Ÿ™ Acknowledgements

- A big thank you to [Charles Tapley Hoyt](https://github.com/cthoyt) for his invaluable contributions!
- Powered by Google's TPU Research Cloud (TRC)



---

## ๐Ÿ‘จโ€๐Ÿ”ฌ Author: [Kohulan](https://kohulanr.com)



---

## ๐ŸŒ Project Website

Experience DECIMER in action at [decimer.ai](https://decimer.ai), brilliantly implemented by [Otto Brinkhaus](https://github.com/OBrink)!

---

## ๐Ÿซ Research Group





---

### ๐Ÿ“Š Project Analytics

![Repobeats](https://repobeats.axiom.co/api/embed/bf532b7ac0d34137bdea8fbb82986828f86de065.svg "Repobeats analytics image")