Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/kohulan/decimer-image_transformer
DECIMER Image Transformer is a deep-learning-based tool designed for automated recognition of chemical structure images. Leveraging transformer architectures, the model converts chemical images into SMILES strings, enabling the digitization of chemical data from scanned documents, literature, and patents.
https://github.com/kohulan/decimer-image_transformer
chemical-image-recognition decimer deep-learning image-data-mining python tensorflow tpu transformers
Last synced: 4 days ago
JSON representation
DECIMER Image Transformer is a deep-learning-based tool designed for automated recognition of chemical structure images. Leveraging transformer architectures, the model converts chemical images into SMILES strings, enabling the digitization of chemical data from scanned documents, literature, and patents.
- Host: GitHub
- URL: https://github.com/kohulan/decimer-image_transformer
- Owner: Kohulan
- License: mit
- Created: 2020-09-07T16:00:46.000Z (over 4 years ago)
- Default Branch: master
- Last Pushed: 2024-11-04T08:30:04.000Z (2 months ago)
- Last Synced: 2025-01-11T11:04:21.820Z (4 days ago)
- Topics: chemical-image-recognition, decimer, deep-learning, image-data-mining, python, tensorflow, tpu, transformers
- Language: Python
- Homepage:
- Size: 24.5 MB
- Stars: 227
- Watchers: 8
- Forks: 56
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
- Citation: CITATION.cff
Awesome Lists containing this project
README
# ๐งช DECIMER Image Transformer ๐ผ๏ธ
### Deep Learning for Chemical Image Recognition using Efficient-Net V2 + Transformer
[![License](https://img.shields.io/badge/License-MIT%202.0-blue.svg?style=for-the-badge)](https://opensource.org/licenses/MIT)
[![Maintenance](https://img.shields.io/badge/Maintained%3F-yes-green.svg?style=for-the-badge)](https://GitHub.com/Kohulan/DECIMER-Image_Transformer/graphs/commit-activity)
[![GitHub issues](https://img.shields.io/github/issues/Kohulan/DECIMER-Image_Transformer.svg?style=for-the-badge)](https://GitHub.com/Kohulan/DECIMER-Image_Transformer/issues/)
[![GitHub contributors](https://img.shields.io/github/contributors/Kohulan/DECIMER-Image_Transformer.svg?style=for-the-badge)](https://GitHub.com/Kohulan/DECIMER-Image_Transformer/graphs/contributors/)
[![tensorflow](https://img.shields.io/badge/TensorFlow-2.10.1-FF6F00.svg?style=for-the-badge&logo=tensorflow)](https://www.tensorflow.org)
[![DOI](https://zenodo.org/badge/293572361.svg)](https://zenodo.org/badge/latestdoi/293572361)
[![Documentation Status](https://readthedocs.org/projects/decimer-image-transformer/badge/?version=latest&style=for-the-badge)](https://decimer-image-transformer.readthedocs.io/en/latest/?badge=latest)
[![GitHub release](https://img.shields.io/github/release/Kohulan/DECIMER-Image_Transformer.svg?style=for-the-badge)](https://GitHub.com/Kohulan/DECIMER-Image_Transformer/releases/)
[![PyPI version fury.io](https://badge.fury.io/py/decimer.svg?style=for-the-badge)](https://pypi.python.org/pypi/decimer/)---
## ๐ Table of Contents
- [Abstract](#-abstract)
- [Method and Model Changes](#-method-and-model-changes)
- [Installation](#-installation)
- [Usage](#-usage)
- [Hand-drawn Model](#-decimer---hand-drawn-model)
- [Citation](#-citation)
- [Acknowledgements](#-acknowledgements)
- [Author](#-author-kohulan)
- [Project Website](#-project-website)
- [Research Group](#-research-group)---
## ๐ฌ Abstract
The DECIMER 2.2 project tackles the OCSR (Optical Chemical Structure Recognition) challenge using cutting-edge computational intelligence methods. Our goal? To provide an automated, open-source software solution for chemical image recognition.
We've supercharged DECIMER with Google's TPU (Tensor Processing Unit) to handle datasets of over 1 million images with lightning speed!
---
## ๐ง Method and Model Changes
๐ผ๏ธ Image Feature Extraction
Now utilizing EfficientNet-V2 for superior image analysis
๐ฎ SMILES Prediction
Employing a state-of-the-art transformer model
### ๐ Training Enhancements
1. **TFRecord Files**: Lightning-fast data reading
2. **Google Cloud Buckets**: Efficient cloud storage solution
3. **TensorFlow Data Pipeline**: Optimized data loading
4. **TPU Strategy**: Harnessing the power of Google's TPUs---
## ๐ป Installation
```bash
# Create a conda wonderland
conda create --name DECIMER python=3.10.0 -y
conda activate DECIMER# Equip yourself with DECIMER
pip install decimer
```---
## ๐ฎ Usage
```python
from DECIMER import predict_SMILES# Unleash the power of DECIMER
image_path = "path/to/your/chemical/masterpiece.jpg"
SMILES = predict_SMILES(image_path)
print(f"๐ Decoded SMILES: {SMILES}")
```---
## โ๏ธ DECIMER - Hand-drawn Model
๐ **New Feature Alert!** ๐
Our latest model brings the magic of AI to hand-drawn chemical structures!
[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.10781330.svg)](https://doi.org/10.5281/zenodo.10781330)
---
## ๐ Citation
If DECIMER helps your research, please cite:
1. Rajan K, et al. "DECIMER.ai - An open platform for automated optical chemical structure identification, segmentation and recognition in scientific publications." *Nat. Commun.* 14, 5045 (2023).
2. Rajan, K., et al. "DECIMER 1.0: deep learning for chemical image recognition using transformers." *J Cheminform* 13, 61 (2021).
3. Rajan, K., et al. "Advancements in hand-drawn chemical structure recognition through an enhanced DECIMER architecture," *J Cheminform* 16, 78 (2024).---
## ๐ Acknowledgements
- A big thank you to [Charles Tapley Hoyt](https://github.com/cthoyt) for his invaluable contributions!
- Powered by Google's TPU Research Cloud (TRC)
---
## ๐จโ๐ฌ Author: [Kohulan](https://kohulanr.com)
---
## ๐ Project Website
Experience DECIMER in action at [decimer.ai](https://decimer.ai), brilliantly implemented by [Otto Brinkhaus](https://github.com/OBrink)!
---
## ๐ซ Research Group
---
### ๐ Project Analytics
![Repobeats](https://repobeats.axiom.co/api/embed/bf532b7ac0d34137bdea8fbb82986828f86de065.svg "Repobeats analytics image")