An open API service indexing awesome lists of open source software.

https://github.com/kasim95/ocr_math_expressions

A Deep Learning pipeline to recognize mathematical expressions from images
https://github.com/kasim95/ocr_math_expressions

keras mathematics optical-character-recognition python python3 tensorflow tensorflow-models

Last synced: 8 months ago
JSON representation

A Deep Learning pipeline to recognize mathematical expressions from images

Awesome Lists containing this project

README

          

# Optical Character Recognition for Handwritten Mathematical Expressions

[![license](https://img.shields.io/github/license/mashape/apistatus.svg)](LICENSE)

## Introduction
The aim of this repo is to recognize handwritten mathematical expression present in image using Deep Learning

The Project is divided into three tasks:
* Character Localization

Localizes characters present in the image with bounding box

* Character Classification

Identifies the class of character present in the bounding box

* Syntactic Analysis

Verifies the collection of symbols predicted by previous two tasks if it represents a mathematical image and generates MathML representation of it.

---
## Quick Start
1. Clone repo and download saved weights
2. Run this command to identify the mathematical expression from image *exp0030.png*
```shell script
python3 evaluate.py -m "trained_models/model3.h5" -i "datasets/object_detection/evaluate/exp0030.png"
```

Example image for characters in expression located with Bounding boxes
![Object Detection](plots/Object_Detection_bboxes.png)

MathML Output:
```xml


Z
=
X
+
Y

```

---

## Training

Convolutional Neural Networks and Dense Networks are trained in *Classification_task.ipynb*

Object Detection Models are trained in their respective submodules

---
## Submodules
* [keras-yolo3](https://github.com/kasim95/keras-yolo3/tree/ocr_math)
(*ocr_math*)

> YOLOv3 implementation in Keras
Used to train Tiny-YOLOv3 model on object_detection dataset

* [models](https://github.com/kasim95/models/tree/tf112)
(*tf112*)

> Tensorflow Object Detection API
Used to train Faster-RCNN with Resnet-50 model on object_detection dataset

---

## Project Structure:

Directories
* **datasets/** : Contains datasets for Object Detection and Character Classification
* **plots/** : Contains plots generated by notebooks and scripts
* **Report/** : Contains Project Report
* **processed_data/** : Contains labels and other processed stuff from Dataset_Preprocessing.ipynb
* **syntactical_analysis** :
* **trained_models/** : Contains saved models weights for CNNs and ANNs

Notebooks
* **Dataset_Preprocessing** : Notebook containing code to combine screen dataset, combine with custom images and generate train-test splits
* **OD_Character_Segmentation.ipynb** : Notebook demonstrating Character Localization using Contour Search
* **OD_Faster-RCNN.ipynb** : Notebook demonstrating Character Localization using Faster-RCNN with Resnet50 model
* **OD_yolov3.ipynb** : Notebook demonstrating Character Localization using Tiny YOLOv3 model
* **Optical_Character_Recognition** : Notebook demonstrating the complete Project Pipeline

Python scripts
* **evaluate.py** : Python file used to evaluate math expression/s from a single image or multiple images using Project Pipeline
* **utils.py** : Python file containing Helper functions

---
## Issues
At the moment, parser rules are set for binary operators only. This limits the scope of the Project to supported operators.

Supported operators in mathematical expression:
* =
* +
* −
* /
* ÷
* *
* ×
* %

---

## Note
This Project has been tested in the following environment:
* Python 3.6.9
* Tensorflow 1.12.3
* Keras 2.2.4
* OpenCV 3.4.2.16
* Numpy 1.17.4
* Pandas 0.25.3