Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/the-black-knight-01/Tabulo
Table Detection and Extraction Using Deep Learning ( It is built in Python, using Luminoth, TensorFlow<2.0 and Sonnet.)
https://github.com/the-black-knight-01/Tabulo
deep-learning detection faster-r-cnn luminoth ocr pdf-table-extraction python sonnet ssd table-data-extraction table-detection table-detection-using-deep-learning table-recognition tabulo tensorflow tesseract
Last synced: 3 months ago
JSON representation
Table Detection and Extraction Using Deep Learning ( It is built in Python, using Luminoth, TensorFlow<2.0 and Sonnet.)
- Host: GitHub
- URL: https://github.com/the-black-knight-01/Tabulo
- Owner: the-black-knight-01
- License: bsd-3-clause
- Created: 2019-08-29T13:29:43.000Z (about 5 years ago)
- Default Branch: master
- Last Pushed: 2022-11-24T18:54:27.000Z (almost 2 years ago)
- Last Synced: 2024-05-19T19:21:38.299Z (6 months ago)
- Topics: deep-learning, detection, faster-r-cnn, luminoth, ocr, pdf-table-extraction, python, sonnet, ssd, table-data-extraction, table-detection, table-detection-using-deep-learning, table-recognition, tabulo, tensorflow, tesseract
- Language: Python
- Homepage: https://interviewbubble.com
- Size: 10.6 MB
- Stars: 196
- Watchers: 11
- Forks: 40
- Open Issues: 15
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-document-understanding - Tabulo - Table extraction from images (Resources)
README
[![Tabulo](https://github.com/interviewBubble/Tabulo/raw/master/docs/images/Tabulo_logo.png)](https://github.com/interviewBubble/Tabulo)
---
Tabulo is an open source toolkit for **computer vision**. Currently, we support table detection, but we are aiming for much more. It is built in Python, using [Luminoth](https://github.com/tryolabs/luminoth), [TensorFlow](https://www.tensorflow.org/) and [Sonnet](https://github.com/deepmind/sonnet).
### Table of Contents
1. **[Installation Instructions](#1-installation-instructions)**
2. **[Avaiable API's](#2-avaiable-apis)**
3. **[Working with pretrained Models](#3-working-with-pretrained-models)**
4. **[Runnning Tabulo](#4-runnning-tabulo)**
5. **[Runnning Tabulo As Service](#5-runnning-tabulo-as-service)**
6. **[Supported models ](#6-supported-models)**
7. **[Usage](#7-usage)**
8. **[Working with datasets](#8-working-with-datasets)**
9. **[Training](#9-training)**
10. **[LICENSE](#10-license)**
## 1. Installation Instructions
Tabulo currently supports Python 2.7 and 3.4–3.6.### 1.1 Pre-requisites
To use Tabulo, [TensorFlow](https://www.tensorflow.org/install/) must be installed beforehand. If you want **GPU support**, you should install the GPU version of TensorFlow with `pip install tensorflow-gpu`, or else you can use the CPU version using `pip install tensorflow`.
We are using tesseract to extract data from table so you have to install tesseract also. [Follow this link to install tessersact](https://interviewbubble.com/install-tesseract-on-mac-linux-windows-centos/)
### 1.2 Installing Tabulo
First, clone the repo on your machine and then install with `pip`:
```bash
git clone https://github.com/interviewBubble/Tabulo.git
cd tabulo
pip install -e .
```### 1.3 Check that the installation worked
Simply run `tabulo --help`.
## 2. Avaiable API's
* `localhost:5000/api/fasterrcnn/predict/` - To detect table in the image
* `localhost:5000/api/fasterrcnn/extract/` - Extract table content from detected tables## 3. Working with pretrained Models:
* DOWNLOAD pretrained model from [Google drive](https://drive.google.com/drive/folders/1aUh9RfGn2XGgG2EtpKFh7P6PmcC3Q48z?usp=sharing)
* Unzip and Copy downloaded luminoth folder inside ```luminoth/utils/pretrained_models``` folder
* Hit this command to list all check points: ```tabulo checkpoint list```
* You will get output like this:
![Checkpoints](https://github.com/interviewBubble/Tabulo/raw/master/docs/images/Checkpoints.png)
* Now run server using this command: ```tabulo server web --checkpoint 6aac7a1e8a8e```## 4. Runnning Tabulo
### 4.1 Running Tabulo as Web Server:
![Running Tabulo](https://github.com/interviewBubble/Tabulo/blob/master/docs/images/tabulo_server.png)### 4.2 Example of Table Detection with Faster R-CNN By Tabulo:
![Example of Table Detection with Faster R-CNN By Tabulo](https://github.com/interviewBubble/Tabulo/blob/master/docs/images/table_detect.png)### 4.3 Example of Table Data Extraction with tesseract By Tabulo:
![Example of Table Data Extraction with tesseract By Tabulo](https://github.com/interviewBubble/Tabulo/blob/master/docs/images/table_data_extract.png)## 5. Runnning Tabulo As Service:
### 5.1 Using Curl command
```Curl command to detect tabel
curl -X POST \
http://localhost:5000/api/fasterrcnn/predict/ \
-H 'Content-Type: application/x-www-form-urlencoded' \
-H 'Postman-Token: 70478bd2-e1e8-442f-b0bf-ea5ecf7bf4d8' \
-H 'cache-control: no-cache' \
-H 'content-type: multipart/form-data; boundary=----WebKitFormBoundary7MA4YWxkTrZu0gW' \
-F image=@/path/to/image/page_8-min.jpg
```
### 5.2 With PostMan
#### Header Section:
![Table Detection using Postman](https://github.com/interviewBubble/Tabulo/blob/master/docs/images/tabulo_resquest_header.png)#### Data Section:
![Table Detection using Postman](https://github.com/interviewBubble/Tabulo/raw/master/docs/images/table_detect_API.png)## 6. Supported models
Currently, we support the following models:
* **Object Detection**
* [Faster R-CNN](https://arxiv.org/abs/1506.01497)
* [SSD](https://arxiv.org/abs/1512.02325)We also provide **pre-trained checkpoints** for the above models trained on popular datasets such as [COCO](http://cocodataset.org/) and [Pascal](http://host.robots.ox.ac.uk/pascal/VOC/).
## 7. Usage
There is one main command line interface which you can use with the `tabulo` command. Whenever you are confused on how you are supposed to do something just type:
`tabulo --help` or `tabulo --help`
and a list of available options with descriptions will show up.
## 8. Working with datasets
[DataSet to train your custom model](https://github.com/interviewBubble/Table-Detection-using-Deep-Learning/tree/master/data).
## 9. Training
See [Training your own model](https://github.com/interviewBubble/Table-Detection-using-Deep-Learning) to learn how to train locally or in Google Cloud.
## 10. LICENSE
Released under the [BSD 3-Clause](LICENSE).--------------
# References
* https://github.com/Sargunan/Table-Detection-using-Deep-learning
* https://github.com/tryolabs/luminoth