https://github.com/vsymbol/CUTIE
CUTIE (TensorFlow implementation of Convolutional Universal Text Information Extractor)
https://github.com/vsymbol/CUTIE
computer-vision deep-learning text-extraction
Last synced: about 1 year ago
JSON representation
CUTIE (TensorFlow implementation of Convolutional Universal Text Information Extractor)
- Host: GitHub
- URL: https://github.com/vsymbol/CUTIE
- Owner: vsymbol
- Created: 2019-01-15T06:18:27.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2022-12-08T05:25:04.000Z (over 3 years ago)
- Last Synced: 2024-11-03T09:33:38.102Z (over 1 year ago)
- Topics: computer-vision, deep-learning, text-extraction
- Language: Python
- Homepage:
- Size: 2.87 MB
- Stars: 154
- Watchers: 16
- Forks: 78
- Open Issues: 18
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# CUTIE
TensorFlow implementation of the paper "CUTIE: Learning to Understand Documents with Convolutional Universal Text Information Extractor."
Xiaohui Zhao [Paper Link](https://arxiv.org/abs/1903.12363v4)
----
CUTIE 是用于“票据文档” 2D 关键信息提取/命名实体识别/槽位填充 算法。
使用CUTIE前,需先使用OCR算法对“票据文档” 中的文字执行检测和识别,而后将格式化的文本输入入CUTIE网络,具体流程可参照论文。
CUTIE can be considered as one type of 2-Dimensional Key Information Extraction, 2-D NER (Named Entity Recognition) or a 2-Dimensional 2D Slot Filling algorithm.
Before training / inference with CUTIE, prepare your structured texts in your scanned document images with any type of OCR algorithm. Refer to the CUTIE paper for details about the procedure.
### Results
Result evaluated on 4,484 receipt documents, including taxi receipts, meals entertainment receipts, and hotel receipts, with 9 different key information classes. (AP / softAP)
|Method | #Params | Taxi | Hotel |
| ----------|:---------:| :-----: | :-----: |
| CloudScan | - | 82.0 / - | 60.0 / - |
| BERT | 110M | 88.1 / - | 71.7 / - |
| CUTIE |**14M** |**94.0 / 97.3**|**74.6 / 87.0**|


### Installation & Usage
```
pip install -r requirements.txt
```
1. Generate your own dictionary with main_build_dict.py / main_data_tokenizer.py
2. Train your model with main_train_json.py
CUTIE achieves best performance with rows/cols well configured. For more insights, refer to statistics in the file (others/TrainingStatistic.xlsx).

### Others
For information about the input example, refer to [issue discussion](https://github.com/vsymbol/CUTIE/issues/7).
- Apply any OCR tool that help you detecting and recognizing words in the scanned document image.
- Label image OCR results with key information class as the .json file in the invoice_data folder. (thanks to @4kssoft)