Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/vishnunkumar/doc_transformers
Document processing using transformers
https://github.com/vishnunkumar/doc_transformers
ai ml nlp ocr
Last synced: about 8 hours ago
JSON representation
Document processing using transformers
- Host: GitHub
- URL: https://github.com/vishnunkumar/doc_transformers
- Owner: Vishnunkumar
- License: mit
- Created: 2021-09-02T05:49:06.000Z (about 3 years ago)
- Default Branch: main
- Last Pushed: 2023-04-22T15:09:23.000Z (over 1 year ago)
- Last Synced: 2024-09-16T19:42:29.030Z (2 months ago)
- Topics: ai, ml, nlp, ocr
- Language: Python
- Homepage: https://vishnunkumar.github.io/doc_transformers/
- Size: 1.55 MB
- Stars: 19
- Watchers: 3
- Forks: 6
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Doc Transformers
Document processing using transformers. This is still in developmental phase, currently supports only extraction of form data i.e (key - value pairs)```bash
pip install -q doc-transformers
```## Pre-requisites
Please install the following seperately
```
pip install pip --upgrade
pip install -q git+https://github.com/huggingface/transformers.gitpip install pyyaml==5.1
# workaround: install old version of pytorch since detectron2 hasn't released packages for pytorch 1.9 (issue: https://github.com/facebookresearch/detectron2/issues/3158)
pip install torch==1.8.0+cu101 torchvision==0.9.0+cu101 -f https://download.pytorch.org/whl/torch_stable.html# install detectron2 that matches pytorch 1.8
# See https://detectron2.readthedocs.io/tutorials/install.html for instructions
pip install -q detectron2 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cu101/torch1.8/index.html
```## Implementation
```python
# loads the pretrained dataset also
from doc_transformers import parser# loads the image and labels
image = parser.load_image(input_path_image)
labels = parser.load_tags()# loads the model
feature_extractor, processor, model = parser.load_models()# gets the bounding boxes, predictions, extracted words and image processed
kp = parser.process_image(image, feature_extractor, processor, model, labels)
```## Results
**Input & Output**
**Table**
- After saving to csv the result looks like the following
| LABEL | TEXT |
| ----- | ---------------------------------- |
| title | CREDIT CARD VOUCHER ANY RESTAURANT |
| title | ANYWHERE |
| key | DATE: |
| value | 02/02/2014 |
| key | TIME: |
| value | 11:11 |
| key | CARD |
| key | TYPE: |
| value | MC |
| key | ACCT: |
| value | XXXX XXXX XXXX |
| value | 1111 |
| key | TRANS |
| key | KEY: |
| value | HYU8789798234 |
| key | AUTH |
| key | CODE: |
| value | 12345 |
| key | EXP |
| key | DATE: |
| value | XX/XX |
| key | CHECK: |
| value | 1111 |
| key | TABLE: |
| value | 11/11 |
| key | SERVER: |
| value | 34 |
| value | MONIKA |
| key | Subtotal: |
| value | $1969 |
| value | .69 |
| key | Gratuity: Total: |## Code credits
[@HuggingFace](https://huggingface.co/)
- Please note that this is still in development phase and will be improved in the near future