https://github.com/jyothish-ram/invoice_ocr_api
Invoice OCR Extraction Flask API
https://github.com/jyothish-ram/invoice_ocr_api
flask-api gemma-2b nlp ocr ollama tensorflow tesseract
Last synced: 8 months ago
JSON representation
Invoice OCR Extraction Flask API
- Host: GitHub
- URL: https://github.com/jyothish-ram/invoice_ocr_api
- Owner: jyothish-ram
- Created: 2024-09-09T12:52:03.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2024-09-10T12:35:55.000Z (about 1 year ago)
- Last Synced: 2025-02-02T18:26:28.390Z (8 months ago)
- Topics: flask-api, gemma-2b, nlp, ocr, ollama, tensorflow, tesseract
- Language: Python
- Homepage:
- Size: 337 KB
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Invoice OCR API
This project is an OCR data extraction from Invoices or bills. This project utilizes TensorFlow, Pytesseract, Ollama, and Gemma_2b.
## Identifiable parameters
`Company Name`,
`Company Address`,
`Customer Name`,
`Customer Address`,
`Invoice Number`,
`Invoice Date`,
`Due Date`,
`Description`,
`Quantity`,
`Unit Price`,
`Taxes`,
`Amount`,
`Total`## Working
This project mainly consists of three parts
1. Tensorflow Model: this model finds the ROI(region of interest) from the invoice Image. the ROI is given to Tesseract.
2. Tesseract: Pytessract extracts text from the image(ocr engine).
3. NLP: Gemma_2b is used with Ollama for NLP which corrects the text extracted by Tesseract> [!NOTE]
> - Tensorflow model is stored in `models/saved_model` folder## Installation
> [!NOTE]
> - `python3 -m venv venv` to create python virtual env
> - `./venv/scripts/activate` to activate venv in Windows or `source venv/bin/activate` in Linux
> - `run pip install -r requirements.txt` to install necessary packages
> - Need to install Ollama as per Ollama documentation, for Linux `curl -fsSL https://ollama.com/install.sh | sh`
> - run `ollama run gemma2:2b` to download NLP Gemma_2b model.
> - run `sudo apt install tesseract-ocr` to install tesseract on Linux machines or for Windows, visit (tesseract for windows)[https://tesseract-ocr.github.io/tessdoc/Compiling.html#windows]
> - To run the program `python app.py`### API Request Model
Sample API Request Model(POST)
> - headers:
```
Content-Type : application/json
```> - body:
```
{
"image": "{image in base64 Format}"}
```### API Response Model
Sample API Response:
```
{
"Company Name": "TEMPUSTIC CONSULTORIA TECNOLOGICA SL",
"Company Address": "C/ PIE DE ALTAR N° 7\n28229 VILLANUEVA DEL PARRDILLO\nMADRID",
"Customer Name": "SM TECNOLOGIA, S.L.U.",
"Customer Address": "Poligono Industrial Os Airios, Sector 2 - Parcela 4\n15320 As Pontes\nA Corufia",
"Invoice Number": "2023.11",
"Invoice Date": "31/05/2023",
"Due Date": null,
"Description": "Hora Programador Java Junior",
"Quantity": 30,
"Unit Price": 176.00,
"Taxes": 1108.80,
"Amount": 5280.00,
"Total": 6388.80
}
```