https://github.com/sushant1827/mistral-ocr-pdf-image
This workflow automates the extraction of structured information from PDFs and image files using the Mistral OCR API.
https://github.com/sushant1827/mistral-ocr-pdf-image
google-sheets http-requests mistral-ai n8n n8n-nodes n8n-workflow ocr-service ocr-text-reader openai-api text-classification
Last synced: 10 days ago
JSON representation
This workflow automates the extraction of structured information from PDFs and image files using the Mistral OCR API.
- Host: GitHub
- URL: https://github.com/sushant1827/mistral-ocr-pdf-image
- Owner: sushant1827
- Created: 2025-04-17T15:28:54.000Z (6 months ago)
- Default Branch: main
- Last Pushed: 2025-04-17T15:36:54.000Z (6 months ago)
- Last Synced: 2025-06-30T11:07:44.569Z (3 months ago)
- Topics: google-sheets, http-requests, mistral-ai, n8n, n8n-nodes, n8n-workflow, ocr-service, ocr-text-reader, openai-api, text-classification
- Homepage:
- Size: 357 KB
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# π§ Mistral OCR with n8n
This n8n workflow automates the extraction of structured information from PDFs and image files using the [Mistral OCR API](https://mistral.ai/). Itβs designed for use cases like extracting invoice details or restaurant receipts, and storing them into Google Sheets.
---
## π§ Features
- π Upload **PDF invoices** through an n8n form
- πΌοΈ Provide **image URLs** (e.g., receipts) for OCR
- π Uses **Mistral OCR** for high-quality document text extraction
- π§ Extracts specific fields using LangChain Information Extractor:
- For PDFs:
- `Invoice Number`
- `Date`
- `Gross Amount`
- `Customer ID`
- For images:
- `Restaurant Name`
- `Date`
- `Total Bill Amount`
- π€ Appends extracted data to a **Google Sheets document**
- π¬ Integrated OpenAI Chat node (optional) for further enrichment or validation---
## π§© Workflow Nodes Overview
- **Form Trigger** β Collects PDF invoices from users
- **Set Node** β Allows testing with static image URLs
- **Mistral API Integration** β Handles:
- File upload
- Signed URL generation
- OCR processing
- **LangChain Extractors** β Converts OCR'd text into structured fields
- **Google Sheets Node** β Writes the extracted information to a live spreadsheet
- **OpenAI Chat Node** β Optionally reviews or interprets the data---
## π§© PDF Workflow

---
## π§© Image Workflow

---
## π Requirements
- n8n setup (local or cloud)
- Mistral API key with OCR access
- Google Sheets API credentials
- OpenAI API key---
## π Example Use Cases
- Automating expense reports
- Extracting invoice metadata for accounting
- Digitizing restaurant receipts for tax documentation---
> Built with π using [n8n](https://n8n.io) and [Mistral OCR](https://mistral.ai/).