https://github.com/karthik-saiharsh/distributed-ocr
DOCR: A local-first, distributed Optical Character Recognition (OCR) platform built with Go, React, and Wails.
https://github.com/karthik-saiharsh/distributed-ocr
distributed-systems go ocr queuing tesseract tesseract-ocr text-recognition wails work-stealing
Last synced: 2 months ago
JSON representation
DOCR: A local-first, distributed Optical Character Recognition (OCR) platform built with Go, React, and Wails.
- Host: GitHub
- URL: https://github.com/karthik-saiharsh/distributed-ocr
- Owner: karthik-saiharsh
- Created: 2026-02-13T04:11:47.000Z (4 months ago)
- Default Branch: main
- Last Pushed: 2026-03-06T08:49:52.000Z (3 months ago)
- Last Synced: 2026-03-06T12:44:21.064Z (3 months ago)
- Topics: distributed-systems, go, ocr, queuing, tesseract, tesseract-ocr, text-recognition, wails, work-stealing
- Language: Go
- Homepage:
- Size: 1.17 MB
- Stars: 0
- Watchers: 0
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# DOCR: Privacy-First Distributed OCR Grid
A local-first, distributed Optical Character Recognition (OCR) platform built with Go, React, and Wails.
## Overview
In the modern digital infrastructure, organizations face a massive bottleneck: the digitization of physical archives. Processing tens of thousands of high-resolution pages is computationally expensive and slow on a single machine. Cloud solutions (like AWS Textract) introduce severe data privacy concerns (HIPAA, GDPR) and require constant internet connectivity.
**Swarm** solves this by creating a **decentralized compute grid** out of ad-hoc local devices (laptops, desktops) sitting around in your office. It uses a **Master-Worker topology** over a Local Area Network (LAN) to securely and privately distribute OCR tasks using advanced distributed systems techniques.
---
## Key Features
* **Privacy-First & Local:** Zero cloud dependency. Sensitive documents (medical records, legal contracts) never leave your local network. Air-gap friendly.
* **Dynamic Load Balancing (Work Stealing):** Idle worker nodes proactively "steal" tasks from busy nodes via direct P2P RPC, ensuring maximum CPU utilization across the cluster.
* **Autonomic Peer Discovery (SWIM Gossip):** Nodes dynamically form a mesh network via UDP multicast gossip. If a laptop is closed or disconnects, the cluster self-heals without task loss.
* **Result Verification (Consensus):** Implements redundant execution. Multiple workers process the same chunk and the Master verifies consensus to defend against malicious nodes or corrupted processing.
* **Cross-Platform GUI:** A sleek interface built with React, Vite, and Wails, giving a native desktop feel on Mac, Windows, and Linux.
---
## System Architecture
System Architechture
The architecture relies on high-performance concurrent processing in Go and robust networking protocols:
1. **Master Node (Orchestrator):** Runs the Wails GUI, manages the Global Job Queue, parses multi-page PDFs locally, and validates the integrity of returned OCR data.
2. **Worker Nodes:** Stateless compute units running the Tesseract CGO wrapper. Features a local Double-Ended Queue (Deque) optimized for both LIFO local processing (cache locality) and FIFO work stealing.
*See [`explanation.md`](explanation.md) for a deep dive into the network topologies and data flow.*
---
## Prerequisites
To run or develop Swarm, ensure you have the following installed:
1. **[Go](https://go.dev/doc/install)** (1.20+)
2. **[Node.js & npm](https://nodejs.org/en/)**
3. **[Wails Setup](https://wails.io/docs/gettingstarted/installation)**
4. **Tesseract OCR:** Required on each machine for the core engine:
* **macOS:** `brew install tesseract`
* **Linux (Ubuntu):** `sudo apt-get install tesseract-ocr libtesseract-dev`
* **Windows:** UB-Mannheim Tesseract installer
*(Note: We use `github.com/gen2brain/go-fitz` for cross-platform PDF handling).*
---
## Development Setup
1. **Clone the Repository:**
```bash
git clone https://github.com/your-org/distributed-ocr.git
cd distributed-ocr
```
2. **Frontend Setup:**
The project uses a React/Vite frontend located in `/frontend`.
```bash
cd frontend
npm install
npm run dev
```
3. **Backend/App Setup:**
The main Wails application is bound in `app.go`. To run the application in development mode with hot-reloading:
```bash
# From the project root
wails dev
```
*Linux Users:* Run with `wails dev -tags webkit2_41` to support specific webkit dependencies.
---
## Running the Distributed Cluster
To see the distributed work stealing and gossip protocols in action across physical machines:
1. **LAN Connection:** Connect multiple computers (e.g., Laptop A and Laptop B) to the exact same local Wi-Fi or router.
2. **Build the Release:**
Compile the app for production on both machines:
```bash
wails build
```
3. **Launch Nodes:** Open the compiled app executable (found in `build/bin/`) on both computers.
4. **Discover Peers:** On Laptop A (your designated Master), click **Scan For Nodes**. The UDP gossip protocol will automatically discover Laptop B's IP address.
5. **Distribute Work:** Click **Upload Document** on Laptop A and select a large PDF.
6. **Watch the Magic:** Laptop A chunks the PDF and distributes it via RPC. Laptop B will compute the OCR using its local Tesseract instance and return strings back to Laptop A to be verified and stitched back together!
---
## Repository Structure
* `/frontend` - React, TypeScript, Vite frontend source.
* `/master` - Orchestrator logic, consensus verification, and job queuing.
* `/worker` - Node executor, Task Deque, and Tesseract C/Go bindings.
* `/swim` - Custom UDP Gossip and discovery protocol implementation.
* `/rpc` - Protobuf/TCP communication interfaces for task assignments and work stealing.
---
## Contributing
We welcome pull requests!
1. Create a new branch for your feature (`git checkout -b feature/nice-feature`).
2. Make your backend changes in Go or frontend changes inside `/frontend`.
3. Please do **not** commit to `main` directly.
4. Submit a PR!
## License
GNU GPL V3
## Created By
[@karthik-saiharsh](https://www.github.com/karthik-saiharsh), [@Adith1207](https://www.github.com/Adith1207), [@Dharsh045](https://www.github.com/Dharsh045), [@RoshJ-17](https://www.github.com/RoshJ-17)