https://github.com/n-jangra/img-ocr
This repository is a Go wrapper for Tesseract OCR using the gosseract package. It enables developers to easily integrate Optical Character Recognition (OCR) functionality into their Go applications, leveraging the power of Tesseract for text extraction from images.
https://github.com/n-jangra/img-ocr
golang gotemplate html-css img ocr-text-reader
Last synced: 8 months ago
JSON representation
This repository is a Go wrapper for Tesseract OCR using the gosseract package. It enables developers to easily integrate Optical Character Recognition (OCR) functionality into their Go applications, leveraging the power of Tesseract for text extraction from images.
- Host: GitHub
- URL: https://github.com/n-jangra/img-ocr
- Owner: N-Jangra
- License: gpl-3.0
- Created: 2025-05-11T17:33:04.000Z (10 months ago)
- Default Branch: main
- Last Pushed: 2025-05-11T17:39:51.000Z (10 months ago)
- Last Synced: 2025-05-13T00:55:39.104Z (10 months ago)
- Topics: golang, gotemplate, html-css, img, ocr-text-reader
- Language: Go
- Homepage:
- Size: 27.3 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: readme.md
- License: LICENSE
Awesome Lists containing this project
README
#
## Go OCR Web App
A simple web application built with **Golang** that allows users to:
* Upload or drag-and-drop an image
* Extract text from the image using **OCR (Optical Character Recognition)**
* View the uploaded image and extracted text in the browser
* Copy the text manually (no JavaScript required)
---
## What It Does
This application provides a web interface to upload image files (JPG, PNG, etc.). It uses the **Tesseract OCR engine** via the `gosseract` Go library to process the image and extract any readable text.
After submission:
1. The image is uploaded and stored in the `uploads/` directory.
2. OCR is performed server-side.
3. The webpage reloads and displays:
* The uploaded image
* Extracted text in a ``
No client-side scripting (JavaScript) is required for image processing or rendering.
---
## Features
* Upload image through form
* Displays uploaded image
* Extracts text using Tesseract
* Displays extracted text
* Copy/paste manually
---
## Requirements
### 1. System Dependencies
You must have **Tesseract OCR** and its dependencies (Leptonica) installed:
#### Linux (Ubuntu/Debian)
```bash
sudo apt update
sudo apt install -y tesseract-ocr libtesseract-dev libleptonica-dev
```
#### macOS (with Homebrew)
```bash
brew install tesseract
```
#### Windows
* Download the installer from: [https://github.com/UB-Mannheim/tesseract/wiki](https://github.com/UB-Mannheim/tesseract/wiki)
* Add the installation path (e.g., `C:\Program Files\Tesseract-OCR`) to your system `PATH`
* Set `TESSDATA_PREFIX` to `C:\Program Files\Tesseract-OCR\tessdata`
---
### 2. Go Dependencies
```bash
go install
go get github.com/otiai10/gosseract/v2
```
---
## 🗂 Project Structure
```
go-ocr-app/
├── cmd/
│ └── server/
│ └── main.go # Entry point
├── internal/
│ └── handlers/
│ └── upload.go # Form and OCR logic
│ └── templates/
│ └── index.gohtml # HTML rendering
├── static/
│ └── style.css # CSS styles
├── uploads/ # Uploaded image files
├── go.mod
└── go.sum
```
---
## How to Run
### 1. Clone the Repo (or create the structure)
```bash
git clone https://github.com/N-Jangra/img-ocr
cd img-ocr
```
### 2. Initialize Go Module
```bash
go mod tidy
```
### 3. Run the Server
```bash
go run main.go
```
### 4. Open in Your Browser
Go to: [http://localhost:8080](http://localhost:8080)
---
## How It Works
* The HTML form uploads the image via `POST /upload`
* Go handles the request:
* Saves the image in `/uploads/`
* Uses `gosseract` to extract text
* Renders the same HTML template with image and text data
* The `uploads/` folder is statically served, so the image preview is just a `
` pointing to `/uploads/`
---
## License
This project is open-source and free to use under the [MIT License](LICENSE).
#