{"id":26754940,"url":"https://github.com/pcisha/ocr-opencv-tesseract","last_synced_at":"2025-03-28T14:17:19.346Z","repository":{"id":275918092,"uuid":"927596792","full_name":"pcisha/ocr-opencv-tesseract","owner":"pcisha","description":"OCR Image to Text Converter using OpenCV and Tesseract","archived":false,"fork":false,"pushed_at":"2025-02-05T08:53:39.000Z","size":6910,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-02-05T09:32:46.284Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/pcisha.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-02-05T08:14:05.000Z","updated_at":"2025-02-05T08:53:42.000Z","dependencies_parsed_at":"2025-02-05T09:32:52.967Z","dependency_job_id":"915d78bc-4db8-4fc1-a26b-243131ccc051","html_url":"https://github.com/pcisha/ocr-opencv-tesseract","commit_stats":null,"previous_names":["pcisha/ocr-opencv-tesseract"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pcisha%2Focr-opencv-tesseract","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pcisha%2Focr-opencv-tesseract/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pcisha%2Focr-opencv-tesseract/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pcisha%2Focr-opencv-tesseract/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/pcisha","download_url":"https://codeload.github.com/pcisha/ocr-opencv-tesseract/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246042016,"owners_count":20714148,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-03-28T14:17:18.792Z","updated_at":"2025-03-28T14:17:19.338Z","avatar_url":"https://github.com/pcisha.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"# 🖼️ OCR Image Processing Web Application\n\nA Java Spring Boot web application that allows users to upload images (screenshots, scanned documents, etc.) and extract text using Tesseract OCR with advanced image preprocessing powered by OpenCV.\n\n### 🚀 Features\n\n- 📤 Upload Images: Supports multiple image formats (JPG, PNG, BMP).\n\n- 🔍 Accurate OCR: Preprocesses images for high-accuracy text extraction.\n\n- 📏 File Size Validation: Supports file uploads up to 5MB.\n\n- 🧪 Error Handling: Graceful error handling for large files, invalid images, and processing failures.\n\n- ⚙️ Dynamic Configurations: Easily configure library paths, file size limits, and other properties.\n\n### ⚙️ Technologies Used\n\n- Java 17 (or higher)\n\n- Spring Boot (REST API)\n- OpenCV (Image Preprocessing)\n- Tesseract OCR (Text Extraction)\n- Maven (Dependency Management)\n\n### 🚀 Getting Started\n\n##### 1️⃣ Prerequisites\n\n- Java 17+\n\n- Maven\n\n- OpenCV with Java Bindings\n\n- Tesseract OCR\n\n##### 2️⃣ Clone the Repository\n\n`git clone https://github.com/your-username/ocr-image-processor.git\ncd ocr-image-processor`\n\n##### 3️⃣ Install Dependencies\n\n`mvn clean install`\n\n##### 4️⃣ Run the Application\n\n`mvn spring-boot:run`\n\n##### 5️⃣ API Endpoint\n\n`POST api/ocr/upload\nContent-Type: multipart/form-data`\n\n\n### 📄 Configuration\n\nEdit `src/main/resources/application.properties`:\n\n#### File Upload Settings\n`max.file.size.mb=5`\n\n#### OpenCV and Tesseract Paths\nOpenCV: `opencv.library.path=/usr/local/share/java/opencv4/libopencv_java4120.dylib`\n\nTesseract: `jna.library.path=/opt/homebrew/Cellar/tesseract/5.5.0/lib`\n\n### 🚩 Error Handling\n\n- 413 Payload Too Large: Triggered when a file exceeds 5MB.\n\n- 400 Invalid Image: If the uploaded file is not a valid image.\n\n- 500 Server Error: For unexpected issues during processing.\n\n#\nDate: February 5, 2025\n\nAuthor: Prachi Shah @ https://pcisha.my.canva.site/\n\nP.S. The default copyright laws apply.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpcisha%2Focr-opencv-tesseract","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpcisha%2Focr-opencv-tesseract","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpcisha%2Focr-opencv-tesseract/lists"}