https://github.com/ferbcn/pytextractor

Extract text with OCR from images and pdf-image files
https://github.com/ferbcn/pytextractor

pyqt5 python tesseract tesseract-ocr

Last synced: 11 months ago
JSON representation

Extract text with OCR from images and pdf-image files

Host: GitHub
URL: https://github.com/ferbcn/pytextractor
Owner: ferbcn
Created: 2019-10-31T10:00:15.000Z (over 6 years ago)
Default Branch: master
Last Pushed: 2019-10-31T13:05:07.000Z (over 6 years ago)
Last Synced: 2025-06-09T20:38:06.461Z (about 1 year ago)
Topics: pyqt5, python, tesseract, tesseract-ocr
Language: Python
Homepage:
Size: 18.6 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: Readme.MD

Awesome Lists containing this project

README

# Installation

pip install -r requirements.txt

tesseract-ocr binaries (and maybe PATH)

Tesseract language packages. (and maybe add PATH to tessdata parent directory)

Windows requires poppler for windows (included in most linux distros). You'll need to add binaries folder to PATH