https://github.com/ferbcn/pytextractor
Extract text with OCR from images and pdf-image files
https://github.com/ferbcn/pytextractor
pyqt5 python tesseract tesseract-ocr
Last synced: 9 months ago
JSON representation
Extract text with OCR from images and pdf-image files
- Host: GitHub
- URL: https://github.com/ferbcn/pytextractor
- Owner: ferbcn
- Created: 2019-10-31T10:00:15.000Z (over 6 years ago)
- Default Branch: master
- Last Pushed: 2019-10-31T13:05:07.000Z (over 6 years ago)
- Last Synced: 2025-06-09T20:38:06.461Z (12 months ago)
- Topics: pyqt5, python, tesseract, tesseract-ocr
- Language: Python
- Homepage:
- Size: 18.6 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: Readme.MD
Awesome Lists containing this project
README
# Installation
pip install -r requirements.txt
tesseract-ocr binaries (and maybe PATH)
Tesseract language packages. (and maybe add PATH to tessdata parent directory)
Windows requires poppler for windows (included in most linux distros). You'll need to add binaries folder to PATH