Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/itext/itext-pdfocr-java
pdfOCR is an iText 7 add-on to recognize and extract text in scanned documents and images. It can also convert them into fully ISO-compliant PDF or PDF/A-3u files that are accessible, searchable, and suitable for archiving
https://github.com/itext/itext-pdfocr-java
archival character data diacritic extractable glyphs hindi image iso-compliant ligatures mandarin ocr optical pdf portuguese recognition scan searchable spanish tesseract
Last synced: 3 months ago
JSON representation
pdfOCR is an iText 7 add-on to recognize and extract text in scanned documents and images. It can also convert them into fully ISO-compliant PDF or PDF/A-3u files that are accessible, searchable, and suitable for archiving
- Host: GitHub
- URL: https://github.com/itext/itext-pdfocr-java
- Owner: itext
- License: other
- Created: 2020-06-16T10:16:55.000Z (over 4 years ago)
- Default Branch: develop
- Last Pushed: 2024-10-31T05:40:14.000Z (3 months ago)
- Last Synced: 2024-10-31T06:23:26.018Z (3 months ago)
- Topics: archival, character, data, diacritic, extractable, glyphs, hindi, image, iso-compliant, ligatures, mandarin, ocr, optical, pdf, portuguese, recognition, scan, searchable, spanish, tesseract
- Language: Java
- Homepage: https://itextpdf.com/en/products/itext-7/pdfocr
- Size: 266 MB
- Stars: 30
- Watchers: 11
- Forks: 8
- Open Issues: 3
-
Metadata Files:
- Contributing: CONTRIBUTING.md
- License: LICENSE.md
- Security: SECURITY.md