https://github.com/ryanfb/ancientgreekocr-grctestfodder
'grctestfodder' repository from http://ancientgreekocr.org/. Ancient Greek page scans and ground truth text for testing OCR accuracy.
https://github.com/ryanfb/ancientgreekocr-grctestfodder
Last synced: 5 months ago
JSON representation
'grctestfodder' repository from http://ancientgreekocr.org/. Ancient Greek page scans and ground truth text for testing OCR accuracy.
- Host: GitHub
- URL: https://github.com/ryanfb/ancientgreekocr-grctestfodder
- Owner: ryanfb
- Created: 2014-11-19T16:37:53.000Z (over 11 years ago)
- Default Branch: master
- Last Pushed: 2014-11-25T23:37:23.000Z (over 11 years ago)
- Last Synced: 2023-04-11T17:41:03.396Z (about 3 years ago)
- Size: 9.66 MB
- Stars: 1
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README
Awesome Lists containing this project
README
A collection of page scans and corresponding text files of Ancient Greek.
These files are designed for use in testing OCR quality, using the tools from https://gitorious.org/ancient-greek-training-for-tesseract/ocr-evaluation-tools, in particular the tessaccsummary script.
The naming of the files is quite straightforward:
.png - the page scan
.txt - the correct UTF-8 encoded text corresponding to the page scan
.src - a text file describing the provenance of the page scan