An open API service indexing awesome lists of open source software.

https://github.com/wollmers/ocr-gt-austriannewspapers-scripts

Scripts for AustrianNewspapers
https://github.com/wollmers/ocr-gt-austriannewspapers-scripts

ground-truth ocr ocr-training

Last synced: 4 months ago
JSON representation

Scripts for AustrianNewspapers

Awesome Lists containing this project

README

          

# Scripts for AustrianNewspapers

## Purpose

Improve the transcriptions in the AustrianNewspapers data set.

## Location of the data set

Unpacked original data (PAGE XML, TIFF) from ÖNB with some transcription texts enhanced and fixed

- [TrainingSet_ONB_Newseye_GT_M1+](https://github.com/UB-Mannheim/AustrianNewspapers/tree/master/TrainingSet_ONB_Newseye_GT_M1%2B)
- [ValidationSet_ONB_Newseye_GT_M1+](https://github.com/UB-Mannheim/AustrianNewspapers/tree/master/ValidationSet_ONB_Newseye_GT_M1%2B)

Extracted line pairs (image and ground truth text)

- [gt](https://github.com/UB-Mannheim/AustrianNewspapers/tree/master/gt)