An open API service indexing awesome lists of open source software.

https://github.com/ub-mannheim/digitue-gt

Ground truth for digitized publications of UB Tübingen
https://github.com/ub-mannheim/digitue-gt

escriptorium fraktur ground-truth ocr

Last synced: 10 months ago
JSON representation

Ground truth for digitized publications of UB Tübingen

Awesome Lists containing this project

README

          

## DigiTue Ground Truth

This repository contains transcriptions for digitized books and journals of
the University Library of Tübingen (https://opendigi.ub.uni-tuebingen.de/digitue/).

The transcriptions were done with eScriptorium, a transcription platform
developed as part of the Scripta and RESILIENCE projects
(https://gitlab.com/scripta/escriptorium/).

Get the related images in JPEG format using this script:
```
for xml in $(find DigiRegio Theo Tue VD18 -name "*.xml"); do (cd $(dirname $xml); page=$(basename $xml .xml); base=$(echo $page|sed 's/_[0-9]*$//'); test -f $page.jpg || (echo $page; curl --silent -Lo $page.jpg https://opendigi.ub.uni-tuebingen.de/opendigi/image/$base/$page.jp2/full/full/0/default.jpg)); done
```