An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with ocr-d

A curated list of projects in awesome lists tagged with ocr-d .

https://github.com/ub-mannheim/tesseract

Tesseract Open Source OCR Engine (main repository)

lstm ocr ocr-d ocr-d-mp tesseract-ocr windows-build

Last synced: 06 Oct 2025

https://github.com/UB-Mannheim/ocr-fileformat

Validate and transform various OCR file formats (hOCR, ALTO, PAGE, FineReader)

alto finereader hocr ocr ocr-d page-xml transformation validation

Last synced: 14 Mar 2025

https://github.com/ub-mannheim/ocr-fileformat

Validate and transform various OCR file formats (hOCR, ALTO, PAGE, FineReader)

alto finereader hocr ocr ocr-d page-xml transformation validation

Last synced: 17 Mar 2025

https://github.com/ocr-d/core

Collection of OCR-related python tools and wrappers from @OCR-D

ocr-d

Last synced: 08 Apr 2025

https://github.com/ocr-d/ocrd_all

Master repository which includes most other OCR-D repositories as submodules

ocr-d

Last synced: 08 Apr 2025

https://github.com/ocr-d/ocrd_segment

OCR-D-compliant page segmentation

ocr-d

Last synced: 30 Apr 2025

https://github.com/OCR-D/ocrd_segment

OCR-D-compliant page segmentation

ocr-d

Last synced: 02 Apr 2025

https://github.com/ocr-d/ocrd_anybaseocr

DFKI Layout Detection for OCR-D

ocr ocr-d ocr-d-mp

Last synced: 24 Feb 2026

https://github.com/OCR-D/ocrd_anybaseocr

DFKI Layout Detection for OCR-D

ocr ocr-d ocr-d-mp

Last synced: 01 Feb 2026

https://github.com/OCR-D/ocrd_tesserocr

Run tesseract with the tesserocr bindings with @OCR-D's interfaces

ocr-d

Last synced: 02 Apr 2025

https://github.com/ocr-d/ocrd_tesserocr

Run tesseract with the tesserocr bindings with @OCR-D's interfaces

ocr-d

Last synced: 06 Apr 2025

https://github.com/hnesk/browse-ocrd

An extensible viewer for OCR-D mets.xml files

ocr-d

Last synced: 22 Jan 2026

https://github.com/ocr-d/spec

Specification of the @OCR-D technical architecture, interface definitions and data exchange format(s)

ocr-d

Last synced: 10 Apr 2025

https://github.com/bertsky/ocrd_detectron2

OCR-D wrapper for detectron2 based segmentation models

dla ocr-d olr

Last synced: 13 Apr 2025

https://github.com/ocr-d/ocrd_calamari

Recognize text using Calamari OCR and the OCR-D framework

calamari-ocr ocr ocr-d

Last synced: 10 Apr 2025

https://github.com/ocr-d/page-to-alto

Convert PAGE (v. 2019) to ALTO (v. 2.0 - 4.2)

ocr-d

Last synced: 12 Jan 2026

https://github.com/ocr-d/ocrd_kraken

Wrapper for the kraken OCR engine

ocr-d

Last synced: 10 Apr 2025

https://github.com/ocr-d/format-converters

Converters for various file formats used for representing OCR

ocr-d

Last synced: 15 Apr 2025

https://github.com/bertsky/workflow-configuration

a makefilization for OCR-D workflows, with configuration examples

ocr-d

Last synced: 31 Aug 2025

https://github.com/slub/ocrd_manager

frontend for ocrd_controller and adapter towards ocrd_kitodo

ocr-d

Last synced: 24 Oct 2025

https://github.com/bertsky/nmalign

forced alignment of lists of string by fuzzy string matching

alignment ocr-d

Last synced: 29 Jul 2025

https://github.com/bertsky/ocrd_publaynet

convert PubLayNet data into METS/PAGE-XML

ocr-d

Last synced: 13 Apr 2025

https://github.com/slub/ocrd_kitodo

Docker integration of Kitodo.Production and OCR-D

docker ocr ocr-d

Last synced: 11 Apr 2025

https://github.com/ocr-d/ocrd_pagetopdf

OCR-D wrapper for prima-pagetopdf

ocr ocr-d prima-pagetopdf

Last synced: 05 Sep 2025

https://github.com/ocr-d/gt-repo-template

A template for creating a ground truth repo with the various functions and features: such as metadata creation, data analysis and presentation.

ground-truth ocr-d pagexml repository template

Last synced: 15 Apr 2025

https://github.com/ub-mannheim/ocrd_pagetopdf

OCR-D wrapper for prima-pagetopdf

ocr ocr-d prima-pagetopdf

Last synced: 07 Aug 2025

https://github.com/ocr-d/ocrd_keraslm

Simple character-based language model using keras

ocr-d

Last synced: 10 Apr 2025

https://github.com/qurator-spk/ocrd-galley

A Dockerized test environment for OCR-D processors 🚢

ocr ocr-d qurator

Last synced: 16 Jan 2026

https://github.com/ocr-d/ocrd_olena

Binarize with Olena/scribo

ocr-d

Last synced: 20 Jul 2025

https://github.com/slub/ocrd_controller

Path to network implementation of OCR-D

ocr-d

Last synced: 27 Feb 2026

https://github.com/bertsky/docstruct

Document structure detection from PAGE-XML to METS-XML

ocr-d

Last synced: 15 Jul 2025

https://github.com/ocr-d/gt-guidelines

OCR-D guidelines for Ground Truth production

ocr-d

Last synced: 06 Jan 2026

https://github.com/ocr-d/ocrd_im6convert

Run ImageMagick with an OCR-D CLI

ocr-d

Last synced: 10 Apr 2025

https://github.com/ocr-d/ocr-d.github.io

Website for OCR-D specs, formats, requirements

ocr-d

Last synced: 30 Jan 2026

https://github.com/ocr-d/ocrd_vandalize

Demo processor to illustrate OCR-D Python API

ocr-d

Last synced: 10 Apr 2025

https://github.com/ocr-d/assets

Test data for testing specs and software in @OCR-D

ocr-d

Last synced: 06 Jan 2026

https://github.com/ocr-d/ocrmultieval

Extensible evaluation of (intermediate) results of an OCR workflow

ocr ocr-d ocr-evaluation

Last synced: 13 Jul 2025

https://github.com/qurator-spk/page2tsv

PAGE-XML to TSV

ocr-d qurator

Last synced: 16 Jan 2026

https://github.com/ocr-d/ocrd_fileformat

OCR-D wrapper for ocr-fileformat

ocr-d

Last synced: 10 Apr 2025

https://github.com/bertsky/ocrd_wrap

OCR-D wrapper for arbitrary coords-preserving image operations

ocr-d

Last synced: 13 Apr 2025

https://github.com/ub-mannheim/ocrd_contrib_ubma

Helper scripts for OCR-D

ocr-d

Last synced: 29 Mar 2025

https://github.com/bertsky/ocrd_page2tei

OCR-D wrapper for page2tei

ocr-d

Last synced: 04 Jan 2026

https://github.com/ocr-d/gt-mufilevelrules

OCR-D-Level-Rules can be created automatically with gt-MufiLevelRules from the encodings published by MUFI: The Medieval Unicode Font Initiative.

ground-truth guidelines ocr ocr-d transcription

Last synced: 07 Jan 2026

https://github.com/qurator-spk/ocrd_trocr

OCR-D processor for TrOCR

ocr ocr-d trocr

Last synced: 16 Jan 2026

https://github.com/tboenig/gt_corpus_benchmark

This repo provides a collection of ground truth data. The collection was compiled under different aspects (complexity of the layouts and use of the fonts). The individual data are also characterized by metadata. The metadata is based on the labeling scheme of OCR-D/PrimaLab.

corp ground-truth ocr-d pagexml

Last synced: 02 Feb 2026

https://github.com/ocr-d/ocrd_framework

Docker installation for the OCR-D framework containing all available processors, taverna workflow and local repository.

ocr-d

Last synced: 28 Mar 2025

https://github.com/stweil/tensorflow_gpu_to_tensorflow

Dummy Python package for tensorflow-gpu on hosts without GPU

ocr-d python tensorflow tensorflow-gpu

Last synced: 15 Apr 2026

https://github.com/tboenig/ocrd_bbaw_pilotbibliothek

Bericht über die OCR-D-Teststellung an Berlin-Brandenburgische Akademie der Wissenschaften (BBAW)

ground-truth ocr ocr-d

Last synced: 20 Feb 2026

https://github.com/ocr-d/gt_structure_1_4

About The repo gt_structure_1_4 is part of the OCR-D Ground Truth Structure corpus. Only the structure of the printed page is annotated. The corpus was created as a result of the DFG project OCR-D.

ground-truth ocr-d page-xml repository segmentation

Last synced: 07 Jan 2026

https://github.com/ub-mannheim/hkb-gt

Ground truth for a political newspaper of the Mannheim region (1931–1945)

ground-truth newspaper ocr ocr-d

Last synced: 29 Mar 2025

https://github.com/bertsky/ocrd_doxa

OCR-D wrapper for DoxaPy image binarization via locally adaptive thresholding

ocr-d

Last synced: 13 Apr 2025

https://github.com/ocr-d/policy

OCR-D Empfehlungen Volltextdigitalisierung

digitisation ground-truth guidelines ocr-d

Last synced: 06 Jan 2026

https://github.com/ocr-d/repository_metastore

Microservice to manage the data and metadata of the OCR-D data. It provides read/write/update metadata (XML), registering XSD, validate XML and indexing of metadata.

ocr-d

Last synced: 28 Mar 2025

https://github.com/tboenig/17_frak_simple

This repository provides the Ground Truth data for the OCR-D Quiver back end. This data serves as a basis for benchmarking the performance and accuracy of different OCR-D workflows for different types of input data.

ground-truth ocr-d

Last synced: 22 Jan 2026

https://github.com/tboenig/16_ant_simple

This repository provides the Ground Truth data for the OCR-D Quiver back end. This data serves as a basis for benchmarking the performance and accuracy of different OCR-D workflows for different types of input data.

ground-truth ocr-d

Last synced: 22 Jan 2026

https://github.com/tboenig/19_frak_simple

This repository provides the Ground Truth data for the OCR-D Quiver back end. This data serves as a basis for benchmarking the performance and accuracy of different OCR-D workflows for different types of input data.

ground-truth ocr-d

Last synced: 22 Jan 2026

https://github.com/ocr-d/gt_structure_1_1

The repo gt_structure_1_1 is part of the OCR-D Ground Truth Structure corpus. Only the structure of the printed page is annotated. The corpus was created as a result of the DFG project OCR-D.

ground-truth ocr-d page-xml repository segmentation

Last synced: 27 Jan 2026

https://github.com/bertsky/ocrd_jdeskew

OCR-D wrapper for Document Image Skew Estimation using Adaptive Radial Projection

ocr-d

Last synced: 19 Jun 2025

https://github.com/bertsky/ocrd_origami

OCR-D wrapper for poke1024/origami OLR+OCR

ocr-d

Last synced: 25 Mar 2025

https://github.com/tboenig/17_frak_complex

This repository provides the Ground Truth data for the OCR-D Quiver back end. This data serves as a basis for benchmarking the performance and accuracy of different OCR-D workflows for different types of input data.

ground-truth ocr-d

Last synced: 23 Jan 2026

https://github.com/tboenig/17_fontmix_simple

This repository provides the Ground Truth data for the OCR-D Quiver back end. This data serves as a basis for benchmarking the performance and accuracy of different OCR-D workflows for different types of input data.

ground-truth ocr-d

Last synced: 23 Jan 2026

https://github.com/ocr-d/gt_structure_1_2

The repo gt_structure_1_2 is part of the OCR-D Ground Truth Structure corpus. Only the structure of the printed page is annotated. The corpus was created as a result of the DFG project OCR-D.

ground-truth ocr-d page-xml repository segmentation

Last synced: 29 Jan 2026

https://github.com/tboenig/16_ant_complex

This repository provides the Ground Truth data for the OCR-D Quiver back end. This data serves as a basis for benchmarking the performance and accuracy of different OCR-D workflows for different types of input data.

ground-truth ocr-d

Last synced: 06 Feb 2026

https://github.com/tboenig/18_ant_simple

This repository provides the Ground Truth data for the OCR-D Quiver back end. This data serves as a basis for benchmarking the performance and accuracy of different OCR-D workflows for different types of input data.

ground-truth ocr-d

Last synced: 26 Feb 2026

https://github.com/tboenig/18_fontmix_complex

This repository provides the Ground Truth data for the OCR-D Quiver back end. This data serves as a basis for benchmarking the performance and accuracy of different OCR-D workflows for different types of input data.

ground-truth ocr-d

Last synced: 09 Feb 2026

https://github.com/ocr-d/gt_structure_1_3

The repo gt_structure_1_3 is part of the OCR-D Ground Truth Structure corpus. Only the structure of the printed page is annotated. The corpus was created as a result of the DFG project OCR-D.

ground-truth ocr-d page-xml repository segmentation

Last synced: 07 Jan 2026

https://github.com/ocr-d/gt-repo-scripts

XSLT and shell scripts for analyzing and creating GitHub pages of a ground truth repository. These are centrally managed and can be used by all repositories created with gt-repo-template (https://github.com/OCR-D/gt-repo-template).

ground-truth ocr-d page-xml repository template

Last synced: 06 Jan 2026

https://github.com/ocr-d/bibliothecabaltica2018

Slides for the OCR-D talk at the Bibliotheca Baltica 2018 symposium in Rostock

ocr-d

Last synced: 07 Jan 2026

https://github.com/tboenig/19_ant_simple

This repository provides the Ground Truth data for the OCR-D Quiver back end. This data serves as a basis for benchmarking the performance and accuracy of different OCR-D workflows for different types of input data.

ground-truth ocr-d

Last synced: 01 Feb 2026

https://github.com/tboenig/18_frak_complex

This repository provides the Ground Truth data for the OCR-D Quiver back end. This data serves as a basis for benchmarking the performance and accuracy of different OCR-D workflows for different types of input data.

ground-truth ocr-d

Last synced: 24 Jan 2026

https://github.com/tboenig/17_fontmix_complex

This repository provides the Ground Truth data for the OCR-D Quiver back end. This data serves as a basis for benchmarking the performance and accuracy of different OCR-D workflows for different types of input data.

ground-truth ocr-d

Last synced: 24 Jan 2026

https://github.com/tboenig/16_frak_complex

This repository provides the Ground Truth data for the OCR-D Quiver back end. This data serves as a basis for benchmarking the performance and accuracy of different OCR-D workflows for different types of input data.

ground-truth ocr-d

Last synced: 03 Feb 2026

https://github.com/tboenig/18_frak_simple

This repository provides the Ground Truth data for the OCR-D Quiver back end. This data serves as a basis for benchmarking the performance and accuracy of different OCR-D workflows for different types of input data.

ground-truth ocr-d

Last synced: 05 Feb 2026

https://github.com/tboenig/16_frak_simple

This repository provides the Ground Truth data for the OCR-D Quiver back end. This data serves as a basis for benchmarking the performance and accuracy of different OCR-D workflows for different types of input data.

ground-truth ocr-d

Last synced: 24 Jan 2026