An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with pagexml

A curated list of projects in awesome lists tagged with pagexml .

https://github.com/mauvilsa/tesseract-recognize

Tool that does layout analysis and/or text recognition using tesseract and outputs the result in Page XML format

cli docker-image document-recognition ocr optical-character-recognition pagexml tesseract text-detection

Last synced: 05 May 2025

https://github.com/mauvilsa/nw-page-editor

Simple app for visual editing of Page XML files

annotation-tool desktop-app docker-image editor pagexml server-app

Last synced: 07 Jul 2025

https://github.com/andbue/nashi

Some bits of javascript to transcribe scanned pages using PageXML

ocr pagexml transcription

Last synced: 11 Apr 2025

https://github.com/omni-us/pagexml

Library in C++ and a python wrapper for dealing with Page XML files

annotation-processing docker-image document-representation pagexml python

Last synced: 20 Mar 2025

https://github.com/ocr-d/gt-repo-template

A template for creating a ground truth repo with the various functions and features: such as metadata creation, data analysis and presentation.

ground-truth ocr-d pagexml repository template

Last synced: 15 Apr 2025

https://github.com/cconzen/readingorderrecalculation

Post-process PageXMLs to improve their region reading order

pagexml reading-order transkribus

Last synced: 28 Jun 2025

https://github.com/tboenig/gt_corpus_benchmark

This repo provides a collection of ground truth data. The collection was compiled under different aspects (complexity of the layouts and use of the fonts). The individual data are also characterized by metadata. The metadata is based on the labeling scheme of OCR-D/PrimaLab.

corp ground-truth ocr-d pagexml

Last synced: 02 Feb 2026

https://github.com/bobld/publaynetsharp

Extract and convert PubLayNet data to PageXml format

csharp pagexml publaynet pubmed

Last synced: 12 Oct 2025

https://github.com/scdh/x2tei-transformations

Transformation from various Formats to TEI

converters docx pagexml tei tei-xml usx

Last synced: 06 Jan 2026