Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/omni-us/pagexml
Library in C++ and a python wrapper for dealing with Page XML files
https://github.com/omni-us/pagexml
annotation-processing docker-image document-representation pagexml python
Last synced: 2 months ago
JSON representation
Library in C++ and a python wrapper for dealing with Page XML files
- Host: GitHub
- URL: https://github.com/omni-us/pagexml
- Owner: omni-us
- License: mit
- Created: 2018-08-21T07:16:55.000Z (over 6 years ago)
- Default Branch: master
- Last Pushed: 2024-01-19T13:44:10.000Z (12 months ago)
- Last Synced: 2024-04-27T05:21:45.586Z (8 months ago)
- Topics: annotation-processing, docker-image, document-representation, pagexml, python
- Language: C++
- Homepage:
- Size: 6.63 MB
- Stars: 13
- Watchers: 6
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE.md
Awesome Lists containing this project
- awesome-ocr - py-pagexml - Python library for handling PAGE XML and OPF files. (Software / OCR file formats)
README
# Introduction
Library in C++ and a python wrapper for dealing with Page XML files
[![CircleCI](https://circleci.com/gh/omni-us/pagexml.svg?style=svg)](https://circleci.com/gh/omni-us/pagexml)
# Requirements
Check [py-pagexml/README.rst](py-pagexml/README.rst) and/or [docker/Dockerfile_build](docker/Dockerfile_build), [docker/Dockerfile_runtime](docker/Dockerfile_runtime).
# Contents
- [lib](lib): Directory containing the C++ PageXML and TextFeatExtractor libraries.
- [py-pagexml](py-pagexml): Swig-based python wrapper for the PageXML library.
- [py-textfeat](py-textfeat): Swig-based python wrapper for the TextFeatExtractor library.# Documentation
- [https://omni-us.github.io/pagexml/py-pagexml](https://omni-us.github.io/pagexml/py-pagexml): Online documentation for py-pagexml.
- [https://omni-us.github.io/pagexml/py-textfeat](https://omni-us.github.io/pagexml/py-textfeat): Online documentation for py-textfeat.