{"id":24951933,"url":"https://github.com/ocr-d/ocrd_anybaseocr","last_synced_at":"2026-02-24T09:32:07.739Z","repository":{"id":42624949,"uuid":"187826565","full_name":"OCR-D/ocrd_anybaseocr","owner":"OCR-D","description":"DFKI Layout Detection for OCR-D","archived":false,"fork":false,"pushed_at":"2025-03-28T14:37:24.000Z","size":127933,"stargazers_count":47,"open_issues_count":19,"forks_count":11,"subscribers_count":8,"default_branch":"master","last_synced_at":"2025-03-30T02:04:13.552Z","etag":null,"topics":["ocr","ocr-d","ocr-d-mp"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/OCR-D.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-05-21T11:47:54.000Z","updated_at":"2025-03-25T14:10:56.000Z","dependencies_parsed_at":"2024-02-19T18:48:24.297Z","dependency_job_id":"a9cdb0cc-1eac-4254-b753-46ec4769b04e","html_url":"https://github.com/OCR-D/ocrd_anybaseocr","commit_stats":{"total_commits":360,"total_committers":15,"mean_commits":24.0,"dds":0.6611111111111111,"last_synced_commit":"5978a1fef1b5b863f71e0a9abd1ff8668876c661"},"previous_names":[],"tags_count":20,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OCR-D%2Focrd_anybaseocr","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OCR-D%2Focrd_anybaseocr/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OCR-D%2Focrd_anybaseocr/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OCR-D%2Focrd_anybaseocr/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/OCR-D","download_url":"https://codeload.github.com/OCR-D/ocrd_anybaseocr/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247427005,"owners_count":20937200,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ocr","ocr-d","ocr-d-mp"],"created_at":"2025-02-03T01:32:33.063Z","updated_at":"2026-02-24T09:32:07.680Z","avatar_url":"https://github.com/OCR-D.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Document Preprocessing and Segmentation\n\n[![CircleCI](https://circleci.com/gh/OCR-D/ocrd_anybaseocr.svg?style=svg)](https://circleci.com/gh/OCR-D/ocrd_anybaseocr)\n[![PyPI](https://img.shields.io/pypi/v/ocrd_anybaseocr.svg)](https://pypi.org/project/ocrd_anybaseocr/)\n\n\n\u003e Tools to preprocess and segment scanned images for OCR-D\n\n   * [Installing](#installing)\n   * [Tools](#tools)\n      * [Binarizer](#binarizer)\n      * [Deskewer](#deskewer)\n      * [Cropper](#cropper)\n      * [Dewarper](#dewarper)\n      * [Text/Non-Text Segmenter](#textnon-text-segmenter)\n      * [Block Segmenter](#block-segmenter)\n      * [Textline Segmenter](#textline-segmenter)\n      * [Document Analyser](#document-analyser)\n   * [Testing](#testing)\n   * [License](#license)\n\n# Installing\n\nRequires Python \u003e= 3.6.\n\n1. Create a new `venv` unless you already have one\n\n        python3 -m venv venv\n\n2. Activate the `venv`\n\n        source venv/bin/activate\n\n3. To install from source, get GNU make and do:\n\n        make install\n\n   There are also prebuilds available on PyPI:\n\n        pip install ocrd_anybaseocr\n\n(This will install both PyTorch and TensorFlow, along with their dependents.)\n\n# Tools\n\nAll tools, also called _processors_, abide by the [CLI specifications](https://ocr-d.de/en/spec/cli) for [OCR-D](https://ocr-d.de), which roughly looks like:\n\n    ocrd-\u003cprocessor-name\u003e [-m \u003cpath to METs input file\u003e] -I \u003cinput group\u003e -O \u003coutput group\u003e [-p \u003cpath to parameter file\u003e]* [-P \u003cparam name\u003e \u003cparam value\u003e]*\n\n## Binarizer\n\n### Method Behaviour \nFor each page (or sub-segment), this processor takes a scanned colored / gray scale document image as input and computes a binarized (black and white) image.\n\nImplemented via rule-based methods (percentile based adaptive background estimation in Ocrolib).\n \n### Example\n\n    ocrd-anybaseocr-binarize -I OCR-D-IMG -O OCR-D-BIN -P operation_level line -P threshold 0.3\n\n\n## Deskewer\n\n### Method Behaviour \nFor each page (or sub-segment), this processor takes a document image as input and computes the skew angle of that. It also annotates a deskewed image. \n\nThe input images have to be binarized for this module to work.\n\nImplemented via rule-based methods (binary projection profile entropy maximization in Ocrolib).\n \n### Example\n\n    ocrd-anybaseocr-deskew -I OCR-D-BIN -O OCR-D-DESKEW -P maxskew 5.0 -P skewsteps 20 -P operation_level page\n\n## Cropper\n\n### Method Behaviour \nFor each page, this processor takes a document image as input and computes the border around the page content area (i.e. removes textual noise as well as any other noise around the page frame). It also annotates a cropped image.\n\nThe input image does not need to be binarized, but should be deskewed for the module to work optimally.\n\nImplemented via rule-based methods (gradient-based line segment detection and morphology based textline detection).\n \n### Example:\n\n    ocrd-anybaseocr-crop -I OCR-D-DESKEW -O OCR-D-CROP -P rulerAreaMax 0 -P marginLeft 0.1\n\n## Dewarper\n\n### Method Behaviour \nFor each page, this processor takes a document image as input and computes a morphed image which will make the text lines straight if they are curved.\n\nThe input image has to be binarized for the module to work, and should be cropped and deskewed for optimal quality.\n\nImplemented via data-driven methods (neural GAN conditional image model trained with pix2pixHD/Pytorch).\n \n### Models\n\n    ocrd resmgr download ocrd-anybaseocr-dewarp '*'\n\n### Example\n\n    ocrd-anybaseocr-dewarp -I OCR-D-CROP -O OCR-D-DEWARP -P resize_mode none -P gpu_id -1\n\n## Text/Non-Text Segmenter\n\n### Method Behaviour \nFor each page, this processor takes a document image as an input and computes two images, separating the text and non-text parts.\n\nThe input image has to be binarized for the module to work, and should be cropped and deskewed for optimal quality.\n\nImplemented via data-driven methods (neural pixel classifier model trained with Tensorflow/Keras).\n \n### Models\n\n    ocrd resmgr download ocrd-anybaseocr-tiseg '*'\n\n### Example\n\n    ocrd-anybaseocr-tiseg -I OCR-D-DEWARP -O OCR-D-TISEG -P use_deeplr true\n\n## Block Segmenter\n\n### Method Behaviour \nFor each page, this processor takes the raw document image as an input and computes a text region segmentation for it (distinguishing various types of text blocks).\n\nThe input image need not be binarized, but should be deskewed for the module to work optimally.\n\nImplemented via data-driven methods (neural Mask-RCNN instance segmentation model trained with Tensorflow/Keras).\n \n### Models\n\n    ocrd resmgr download ocrd-anybaseocr-block-segmentation '*'\n\n### Example\n\n    ocrd-anybaseocr-block-segmentation -I OCR-D-TISEG -O OCR-D-BLOCK -P active_classes '[\"page-number\", \"paragraph\", \"heading\", \"drop-capital\", \"marginalia\", \"caption\"]' -P min_confidence 0.8 -P post_process true\n\n## Textline Segmenter\n\n### Method Behaviour \nFor each page (or region), this processor takes a cropped document image as an input and computes a textline segmentation for it.\n\nThe input image should be binarized and deskewed for the module to work. \n\nImplemented via rule-based methods (gradient and morphology based line estimation in Ocrolib).\n \n### Example\n\n    ocrd-anybaseocr-textline -I OCR-D-BLOCK -O OCR-D-LINE -P operation_level region\n\n## Document Analyser\n\n### Method Behaviour \nFor the whole document, this processor takes all the cropped page images and their corresponding text regions as input and computes the logical structure (page types and sections).\n\nThe input image should be binarized and segmented for this module to work.\n\nImplemented via data-driven methods (neural Inception-V3 image classification model trained with Tensorflow/Keras).\n\n### Models\n\n    ocrd resmgr download ocrd-anybaseocr-layout-analysis '*'\n\n### Example\n\n    ocrd-anybaseocr-layout-analysis -I OCR-D-LINE -O OCR-D-STRUCT\n\n## Testing\n\nTo test the tools under realistic conditions (on OCR-D workspaces),\ndownload [OCR-D/assets](https://github.com/OCR-D/assets). In particular,\nthe code is tested with the [dfki-testdata](https://github.com/OCR-D/assets/tree/master/data/dfki-testdata)\ndataset.\n\nTo download the data:\n\n    make assets\n\nTo run module tests:\n\n    make test\n\nTo run processor/workflow tests:\n\n    make cli-test\n\n## License\n\n\n```\n Licensed under the Apache License, Version 2.0 (the \"License\");\n you may not use this file except in compliance with the License.\n You may obtain a copy of the License at\n\n     http://www.apache.org/licenses/LICENSE-2.0\n\n Unless required by applicable law or agreed to in writing, software\n distributed under the License is distributed on an \"AS IS\" BASIS,\n WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n See the License for the specific language governing permissions and\n limitations under the License.\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Focr-d%2Focrd_anybaseocr","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Focr-d%2Focrd_anybaseocr","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Focr-d%2Focrd_anybaseocr/lists"}