https://github.com/OCR-D/ocrd_anybaseocr
DFKI Layout Detection for OCR-D
https://github.com/OCR-D/ocrd_anybaseocr
ocr ocr-d ocr-d-mp
Last synced: 4 months ago
JSON representation
DFKI Layout Detection for OCR-D
- Host: GitHub
- URL: https://github.com/OCR-D/ocrd_anybaseocr
- Owner: OCR-D
- License: apache-2.0
- Created: 2019-05-21T11:47:54.000Z (about 7 years ago)
- Default Branch: master
- Last Pushed: 2025-05-01T22:11:45.000Z (about 1 year ago)
- Last Synced: 2025-11-15T04:20:06.367Z (7 months ago)
- Topics: ocr, ocr-d, ocr-d-mp
- Language: PureBasic
- Size: 176 MB
- Stars: 47
- Watchers: 6
- Forks: 11
- Open Issues: 19
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
Awesome Lists containing this project
- Awesome-DFKI - ocrd_anybaseocr - DFKI Layout Detection for [OCR-D](https://ocr-d.de/) (Public Groups and Projects on GitHub.com)
README
# Document Croppnig
[](https://circleci.com/gh/OCR-D/ocrd_anybaseocr)
[](https://pypi.org/project/ocrd_anybaseocr/)
> Tools to crop scanned images for OCR-D
* [Installing](#installing)
* [Tools](#tools)
* [Cropper](#cropper)
* [Document Analyser](#document-analyser)
* [Testing](#testing)
* [License](#license)
# Installing
Requires Python >= 3.6.
1. Create a new `venv` unless you already have one
python3 -m venv venv
2. Activate the `venv`
source venv/bin/activate
3. To install from source, get GNU make and do:
make install
There are also prebuilds available on PyPI:
pip install ocrd_anybaseocr
# Tools
All tools, also called _processors_, abide by the [CLI specifications](https://ocr-d.de/en/spec/cli) for [OCR-D](https://ocr-d.de), which roughly looks like:
ocrd- [-m ] -I -O [-p ]* [-P ]*
## Cropper
### Method Behaviour
For each page, this processor takes a document image as input and computes the border around the page content area (i.e. removes textual noise as well as any other noise around the page frame). It also annotates a cropped image.
The input image does not need to be binarized, but should be deskewed for the module to work optimally.
Implemented via rule-based methods (gradient-based line segment detection and morphology based textline detection).
### Example:
ocrd-anybaseocr-crop -I OCR-D-DESKEW -O OCR-D-CROP -P rulerAreaMax 0 -P marginLeft 0.1
## Document Analyser
### Method Behaviour
For the whole document, this processor takes all the cropped page images and their corresponding text regions as input and computes the logical structure (page types and sections).
The input image should be binarized and segmented for this module to work.
Implemented via data-driven methods (neural Inception-V3 image classification model trained with Tensorflow/Keras).
### Example
ocrd-anybaseocr-layout-analysis -I OCR-D-LINE -O OCR-D-STRUCT
## Testing
To test the tools under realistic conditions (on OCR-D workspaces),
download [OCR-D/assets](https://github.com/OCR-D/assets). In particular,
the code is tested with the [dfki-testdata](https://github.com/OCR-D/assets/tree/master/data/dfki-testdata)
dataset.
To download the data:
make assets
To run module tests:
make test
To run processor/workflow tests:
make cli-test
## License
```
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
```