{"id":41234347,"url":"https://github.com/OCR-D/ocrd_anybaseocr","last_synced_at":"2026-02-01T10:00:45.310Z","repository":{"id":42624949,"uuid":"187826565","full_name":"OCR-D/ocrd_anybaseocr","owner":"OCR-D","description":"DFKI Layout Detection for OCR-D","archived":false,"fork":false,"pushed_at":"2025-05-01T22:11:45.000Z","size":184506,"stargazers_count":47,"open_issues_count":19,"forks_count":11,"subscribers_count":6,"default_branch":"master","last_synced_at":"2025-11-15T04:20:06.367Z","etag":null,"topics":["ocr","ocr-d","ocr-d-mp"],"latest_commit_sha":null,"homepage":null,"language":"PureBasic","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/OCR-D.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2019-05-21T11:47:54.000Z","updated_at":"2025-05-01T22:11:48.000Z","dependencies_parsed_at":"2024-02-19T18:48:24.297Z","dependency_job_id":"a9cdb0cc-1eac-4254-b753-46ec4769b04e","html_url":"https://github.com/OCR-D/ocrd_anybaseocr","commit_stats":{"total_commits":360,"total_committers":15,"mean_commits":24.0,"dds":0.6611111111111111,"last_synced_commit":"5978a1fef1b5b863f71e0a9abd1ff8668876c661"},"previous_names":[],"tags_count":21,"template":false,"template_full_name":null,"purl":"pkg:github/OCR-D/ocrd_anybaseocr","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OCR-D%2Focrd_anybaseocr","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OCR-D%2Focrd_anybaseocr/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OCR-D%2Focrd_anybaseocr/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OCR-D%2Focrd_anybaseocr/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/OCR-D","download_url":"https://codeload.github.com/OCR-D/ocrd_anybaseocr/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OCR-D%2Focrd_anybaseocr/sbom","scorecard":{"id":103521,"data":{"date":"2025-08-11","repo":{"name":"github.com/OCR-D/ocrd_anybaseocr","commit":"8a786639bdc7f9ce85a4fc8b2b79e0dc456007d9"},"scorecard":{"version":"v5.2.1-40-gf6ed084d","commit":"f6ed084d17c9236477efd66e5b258b9d4cc7b389"},"score":2.6,"checks":[{"name":"Code-Review","score":3,"reason":"Found 2/6 approved changesets -- score normalized to 3","details":null,"documentation":{"short":"Determines if the project requires human code review before pull requests (aka merge requests) are merged.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#code-review"}},{"name":"Packaging","score":-1,"reason":"packaging workflow not detected","details":["Warn: no GitHub/GitLab publishing workflow detected."],"documentation":{"short":"Determines if the project is published as a package that others can easily download, install, easily update, and uninstall.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#packaging"}},{"name":"Maintained","score":0,"reason":"0 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project is \"actively maintained\".","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#maintained"}},{"name":"Dangerous-Workflow","score":-1,"reason":"no workflows found","details":null,"documentation":{"short":"Determines if the project's GitHub Action workflows avoid dangerous patterns.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#dangerous-workflow"}},{"name":"Token-Permissions","score":-1,"reason":"No tokens found","details":null,"documentation":{"short":"Determines if the project's workflows follow the principle of least privilege.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#token-permissions"}},{"name":"Binary-Artifacts","score":10,"reason":"no binaries found in the repo","details":null,"documentation":{"short":"Determines if the project has generated executable (binary) artifacts in the source repository.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#binary-artifacts"}},{"name":"CII-Best-Practices","score":0,"reason":"no effort to earn an OpenSSF best practices badge detected","details":null,"documentation":{"short":"Determines if the project has an OpenSSF (formerly CII) Best Practices Badge.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#cii-best-practices"}},{"name":"Security-Policy","score":0,"reason":"security policy file not detected","details":["Warn: no security policy file detected","Warn: no security file to analyze","Warn: no security file to analyze","Warn: no security file to analyze"],"documentation":{"short":"Determines if the project has published a security policy.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#security-policy"}},{"name":"Fuzzing","score":0,"reason":"project is not fuzzed","details":["Warn: no fuzzer integrations found"],"documentation":{"short":"Determines if the project uses fuzzing.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#fuzzing"}},{"name":"License","score":10,"reason":"license file detected","details":["Info: project has a license file: LICENSE:0","Info: FSF or OSI recognized license: Apache License 2.0: LICENSE:0"],"documentation":{"short":"Determines if the project has defined a license.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#license"}},{"name":"Signed-Releases","score":0,"reason":"Project has not signed or included provenance with any releases.","details":["Warn: release artifact v1.8.2 not signed: https://api.github.com/repos/OCR-D/ocrd_anybaseocr/releases/63167286","Warn: release artifact v1.8.2 does not have provenance: https://api.github.com/repos/OCR-D/ocrd_anybaseocr/releases/63167286"],"documentation":{"short":"Determines if the project cryptographically signs release artifacts.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#signed-releases"}},{"name":"Pinned-Dependencies","score":0,"reason":"dependency not pinned by hash detected -- score normalized to 0","details":["Warn: containerImage not pinned by hash: Dockerfile:2","Warn: pipCommand not pinned by hash: Dockerfile:41","Info:   0 out of   1 containerImage dependencies pinned","Info:   0 out of   1 pipCommand dependencies pinned"],"documentation":{"short":"Determines if the project has declared and pinned the dependencies of its build process.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#pinned-dependencies"}},{"name":"Branch-Protection","score":0,"reason":"branch protection not enabled on development/release branches","details":["Warn: branch protection not enabled for branch 'master'"],"documentation":{"short":"Determines if the default and release branches are protected with GitHub's branch protection settings.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#branch-protection"}},{"name":"Vulnerabilities","score":8,"reason":"2 existing vulnerabilities detected","details":["Warn: Project is vulnerable to: PYSEC-2023-102","Warn: Project is vulnerable to: PYSEC-2023-114"],"documentation":{"short":"Determines if the project has open, known unfixed vulnerabilities.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#vulnerabilities"}},{"name":"SAST","score":0,"reason":"SAST tool is not run on all commits -- score normalized to 0","details":["Warn: 0 commits out of 26 are checked with a SAST tool"],"documentation":{"short":"Determines if the project uses static code analysis.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#sast"}}]},"last_synced_at":"2025-08-15T10:39:26.429Z","repository_id":42624949,"created_at":"2025-08-15T10:39:26.429Z","updated_at":"2025-08-15T10:39:26.429Z"},"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28975278,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-01T09:57:52.632Z","status":"ssl_error","status_checked_at":"2026-02-01T09:57:49.143Z","response_time":56,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ocr","ocr-d","ocr-d-mp"],"created_at":"2026-01-23T01:00:32.313Z","updated_at":"2026-02-01T10:00:45.295Z","avatar_url":"https://github.com/OCR-D.png","language":"PureBasic","funding_links":[],"categories":["Public Groups and Projects on GitHub.com"],"sub_categories":[],"readme":"# Document Croppnig\n\n[![CircleCI](https://circleci.com/gh/OCR-D/ocrd_anybaseocr.svg?style=svg)](https://circleci.com/gh/OCR-D/ocrd_anybaseocr)\n[![PyPI](https://img.shields.io/pypi/v/ocrd_anybaseocr.svg)](https://pypi.org/project/ocrd_anybaseocr/)\n\n\n\u003e Tools to crop scanned images for OCR-D\n\n   * [Installing](#installing)\n   * [Tools](#tools)\n      * [Cropper](#cropper)\n      * [Document Analyser](#document-analyser)\n   * [Testing](#testing)\n   * [License](#license)\n\n# Installing\n\nRequires Python \u003e= 3.6.\n\n1. Create a new `venv` unless you already have one\n\n        python3 -m venv venv\n\n2. Activate the `venv`\n\n        source venv/bin/activate\n\n3. To install from source, get GNU make and do:\n\n        make install\n\n   There are also prebuilds available on PyPI:\n\n        pip install ocrd_anybaseocr\n\n# Tools\n\nAll tools, also called _processors_, abide by the [CLI specifications](https://ocr-d.de/en/spec/cli) for [OCR-D](https://ocr-d.de), which roughly looks like:\n\n    ocrd-\u003cprocessor-name\u003e [-m \u003cpath to METs input file\u003e] -I \u003cinput group\u003e -O \u003coutput group\u003e [-p \u003cpath to parameter file\u003e]* [-P \u003cparam name\u003e \u003cparam value\u003e]*\n\n## Cropper\n\n### Method Behaviour \nFor each page, this processor takes a document image as input and computes the border around the page content area (i.e. removes textual noise as well as any other noise around the page frame). It also annotates a cropped image.\n\nThe input image does not need to be binarized, but should be deskewed for the module to work optimally.\n\nImplemented via rule-based methods (gradient-based line segment detection and morphology based textline detection).\n \n### Example:\n\n    ocrd-anybaseocr-crop -I OCR-D-DESKEW -O OCR-D-CROP -P rulerAreaMax 0 -P marginLeft 0.1\n\n## Document Analyser\n\n### Method Behaviour \nFor the whole document, this processor takes all the cropped page images and their corresponding text regions as input and computes the logical structure (page types and sections).\n\nThe input image should be binarized and segmented for this module to work.\n\nImplemented via data-driven methods (neural Inception-V3 image classification model trained with Tensorflow/Keras).\n\n\n### Example\n\n    ocrd-anybaseocr-layout-analysis -I OCR-D-LINE -O OCR-D-STRUCT\n\n## Testing\n\nTo test the tools under realistic conditions (on OCR-D workspaces),\ndownload [OCR-D/assets](https://github.com/OCR-D/assets). In particular,\nthe code is tested with the [dfki-testdata](https://github.com/OCR-D/assets/tree/master/data/dfki-testdata)\ndataset.\n\nTo download the data:\n\n    make assets\n\nTo run module tests:\n\n    make test\n\nTo run processor/workflow tests:\n\n    make cli-test\n\n## License\n\n\n```\n Licensed under the Apache License, Version 2.0 (the \"License\");\n you may not use this file except in compliance with the License.\n You may obtain a copy of the License at\n\n     http://www.apache.org/licenses/LICENSE-2.0\n\n Unless required by applicable law or agreed to in writing, software\n distributed under the License is distributed on an \"AS IS\" BASIS,\n WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n See the License for the specific language governing permissions and\n limitations under the License.\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FOCR-D%2Focrd_anybaseocr","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FOCR-D%2Focrd_anybaseocr","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FOCR-D%2Focrd_anybaseocr/lists"}