{"id":27695045,"url":"https://github.com/tboenig/gt_corpus_benchmark","last_synced_at":"2026-02-02T21:08:45.799Z","repository":{"id":119330356,"uuid":"570050686","full_name":"tboenig/gt_corpus_benchmark","owner":"tboenig","description":"This repo provides a collection of ground truth data. The collection was compiled under different aspects (complexity of the layouts and use of the fonts). The individual data are also characterized by metadata. The metadata is based on the labeling scheme of OCR-D/PrimaLab.","archived":false,"fork":false,"pushed_at":"2023-04-05T07:57:09.000Z","size":26,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-04-25T14:15:01.167Z","etag":null,"topics":["corp","ground-truth","ocr-d","pagexml"],"latest_commit_sha":null,"homepage":"https://tboenig.github.io/gt_corpus_benchmark/","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/tboenig.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2022-11-24T08:19:39.000Z","updated_at":"2023-04-11T08:15:29.000Z","dependencies_parsed_at":null,"dependency_job_id":"deafa43d-0e35-4590-82e2-b3c713f33bbc","html_url":"https://github.com/tboenig/gt_corpus_benchmark","commit_stats":null,"previous_names":[],"tags_count":14,"template":false,"template_full_name":null,"purl":"pkg:github/tboenig/gt_corpus_benchmark","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tboenig%2Fgt_corpus_benchmark","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tboenig%2Fgt_corpus_benchmark/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tboenig%2Fgt_corpus_benchmark/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tboenig%2Fgt_corpus_benchmark/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/tboenig","download_url":"https://codeload.github.com/tboenig/gt_corpus_benchmark/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tboenig%2Fgt_corpus_benchmark/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":263577268,"owners_count":23483132,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["corp","ground-truth","ocr-d","pagexml"],"created_at":"2025-04-25T14:15:00.497Z","updated_at":"2026-02-02T21:08:40.774Z","avatar_url":"https://github.com/tboenig.png","language":null,"funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cdiv\u003e\n   \u003cdiv id=\"main\"\u003e\n      \u003ch1\u003e📚 Corpus\u003c/h1\u003e\n      \u003cp\u003eThis corpus includes Ground Truth (GT) data compiled considering the following feature:\u003c/p\u003e\n      \u003col\u003e\n         \u003cli\u003eClassification into font groups: Gothic/Blackletter, Antiqua and FontMix (Antiqua and Blackletter)\u003cbr/\u003e\n         distinction of the selected print type or combinations\u003c/li\u003e\n         \u003cli\u003eClassification into simple and complex\u003cbr/\u003e\n         compelexity of the layout (columns, footnotes,...)\u003c/li\u003e\n      \u003c/ol\u003e\n      \u003cp\u003eThe data are also divided according to the time of creation or production.\u003c/p\u003e\n      \u003ch2\u003e🖉 Creation\u003c/h2\u003e\n      \u003cp\u003eThe data were created according to the OCR-D Ground Truth Guideline (https://ocr-d.de/en/gt-guidelines/trans/).\u003c/p\u003e\n      \u003ch2\u003e💻 Repositories\u003c/h2\u003e\n      \u003cdiv id=\"data\"\u003e\n         \u003cdetails\u003e\n            \u003csummary\u003eGothic/Blackletter\u003c/summary\u003e\n            \u003cdetails\u003e\n               \u003csummary\u003e\n                  simple\n               \u003c/summary\u003e\n               \u003cul\u003e\n                  \u003cli\u003e\u003ca href=\"https://github.com/tboenig/16_frak_simple\"\u003ehttps://github.com/tboenig/16_frak_simple\u003c/a\u003e\u003c/li\u003e\n                  \u003cli\u003e\u003ca href=\"https://github.com/tboenig/17_frak_simple\"\u003ehttps://github.com/tboenig/17_frak_simple\u003c/a\u003e\u003c/li\u003e\n                  \u003cli\u003e\u003ca href=\"https://github.com/tboenig/18_frak_simple\"\u003ehttps://github.com/tboenig/18_frak_simple\u003c/a\u003e\u003c/li\u003e\n                  \u003cli\u003e\u003ca href=\"https://github.com/tboenig/19_frak_simple\"\u003ehttps://github.com/tboenig/19_frak_simple\u003c/a\u003e\u003c/li\u003e\n               \u003c/ul\u003e\n            \u003c/details\u003e\n            \u003cdetails\u003e\n               \u003csummary\u003e\n                  complex\n               \u003c/summary\u003e\n               \u003cul\u003e\n                  \u003cli\u003e\u003ca href=\"https://github.com/tboenig/16_frak_complex\"\u003ehttps://github.com/tboenig/16_frak_complex\u003c/a\u003e\u003c/li\u003e\n                  \u003cli\u003e\u003ca href=\"https://github.com/tboenig/17_frak_complex\"\u003ehttps://github.com/tboenig/17_frak_complex\u003c/a\u003e\u003c/li\u003e\n                  \u003cli\u003e\u003ca href=\"https://github.com/tboenig/18_frak_complex\"\u003ehttps://github.com/tboenig/18_frak_complex\u003c/a\u003e\u003c/li\u003e\n               \u003c/ul\u003e\n            \u003c/details\u003e\n         \u003c/details\u003e\n         \u003cdetails\u003e\n            \u003csummary\u003eAntiqua\u003c/summary\u003e\n            \u003cdetails\u003e\n               \u003csummary\u003e\n                  simple\n               \u003c/summary\u003e\n               \u003cul\u003e\n                  \u003cli\u003e\u003ca href=\"https://github.com/tboenig/16_ant_simple\"\u003ehttps://github.com/tboenig/16_ant_simple\u003c/a\u003e\u003c/li\u003e\n                  \u003cli\u003e\u003ca href=\"https://github.com/tboenig/18_ant_simple\"\u003ehttps://github.com/tboenig/18_ant_simple\u003c/a\u003e\u003c/li\u003e\n               \u003c/ul\u003e\n            \u003c/details\u003e\n            \u003cdetails\u003e\n               \u003csummary\u003e\n                  complex\n               \u003c/summary\u003e\n               \u003cul\u003e\n                  \u003cli\u003e\u003ca href=\"https://github.com/tboenig/16_ant_complex\"\u003ehttps://github.com/tboenig/16_ant_complex\u003c/a\u003e\u003c/li\u003e\n                  \u003cli\u003e\u003ca href=\"https://github.com/tboenig/19_ant_simple\"\u003ehttps://github.com/tboenig/19_ant_simple\u003c/a\u003e\u003c/li\u003e\n               \u003c/ul\u003e\n            \u003c/details\u003e\n         \u003c/details\u003e\n         \u003cdetails\u003e\n            \u003csummary\u003eFontMix (Antiqua and Blackletter)\u003c/summary\u003e\n            \u003cdetails\u003e\n               \u003csummary\u003e\n                  fontmix\n               \u003c/summary\u003e\n               \u003cul\u003e\n                  \u003cli\u003e\u003ca href=\"https://github.com/tboenig/17_fontmix_simple\"\u003ehttps://github.com/tboenig/17_fontmix_simple\u003c/a\u003e\u003c/li\u003e\n                  \u003cli\u003e\u003ca href=\"https://github.com/tboenig/18_fontmix_complex\"\u003ehttps://github.com/tboenig/18_fontmix_complex\u003c/a\u003e\u003c/li\u003e\n               \u003c/ul\u003e\n            \u003c/details\u003e\n         \u003c/details\u003e\n      \u003c/div\u003e\n   \u003c/div\u003e\n   \u003cdiv\u003e\n      \u003ch1\u003eAnalyzed collection\u003c/h1\u003e\n      \u003cp\u003eThe GT data has been labeled. The labeling is based on an ontology defined by the Pattern Recognition \n                    and Image Analysis Research Lab (PRImA-Research-Lab) at the University of Salford. The labeling metadata \n                    is created for each available page. The following labeling metadata is available for the different collections.\u003c/p\u003e\n      \u003cp\u003esee: gt-labelling : semantic-labelling OCR ground truth data (https://github.com/OCR-D/gt-labelling)\u003c/p\u003e\n      \u003cdiv\u003e\n         \u003ch2\u003eFontMix (Antiqua and Blackletter)\u003c/h2\u003e\n         \u003cdiv\u003e\n            \u003cdetails\u003e\n               \u003csummary\u003esimple\u003c/summary\u003e\n               \u003cul\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003eactivityDomain/computing/visual/analysisRecognition/layoutAnalysis\u003c/summary\u003e\n                        \u003cp\u003eIn computer vision, document layout analysis is the process of identifying and categorizing the regions of interest in the scanned image of a text document. A reading system requires the segmentation of text zones from non-textual ones and the arrangement in their correct reading order.\n\nExamples:\nPage layout analysis (segmentation into regions, classification into text, graphic, table etc.)\n\nRelated:\n\"OCR\": Often used as a synonym for layout analysis and text recognition, but strictly only the text recognition component.\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003eactivityDomain/computing/visual/analysisRecognition/ocr\u003c/summary\u003e\n                        \u003cp/\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003eactivityDomain/computing/visual/analysisRecognition/text\u003c/summary\u003e\n                        \u003cp\u003eTranslation of any kind of depicted symbols to machine readable format\n\nExamples:\nOCR\nMathematical equation recognition\n\nRelated:\nText processing (separate category)\nTable recognition\nMap reading\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003econdition/acquisition/method-flaws/imaging/uneven-illumination\u003c/summary\u003e\n                        \u003cp\u003eUneven illumination leading to brightness or contrast variations\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003econdition/production-related/document-characteristics/low-contrast\u003c/summary\u003e\n                        \u003cp\u003eThe contrast bwtween the paper and the page content is very low\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003econdition/production-related/document-faults/ink-from-facing\u003c/summary\u003e\n                        \u003cp\u003eInk from facing page was transferred to this page\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003econdition/wear/additions/informative/annotations\u003c/summary\u003e\n                        \u003cp\u003eAnnotations regarding the content\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003econtent-encoding/structured\u003c/summary\u003e\n                        \u003cp\u003eE.g. XML\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003econtent-type/corpus\u003c/summary\u003e\n                        \u003cp\u003e\nCorpus: a collection of written texts, especially the entire works of a particular author or a body of writing on a particular subject.\n\nExamples:\nA text corpus,\nAn image database\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003econtentOfInterest/visual/graphical\u003c/summary\u003e\n                        \u003cp\u003e\n                        Description coming soon.\n                    \u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003econtentOfInterest/visual/graphical/separator\u003c/summary\u003e\n                        \u003cp\u003e\n                        Description coming soon.\n                    \u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003econtentOfInterest/visual/text\u003c/summary\u003e\n                        \u003cp\u003e\n                        Description coming soon.\n                    \u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003edata-attributes/document-related/structural/running-titles\u003c/summary\u003e\n                        \u003cp\u003eTitles repeated each page\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003edata-attributes/document-related/visual/text/drop-caps\u003c/summary\u003e\n                        \u003cp\u003eDrap capitals (large capitals at beginning of paragraph)\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003edata-attributes/document-related/visual/text/font/multi-font/font-sizes\u003c/summary\u003e\n                        \u003cp\u003eMore than one font size used\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003edata-attributes/document-related/visual/text/font/multi-font/typefaces\u003c/summary\u003e\n                        \u003cp\u003eMore than one typeface used\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003edata-attributes/document-related/visual/text/font/typeface/antiqua\u003c/summary\u003e\n                        \u003cp\u003eAntiqua font (more modern)\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003edata-attributes/document-related/visual/text/font/typeface/blackletter\u003c/summary\u003e\n                        \u003cp\u003eBlackletter, gothic, Fraktur\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003edata-attributes/language/mixed\u003c/summary\u003e\n                        \u003cp\u003eMore than one language used\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003egranularity/logical/document-related/paragraph\u003c/summary\u003e\n                        \u003cp\u003e\n                        Description coming soon.\n                    \u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003egranularity/physical/document-related/page\u003c/summary\u003e\n                        \u003cp\u003e\n                        Description coming soon.\n                    \u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003egranularity/physical/document-related/region\u003c/summary\u003e\n                        \u003cp\u003eRegion, zone, block\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003egranularity/physical/document-related/text-line\u003c/summary\u003e\n                        \u003cp\u003e\n                        Description coming soon.\n                    \u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003egranularity/physical/document-related/word\u003c/summary\u003e\n                        \u003cp\u003eWord or partial word, if separated by line break, for example\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003eplatform/platform-independent\u003c/summary\u003e\n                        \u003cp\u003e\n                        Description coming soon.\n                    \u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n               \u003c/ul\u003e\n            \u003c/details\u003e\n         \u003c/div\u003e\n         \u003cdiv\u003e\n            \u003cdetails\u003e\n               \u003csummary\u003ecomplex\u003c/summary\u003e\n               \u003cul\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003eactivityDomain/computing/visual/analysisRecognition/layoutAnalysis\u003c/summary\u003e\n                        \u003cp\u003eIn computer vision, document layout analysis is the process of identifying and categorizing the regions of interest in the scanned image of a text document. A reading system requires the segmentation of text zones from non-textual ones and the arrangement in their correct reading order.\n\nExamples:\nPage layout analysis (segmentation into regions, classification into text, graphic, table etc.)\n\nRelated:\n\"OCR\": Often used as a synonym for layout analysis and text recognition, but strictly only the text recognition component.\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003eactivityDomain/computing/visual/analysisRecognition/ocr\u003c/summary\u003e\n                        \u003cp/\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003eactivityDomain/computing/visual/analysisRecognition/text\u003c/summary\u003e\n                        \u003cp\u003eTranslation of any kind of depicted symbols to machine readable format\n\nExamples:\nOCR\nMathematical equation recognition\n\nRelated:\nText processing (separate category)\nTable recognition\nMap reading\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003econdition/acquisition/content-or-background/included-objects/preceeding-or-proceeding\u003c/summary\u003e\n                        \u003cp\u003ePart of preceeding or succeeding object included (e.g. other page)\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003econdition/acquisition/geometric/page-curl\u003c/summary\u003e\n                        \u003cp\u003eVisible page curl (e.g. book scanning)\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003econdition/acquisition/geometric/perspective-distortions\u003c/summary\u003e\n                        \u003cp\u003ePerspective distortions (e.g. due to camera-based acquisition)\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003econdition/acquisition/method-flaws/imaging/uneven-illumination\u003c/summary\u003e\n                        \u003cp\u003eUneven illumination leading to brightness or contrast variations\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003econdition/production-related/document-characteristics/low-contrast\u003c/summary\u003e\n                        \u003cp\u003eThe contrast bwtween the paper and the page content is very low\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003econdition/production-related/document-faults/ink-from-facing\u003c/summary\u003e\n                        \u003cp\u003eInk from facing page was transferred to this page\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003econtent-encoding/structured\u003c/summary\u003e\n                        \u003cp\u003eE.g. XML\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003econtent-type/corpus\u003c/summary\u003e\n                        \u003cp\u003e\nCorpus: a collection of written texts, especially the entire works of a particular author or a body of writing on a particular subject.\n\nExamples:\nA text corpus,\nAn image database\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003econtentOfInterest/visual/graphical/separator\u003c/summary\u003e\n                        \u003cp\u003e\n                        Description coming soon.\n                    \u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003econtentOfInterest/visual/text\u003c/summary\u003e\n                        \u003cp\u003e\n                        Description coming soon.\n                    \u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003edata-attributes/document-related/structural/footnote-continued\u003c/summary\u003e\n                        \u003cp/\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003edata-attributes/document-related/structural/footnotes\u003c/summary\u003e\n                        \u003cp\u003eFootnotes at bottom of page\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003edata-attributes/document-related/structural/running-titles\u003c/summary\u003e\n                        \u003cp\u003eTitles repeated each page\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003edata-attributes/document-related/visual/text/drop-caps\u003c/summary\u003e\n                        \u003cp\u003eDrap capitals (large capitals at beginning of paragraph)\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003edata-attributes/document-related/visual/text/font/multi-font/font-sizes\u003c/summary\u003e\n                        \u003cp\u003eMore than one font size used\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003edata-attributes/document-related/visual/text/font/multi-font/typefaces\u003c/summary\u003e\n                        \u003cp\u003eMore than one typeface used\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003edata-attributes/document-related/visual/text/font/typeface/antiqua\u003c/summary\u003e\n                        \u003cp\u003eAntiqua font (more modern)\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003edata-attributes/document-related/visual/text/font/typeface/blackletter\u003c/summary\u003e\n                        \u003cp\u003eBlackletter, gothic, Fraktur\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003edata-attributes/language/mixed\u003c/summary\u003e\n                        \u003cp\u003eMore than one language used\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003egranularity/logical/document-related/paragraph\u003c/summary\u003e\n                        \u003cp\u003e\n                        Description coming soon.\n                    \u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003egranularity/physical/document-related/page\u003c/summary\u003e\n                        \u003cp\u003e\n                        Description coming soon.\n                    \u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003egranularity/physical/document-related/region\u003c/summary\u003e\n                        \u003cp\u003eRegion, zone, block\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003egranularity/physical/document-related/text-line\u003c/summary\u003e\n                        \u003cp\u003e\n                        Description coming soon.\n                    \u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003egranularity/physical/document-related/word\u003c/summary\u003e\n                        \u003cp\u003eWord or partial word, if separated by line break, for example\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003eplatform/platform-independent\u003c/summary\u003e\n                        \u003cp\u003e\n                        Description coming soon.\n                    \u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n               \u003c/ul\u003e\n            \u003c/details\u003e\n         \u003c/div\u003e\n      \u003c/div\u003e\n      \u003cdiv\u003e\n         \u003ch2\u003eGothic/Blackletter\u003c/h2\u003e\n         \u003cdiv\u003e\n            \u003cdetails\u003e\n               \u003csummary\u003esimple\u003c/summary\u003e\n               \u003cul\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003eactivityDomain/computing/visual/analysisRecognition/layoutAnalysis\u003c/summary\u003e\n                        \u003cp\u003eIn computer vision, document layout analysis is the process of identifying and categorizing the regions of interest in the scanned image of a text document. A reading system requires the segmentation of text zones from non-textual ones and the arrangement in their correct reading order.\n\nExamples:\nPage layout analysis (segmentation into regions, classification into text, graphic, table etc.)\n\nRelated:\n\"OCR\": Often used as a synonym for layout analysis and text recognition, but strictly only the text recognition component.\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003eactivityDomain/computing/visual/analysisRecognition/ocr\u003c/summary\u003e\n                        \u003cp/\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003eactivityDomain/computing/visual/analysisRecognition/text\u003c/summary\u003e\n                        \u003cp\u003eTranslation of any kind of depicted symbols to machine readable format\n\nExamples:\nOCR\nMathematical equation recognition\n\nRelated:\nText processing (separate category)\nTable recognition\nMap reading\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003econdition/acquisition/geometric/page-curl\u003c/summary\u003e\n                        \u003cp\u003eVisible page curl (e.g. book scanning)\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003econdition/acquisition/geometric/perspective-distortions\u003c/summary\u003e\n                        \u003cp\u003ePerspective distortions (e.g. due to camera-based acquisition)\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003econdition/ageing/warping\u003c/summary\u003e\n                        \u003cp\u003eArbitrary warping (e.g. due to moisture)\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003econdition/production-related/document-faults/ink-from-facing\u003c/summary\u003e\n                        \u003cp\u003eInk from facing page was transferred to this page\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003econdition/wear/additions/informative/annotations\u003c/summary\u003e\n                        \u003cp\u003eAnnotations regarding the content\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003econdition/wear/medium-damage/stains\u003c/summary\u003e\n                        \u003cp\u003eNoticeable stains on medium\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003econtent-encoding/structured\u003c/summary\u003e\n                        \u003cp\u003eE.g. XML\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003econtent-type/corpus\u003c/summary\u003e\n                        \u003cp\u003e\nCorpus: a collection of written texts, especially the entire works of a particular author or a body of writing on a particular subject.\n\nExamples:\nA text corpus,\nAn image database\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003econtentOfInterest/visual/graphical\u003c/summary\u003e\n                        \u003cp\u003e\n                        Description coming soon.\n                    \u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003econtentOfInterest/visual/graphical/separator\u003c/summary\u003e\n                        \u003cp\u003e\n                        Description coming soon.\n                    \u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003econtentOfInterest/visual/text\u003c/summary\u003e\n                        \u003cp\u003e\n                        Description coming soon.\n                    \u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003edata-attributes/document-related/structural/running-titles\u003c/summary\u003e\n                        \u003cp\u003eTitles repeated each page\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003edata-attributes/document-related/visual/text/drop-caps\u003c/summary\u003e\n                        \u003cp\u003eDrap capitals (large capitals at beginning of paragraph)\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003edata-attributes/document-related/visual/text/font/multi-font/font-sizes\u003c/summary\u003e\n                        \u003cp\u003eMore than one font size used\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003edata-attributes/document-related/visual/text/font/multi-font/typefaces\u003c/summary\u003e\n                        \u003cp\u003eMore than one typeface used\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003edata-attributes/document-related/visual/text/font/typeface/antiqua\u003c/summary\u003e\n                        \u003cp\u003eAntiqua font (more modern)\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003edata-attributes/document-related/visual/text/font/typeface/blackletter\u003c/summary\u003e\n                        \u003cp\u003eBlackletter, gothic, Fraktur\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003egranularity/logical/document-related/paragraph\u003c/summary\u003e\n                        \u003cp\u003e\n                        Description coming soon.\n                    \u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003egranularity/physical/document-related/page\u003c/summary\u003e\n                        \u003cp\u003e\n                        Description coming soon.\n                    \u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003egranularity/physical/document-related/region\u003c/summary\u003e\n                        \u003cp\u003eRegion, zone, block\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003egranularity/physical/document-related/text-line\u003c/summary\u003e\n                        \u003cp\u003e\n                        Description coming soon.\n                    \u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003egranularity/physical/document-related/word\u003c/summary\u003e\n                        \u003cp\u003eWord or partial word, if separated by line break, for example\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003eplatform/platform-independent\u003c/summary\u003e\n                        \u003cp\u003e\n                        Description coming soon.\n                    \u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n               \u003c/ul\u003e\n            \u003c/details\u003e\n         \u003c/div\u003e\n         \u003cdiv\u003e\n            \u003cdetails\u003e\n               \u003csummary\u003ecomplex\u003c/summary\u003e\n               \u003cul\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003eactivityDomain/computing/visual/analysisRecognition/layoutAnalysis\u003c/summary\u003e\n                        \u003cp\u003eIn computer vision, document layout analysis is the process of identifying and categorizing the regions of interest in the scanned image of a text document. A reading system requires the segmentation of text zones from non-textual ones and the arrangement in their correct reading order.\n\nExamples:\nPage layout analysis (segmentation into regions, classification into text, graphic, table etc.)\n\nRelated:\n\"OCR\": Often used as a synonym for layout analysis and text recognition, but strictly only the text recognition component.\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003eactivityDomain/computing/visual/analysisRecognition/ocr\u003c/summary\u003e\n                        \u003cp/\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003eactivityDomain/computing/visual/analysisRecognition/text\u003c/summary\u003e\n                        \u003cp\u003eTranslation of any kind of depicted symbols to machine readable format\n\nExamples:\nOCR\nMathematical equation recognition\n\nRelated:\nText processing (separate category)\nTable recognition\nMap reading\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003econdition/acquisition/content-or-background/included-objects/preceeding-or-proceeding\u003c/summary\u003e\n                        \u003cp\u003ePart of preceeding or succeeding object included (e.g. other page)\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003econdition/acquisition/geometric/page-curl\u003c/summary\u003e\n                        \u003cp\u003eVisible page curl (e.g. book scanning)\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003econdition/acquisition/geometric/perspective-distortions\u003c/summary\u003e\n                        \u003cp\u003ePerspective distortions (e.g. due to camera-based acquisition)\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003econdition/acquisition/method-flaws/imaging/uneven-illumination\u003c/summary\u003e\n                        \u003cp\u003eUneven illumination leading to brightness or contrast variations\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003econdition/ageing/warping\u003c/summary\u003e\n                        \u003cp\u003eArbitrary warping (e.g. due to moisture)\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003econdition/production-related/document-characteristics/low-contrast\u003c/summary\u003e\n                        \u003cp\u003eThe contrast bwtween the paper and the page content is very low\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003econdition/production-related/document-faults/ink-from-facing\u003c/summary\u003e\n                        \u003cp\u003eInk from facing page was transferred to this page\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003econdition/wear/additions/informative/annotations\u003c/summary\u003e\n                        \u003cp\u003eAnnotations regarding the content\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003econdition/wear/additions/informative/stamps\u003c/summary\u003e\n                        \u003cp\u003eThe medium was stamped\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003econdition/wear/medium-damage/stains\u003c/summary\u003e\n                        \u003cp\u003eNoticeable stains on medium\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003econtent-encoding/structured\u003c/summary\u003e\n                        \u003cp\u003eE.g. XML\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003econtent-type/corpus\u003c/summary\u003e\n                        \u003cp\u003e\nCorpus: a collection of written texts, especially the entire works of a particular author or a body of writing on a particular subject.\n\nExamples:\nA text corpus,\nAn image database\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003econtentOfInterest/visual/composite/music\u003c/summary\u003e\n                        \u003cp\u003e\n                        Description coming soon.\n                    \u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003econtentOfInterest/visual/graphical\u003c/summary\u003e\n                        \u003cp\u003e\n                        Description coming soon.\n                    \u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003econtentOfInterest/visual/graphical/separator\u003c/summary\u003e\n                        \u003cp\u003e\n                        Description coming soon.\n                    \u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003econtentOfInterest/visual/text\u003c/summary\u003e\n                        \u003cp\u003e\n                        Description coming soon.\n                    \u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003edata-attributes/document-related/structural/footnotes\u003c/summary\u003e\n                        \u003cp\u003eFootnotes at bottom of page\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003edata-attributes/document-related/structural/running-titles\u003c/summary\u003e\n                        \u003cp\u003eTitles repeated each page\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003edata-attributes/document-related/visual/decorations\u003c/summary\u003e\n                        \u003cp\u003eDecorations of some kind\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003edata-attributes/document-related/visual/illustrations\u003c/summary\u003e\n                        \u003cp\u003eIllustrations in content\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003edata-attributes/document-related/visual/illustrations/multi-colour\u003c/summary\u003e\n                        \u003cp\u003eMulti-colour illustrations in content\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003edata-attributes/document-related/visual/text/drop-caps\u003c/summary\u003e\n                        \u003cp\u003eDrap capitals (large capitals at beginning of paragraph)\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003edata-attributes/document-related/visual/text/font/multi-font/font-sizes\u003c/summary\u003e\n                        \u003cp\u003eMore than one font size used\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003edata-attributes/document-related/visual/text/font/multi-font/typefaces\u003c/summary\u003e\n                        \u003cp\u003eMore than one typeface used\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003edata-attributes/document-related/visual/text/font/typeface/antiqua\u003c/summary\u003e\n                        \u003cp\u003eAntiqua font (more modern)\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003edata-attributes/document-related/visual/text/font/typeface/blackletter\u003c/summary\u003e\n                        \u003cp\u003eBlackletter, gothic, Fraktur\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003edata-attributes/language/mixed\u003c/summary\u003e\n                        \u003cp\u003eMore than one language used\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003egranularity/logical/document-related/paragraph\u003c/summary\u003e\n                        \u003cp\u003e\n                        Description coming soon.\n                    \u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003egranularity/physical/document-related/page\u003c/summary\u003e\n                        \u003cp\u003e\n                        Description coming soon.\n                    \u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003egranularity/physical/document-related/region\u003c/summary\u003e\n                        \u003cp\u003eRegion, zone, block\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003egranularity/physical/document-related/text-line\u003c/summary\u003e\n                        \u003cp\u003e\n                        Description coming soon.\n                    \u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003egranularity/physical/document-related/word\u003c/summary\u003e\n                        \u003cp\u003eWord or partial word, if separated by line break, for example\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003eplatform/platform-independent\u003c/summary\u003e\n                        \u003cp\u003e\n                        Description coming soon.\n                    \u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n               \u003c/ul\u003e\n            \u003c/details\u003e\n         \u003c/div\u003e\n      \u003c/div\u003e\n      \u003cdiv\u003e\n         \u003ch2\u003eAntiqua\u003c/h2\u003e\n         \u003cdiv\u003e\n            \u003cdetails\u003e\n               \u003csummary\u003esimple\u003c/summary\u003e\n               \u003cul\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003eactivityDomain/computing/visual/analysisRecognition/layoutAnalysis\u003c/summary\u003e\n                        \u003cp\u003eIn computer vision, document layout analysis is the process of identifying and categorizing the regions of interest in the scanned image of a text document. A reading system requires the segmentation of text zones from non-textual ones and the arrangement in their correct reading order.\n\nExamples:\nPage layout analysis (segmentation into regions, classification into text, graphic, table etc.)\n\nRelated:\n\"OCR\": Often used as a synonym for layout analysis and text recognition, but strictly only the text recognition component.\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003eactivityDomain/computing/visual/analysisRecognition/ocr\u003c/summary\u003e\n                        \u003cp/\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003eactivityDomain/computing/visual/analysisRecognition/text\u003c/summary\u003e\n                        \u003cp\u003eTranslation of any kind of depicted symbols to machine readable format\n\nExamples:\nOCR\nMathematical equation recognition\n\nRelated:\nText processing (separate category)\nTable recognition\nMap reading\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003econdition/production-related/document-faults/ink-from-facing\u003c/summary\u003e\n                        \u003cp\u003eInk from facing page was transferred to this page\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003econdition/wear/medium-damage/stains\u003c/summary\u003e\n                        \u003cp\u003eNoticeable stains on medium\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003econtent-encoding/structured\u003c/summary\u003e\n                        \u003cp\u003eE.g. XML\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003econtent-type/corpus\u003c/summary\u003e\n                        \u003cp\u003e\nCorpus: a collection of written texts, especially the entire works of a particular author or a body of writing on a particular subject.\n\nExamples:\nA text corpus,\nAn image database\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003econtentOfInterest/visual/graphical/separator\u003c/summary\u003e\n                        \u003cp\u003e\n                        Description coming soon.\n                    \u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003econtentOfInterest/visual/text\u003c/summary\u003e\n                        \u003cp\u003e\n                        Description coming soon.\n                    \u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003edata-attributes/document-related/visual/text/drop-caps\u003c/summary\u003e\n                        \u003cp\u003eDrap capitals (large capitals at beginning of paragraph)\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003edata-attributes/document-related/visual/text/font/multi-font/font-sizes\u003c/summary\u003e\n                        \u003cp\u003eMore than one font size used\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003edata-attributes/document-related/visual/text/font/typeface/antiqua\u003c/summary\u003e\n                        \u003cp\u003eAntiqua font (more modern)\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003edata-attributes/document-related/visual/text/font/typeface/blackletter\u003c/summary\u003e\n                        \u003cp\u003eBlackletter, gothic, Fraktur\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003egranularity/logical/document-related/paragraph\u003c/summary\u003e\n                        \u003cp\u003e\n                        Description coming soon.\n                    \u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003egranularity/physical/document-related/page\u003c/summary\u003e\n                        \u003cp\u003e\n                        Description coming soon.\n                    \u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003egranularity/physical/document-related/region\u003c/summary\u003e\n                        \u003cp\u003eRegion, zone, block\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003egranularity/physical/document-related/text-line\u003c/summary\u003e\n                        \u003cp\u003e\n                        Description coming soon.\n                    \u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003egranularity/physical/document-related/word\u003c/summary\u003e\n                        \u003cp\u003eWord or partial word, if separated by line break, for example\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003eplatform/platform-independent\u003c/summary\u003e\n                        \u003cp\u003e\n                        Description coming soon.\n                    \u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n               \u003c/ul\u003e\n            \u003c/details\u003e\n         \u003c/div\u003e\n         \u003cdiv\u003e\n            \u003cdetails\u003e\n               \u003csummary\u003ecomplex\u003c/summary\u003e\n               \u003cul\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003eactivityDomain/computing/visual/analysisRecognition/layoutAnalysis\u003c/summary\u003e\n                        \u003cp\u003eIn computer vision, document layout analysis is the process of identifying and categorizing the regions of interest in the scanned image of a text document. A reading system requires the segmentation of text zones from non-textual ones and the arrangement in their correct reading order.\n\nExamples:\nPage layout analysis (segmentation into regions, classification into text, graphic, table etc.)\n\nRelated:\n\"OCR\": Often used as a synonym for layout analysis and text recognition, but strictly only the text recognition component.\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003eactivityDomain/computing/visual/analysisRecognition/ocr\u003c/summary\u003e\n                        \u003cp/\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003eactivityDomain/computing/visual/analysisRecognition/text\u003c/summary\u003e\n                        \u003cp\u003eTranslation of any kind of depicted symbols to machine readable format\n\nExamples:\nOCR\nMathematical equation recognition\n\nRelated:\nText processing (separate category)\nTable recognition\nMap reading\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003econdition/production-related/document-faults/ink-from-facing\u003c/summary\u003e\n                        \u003cp\u003eInk from facing page was transferred to this page\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003econdition/wear/additions/informative/annotations\u003c/summary\u003e\n                        \u003cp\u003eAnnotations regarding the content\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003econdition/wear/medium-damage/stains\u003c/summary\u003e\n                        \u003cp\u003eNoticeable stains on medium\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003econtent-encoding/structured\u003c/summary\u003e\n                        \u003cp\u003eE.g. XML\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003econtent-type/corpus\u003c/summary\u003e\n                        \u003cp\u003e\nCorpus: a collection of written texts, especially the entire works of a particular author or a body of writing on a particular subject.\n\nExamples:\nA text corpus,\nAn image database\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003econtentOfInterest/visual/text\u003c/summary\u003e\n                        \u003cp\u003e\n                        Description coming soon.\n                    \u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003edata-attributes/document-related/structural/footnote-continued\u003c/summary\u003e\n                        \u003cp/\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003edata-attributes/document-related/structural/footnotes\u003c/summary\u003e\n                        \u003cp\u003eFootnotes at bottom of page\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003edata-attributes/document-related/structural/running-titles\u003c/summary\u003e\n                        \u003cp\u003eTitles repeated each page\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003edata-attributes/document-related/visual/text/drop-caps\u003c/summary\u003e\n                        \u003cp\u003eDrap capitals (large capitals at beginning of paragraph)\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003edata-attributes/document-related/visual/text/font/multi-font/font-sizes\u003c/summary\u003e\n                        \u003cp\u003eMore than one font size used\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003edata-attributes/document-related/visual/text/font/multi-font/typefaces\u003c/summary\u003e\n                        \u003cp\u003eMore than one typeface used\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003edata-attributes/document-related/visual/text/font/typeface/antiqua\u003c/summary\u003e\n                        \u003cp\u003eAntiqua font (more modern)\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003edata-attributes/document-related/visual/text/font/typeface/blackletter\u003c/summary\u003e\n                        \u003cp\u003eBlackletter, gothic, Fraktur\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003edata-attributes/language/mixed\u003c/summary\u003e\n                        \u003cp\u003eMore than one language used\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003egranularity/logical/document-related/paragraph\u003c/summary\u003e\n                        \u003cp\u003e\n                        Description coming soon.\n                    \u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003egranularity/physical/document-related/page\u003c/summary\u003e\n                        \u003cp\u003e\n                        Description coming soon.\n                    \u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003egranularity/physical/document-related/region\u003c/summary\u003e\n                        \u003cp\u003eRegion, zone, block\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003egranularity/physical/document-related/text-line\u003c/summary\u003e\n                        \u003cp\u003e\n                        Description coming soon.\n                    \u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003egranularity/physical/document-related/word\u003c/summary\u003e\n                        \u003cp\u003eWord or partial word, if separated by line break, for example\u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n                  \u003cli\u003e\n                     \u003cdetails\u003e\n                        \u003csummary\u003eplatform/platform-independent\u003c/summary\u003e\n                        \u003cp\u003e\n                        Description coming soon.\n                    \u003c/p\u003e\n                     \u003c/details\u003e\n                  \u003c/li\u003e\n               \u003c/ul\u003e\n            \u003c/details\u003e\n         \u003c/div\u003e\n      \u003c/div\u003e\n   \u003c/div\u003e\n\u003c/div\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftboenig%2Fgt_corpus_benchmark","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftboenig%2Fgt_corpus_benchmark","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftboenig%2Fgt_corpus_benchmark/lists"}