https://github.com/DIVA-DIA/DIVA_Layout_Analysis_Evaluator
Layout Analysis Evaluator for the ICDAR 2017 competition on Layout Analysis for Challenging Medieval Manuscripts
https://github.com/DIVA-DIA/DIVA_Layout_Analysis_Evaluator
Last synced: 3 months ago
JSON representation
Layout Analysis Evaluator for the ICDAR 2017 competition on Layout Analysis for Challenging Medieval Manuscripts
- Host: GitHub
- URL: https://github.com/DIVA-DIA/DIVA_Layout_Analysis_Evaluator
- Owner: DIVA-DIA
- License: lgpl-3.0
- Created: 2017-06-28T11:53:56.000Z (about 8 years ago)
- Default Branch: master
- Last Pushed: 2019-05-17T09:11:42.000Z (about 6 years ago)
- Last Synced: 2024-11-03T09:33:35.685Z (8 months ago)
- Language: Java
- Size: 12.4 MB
- Stars: 22
- Watchers: 15
- Forks: 8
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE.md
Awesome Lists containing this project
README
# LayoutAnalysisEvaluator
Layout Analysis Evaluator for:
* [ICDAR 2019 Historical Document Reading Challenge on Large Structured Chinese Family Records](https://www.ltu.se/research/subjects/Maskininlarning/ICDAR-2019-HDRC-Chinese?l=en "ICDAR 2019 Historical Document Reading Challenge on Large Structured Chinese Family Records Homepage")
* [ICDAR 2017 competition on Layout Analysis for Challenging Medieval Manuscripts](https://diuf.unifr.ch/main/hisdoc/icdar2017-hisdoc-layout-comp "ICDAR 2017 competition on Layout Analysis for Challenging Medieval Manuscripts Homepage")Minimal usage: `java -jar LayoutAnalysisEvaluator.jar -gt gt_image.png -p prediction_image.png`
Parameters list: utility-name
```
-gt,--groundTruth Ground Truth image
-p,--prediction Prediction image
-o,--original (Optional) Original image, to be overlapped with the results visualization
-j,--json (Optional) Json Path, for the DIVAServices JSON output
-out,--outputPath (Optional) Output path (relative to prediction input path)
-dv,--disableVisualization (Optional)(Flag) Vsualizing the evaluation as image is NOT desired
```
**Note:** this also outputs a human-friendly visualization of the results next to the
`prediction_image.png` which can be overlapped to the original image if provided
with the parameter `-overlap` to enable deeper analysis.## Visualization of the results
Along with the numerical results (such as the Intersection over Union (IU), precision, recall,F1)
the tool provides a human friendly visualization of the results.
Additionally, when desired one can provide the original image and it will be overlapped with
the visualization of the results.
This is particularly helpful to understand why certain artifacts are created.
The three images below represent the three steps: the original image, the visualization of the result
and the two overlapped.

### Interpreting the colors
Pixel colors are assigned as follows:
- GREEN: Foreground predicted correctly
- YELLOW: Foreground predicted - but the wrong class (e.g. Text instead of Comment)
- BLACK: Background predicted correctly
- RED: Background mis-predicted as Foreground
- BLUE: Foreground mis-predicted as Background### Example of problem hunting
Below there is an example supporting the usefulness of overlapping the prediction quality visualization with the original image.
Focus on the red pixels pointed at by the white arrow: they are background pixels mis-classified as foreground.
In the normal visualization (left) its not possible to know why would an algorithm decide that in that
spot there is something belonging to foreground, as it is clearly far from regular text.
However, when overlapped with the original image (right) one can clearly see that in this area there is an
ink stain which could explain why the classification algorithm is deceived into thinking these pixel were
foreground. This kind of interpretation is obviously not possible without the information provided by the
original image like in (right).
## Ground Truth Format
The ground truth information needs to be a pixel-label image where the class information is encoded in the blue
channel.
Red and green channels should be set to 0 with the exception of the boundaries pixels used in the two competitions mentioned above.For example, in the DIVA-HisDB dataset there are four different annotated classes which might overlap:
main text body, decorations, comments and background.In the pixel-label images the classes are encoded by RGB values as follows:
Red = 0 everywhere (except boundaries)
Green = 0 everywhere
Blue = 0b00...1000 = 0x000008: main text body
Blue = 0b00...0100 = 0x000004: decoration
Blue = 0b00...0010 = 0x000002: comment
Blue = 0b00...0001 = 0x000001: background (out of page)Note that the GT might contain multi-class labeled pixels, for all classes except for the background.
For example:Blue = 0b...1000 | 0b...0010 = 0b...1010 = 0x00000A : main text body + comment
Blue = 0b...1000 | 0b...0100 = 0b...1100 = 0x00000C : main text body + decoration
Blue = 0b...0010 | 0b...0100 = 0b...0110 = 0x000006 : comment + decoration## Citing us
If you use our software, please cite our paper as:
``` latex
@inproceedings{alberti2017evaluation,
address = {Kyoto, Japan},
archivePrefix = {arXiv},
arxivId = {1712.01656},
author = {Alberti, Michele and Bouillon, Manuel and Ingold, Rolf and Liwicki, Marcus},
booktitle = {2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)},
doi = {10.1109/ICDAR.2017.311},
eprint = {1712.01656},
isbn = {978-1-5386-3586-5},
month = {nov},
pages = {43--47},
title = {{Open Evaluation Tool for Layout Analysis of Document Images}},
year = {2017}
}
```