https://github.com/docling-project/docling-eval
Evaluation framework for document processing models and services.
https://github.com/docling-project/docling-eval
Last synced: about 1 year ago
JSON representation
Evaluation framework for document processing models and services.
- Host: GitHub
- URL: https://github.com/docling-project/docling-eval
- Owner: docling-project
- License: mit
- Created: 2024-12-13T13:23:19.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-06-11T15:48:27.000Z (about 1 year ago)
- Last Synced: 2025-06-11T16:11:35.331Z (about 1 year ago)
- Language: Python
- Homepage:
- Size: 21.6 MB
- Stars: 19
- Watchers: 1
- Forks: 6
- Open Issues: 13
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
README
# Docling-eval
[](https://arxiv.org/abs/2408.09869)
[](https://pypi.org/project/docling-eval/)
[](https://pypi.org/project/docling-eval/)
[](https://github.com/astral-sh/uv)
[](https://github.com/psf/black)
[](https://pycqa.github.io/isort/)
[](https://pydantic.dev)
[](https://github.com/pre-commit/pre-commit)
[](https://opensource.org/licenses/MIT)
Evaluate [Docling](https://github.com/docling-project/docling) on various datasets.
## Features
Evaluate docling on various datasets. You can use the cli
```shell
terminal %> docling-eval --help
Usage: docling_eval [OPTIONS] COMMAND [ARGS]...
Docling Evaluation CLI for benchmarking document processing tasks.
╭─ Options ────────────────────────────────────────────────────────────────────────────╮
│ --help Show this message and exit. │
╰──────────────────────────────────────────────────────────────────────────────────────╯
╭─ Commands ───────────────────────────────────────────────────────────────────────────╮
│ create Create both ground truth and evaluation datasets in one step. │
│ create-eval Create evaluation dataset from existing ground truth. │
│ create-gt Create ground truth dataset only. │
│ evaluate Evaluate predictions against ground truth. │
│ visualize Visualize evaluation results. │
╰──────────────────────────────────────────────────────────────────────────────────────╯
```
## Benchmarks
- General
- [DP-Bench benchmarks](docs/DP-Bench_benchmarks.md): Text, layout, reading order and table structure evaluation on the DP-Bench dataset.
- [OmniDocBench benchmarks](docs/OmniDocBench_benchmarks.md): Text, layout, reading order and table structure evaluation on the OmniDocBench dataset.
- Layout
- [DocLayNetV1 Benchmarks](docs/DocLayNetv1_benchmarks.md): Text and layout evaluation on the DocLayNet v1.2 dataset.
- Table-Structure
- [FinTabnet Benchmarks](docs/FinTabNet_benchmarks.md): Table structure evaluation on the FinTabNet dataset.
- [PubTabNet benchmarks](docs/PubTabNet_benchmarks.md): Table structure evaluation on the PubTabNet dataset.
- [Pub1M benchmarks](docs/P1M_benchmarks.md): Table structure evaluation on the Pub1M dataset.
On our list for next benchmarks:
- [OmniOCR](getomni-ai/ocr-benchmark)
- Hyperscalers
- [CoMix](https://github.com/emanuelevivoli/CoMix/tree/main/docs/datasets)
- [DocVQA](https://huggingface.co/datasets/lmms-lab/DocVQA)
- [rd-tablebench](https://huggingface.co/datasets/reducto/rd-tablebench)
- [BigDocs-Bench](https://huggingface.co/datasets/ServiceNow/BigDocs-Bench)
## Contributing
Please read [Contributing to Docling](https://github.com/docling-project/docling/blob/main/CONTRIBUTING.md) for details.
## License
The Docling codebase is under MIT license.
For individual model usage, please refer to the model licenses found in the original packages.
## IBM ❤️ Open Source AI
Docling-eval has been brought to you by IBM.