{"id":37072661,"url":"https://github.com/docling-project/docling-cvat-tools","last_synced_at":"2026-01-14T08:32:02.680Z","repository":{"id":332085047,"uuid":"1128302183","full_name":"docling-project/docling-cvat-tools","owner":"docling-project","description":"Collection of CVAT parsing and campaign utilities for Docling","archived":false,"fork":false,"pushed_at":"2026-01-12T16:18:03.000Z","size":54694,"stargazers_count":1,"open_issues_count":1,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-01-12T18:56:06.826Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/docling-project.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":".github/SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":"MAINTAINERS.md","copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-01-05T12:42:47.000Z","updated_at":"2026-01-10T05:20:43.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/docling-project/docling-cvat-tools","commit_stats":null,"previous_names":["docling-project/docling-cvat-tools"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/docling-project/docling-cvat-tools","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/docling-project%2Fdocling-cvat-tools","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/docling-project%2Fdocling-cvat-tools/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/docling-project%2Fdocling-cvat-tools/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/docling-project%2Fdocling-cvat-tools/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/docling-project","download_url":"https://codeload.github.com/docling-project/docling-cvat-tools/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/docling-project%2Fdocling-cvat-tools/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28414194,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-14T08:31:27.429Z","status":"ssl_error","status_checked_at":"2026-01-14T08:31:19.098Z","response_time":107,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-01-14T08:32:01.998Z","updated_at":"2026-01-14T08:32:02.668Z","avatar_url":"https://github.com/docling-project.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# docling-cvat-tools\n\nCVAT annotation tools for Docling document processing and evaluation.\n\nThis package provides comprehensive tools for working with CVAT (Computer Vision Annotation Tool) annotations in the context of Docling document processing and evaluation workflows.\n\n## Features\n\n- **CVAT XML Parsing**: Parse and validate CVAT XML annotation files\n- **Document Conversion**: Convert CVAT annotations to `DoclingDocument` format\n- **Validation**: Validate CVAT annotations for correctness and completeness\n- **Visualization**: Generate HTML visualizations of annotated documents\n- **CLI Tools**: Command-line utilities for common CVAT workflows\n\n## Installation\n\n```bash\npip install docling-cvat-tools\n```\n\nOr install as an optional dependency of `docling-eval`:\n\n```bash\npip install \"docling-eval[campaign-tools]\"\n```\n\n\n## Requirements\n\n- Python \u003e=3.10,\u003c4.0\n- docling-core (document types)\n- docling (for document processing)\n\n## Usage\n\n### CLI Tools\n\n#### Validate CVAT annotations\n\n```bash\ndocling-cvat-validator path/to/annotations.xml\n```\n\n#### Convert CVAT to DoclingDocument\n\n```bash\ndocling-cvat-to-docling --input_path path/to/cvat_folder --output-dir output/\n```\n\n### Python API\n\n```python\nfrom docling_cvat_tools.cvat_tools.parser import parse_cvat_file\nfrom docling_cvat_tools.cvat_tools.cvat_to_docling import convert_cvat_to_docling\nfrom docling_cvat_tools.cvat_tools.validator import validate_cvat_sample\n\n# Parse CVAT XML file\nparsed = parse_cvat_file(Path(\"annotations.xml\"))\n\n# Validate annotations\nvalidation_result = validate_cvat_sample(\n    xml_path=Path(\"annotations.xml\"),\n    image_filename=\"page_000001.png\"\n)\n\n# Convert CVAT folder to DoclingDocuments\nresults = convert_cvat_to_docling(\n    xml_path=Path(\"annotations.xml\"),\n    input_path=Path(\"document.pdf\"),\n    image_identifier=\"page_000001.png\",\n    output_dir=Path(\"output\")\n)\n```\n\n### Integration with docling-eval\n\nThis package is designed to work seamlessly with `docling-eval`. When installed as an optional dependency, it enables CVAT-specific features in the evaluation framework:\n\n- CVAT dataset builders (`CvatDatasetBuilder`, `CvatPreannotationBuilder`)\n- CVAT evaluation pipelines\n\n## Package Structure\n\n- `docling_cvat_tools.cvat_tools`: Core CVAT parsing, conversion, and validation\n- `docling_cvat_tools.datamodels`: CVAT-specific data models\n- `docling_cvat_tools.visualisation`: HTML visualization utilities\n- `docling_cvat_tools.cli`: Command-line interface tools\n- `docling_cvat_tools.utils`: Utility functions\n\n## Development\n\n```bash\n# Install in development mode\nuv sync\n\n# Run tests\nuv run pytest\n```\n\n## License\n\nMIT\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdocling-project%2Fdocling-cvat-tools","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdocling-project%2Fdocling-cvat-tools","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdocling-project%2Fdocling-cvat-tools/lists"}