{"id":46886436,"url":"https://github.com/bramalkema/openxml-audit","last_synced_at":"2026-05-03T22:03:01.191Z","repository":{"id":332660004,"uuid":"1134547411","full_name":"BramAlkema/openxml-audit","owner":"BramAlkema","description":"Validate Office files in pure Python with Open XML SDK parity, pytest fixtures, and CI hooks.","archived":false,"fork":false,"pushed_at":"2026-04-28T00:39:15.000Z","size":42373,"stargazers_count":1,"open_issues_count":1,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-04-28T02:31:01.853Z","etag":null,"topics":["docx","odf","openxml","pptx","pytest","python","validation","xlsx"],"latest_commit_sha":null,"homepage":"https://bramalkema.github.io/openxml-audit/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/BramAlkema.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null},"funding":{"github":"BramAlkema"}},"created_at":"2026-01-14T21:40:37.000Z","updated_at":"2026-04-28T00:39:19.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/BramAlkema/openxml-audit","commit_stats":null,"previous_names":["bramalkema/openxml-audit"],"tags_count":25,"template":false,"template_full_name":null,"purl":"pkg:github/BramAlkema/openxml-audit","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BramAlkema%2Fopenxml-audit","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BramAlkema%2Fopenxml-audit/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BramAlkema%2Fopenxml-audit/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BramAlkema%2Fopenxml-audit/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/BramAlkema","download_url":"https://codeload.github.com/BramAlkema/openxml-audit/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BramAlkema%2Fopenxml-audit/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32586189,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-03T06:36:36.687Z","status":"ssl_error","status_checked_at":"2026-05-03T06:36:09.306Z","response_time":103,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["docx","odf","openxml","pptx","pytest","python","validation","xlsx"],"created_at":"2026-03-10T22:10:37.036Z","updated_at":"2026-05-03T22:03:01.184Z","avatar_url":"https://github.com/BramAlkema.png","language":"Python","funding_links":["https://github.com/sponsors/BramAlkema"],"categories":[],"sub_categories":[],"readme":"\u003cp align=\"center\"\u003e\n  \u003cimg src=\"https://raw.githubusercontent.com/BramAlkema/openxml-audit/main/docs/logo.png\" alt=\"OpenXML Audit\" width=\"96\"\u003e\n\u003c/p\u003e\n\n\u003ch1 align=\"center\"\u003eOpenXML Audit\u003c/h1\u003e\n\n[![PyPI](https://img.shields.io/pypi/v/openxml-audit)](https://pypi.org/project/openxml-audit/)\n[![Downloads](https://img.shields.io/pypi/dm/openxml-audit)](https://pypi.org/project/openxml-audit/)\n[![Python](https://img.shields.io/pypi/pyversions/openxml-audit)](https://pypi.org/project/openxml-audit/)\n[![License: MIT](https://img.shields.io/badge/license-MIT-blue.svg)](LICENSE)\n[![CI](https://github.com/BramAlkema/openxml-audit/actions/workflows/parity-gate.yml/badge.svg)](https://github.com/BramAlkema/openxml-audit/actions/workflows/parity-gate.yml)\n[![SDK Parity](https://img.shields.io/badge/SDK%20parity-100%25-brightgreen)](docs/parity_contract.md)\n[![ODF Parity](https://img.shields.io/badge/ODF%20parity-100%25-brightgreen)](docs/odf_validation_contract.md)\n[![pytest](https://img.shields.io/badge/pytest-plugin-orange)](https://pypi.org/project/openxml-audit/)\n\nValidate OOXML (PPTX/DOCX/XLSX) and ODF files in pure Python — no .NET required.\n\nA Python port of Microsoft's [Open XML SDK](https://github.com/OfficeDev/Open-XML-SDK) validation logic. Check whether generated or modified Office files will open cleanly, directly from Python scripts, CI pipelines, or anywhere .NET isn't practical.\n\nAlso supports OASIS OpenDocument Format (ODT/ODS/ODP) with staged conformance levels.\n\n## Evidence ladder\n\nValidation is the floor tier. Whether a file *survives* depends on more than ECMA legality — it also has to load in the target app, survive a save, behave correctly at runtime, and ideally match what the app itself would author. `openxml-audit` organizes this as an evidence ladder (`openxml_audit.EvidenceTier`):\n\n1. **`schema-valid`** — parses against ECMA/OASIS schemas *(this is what `openxml-audit validate` checks)*\n2. **`loadable`** — the target app opens without repair\n3. **`roundtrip-preserved`** — the app's save does not rewrite the intent\n4. **`slideshow-verified`** — runtime behavior matches intent\n5. **`ui-authored`** — the app itself produced this structure\n\nTiers 2–5 are backed by curated corpora of target-app-authored XML. The first corpus lives at [`docs/pptx_oracle/`](docs/pptx_oracle/README.md) — PowerPoint animation/timing, where \"schema-valid but silently rewritten\" is the dominant failure mode. DOCX and XLSX corpora can follow the same layout when the research starts.\n\n```python\nfrom openxml_audit import EvidenceTier\nfrom openxml_audit.pptx import check_capability\n\ncheck_capability(\"pptx.anim.effect.entr.fade\", minimum_tier=EvidenceTier.LOADABLE)\n```\n\n## Features\n\n- **OOXML Validation**: Package structure, schema, semantic, properties, and format-specific checks for PPTX/DOCX/XLSX — 100% parity with Open XML SDK v3.4.1 without the .NET dependency\n- **ODF Validation**: Staged conformance levels — foundation, schema-core (Relax NG), semantic-core, and security-core for ODT/ODS/ODP\n- **Evidence ladder**: Validation is the floor tier. Curated PPTX corpora (`docs/pptx_oracle/`) verify loadability, roundtrip preservation, and runtime behavior above it — for features like animation/timing where \"schema-valid\" isn't enough\n- **Fast**: 1.2x the .NET SDK cold, 2.2x warm — validates a 798K DOCX in 101ms\n- **pytest Plugin**: `assert_valid_pptx`, `assert_valid_docx`, `assert_valid_xlsx`, `assert_valid_odf` — zero config\n- **CI Ready**: GitHub Action, pre-commit hook, and parallel batch validation\n- **Multiple Output Formats**: Text, JSON, and XML output\n\n## Why validate?\n\nLibraries that generate Office files routinely produce corrupt output — python-pptx has 12+ open corruption issues, docxtpl has 7, XlsxWriter 25+. These surface as \"PowerPoint found a problem\" dialogs for end users or silent failures in CI. With AI agents now generating slides and reports, the problem is getting worse.\n\nopenxml-audit catches these before your users do — same checks Microsoft's SDK runs, in pure Python.\n\n| Ecosystem | Examples | How openxml-audit helps |\n|-----------|----------|------------------------|\n| File generators | python-pptx, python-docx, openpyxl, XlsxWriter | Validate output in tests and CI — catch corruption before release |\n| Template engines | docxtpl, pptx-template | Jinja2 rendering can break XML structure — validate after render |\n| Data pipelines | pandas `to_excel`, tablib, django-import-export | Assert valid exports in pipeline tests |\n| AI/LLM agents | Auto-PPT, GenFilesMCP, Docling | AI-generated Office files are unreliable — validate and retry |\n| Government / ODF | Suite Numerique, odfpy | ODF conformance for EU regulatory requirements |\n\n## Performance\n\nPure Python, but close to .NET — lxml does the heavy XML lifting in C.\n\n| Benchmark | .NET SDK | openxml-audit | Ratio |\n|-----------|----------|---------------|-------|\n| Cold start (6 files, mixed formats) | 994ms | 1,175ms | 1.2x |\n| Warm (798K DOCX) | 46ms | 101ms | 2.2x |\n| Warm (1.4MB PPTX) | — | 83ms | — |\n| Warm (114K XLSX) | — | 29ms | — |\n\nBatch validation supports `--parallel N` for multiprocess speedup. The pytest plugin uses session-scoped fixtures so schema loading happens once per test run.\n\n## Installation\n\n```bash\npip install openxml-audit\n```\n\nOr install from source:\n\n```bash\ngit clone https://github.com/BramAlkema/openxml-audit.git\ncd openxml-audit\npip install -e .\n```\n\n## Quick Start\n\n### Command Line\n\n```bash\n# Validate a single file\nopenxml-audit presentation.pptx\n\n# Validate an OASIS OpenDocument file\nopenxml-audit document.odt\n\n# Validate with JSON output\nopenxml-audit presentation.pptx --output json\n\n# Validate with XML output\nopenxml-audit presentation.pptx --output xml\n\n# Validate all matching files in a directory\nopenxml-audit ./presentations/ --recursive\n\n# Validate against a specific Office version\nopenxml-audit presentation.pptx --format Office2007\n\n# Limit maximum errors reported\nopenxml-audit presentation.pptx --max-errors 10\n```\n\n### Python API\n\n```python\nfrom openxml_audit import validate_pptx, is_valid_pptx, OpenXmlValidator\n\n# Quick check\nif is_valid_pptx(\"presentation.pptx\"):\n    print(\"File is valid!\")\n\n# Detailed validation\nresult = validate_pptx(\"presentation.pptx\")\nif not result.is_valid:\n    print(f\"Found {result.error_count} errors, {result.warning_count} warnings\")\n    for error in result.errors:\n        print(f\"  [{error.severity.value}] {error.description}\")\n\n# With custom options\nfrom openxml_audit import FileFormat\n\nvalidator = OpenXmlValidator(\n    file_format=FileFormat.OFFICE_2019,\n    max_errors=100,\n    schema_validation=True,\n    semantic_validation=True,\n)\nresult = validator.validate(\"presentation.pptx\")\n```\n\n## Documentation\n\n- [ADRs](docs/adr/README.md) — evidence-ladder mission and PPTX evidence ownership\n- [PPTX oracle corpus](docs/pptx_oracle/README.md) — curated PowerPoint timing\n  fixtures and XML-first methodology\n- [Parity contract](docs/parity_contract.md) — SDK calibration and drift rules\n\n## ODF Validation Depth\n\nODF validation is staged by explicit conformance level.\n\n| Level | Includes | Does not include |\n|---|---|---|\n| `foundation` | package/manifest integrity + XML parse sweep | Relax NG schema-core routing, semantic-core rules, security-core checks |\n| `schema-core` | foundation + Relax NG validation for routed XML members | semantic-core and security-core checks |\n| `semantic-core` | foundation + semantic-core rule families (`ODFSEM*`) | Relax NG schema-core routing, security-core checks |\n| `security-core` | semantic-core + signature/encryption structural checks (`ODFSEC*`) | full cryptographic trust guarantees unless crypto verification backend is configured |\n\nRule registry and policy references:\n\n- semantic rule IDs: `openxml_audit.odf.get_odf_semantic_rules()`\n- security policy: `docs/odf_security_policy.md`\n- reference calibration/drift contract: `docs/odf_validation_contract.md`\n\n### CLI Conformance Selection\n\nUse `--odf-level` when validating ODF files:\n\n```bash\n# foundation\nopenxml-audit file.odt --validator odf --odf-level foundation\n\n# semantic-core (default)\nopenxml-audit file.odt --validator odf --odf-level semantic-core\n\n# security-core\nopenxml-audit file.odt --validator odf --odf-level security-core\n```\n\nSchema-core uses bundled OASIS Relax NG schemas by default:\n\n```bash\nopenxml-audit file.odt \\\n  --validator odf \\\n  --odf-level schema-core\n```\n\nPass `--odf-schema-routes` only when you want to override or extend routing. It accepts either\nshape:\n\n- versioned mapping:\n  - `{\"1.3\": {\"content.xml\": \"schemas/odf/1.3/content.rng\"}}`\n- flat legacy mapping:\n  - `{\"content.xml\": \"schemas/odf/content.rng\"}`\n\nSecurity-core crypto verification hook:\n\n```bash\nopenxml-audit file.odt \\\n  --validator odf \\\n  --odf-level security-core \\\n  --odf-verify-cryptography\n```\n\n### API Conformance Selection\n\n```python\nfrom openxml_audit import FileFormat\nfrom openxml_audit.odf import OdfValidator\n\n# foundation\nfoundation = OdfValidator(\n    file_format=FileFormat.ODF_1_3,\n    schema_validation=False,\n    semantic_validation=False,\n    security_validation=False,\n)\n\n# schema-core (bundled schemas by default)\nschema_core = OdfValidator(\n    file_format=FileFormat.ODF_1_3,\n    schema_validation=True,\n    semantic_validation=False,\n    security_validation=False,\n    relaxng_validation=True,\n)\n\n# schema-core with custom routes\nschema_core_custom = OdfValidator(\n    file_format=FileFormat.ODF_1_3,\n    schema_validation=True,\n    semantic_validation=False,\n    security_validation=False,\n    relaxng_validation=True,\n    schema_routes={\"1.3\": {\"content.xml\": \"schemas/odf/1.3/content.rng\"}},\n)\n\n# semantic-core\nsemantic_core = OdfValidator(\n    file_format=FileFormat.ODF_1_3,\n    schema_validation=True,\n    semantic_validation=True,\n    security_validation=False,\n)\n\n# security-core\nsecurity_core = OdfValidator(\n    file_format=FileFormat.ODF_1_3,\n    schema_validation=True,\n    semantic_validation=True,\n    security_validation=True,\n    verify_cryptography=False,  # set True when crypto backend is available\n)\n```\n\n### ODF Benchmarking\n\n```bash\n# Benchmark an ODF file (5 iterations by default)\npython scripts/odf/benchmark_validation.py document.odt\n\n# More iterations, with security checks\npython scripts/odf/benchmark_validation.py document.odt --iterations 20 --security\n\n# Foundation-only (skip schema/semantic)\npython scripts/odf/benchmark_validation.py document.odt --no-schema --no-semantic\n```\n\nReports avg/min/max/P95 with per-phase breakdown (package_structure, xml_parse, schema, semantic, security).\n\nOOXML benchmark: `python scripts/benchmark_validation.py presentation.pptx`\n\n### Known ODF Limitations\n\n- Schema-core validates bundled routed members by default; use `schema_routes` to extend or\n  override routing for additional XML parts.\n- Security-core validates structure/policy, not full cryptographic trust by default.\n- CLI `--odf-level` only applies when the selected/auto-detected validator is ODF.\n\n### ODF Reference Calibration\n\nCompare Python results against external validators (ODF Toolkit, OPF) using the scripts in `scripts/odf/`:\n\n| Script | Purpose |\n|--------|---------|\n| `run_reference_validators.py` | Run Python + external validators on pinned corpus |\n| `compare_reference_results.py` | Diff results into mismatch families |\n| `check_reference_drift.py` | Enforce drift policy against baseline |\n| `bootstrap_reference_validators.py` | Auto-build external validator commands |\n\nCI workflow: `.github/workflows/odf-reference-calibration.yml` — builds ODF Toolkit and OPF at runtime via Maven/Docker.\n\nSet command templates via `--odf-toolkit-cmd` / `--opf-cmd` or env vars `ODF_TOOLKIT_CMD` / `OPF_ODF_VALIDATOR_CMD`. Placeholders: `{file}`, `{file_dir}`, `{file_name}`, `{file_stem}`, `{file_suffix}`.\n\n## Google Workspace Roundtrip Oracle\n\nThe `gsuite` engine in the oracle dispatcher rounds OOXML files\nthrough Google's import/export pipeline (upload → convert to native\nGoogle Slides → export back to .pptx → diff) and classifies what\nGSuite drops, transforms, or normalizes. See\n[`specs/031-gsuite-roundtrip-oracle.md`](specs/031-gsuite-roundtrip-oracle.md)\nfor the full design.\n\n### One-time setup\n\nGSuite uploads require **domain-wide delegation** because service\naccounts have zero storage quota since Google's 2024 policy change.\nThe setup is a one-time per-Workspace ceremony:\n\n1. **Create a GCP project** at \u003chttps://console.cloud.google.com\u003e\n   (e.g., `openxml-audit-oracle`).\n2. **APIs \u0026 Services → Library**, enable **Google Drive API**.\n3. **IAM \u0026 Admin → Service Accounts**, create one (e.g.,\n   `oracle-roundtrip`); skip the project IAM role grant.\n4. On the new SA → **Keys → Add key → JSON**. Save to\n   `~/.config/openxml-audit/google_service_account.json` and\n   `chmod 600` it.\n5. Note the SA's OAuth client ID (in **Show domain-wide delegation**\n   on the SA page).\n6. In Google Workspace Admin Console\n   (\u003chttps://admin.google.com\u003e) → **Security → Access and data\n   control → API controls → Domain-wide Delegation → Add new**.\n   Paste the OAuth client ID; scope:\n   `https://www.googleapis.com/auth/drive`. Requires Workspace\n   super-admin rights — one-time per Workspace.\n7. In Drive, create a folder owned by the impersonation subject\n   (e.g., `openxml-audit-oracle-staging`) to hold in-flight oracle\n   uploads. Copy its folder ID from the URL.\n\nInstall the optional dependency group:\n\n```bash\npip install -e \".[gsuite]\"\n```\n\n### Running\n\nThree env vars wire it up:\n\n```bash\nexport GSUITE_ORACLE_CREDS=~/.config/openxml-audit/google_service_account.json  # default; override only if elsewhere\nexport GSUITE_ORACLE_SUBJECT=info@yourdomain.example                             # the user the SA impersonates\nexport GSUITE_ORACLE_FOLDER_ID=1abcDEFghijKLM...                                 # the staging folder ID\n```\n\nThen:\n\n```bash\npython -m openxml_audit.oracle gsuite presentation.pptx\npython -m openxml_audit.oracle gsuite ./corpus/ --output gsuite-report.json\n```\n\nThe report classifies each roundtrip across a `LossClass` taxonomy:\n`theme_loss`, `master_loss`, `style_loss`, `font_loss`,\n`media_re_encoded`, `metadata_churn`, `structural_normalization`\n(parts GSuite *added*), `content_preserved_lossy`,\n`content_changed`, `unmapped`. Multiple classes may fire per file.\n\nDrive uploads are deleted in `finally` after each roundtrip — the\noracle never leaves files in your account.\n\n## Open XML SDK (Standalone)\n\nRun the .NET SDK validator separately (requires .NET SDK 8.x or Docker):\n\n```bash\ndotnet run --project scripts/sdk_check/sdk_check.csproj -- /path/to/file.pptx\ndotnet run --project scripts/sdk_compare/OpenXmlSdkValidator.csproj -- /path/to/file.pptx  # JSON\n\n# Via Docker\ndocker run --rm -v \"$PWD:/work\" -w /work mcr.microsoft.com/dotnet/sdk:8.0 \\\n  dotnet run --project scripts/sdk_check/sdk_check.csproj -- /work/path/to/file.pptx\n```\n\nSupports PPTX/DOCX/XLSX and variants. Configured for Office 2019.\n\n## GitHub Action\n\nValidate Office files in your PRs automatically:\n\n```yaml\n# .github/workflows/validate-office-files.yml\nname: Validate Office Files\non: [pull_request]\n\njobs:\n  validate:\n    runs-on: ubuntu-latest\n    steps:\n      - uses: actions/checkout@v4\n      - uses: actions/setup-python@v5\n        with:\n          python-version: \"3.12\"\n      - uses: BramAlkema/openxml-audit@main\n        with:\n          changed-only: \"true\"  # only validate files changed in the PR\n```\n\nOptions:\n\n| Input | Default | Description |\n|-------|---------|-------------|\n| `path` | `.` | Directory or file to validate |\n| `format` | `Office2019` | Office version to validate against |\n| `changed-only` | `false` | Only validate files changed in the PR |\n| `recursive` | `true` | Search subdirectories |\n| `max-errors` | `100` | Maximum errors per file |\n\n## Pre-commit Hook\n\n```yaml\n# .pre-commit-config.yaml\nrepos:\n  - repo: https://github.com/BramAlkema/openxml-audit\n    rev: v0.5.0\n    hooks:\n      - id: openxml-audit\n```\n\nValidates any `.pptx`, `.docx`, `.xlsx`, `.odt`, `.ods`, or `.odp` file before commit.\n\n## Examples\n\nReady-to-run scripts in [`examples/`](examples/):\n\n| Script | Description |\n|--------|-------------|\n| [`validate_python_pptx.py`](examples/validate_python_pptx.py) | Generate a PPTX with python-pptx and validate it |\n| [`validate_openpyxl.py`](examples/validate_openpyxl.py) | Generate an XLSX with openpyxl and validate it |\n| [`validate_odf.py`](examples/validate_odf.py) | Validate an ODF file (ODT/ODS/ODP) |\n| [`ci_validation.py`](examples/ci_validation.py) | Validate all Office files in a directory (CI-ready, OOXML + ODF) |\n\n## CI Workflows\n\n| Workflow | Trigger | Purpose |\n|----------|---------|---------|\n| `parity-gate.yml` | PR / push | Enforce OOXML parity + perf budget against SDK baseline |\n| `calibrate-parity.yml` | Weekly / dispatch | Calibrate against Open XML SDK upstream |\n| `sdk-update.yml` | Quarterly / dispatch | Track upstream SDK version changes |\n| `odf-reference-calibration.yml` | Dispatch | Run ODF reference validators and drift checks |\n| `validate-inputs.yml` | Push to `inputs/` | Validate dropped files with both Python and .NET SDK |\n| `release.yml` | Tag push (`v*`) | Build and publish to PyPI |\n| `pages.yml` | Push to `main` | Deploy documentation site |\n\nOOXML parity details: `docs/parity_contract.md`. ODF reference contract: `docs/odf_validation_contract.md`.\n\n## pytest Plugin\n\nFixtures are registered automatically — just `pip install openxml-audit` and use them:\n\n```python\ndef test_my_presentation(assert_valid_pptx, tmp_path):\n    output = tmp_path / \"output.pptx\"\n    generate_pptx(output)\n    assert_valid_pptx(output)  # fails with detailed errors if invalid\n\ndef test_my_document(assert_valid_docx, tmp_path):\n    output = tmp_path / \"output.docx\"\n    generate_docx(output)\n    assert_valid_docx(output)\n\ndef test_my_spreadsheet(assert_valid_xlsx, tmp_path):\n    output = tmp_path / \"output.xlsx\"\n    generate_xlsx(output)\n    assert_valid_xlsx(output)\n\ndef test_odf_file(assert_valid_odf, tmp_path):\n    output = tmp_path / \"output.odt\"\n    generate_odt(output)\n    assert_valid_odf(output)\n```\n\nCLI options:\n\n```bash\n# Validate against a specific Office version\npytest --openxml-format Office2007\n\n# Limit errors collected per file\npytest --openxml-max-errors 50\n```\n\nAvailable fixtures: `openxml_validator`, `assert_valid_pptx`, `assert_valid_docx`, `assert_valid_xlsx`, `assert_valid_odf`.\n\n## Integration Helpers\n\n```python\n# Context manager\nfrom openxml_audit import validation_context\n\nwith validation_context(raise_on_invalid=True) as validator:\n    result = validator.validate(\"presentation.pptx\")\n\n# Decorator — validate after save\nfrom openxml_audit import validate_on_save\n\n@validate_on_save(raise_on_invalid=True)\ndef create_presentation(output_path: str) -\u003e None:\n    Presentation().save(output_path)\n\n# Decorator — require valid input\nfrom openxml_audit import require_valid_pptx\n\n@require_valid_pptx()\ndef process(input_path: str) -\u003e dict: ...\n```\n\n## API Reference\n\n### `OpenXmlValidator` / `OdfValidator`\n\n```python\nOpenXmlValidator(file_format=FileFormat.OFFICE_2019, max_errors=1000,\n                 schema_validation=True, semantic_validation=True)\n\nOdfValidator(file_format=FileFormat.ODF_1_3, max_errors=1000,\n             schema_validation=True, semantic_validation=True,\n             security_validation=False, strict=True)\n```\n\nBoth expose:\n- `validate(path) -\u003e ValidationResult`\n- `validate_with_timings(path) -\u003e (ValidationResult, dict[str, float])`\n- `is_valid(path) -\u003e bool`\n\n### `ValidationResult`\n\n| Property | Type | Description |\n|----------|------|-------------|\n| `is_valid` | `bool` | No ERROR-severity issues |\n| `errors` | `list[ValidationError]` | All errors and warnings |\n| `error_count` / `warning_count` | `int` | Counts by severity |\n| `file_path` | `str` | Validated file path |\n| `file_format` | `FileFormat` | Version validated against |\n\n### `ValidationError`\n\n| Property | Type | Description |\n|----------|------|-------------|\n| `error_type` | `ValidationErrorType` | `PACKAGE`, `BINARY`, `SCHEMA`, `SEMANTIC`, `RELATIONSHIP`, `MARKUP_COMPATIBILITY` |\n| `severity` | `ValidationSeverity` | `ERROR`, `WARNING`, `INFO` |\n| `description` | `str` | Human-readable message |\n| `part_uri` | `str \\| None` | Affected part URI |\n| `path` | `str \\| None` | XPath to affected element |\n\n### Supported Formats\n\n| OOXML | ODF |\n|-------|-----|\n| `OFFICE_2007` through `MICROSOFT_365` (default: `OFFICE_2019`) | `ODF_1_2`, `ODF_1_3` (default: `ODF_1_3`) |\n\n### Convenience Functions\n\n- `validate_pptx(path) -\u003e ValidationResult`\n- `is_valid_pptx(path) -\u003e bool`\n\n## Works Well With\n\nThese libraries create Office files — openxml-audit checks them:\n\n| Library | Format | Link |\n|---------|--------|------|\n| [python-pptx](https://github.com/scanny/python-pptx) | PPTX | Create and update PowerPoint files |\n| [python-docx](https://github.com/python-openxml/python-docx) | DOCX | Create and update Word files |\n| [openpyxl](https://openpyxl.readthedocs.io/) | XLSX | Create and update Excel files |\n\n```python\nfrom pptx import Presentation\nfrom openxml_audit import validate_pptx\n\nPresentation().save(\"output.pptx\")\n\nresult = validate_pptx(\"output.pptx\")\nif not result.is_valid:\n    print(f\"{result.error_count} issues found\")\n```\n\n## Contributing\n\nContributions are welcome! See [CONTRIBUTING.md](CONTRIBUTING.md) for dev setup and guidelines.\n\n## Looking for Maintainers\n\nThis project is actively looking for co-maintainers — especially people working with:\n\n- Office file generation pipelines (python-pptx, python-docx, openpyxl)\n- ODF tooling and OASIS conformance\n- Open XML SDK internals\n\nIf you're interested, open an issue or reach out.\n\n## Funding\n\nIf this project saves you time, consider sponsoring its development:\n\n[![GitHub Sponsors](https://img.shields.io/badge/sponsor-GitHub%20Sponsors-ea4aaa)](https://github.com/sponsors/BramAlkema)\n\n## Changelog\n\nSee [CHANGELOG.md](CHANGELOG.md) for a full list of changes by version.\n\n## License\n\n[MIT](LICENSE)\n\n## Acknowledgments\n\nBased on the validation logic from Microsoft's [Open XML SDK](https://github.com/OfficeDev/Open-XML-SDK) for .NET.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbramalkema%2Fopenxml-audit","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbramalkema%2Fopenxml-audit","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbramalkema%2Fopenxml-audit/lists"}