{"id":47617136,"url":"https://github.com/opendqv/opendqv","last_synced_at":"2026-04-25T15:03:13.508Z","repository":{"id":345707914,"uuid":"1180353452","full_name":"OpenDQV/OpenDQV","owner":"OpenDQV","description":"Open-source, contract-driven data quality validation. Shift-left enforcement at the point of write — before data enters your pipeline.","archived":false,"fork":false,"pushed_at":"2026-04-11T09:17:12.000Z","size":50107,"stargazers_count":10,"open_issues_count":1,"forks_count":2,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-04-11T10:27:17.019Z","etag":null,"topics":["data-contracts","data-governance","data-quality","data-validation","fastapi","mcp","open-source","python","shift-left"],"latest_commit_sha":null,"homepage":"https://github.com/OpenDQV/OpenDQV","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/OpenDQV.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":".github/CODEOWNERS","security":"SECURITY.md","support":"SUPPORT.md","governance":"GOVERNANCE.md","roadmap":"docs/roadmap.md","authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":"CLA.md"}},"created_at":"2026-03-13T00:39:41.000Z","updated_at":"2026-04-11T09:17:20.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/OpenDQV/OpenDQV","commit_stats":null,"previous_names":["opendqv/opendqv"],"tags_count":46,"template":false,"template_full_name":null,"purl":"pkg:github/OpenDQV/OpenDQV","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OpenDQV%2FOpenDQV","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OpenDQV%2FOpenDQV/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OpenDQV%2FOpenDQV/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OpenDQV%2FOpenDQV/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/OpenDQV","download_url":"https://codeload.github.com/OpenDQV/OpenDQV/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OpenDQV%2FOpenDQV/sbom","scorecard":{"id":1245328,"data":{"date":"2026-03-28T05:55:06Z","repo":{"name":"github.com/OpenDQV/OpenDQV","commit":"135cfbb0ac2895627ef2b095b4a5776db8af9966"},"scorecard":{"version":"v5.0.0","commit":"ea7e27ed41b76ab879c862fa0ca4cc9c61764ee4"},"score":5.6,"checks":[{"name":"Binary-Artifacts","score":10,"reason":"no binaries found in the repo","details":null,"documentation":{"short":"Determines if the project has generated executable (binary) artifacts in the source repository.","url":"https://github.com/ossf/scorecard/blob/ea7e27ed41b76ab879c862fa0ca4cc9c61764ee4/docs/checks.md#binary-artifacts"}},{"name":"Branch-Protection","score":-1,"reason":"internal error: error during branchesHandler.setup: internal error: githubv4.Query: Resource not accessible by integration","details":null,"documentation":{"short":"Determines if the default and release branches are protected with GitHub's branch protection settings.","url":"https://github.com/ossf/scorecard/blob/ea7e27ed41b76ab879c862fa0ca4cc9c61764ee4/docs/checks.md#branch-protection"}},{"name":"CI-Tests","score":10,"reason":"1 out of 1 merged PRs checked by a CI test -- score normalized to 10","details":null,"documentation":{"short":"Determines if the project runs tests before pull requests are merged.","url":"https://github.com/ossf/scorecard/blob/ea7e27ed41b76ab879c862fa0ca4cc9c61764ee4/docs/checks.md#ci-tests"}},{"name":"CII-Best-Practices","score":5,"reason":"badge detected: Passing","details":null,"documentation":{"short":"Determines if the project has an OpenSSF (formerly CII) Best Practices Badge.","url":"https://github.com/ossf/scorecard/blob/ea7e27ed41b76ab879c862fa0ca4cc9c61764ee4/docs/checks.md#cii-best-practices"}},{"name":"Code-Review","score":0,"reason":"Found 0/28 approved changesets -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project requires human code review before pull requests (aka merge requests) are merged.","url":"https://github.com/ossf/scorecard/blob/ea7e27ed41b76ab879c862fa0ca4cc9c61764ee4/docs/checks.md#code-review"}},{"name":"Contributors","score":3,"reason":"project has 1 contributing companies or organizations -- score normalized to 3","details":["Info: anthropics contributor org/company found, "],"documentation":{"short":"Determines if the project has a set of contributors from multiple organizations (e.g., companies).","url":"https://github.com/ossf/scorecard/blob/ea7e27ed41b76ab879c862fa0ca4cc9c61764ee4/docs/checks.md#contributors"}},{"name":"Dangerous-Workflow","score":10,"reason":"no dangerous workflow patterns detected","details":null,"documentation":{"short":"Determines if the project's GitHub Action workflows avoid dangerous patterns.","url":"https://github.com/ossf/scorecard/blob/ea7e27ed41b76ab879c862fa0ca4cc9c61764ee4/docs/checks.md#dangerous-workflow"}},{"name":"Dependency-Update-Tool","score":10,"reason":"update tool detected","details":["Info: detected update tool: Dependabot: .github/dependabot.yml:1"],"documentation":{"short":"Determines if the project uses a dependency update tool.","url":"https://github.com/ossf/scorecard/blob/ea7e27ed41b76ab879c862fa0ca4cc9c61764ee4/docs/checks.md#dependency-update-tool"}},{"name":"Fuzzing","score":0,"reason":"project is not fuzzed","details":["Warn: no fuzzer integrations found"],"documentation":{"short":"Determines if the project uses fuzzing.","url":"https://github.com/ossf/scorecard/blob/ea7e27ed41b76ab879c862fa0ca4cc9c61764ee4/docs/checks.md#fuzzing"}},{"name":"License","score":10,"reason":"license file detected","details":["Info: project has a license file: LICENSE:0","Info: FSF or OSI recognized license: MIT License: LICENSE:0"],"documentation":{"short":"Determines if the project has defined a license.","url":"https://github.com/ossf/scorecard/blob/ea7e27ed41b76ab879c862fa0ca4cc9c61764ee4/docs/checks.md#license"}},{"name":"Maintained","score":0,"reason":"project was created in last 90 days. please review its contents carefully","details":["Warn: Repository was created in last 90 days."],"documentation":{"short":"Determines if the project is \"actively maintained\".","url":"https://github.com/ossf/scorecard/blob/ea7e27ed41b76ab879c862fa0ca4cc9c61764ee4/docs/checks.md#maintained"}},{"name":"Packaging","score":10,"reason":"packaging workflow detected","details":["Info: Project packages its releases by way of GitHub Actions.: .github/workflows/ci.yml:111"],"documentation":{"short":"Determines if the project is published as a package that others can easily download, install, easily update, and uninstall.","url":"https://github.com/ossf/scorecard/blob/ea7e27ed41b76ab879c862fa0ca4cc9c61764ee4/docs/checks.md#packaging"}},{"name":"Pinned-Dependencies","score":3,"reason":"dependency not pinned by hash detected -- score normalized to 3","details":["Info: Possibly incomplete results: error parsing shell code: \"foo(\" must be followed by ): scripts/clean_room_test.sh:385","Warn: third-party GitHubAction not pinned by hash: .github/workflows/ci.yml:55: update your workflow using https://app.stepsecurity.io/secureworkflow/OpenDQV/OpenDQV/ci.yml/main?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/link-check.yml:21: update your workflow using https://app.stepsecurity.io/secureworkflow/OpenDQV/OpenDQV/link-check.yml/main?enable=pin","Warn: third-party GitHubAction not pinned by hash: .github/workflows/link-check.yml:24: update your workflow using https://app.stepsecurity.io/secureworkflow/OpenDQV/OpenDQV/link-check.yml/main?enable=pin","Warn: third-party GitHubAction not pinned by hash: .github/workflows/link-check.yml:55: update your workflow using https://app.stepsecurity.io/secureworkflow/OpenDQV/OpenDQV/link-check.yml/main?enable=pin","Warn: containerImage not pinned by hash: Dockerfile:3","Warn: containerImage not pinned by hash: Dockerfile.smoketest:1: pin your Docker image by updating python:3.11-slim to python:3.11-slim@sha256:9358444059ed78e2975ada2c189f1c1a3144a5dab6f35bff8c981afb38946634","Warn: pipCommand not pinned by hash: Dockerfile:16-17","Warn: pipCommand not pinned by hash: Dockerfile:16-17","Warn: pipCommand not pinned by hash: Dockerfile.smoketest:7-8","Warn: pipCommand not pinned by hash: Dockerfile.smoketest:7-8","Warn: pipCommand not pinned by hash: install.sh:28","Warn: pipCommand not pinned by hash: scripts/clean_room_test.sh:84","Warn: pipCommand not pinned by hash: scripts/clean_room_test.sh:404","Warn: pipCommand not pinned by hash: scripts/clean_room_test.sh:416","Warn: downloadThenRun not pinned by hash: scripts/clean_room_test.sh:599","Warn: downloadThenRun not pinned by hash: scripts/clean_room_test.sh:613","Warn: downloadThenRun not pinned by hash: scripts/clean_room_test.sh:614","Warn: pipCommand not pinned by hash: scripts/run_smoke_tests.sh:132","Warn: pipCommand not pinned by hash: .github/workflows/ci.yml:35","Warn: pipCommand not pinned by hash: .github/workflows/ci.yml:76","Warn: pipCommand not pinned by hash: .github/workflows/ci.yml:98","Warn: pipCommand not pinned by hash: .github/workflows/ci.yml:110","Warn: downloadThenRun not pinned by hash: .github/workflows/ci.yml:137","Warn: pipCommand not pinned by hash: .github/workflows/ci.yml:181","Warn: pipCommand not pinned by hash: .github/workflows/ci.yml:204","Warn: pipCommand not pinned by hash: .github/workflows/publish.yml:34","Warn: pipCommand not pinned by hash: .github/workflows/release.yml:52","Warn: pipCommand not pinned by hash: .github/workflows/release.yml:119","Info:  26 out of  27 GitHub-owned GitHubAction dependencies pinned","Info:  10 out of  13 third-party GitHubAction dependencies pinned","Info:   0 out of   2 containerImage dependencies pinned","Info:   0 out of  18 pipCommand dependencies pinned","Info:   0 out of   4 downloadThenRun dependencies pinned"],"documentation":{"short":"Determines if the project has declared and pinned the dependencies of its build process.","url":"https://github.com/ossf/scorecard/blob/ea7e27ed41b76ab879c862fa0ca4cc9c61764ee4/docs/checks.md#pinned-dependencies"}},{"name":"SAST","score":10,"reason":"SAST tool is run on all commits","details":["Info: SAST configuration detected: CodeQL","Info: all commits (2) are checked with a SAST tool"],"documentation":{"short":"Determines if the project uses static code analysis.","url":"https://github.com/ossf/scorecard/blob/ea7e27ed41b76ab879c862fa0ca4cc9c61764ee4/docs/checks.md#sast"}},{"name":"Security-Policy","score":10,"reason":"security policy file detected","details":["Info: security policy file detected: SECURITY.md:1","Info: Found linked content: SECURITY.md:1","Info: Found disclosure, vulnerability, and/or timelines in security policy: SECURITY.md:1","Info: Found text in security policy: SECURITY.md:1"],"documentation":{"short":"Determines if the project has published a security policy.","url":"https://github.com/ossf/scorecard/blob/ea7e27ed41b76ab879c862fa0ca4cc9c61764ee4/docs/checks.md#security-policy"}},{"name":"Signed-Releases","score":-1,"reason":"no releases found","details":null,"documentation":{"short":"Determines if the project cryptographically signs release artifacts.","url":"https://github.com/ossf/scorecard/blob/ea7e27ed41b76ab879c862fa0ca4cc9c61764ee4/docs/checks.md#signed-releases"}},{"name":"Token-Permissions","score":0,"reason":"detected GitHub workflow tokens with excessive permissions","details":["Info: jobLevel 'contents' permission set to 'read': .github/workflows/ci.yml:164","Info: jobLevel 'contents' permission set to 'read': .github/workflows/ci.yml:195","Info: jobLevel 'contents' permission set to 'read': .github/workflows/ci.yml:24","Info: jobLevel 'contents' permission set to 'read': .github/workflows/ci.yml:66","Info: jobLevel 'contents' permission set to 'read': .github/workflows/ci.yml:88","Info: jobLevel 'contents' permission set to 'read': .github/workflows/ci.yml:115","Info: jobLevel 'contents' permission set to 'read': .github/workflows/codeql.yml:19","Info: jobLevel 'contents' permission set to 'read': .github/workflows/docker-publish.yml:18","Info: jobLevel 'contents' permission set to 'read': .github/workflows/release.yml:105","Info: jobLevel 'contents' permission set to 'read': .github/workflows/scorecard.yml:23","Info: jobLevel 'actions' permission set to 'read': .github/workflows/scorecard.yml:24","Info: topLevel permissions set to 'read-all': .github/workflows/ci.yml:9","Info: topLevel permissions set to 'read-all': .github/workflows/codeql.yml:12","Info: topLevel 'contents' permission set to 'read': .github/workflows/docker-publish.yml:11","Warn: no topLevel permission defined: .github/workflows/link-check.yml:1","Info: topLevel 'contents' permission set to 'read': .github/workflows/publish.yml:14","Warn: topLevel 'contents' permission set to 'write': .github/workflows/release.yml:32","Info: topLevel permissions set to 'read-all': .github/workflows/scorecard.yml:12","Info: no jobLevel write permissions found"],"documentation":{"short":"Determines if the project's workflows follow the principle of least privilege.","url":"https://github.com/ossf/scorecard/blob/ea7e27ed41b76ab879c862fa0ca4cc9c61764ee4/docs/checks.md#token-permissions"}},{"name":"Vulnerabilities","score":3,"reason":"7 existing vulnerabilities detected","details":["Warn: Project is vulnerable to: GHSA-5239-wwwm-4pmq","Warn: Project is vulnerable to: GHSA-2c2j-9gv5-cj73","Warn: Project is vulnerable to: GHSA-7f5h-v6xp-fcq8","Warn: Project is vulnerable to: GHSA-w2gf-jxc9-pf2q / PYSEC-2024-203","Warn: Project is vulnerable to: GHSA-5xh2-23cc-5jc6","Warn: Project is vulnerable to: GHSA-79gp-q4wv-33fr / PYSEC-2024-171","Warn: Project is vulnerable to: GHSA-7p48-42j8-8846"],"documentation":{"short":"Determines if the project has open, known unfixed vulnerabilities.","url":"https://github.com/ossf/scorecard/blob/ea7e27ed41b76ab879c862fa0ca4cc9c61764ee4/docs/checks.md#vulnerabilities"}}]},"last_synced_at":"2026-03-28T07:25:06.788Z","repository_id":345707914,"created_at":"2026-03-28T07:25:06.788Z","updated_at":"2026-03-28T07:25:06.788Z"},"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31701642,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-11T21:17:31.016Z","status":"online","status_checked_at":"2026-04-12T02:00:06.763Z","response_time":58,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-contracts","data-governance","data-quality","data-validation","fastapi","mcp","open-source","python","shift-left"],"created_at":"2026-04-01T21:36:17.271Z","updated_at":"2026-04-25T15:03:13.497Z","avatar_url":"https://github.com/OpenDQV.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cp align=\"center\"\u003e\n  \u003cimg src=\"docs/assets/OpenDQV-Logo-Hires.png\" alt=\"OpenDQV — Open Data Quality Validation\" width=\"480\"\u003e\n\u003c/p\u003e\n\n[![CI](https://github.com/OpenDQV/OpenDQV/actions/workflows/ci.yml/badge.svg)](https://github.com/OpenDQV/OpenDQV/actions/workflows/ci.yml)\n[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](https://github.com/OpenDQV/OpenDQV/blob/main/LICENSE)\n[![Python 3.11+](https://img.shields.io/badge/python-3.11%2B-blue)](https://pypi.org/project/opendqv/)\n[![PyPI](https://img.shields.io/pypi/v/opendqv?style=flat)](https://pypi.org/project/opendqv/)\n[![Docker](https://img.shields.io/badge/docker-ghcr.io%2Fopendqv-blue?logo=docker)](https://github.com/orgs/OpenDQV/packages/container/package/opendqv%2Fopendqv)\n[![Platforms](https://img.shields.io/badge/platforms-Linux%20%7C%20macOS%20%7C%20Windows%20%7C%20ARM64-lightgrey)](#)\n[![OpenSSF Scorecard](https://api.securityscorecards.dev/projects/github.com/OpenDQV/OpenDQV/badge)](https://securityscorecards.dev/#/projects/github.com/OpenDQV/OpenDQV)\n[![Coverage](https://codecov.io/gh/OpenDQV/OpenDQV/branch/main/graph/badge.svg)](https://codecov.io/gh/OpenDQV/OpenDQV)\n[![Ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff)\n[![OpenSSF Best Practices](https://www.bestpractices.dev/projects/12229/badge)](https://www.bestpractices.dev/projects/12229)\n\n| [Quickstart](docs/quickstart.md) | [Rules](docs/rules/) | [Contracts](docs/compliance-contracts.md) | [MCP](docs/mcp.md) | [API](docs/index.md) | [Security](SECURITY.md) | [FAQ](docs/faq.md) |\n|---|---|---|---|---|---|---|\n\n\u003e **\"Trust is easier to build than to repair.\"**\n\u003e That is why OpenDQV exists. A `422` at the point of write is cheaper than a data incident three weeks later.\n\n\u003e **Beta (v2.x).** Public API surface (REST, contract YAML, MCP tools, Python SDK) is stable. Breaking changes follow a one-release deprecation cycle. Security fixes backported to the latest 2.x line. See [API Stability](#api-stability) for commitments.\n\n**OpenDQV is a write-time data validation service.** Source systems call it before writing data. Bad records return a `422` with per-field errors. Good records pass through. No payload is stored.\n\n![OpenDQV demo — define a contract, send a bad record (get a 422), fix it (get a 200)](docs/demo_wizard.gif)\n\n```mermaid\nflowchart LR\n    subgraph Callers\n        direction TB\n        SF[Salesforce]\n        SAP[SAP]\n        DYN[Dynamics]\n        ORA[Oracle]\n        WEB[Web forms]\n        ETL1[ETL pipelines]\n\n        DJ[Django clean]\n        PY[Python scripts]\n        PD[Pandas / ETL]\n\n        CD[Claude Desktop]\n        CUR[Cursor]\n        LLM[LLM agents]\n    end\n\n    subgraph OpenDQV\n        direction TB\n        API[Validation API\\nREST / batch]\n        SDK[LocalValidator\\nin-process SDK]\n        MCP[MCP Server\\nAI-native]\n        API \u0026 SDK \u0026 MCP --\u003e CON[Contracts · YAML\\nGovernance · RBAC\\nAudit trail]\n        API \u0026 SDK \u0026 MCP --\u003e GEN[Code Generator\\nApex · JS · SQL]\n    end\n\n    subgraph Results\n        direction TB\n        R1[valid: true / false]\n        R2[per-field errors]\n        R3[severity levels]\n        R4[webhooks on events]\n    end\n\n    SF \u0026 SAP \u0026 DYN \u0026 ORA \u0026 WEB \u0026 ETL1 --\u003e API\n    DJ \u0026 PY \u0026 PD --\u003e SDK\n    CD \u0026 CUR \u0026 LLM --\u003e MCP\n\n    API \u0026 SDK \u0026 MCP --\u003e R1\n\n    subgraph Importers\n        IMP[dbt schema · GX suites\\nSoda checks · ODCS · CSV]\n    end\n    IMP --\u003e CON\n\n    style API fill:#0d3b5e,stroke:#092a44,color:#fff\n    style SDK fill:#0d3b5e,stroke:#092a44,color:#fff\n    style MCP fill:#0d3b5e,stroke:#092a44,color:#fff\n    style CON fill:#1a8aad,stroke:#14708d,color:#fff\n    style GEN fill:#1a8aad,stroke:#14708d,color:#fff\n    style R1 fill:#2ec4e6,stroke:#1a8aad,color:#0d3b5e\n    style R2 fill:#2ec4e6,stroke:#1a8aad,color:#0d3b5e\n    style R3 fill:#2ec4e6,stroke:#1a8aad,color:#0d3b5e\n    style R4 fill:#2ec4e6,stroke:#1a8aad,color:#0d3b5e\n    style IMP fill:#1a8aad,stroke:#14708d,color:#fff\n```\n\nA `422` at the point of write closes the feedback loop — producers see failures immediately and fix them at source. Rejection rates drop over time because the tool changes the incentive, not just the outcome.\n\nFor post-landing monitoring use [Great Expectations](https://greatexpectations.io), [Soda](https://www.soda.io), or [dbt tests](https://docs.getdbt.com/docs/build/tests) — they're complementary, not competing. OpenDQV owns layer one (write-time enforcement); those tools own layer three (post-ingestion observability).\n\n---\n\n## AI Agents — first-class via MCP\n\nOpenDQV ships a built-in [Model Context Protocol](https://modelcontextprotocol.io) server, so [Claude Desktop](https://claude.ai/download), [Cursor](https://www.cursor.com), and any other MCP-compatible agent can discover contracts, validate records, and explain failures through tool calls the agent **explicitly declares** — no hallucinated compliance, no invented rules.\n\n[![Watch the 4-minute MCP demo](docs/demo_mcp_poster.png)](https://github.com/user-attachments/assets/4d414ff1-b08c-4ff1-91e4-e421f0d5391d)\n\n*4-minute demo: Claude Desktop uses two MCP servers — OpenDQV for validation, Marmot for catalog lineage — to check a menu item against `ppds_menu_item` for Natasha's Law allergen compliance, stating which tool calls it makes and why. ([Backup: download the MP4 from the repo](https://github.com/OpenDQV/OpenDQV/raw/main/docs/demo_mcp.mp4))*\n\nFor tool reference, write guardrails, remote/enterprise mode, and the Marmot composition pattern, see **[docs/mcp.md](docs/mcp.md)**.\n\n---\n\n## Install\n\n| I have... | Command |\n|-----------|---------|\n| Python 3.11+ | `git clone https://github.com/OpenDQV/OpenDQV.git \u0026\u0026 cd OpenDQV \u0026\u0026 bash install.sh` |\n| Docker | `git clone https://github.com/OpenDQV/OpenDQV.git \u0026\u0026 cd OpenDQV \u0026\u0026 cp .env.example .env \u0026\u0026 docker compose up -d` |\n| Just the SDK/CLI | `pip install opendqv` then `opendqv init` to bootstrap contracts |\n| None of the above | [Beginner setup guide →](docs/beginner-quickstart.md) |\n\n`install.sh` creates a virtual environment, installs dependencies, and launches the onboarding wizard. Docker pulls `ghcr.io/opendqv/opendqv:latest` — no build step required.\n\n\u003e ⚠️ `AUTH_MODE=open` (the default) has **no authentication**. Set `AUTH_MODE=token` and a strong `SECRET_KEY` in `.env` before any non-local deployment. See [SECURITY.md](SECURITY.md).\n\n---\n\n## Your First Validation\n\n**1. Write a contract** — drop a YAML file in your contracts directory (run `opendqv init --all` to copy the 43 bundled contracts, or `opendqv init` for a single starter):\n\n```yaml\ncontract:\n  name: order\n  version: \"1.0\"\n  owner: \"Data Governance\"\n  status: active\n  rules:\n    - name: valid_email\n      type: regex\n      field: email\n      pattern: \"^[^@\\\\s]+@[^@\\\\s]+\\\\.[^@\\\\s]+$\"\n      severity: error\n      error_message: \"Invalid email format\"\n    - name: amount_positive\n      type: min\n      field: amount\n      min: 0.01\n      severity: error\n      error_message: \"Order amount must be positive\"\n    - name: status_valid\n      type: allowed_values\n      field: status\n      allowed_values: [pending, confirmed, shipped, cancelled]\n      severity: error\n      error_message: \"Invalid order status\"\n```\n\n**2. Reload contracts:**\n\n```bash\ncurl -X POST http://localhost:8000/api/v1/contracts/reload\n```\n\n**3. Send a bad record — OpenDQV rejects it:**\n\n```bash\ncurl -s -X POST http://localhost:8000/api/v1/validate \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"contract\": \"order\", \"record\": {\"email\": \"not-an-email\", \"amount\": -5, \"status\": \"unknown\"}}'\n```\n\n```json\n{\n  \"valid\": false,\n  \"errors\": [\n    {\"field\": \"email\",  \"rule\": \"valid_email\",    \"message\": \"Invalid email format\",        \"severity\": \"error\"},\n    {\"field\": \"amount\", \"rule\": \"amount_positive\", \"message\": \"Order amount must be positive\", \"severity\": \"error\"},\n    {\"field\": \"status\", \"rule\": \"status_valid\",    \"message\": \"Invalid order status\",        \"severity\": \"error\"}\n  ],\n  \"contract\": \"order\",\n  \"version\": \"1.0\"\n}\n```\n\n**4. Fix the record — it passes:**\n\n```bash\ncurl -s -X POST http://localhost:8000/api/v1/validate \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"contract\": \"order\", \"record\": {\"email\": \"alice@example.com\", \"amount\": 49.99, \"status\": \"pending\"}}'\n```\n\n```json\n{\"valid\": true, \"errors\": [], \"warnings\": [], \"contract\": \"order\", \"version\": \"1.0\"}\n```\n\nThe `customer` contract ships pre-seeded if you want to skip step 1. The [quickstart guide](docs/quickstart.md) walks through authoring, lifecycle, and batch validation.\n\n---\n\n## Rules\n\n| Type | What it checks |\n|------|----------------|\n| `not_empty` | Field is present and non-empty |\n| `regex` | Field matches (or does not match) a pattern. Built-ins: `builtin:email`, `builtin:uuid`, `builtin:ipv4`, `builtin:url` |\n| `min` / `max` / `range` | Numeric bounds |\n| `min_length` / `max_length` | String length |\n| `date_format` | Parseable date/datetime. Falls back through common formats if no explicit format is set |\n| `allowed_values` | Value must be in a fixed list |\n| `lookup` | Value must appear in a local file or HTTP endpoint (with TTL cache) |\n| `compare` | Cross-field: `field` op `compare_to` — supports `gt`, `lt`, `gte`, `lte`, `eq`, `neq`, and `today`/`now` sentinels |\n| `required_if` / `forbidden_if` | Conditional: required or forbidden when another field equals a value |\n| `checksum` | Check-digit integrity: IBAN, GTIN/GS1, NHS, ISIN, LEI, VIN, CPF, ISRC |\n| `unique` | No duplicates within a batch (batch mode only) |\n| `cross_field_range` | Value must be between two other fields in the same record |\n| `field_sum` | Sum of named fields must equal a target (within optional tolerance) |\n| `geospatial_bounds` | Lat/lon pair within a bounding box |\n| `date_diff` | Difference between two date fields within a range |\n| `age_match` | Declared age consistent with date-of-birth field |\n\nRules have `severity: error` (blocks the record) or `severity: warning` (flags but allows).\nAny rule can include a `condition` block to apply it only when another field equals a given value.\n\nFull reference: [docs/rules/](docs/rules/)\n\n---\n\n## How it compares\n\nA mature data governance programme operates across three layers, each with a distinct job:\n\n| Layer | Purpose | Tools |\n|---|---|---|\n| **1. Write-time enforcement** | Prevent bad data from entering any system | **OpenDQV** |\n| **2. Catalog / governance / stewardship** | Ownership, glossary, lineage, policy, stewardship workflows | Alation, Atlan, Collibra, Purview, DataHub, Marmot |\n| **3. Pipeline testing / observability** | Detect drift, freshness issues, residual quality after ingestion | Great Expectations, Soda Core, dbt tests, Monte Carlo |\n\nOpenDQV Core owns layer one. Your catalog handles layer two, your pipeline tools handle layer three.\n\n| | Great Expectations / Soda / dbt | OpenDQV |\n|---|---|---|\n| **When** | After data lands (in warehouse/lake) | Before data is written (at the door) |\n| **Where** | Data pipelines, batch jobs | Source system integration points |\n| **Model** | Scan data at rest | Validate data in flight |\n| **Latency** | Minutes to hours (batch) | Milliseconds (API call) |\n| **Who calls it** | Data engineers | Data engineers, developers, CRM admins |\n\n**They're complementary.** Use Great Expectations to monitor your warehouse. Use OpenDQV to stop bad data from getting there in the first place.\n\n---\n\n## Contracts\n\n43 production-ready contracts ship inside the `opendqv` package covering GDPR, HIPAA, SOX, MiFID II,\nUK Building Safety Act, Martyn's Law, Natasha's Law, Ofcom Online Safety Act, EU DORA,\nand 20+ other regulatory frameworks across UK, EU, and US. `pip install opendqv` gives you all of them\n— `opendqv list` works with zero configuration.\n\nSee [docs/compliance-contracts.md](docs/compliance-contracts.md) for the full list with\nregulatory context, or browse [opendqv/contracts/](opendqv/contracts/) directly.\n17 minimal starter templates are in [examples/starter_contracts/](examples/starter_contracts/).\n\n---\n\n## Performance\n\nEC2 c6i.large, 2 workers, 12-rule contract, mixed 50/50 workload:\n**~482 req/s, p99 ~182 ms.** Sizing rule: `WEB_CONCURRENCY = number of vCPUs`.\n\nSee [docs/benchmark_throughput.md](docs/benchmark_throughput.md) for full platform comparison,\nmethodology, and monthly volume extrapolation.\n\n---\n\n## Documentation\n\n| | |\n|---|---|\n| [Quickstart](docs/quickstart.md) | Build your first contract in 15 minutes |\n| [Rules Reference](docs/rules/) | All rule types with parameters and examples |\n| [Compliance Contracts](docs/compliance-contracts.md) | 44 contracts with regulatory context |\n| [API Reference](docs/index.md) | REST endpoints, SDK, GraphQL, webhooks |\n| [Security](SECURITY.md) | Deployment checklist, threat model, RBAC |\n| [Production Deployment](docs/production_deployment.md) | Token auth, TLS, Docker Compose, hardening |\n| [Integrations](docs/index.md) | Salesforce, Kafka, Snowflake, dbt, Databricks, MCP, and more |\n| [All docs →](docs/) | 76 documentation files |\n\n---\n\n## API Stability\n\nOpenDQV is in Beta as of 2.0.0. The following stability commitments apply to the v2.x series:\n\n- **REST API endpoints** — paths, request bodies, and response shapes are stable within `v2.x`. Backwards-incompatible changes require a major version bump and follow a deprecation cycle (one minor release of warnings before removal).\n- **YAML contract format** — the contract schema (rules, fields, types) is stable within `v2.x`. New rule types may be added; existing rules will not change semantics without a deprecation cycle.\n- **Python SDK** — `OpenDQVClient`, `AsyncOpenDQVClient`, and `LocalValidator` public method signatures are stable within `v2.x`. Internal helpers (prefixed `_`) are not covered.\n- **MCP tools** — tool names and parameters are stable within `v2.x`.\n- **Security fixes** — backported to the latest 2.x line on a best-effort basis.\n\n### Known limitations in v2.2.x\n\n- **Rule null handling is inconsistent.** Most format rules fail when the target\n  field is missing; a few (`max_length`, `allowed_values`) pass silently;\n  `field_sum` and `ratio_check` coerce missing operands to `0`. Single-record\n  and batch paths disagree in a few cases. See\n  [`docs/rules/core_rules.md`](docs/rules/core_rules.md#null-handling-current-v22x-behaviour)\n  for the full matrix and the safe pattern to use today. v2.3.0 will make this\n  consistent (loud-by-default with an `optional: true` opt-out).\n- **Unknown rule types pass silently at runtime.** A typo in `type:` (e.g.\n  `min_lenght`) is caught by `opendqv lint` but not by the engine — a typo'd\n  rule is a disabled rule. Always lint before deploy. v2.3.0 will reject\n  unknown types at contract load.\n\n---\n\n## Contributing\n\nSee [CONTRIBUTING.md](CONTRIBUTING.md) for setup instructions, coding guidelines, and how to submit changes.\n\n## License\n\nMIT — see [LICENSE](LICENSE).\n\n## Acknowledgements\n\n**Led by [Sunny Sharma](https://uk.linkedin.com/in/sunny-sharma-3927632), [BGMS Consultants Ltd](https://www.bgmsconsultants.com).** The vision, the architecture, every contract, and every design decision in this repository are directed by a human who believes data quality is a write-time responsibility.\n\nOpenDQV is built with a hybrid team. Sunny leads — carbon and silicon. Three AI collaborators execute: Claude Sonnet 4.6 (primary developer), Claude Opus 4.6 (strategic auditor), and Grok (market intelligence). All answer to the same ethos: *trust is easier to build than to repair.*\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fopendqv%2Fopendqv","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fopendqv%2Fopendqv","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fopendqv%2Fopendqv/lists"}