https://github.com/databrickslabs/dqx
Databricks framework to validate Data Quality of pySpark DataFrames and Tables
https://github.com/databrickslabs/dqx
data-profiling data-quality data-quality-monitoring databricks lakeflow spark spark-streaming unity-catalog
Last synced: about 2 months ago
JSON representation
Databricks framework to validate Data Quality of pySpark DataFrames and Tables
- Host: GitHub
- URL: https://github.com/databrickslabs/dqx
- Owner: databrickslabs
- License: other
- Created: 2024-04-23T18:28:43.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2026-02-05T22:03:18.000Z (about 2 months ago)
- Last Synced: 2026-02-05T22:51:35.025Z (about 2 months ago)
- Topics: data-profiling, data-quality, data-quality-monitoring, databricks, lakeflow, spark, spark-streaming, unity-catalog
- Language: Python
- Homepage: https://databrickslabs.github.io/dqx
- Size: 6.99 MB
- Stars: 372
- Watchers: 6
- Forks: 85
- Open Issues: 77
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
- Codeowners: CODEOWNERS
- Notice: NOTICE
Awesome Lists containing this project
README
DQX by Databricks Labs
===
Simplified Data Quality checking at Scale for PySpark Workloads on streaming and standard DataFrames.
[](https://github.com/databrickslabs/dqx/actions/workflows/push.yml)
[](https://codecov.io/github/databrickslabs/dqx)

[](https://pypi.org/project/databricks-labs-dqx/)

# Documentation
The complete documentation is available at: [https://databrickslabs.github.io/dqx/](https://databrickslabs.github.io/dqx/)
# Contribution
Please see the contribution guidance [here](https://databrickslabs.github.io/dqx/docs/dev/contributing/) on how to contribute to the project (build, test, and submit a PR).
# Project Support
Please note that this project is provided for your exploration only and is not
formally supported by Databricks with Service Level Agreements (SLAs). They are
provided AS-IS, and we do not make any guarantees. Please do not
submit a support ticket relating to any issues arising from the use of this project.
Any issues discovered through the use of this project should be filed as GitHub
[Issues on this repository](https://github.com/databrickslabs/dqx/issues).
They will be reviewed as time permits, but no formal SLAs for support exist.