https://github.com/llucmh/dbt-checks
Reusable, declarative data quality checks for dbt projects.
https://github.com/llucmh/dbt-checks
analytics analytics-engineering bigquery data-quality data-quality-checks data-validation data-validation-library databricks date-engineering dbt duckdb postgres snowflake sql testing testing-framework testing-tool
Last synced: 23 days ago
JSON representation
Reusable, declarative data quality checks for dbt projects.
- Host: GitHub
- URL: https://github.com/llucmh/dbt-checks
- Owner: LlucMH
- License: mit
- Created: 2025-11-18T23:02:38.000Z (7 months ago)
- Default Branch: main
- Last Pushed: 2026-04-03T23:11:28.000Z (2 months ago)
- Last Synced: 2026-04-03T23:26:18.003Z (2 months ago)
- Topics: analytics, analytics-engineering, bigquery, data-quality, data-quality-checks, data-validation, data-validation-library, databricks, date-engineering, dbt, duckdb, postgres, snowflake, sql, testing, testing-framework, testing-tool
- Homepage: https://github.com/LlucMH/dbt-checks
- Size: 1.91 MB
- Stars: 2
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project
README
---
---
**`dbt-checks`** is a lightweight library of reusable data quality checks for dbt projects.
It provides simple, expressive tests to validate business rules and data integrity directly in your models — without writing custom SQL every time.
> ⚠️ Early-stage project — feedback and contributions are welcome.
---
# Installation
Add the package to your `packages.yml`:
```yaml
packages:
- git: https://github.com/LlucMH/dbt-checks.git
revision: v0.3.2
```
Then install dependencies:
``` bash
dbt deps
```
💡 Always pin a version in production projects.
# Usage
Checks can be added directly to models or columns in your schema files.
Example:
``` yaml
models:
- name: orders
columns:
- name: value
data_tests:
- dbt_checks.non_negative
- dbt_checks.between_values:
arguments:
min_value: 0
max_value: 10000
```
Run tests as usual:
``` bash
dbt test
```
# Scoped Checks with `where`
All checks support an optional `where` argument to apply validations only to a subset of rows.
This is useful when you want to validate specific business segments, statuses, partitions, or recent data.
Example:
```yaml
models:
- name: orders
columns:
- name: value
data_tests:
- dbt_checks.greater_than:
arguments:
value: 0
where: "status = 'active'"
```
The `where` expression is applied before the check runs.
# Standardized Failure Output
dbt-checks provides standardized and human-readable failure outputs designed for easier debugging and CI visibility.
Instead of generic outputs like:
```text
Got 1 result, configured to fail if != 0
```
checks now expose contextual failure information.
## Row-level checks
Example output:
| failing_value | expected_min_value | failed_check | failure_reason |
| --- | --- | --- | --- |
| -5 | 0 | non_negative | Value must be greater than or equal to 0 |
Used by:
- numeric checks
- string checks
- most temporal checks
---
## Aggregation checks
Example output:
| actual_value | expected_min_value | expected_max_value |
| --- | --- | --- |
| 1500 | 0 | 1000 |
Used by:
- avg_between
- sum_between
- min_between
- max_between
- row_count_between
---
## Ratio checks
Example output:
| actual_ratio | expected_min_ratio | expected_max_ratio |
| --- | --- | --- |
| 0.92 | 0.0 | 0.80 |
Used by:
- null_ratio_between
- positive_ratio_between
- negative_ratio_between
- value_ratio_between
---
## Additional Context
Checks may also expose:
- `failed_check`
- `failure_reason`
- `applied_condition`
- `actual_length`
- `actual_diff_days`
- `actual_day_of_week`
This makes dbt-checks outputs easier to:
- debug in CI
- inspect in stored failures
- integrate with observability tooling
- consume programmatically
# NULL Handling
dbt-checks follows a consistent and explicit null-handling strategy.
Most checks ignore null values by default.
Use dedicated checks to validate null presence.
## Summary:
- Numeric → ignored
- String → ignored
- Temporal → ignored
- Aggregation → ignored (SQL behavior)
- Row count → includes nulls
- Ratio checks → explicit handling
Use:
- null_ratio_below
- null_ratio_between
# Available Checks
dbt-checks provides reusable data validation tests grouped by category.
## Numeric
Numeric checks validate numeric ranges and thresholds.
Check | Description
----- | ----------
`non_negative` | Ensures values are ≥ 0
`non_positive` | Ensures values are ≤ 0
`greater_than` | Ensures values are greater than a threshold
`greater_or_equal_than` | Ensures values are ≥ a threshold
`less_than` | Ensures values are less than a threshold
`less_or_equal_than` | Ensures values are ≤ a threshold
`between_values` | Ensures values fall within a numeric range
Example
``` yaml
columns:
- name: value
data_tests:
- dbt_checks.between_values:
arguments:
min_value: 0
max_value: 100
```
## String
String checks validate textual fields such as identifiers or formatted values.
Check | Description
----- | ----------
`not_blank` | Ensures strings are not empty or whitespace
`length_between` | Validates string length range
`matches_regex` | Validates a regex pattern
`starts_with` | Ensures string starts with prefix
`ends_with` | Ensures string ends with suffix
`contains` | Ensures string contains substring
Example
``` yaml
columns:
- name: email
data_tests:
- dbt_checks.matches_regex:
arguments:
pattern: "^[^@]+@[^@]+\\.[^@]+$"
```
## Temporal
Temporal checks validate date and timestamp fields.
Check | Description
----- | ----------
`not_future_date` | Ensures date is not in the future
`not_before_date` | Ensures date is after a minimum date
`between_dates` | Ensures date is within a range
`recent_date` | Ensures date is within N days
`date_diff_less_than` | Ensures difference between two dates is within threshold
`no_weekend_dates` | Ensures dates do not fall on weekends
Example
``` yaml
columns:
- name: date
data_tests:
- dbt_checks.recent_date:
arguments:
max_age_days: 7
```
## Aggregation
Aggregation checks validate dataset-level metrics.
Nulls follow SQL behavior (ignored in aggregation).
Check | Description
----- | ----------
`row_count_greater_than` | Ensures model has at least N rows
`row_count_less_than` | Ensures model has at most N rows
`row_count_between` | Ensures row count falls within range
`sum_between` | Ensures column sum falls within range
`avg_between` | Ensures column average falls within range
`max_between` | Ensures column maximum falls within range
`min_between` | Ensures column minimum falls within range
**If all values are null → test fails**
Example
``` yaml
models:
- name: orders
data_tests:
- dbt_checks.row_count_greater_than:
arguments:
value: 100
```
## Ratio
Ratio checks validate proportions of rows matching a condition.
Check | Description
----- | ----------
`null_ratio_below` | Ensures null ratio is below threshold
`null_ratio_between` | Ensures null ratio is within range
`positive_ratio_between` | Ensures positive value ratio within range
`negative_ratio_between` | Ensures negative value ratio within range
`value_ratio_between` | Ensures specific value ratio within range
**Null handling:**
- null_ratio_* explicitly evaluates nulls
- others use total row count as denominator
Example
``` yaml
columns:
- name: email
data_tests:
- dbt_checks.null_ratio_below:
arguments:
threshold: 0.05
```
# Supported Warehouses
`dbt-checks` is designed to work across common dbt adapters:
- Snowflake
- BigQuery
- Databricks
- Spark
- Redshift
- Postgres
Adapter-specific behavior is handled through dbt's `dispatch` mechanism.
**Tested on DuckDB in CI.**
**Aditional adapters are supported through dbt dispatch (best-efort compatibility).**
# Why dbt-checks?
Many dbt projects repeatedly implement the same validation logic.
`dbt-checks` provides:
- reusable checks
- simple configuration
- scoped checks with optional `where` filters
- standardized failure outputs
- CI-friendly debugging context
- predictable null handling
- consistent validation patterns
- cross-warehouse compatibility
- reusable internal helper architecture
- consistent SQL generation across checks
- centralized casting, predicates, ratios, and filtering logic
# Internal Architecture
`dbt-checks` uses reusable internal helper macros to standardize SQL generation across all checks.
Internal helpers include:
- casting helpers
- reusable predicates
- ratio utilities
- filter application helpers
- date utilities
- validation helpers
This improves:
- maintainability
- adapter compatibility
- consistency
- future extensibility
# Contributing
Contributions are welcome.
To add a new check:
1. Implement it in `macros/tests`
2. Reuse helper macros when possible
3. Add documentation
4. Add integration tests (including null behavior)
# License
This project is licensed under the MIT License — see the [LICENSE](LICENSE) file for details.