{"id":46077862,"url":"https://github.com/infinitelambda/dbt-audit-helper-ext","last_synced_at":"2026-05-06T09:04:46.816Z","repository":{"id":264192162,"uuid":"892639565","full_name":"infinitelambda/dbt-audit-helper-ext","owner":"infinitelambda","description":"Extended Audit Helper solution 💪","archived":false,"fork":false,"pushed_at":"2026-04-15T05:07:09.000Z","size":7907,"stargazers_count":6,"open_issues_count":8,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-04-15T07:10:46.128Z","etag":null,"topics":["audit","dbt","extended","informatica","migration","validation"],"latest_commit_sha":null,"homepage":"","language":"Shell","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/infinitelambda.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2024-11-22T13:49:42.000Z","updated_at":"2026-04-15T05:06:28.000Z","dependencies_parsed_at":"2024-11-22T15:22:09.669Z","dependency_job_id":"24883ea4-c89e-4a2b-ba94-79bdd6d1fcbc","html_url":"https://github.com/infinitelambda/dbt-audit-helper-ext","commit_stats":null,"previous_names":["infinitelambda/dbt-audit-helper-ext"],"tags_count":28,"template":false,"template_full_name":null,"purl":"pkg:github/infinitelambda/dbt-audit-helper-ext","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/infinitelambda%2Fdbt-audit-helper-ext","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/infinitelambda%2Fdbt-audit-helper-ext/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/infinitelambda%2Fdbt-audit-helper-ext/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/infinitelambda%2Fdbt-audit-helper-ext/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/infinitelambda","download_url":"https://codeload.github.com/infinitelambda/dbt-audit-helper-ext/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/infinitelambda%2Fdbt-audit-helper-ext/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32686264,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-06T08:33:17.875Z","status":"ssl_error","status_checked_at":"2026-05-06T08:33:17.221Z","response_time":117,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["audit","dbt","extended","informatica","migration","validation"],"created_at":"2026-03-01T15:01:59.902Z","updated_at":"2026-05-06T09:04:46.809Z","avatar_url":"https://github.com/infinitelambda.png","language":"Shell","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003c!-- markdownlint-disable no-inline-html no-alt-text --\u003e\n# dbt-audit-helper-ext\n\n\u003cimg align=\"right\" width=\"150\" height=\"150\" src=\"https://raw.githubusercontent.com/infinitelambda/dbt-audit-helper-ext/main/docs/assets/img/il-logo.png\"\u003e\n\n**Extended Audit Helper solution 💪**\n\n[![docs](https://img.shields.io/badge/docs-visit%20folder-blue?style=flat\u0026logo=gitbook\u0026logoColor=white)](https://github.com/infinitelambda/dbt-audit-helper-ext/tree/main/docs)\n\n[![dbt-hub](https://img.shields.io/badge/Visit-dbt--hub%20↗️-FF694B?logo=dbt\u0026logoColor=FF694B)](https://hub.getdbt.com/infinitelambda/audit_helper_ext)\n[![support-snowflake](https://img.shields.io/badge/support-Snowflake-7faecd?logo=snowflake\u0026logoColor=7faecd)](https://docs.snowflake.com?ref=infinitelambda)\n[![support-bigquery](https://img.shields.io/badge/support-BigQuery-4285F4?logo=google-cloud\u0026logoColor=white)](https://cloud.google.com/bigquery/docs?ref=infinitelambda)\n[![support-databricks](https://img.shields.io/badge/support-Databricks-FF3621?logo=databricks\u0026logoColor=white)](https://docs.databricks.com?ref=infinitelambda)\n[![support-sqlserver](https://img.shields.io/badge/support-SQL%20Server-CC2927?logo=microsoft%20sql%20server\u0026logoColor=white)](https://docs.microsoft.com/en-us/sql/sql-server/?ref=infinitelambda)\n[![support-postgres](https://img.shields.io/badge/support-PostgreSQL-4169E1?logo=postgresql\u0026logoColor=white)](https://www.postgresql.org/docs/?ref=infinitelambda)\n[![support-dbt](https://img.shields.io/badge/support-dbt%20v1.7+-FF694B?logo=dbt\u0026logoColor=FF694B)](https://docs.getdbt.com?ref=infinitelambda)\n\nThis repository provides a collection of powerful macros designed to enhance data validation workflows that support:\n\n- _Historical Logging_: Automatically saving detailed validation results into a designated DWH table for comprehensive audit tracking\n- _Row-Level Detail Persistence_: Optionally persisting per-mart row-level comparison data for deep-dive investigation of mismatches\n- _Latest Summary Reporting_: Maintaining a concise, up-to-date summary table for quick insights into the current state of validations\n- _Codegen and Scripts_: Simplifying workflows, particularly valuable for migration projects by automating repetitive tasks\n\n**Data Warehouses**:\n\n- ❄️ Snowflake (default)\n- ☁️ BigQuery\n- 🧱 Databricks\n- ⛱️ SQL Server\n- 🐘 PostgreSQL\n\n\u003e **Upgrading to v0.9?** Check the [breaking changes](./docs/breaking-changes-v0.9.md) before you upgrade.\n\n## Installation\n\n- **Add to `packages.yml` file**:\n\n  ```yml\n  packages:\n    - package: infinitelambda/audit_helper_ext\n      version: [\"\u003e=0.1.0\", \"\u003c1.0.0\"]\n      # keep an eye on the latest version, and change it accordingly\n  ```\n\n  Or use the latest version from git:\n\n  ```yml\n  packages:\n    - git: \"https://github.com/infinitelambda/dbt-audit-helper-ext.git\"\n      revision: \u003crelease version or tag\u003e # 0.1.0\n  ```\n\n  And run `dbt deps` to install the package!\n\n- **Configure dispatch `search_order` in `dbt_project.yml` file** (only need for SQL Server):\n\n  ```yml\n  dispatch:\n    - macro_namespace: audit_helper\n      search_order: ['audit_helper_ext', 'audit_helper']\n    - macro_namespace: dbt\n      search_order: ['audit_helper_ext', 'dbt']\n  ```\n\n- **Initialize the resources**:\n\n  ```bash\n  dbt deps\n  dbt run -s audit_helper_ext\n  ```\n\n  This step will create the log table (`validation_log`) and the summary view on top (`validation_log_report`).\n  When row-level detail persistence is enabled, per-mart detail tables (`validation_log_detail__\u003cmart_table\u003e`) are created automatically during validation runs.\n\n- **Generate the validation macros**:\n\n  \u003e Check [`/scripts`](https://github.com/infinitelambda/dbt-audit-helper-ext/tree/main/scripts) directory for all the codegen utilities\n\n  Firstly, we need to determine the location (database and schema) of the source tables:\n\n  ** _If all source tables are in the same location_, we can use the environment variable to set these values:\n\n  ```bash\n  export SOURCE_SCHEMA=MY_SOURCE_SCHEMA\n  export SOURCE_DATABASE=MY_SOURCE_DATABASE\n  ```\n\n  ** _If having multiple locations_, we can start to configure the location inside each dbt models' `config` block:\n\n  ```sql\n  {{\n    config(\n      ...\n      audit_helper__source_database = 'MY_SOURCE_SCHEMA',\n      audit_helper__source_schema = 'MY_SOURCE_DATABASE'\n    )\n  }}\n  ...\n  ```\n\n  Then, we can start generating the validation macro files now.\n  Let's say we need to validate all models in `03_mart` directory:\n\n  ```bash\n  python dbt_packages/audit_helper_ext/scripts/create_validation_macros.py models/03_mart\n  ```\n\n  Or just aim to validation a specific model which is `03_mart/dim_sales`:\n\n  ```bash\n  python dbt_packages/audit_helper_ext/scripts/create_validation_macros.py \\\n    models/03_mart \\\n    dim_sales\n  ```\n\n  Finally, check out your dbt project at the directory named `macros/validation`!\n\n## Configuration\n\n### Query Pre-Hook\n\nFor adapter-specific query configurations (e.g., disabling parallel execution in PostgreSQL), you can use the `audit_helper__audit_query_pre_hooks` variable to execute SQL statements before each audit query:\n\n```yaml\nvars:\n  # PostgreSQL: Disable parallel execution to improve match rate\n  # (helps with window functions and double precision data type consistency)\n  audit_helper__audit_query_pre_hooks:\n    - 'SET max_parallel_workers_per_gather = 0'\n```\n\nYou can specify multiple pre-hook queries as a list. Each query will be executed sequentially before the audit query runs.\n\n**Example: Multiple pre-hooks**\n```yaml\nvars:\n  audit_helper__audit_query_pre_hooks:\n    - 'SET max_parallel_workers_per_gather = 0'\n    - 'SET work_mem = \"256MB\"'\n```\n\n**Use cases:**\n- **PostgreSQL**: Disable parallel execution to avoid discrepancies with window functions or double precision types\n- **Other adapters**: Set session-level configurations for performance tuning or behavior consistency\n\n## Validation Strategy\n\nThis repo contains the **useful macros** to support for saving the historical validation results into the DWH table ([`validation_log`](./models/validation_log.sql)), together with the latest summary table ([`validation_log_report`](./models/validation_log_report.sql)).\n\nThere are 3 main types of validation:\n\n- Count (`count`, [source](./macros/validation/get_validation_count.sql))\n- Schema (`schema`, [source](./macros/validation/get_validation_schema.sql))\n- Row by Row (`full`, [source](./macros/validation/get_validation_full.sql))\n\nAdditionally, we have the 4th type - `upstream_row_count` ([source](./macros/validation/get_upstream_row_count.sql)) which will be very useful to understand better the validtion context, for example, _the result might be up to 100% matched rate but there is 0 updates in the upstream models, hence there no updates in the final table, that means we can't not say surely it was a perfect match_.\n\n### Row-Level Detail Persistence\n\nWhen the `full` validation type runs, you can optionally persist the row-level comparison data into a dedicated detail table per mart model. This is incredibly useful for investigating mismatches without having to re-run the comparison query manually.\n\nEnable it by setting these variables in your `dbt_project.yml`:\n\n```yaml\nvars:\n  audit_helper__store_comparison_data: true         # enable row-level detail persistence\n  # audit_helper__store_matched_rows: false          # also persist identical rows (default: false)\n  # audit_helper__store_comparison_data_limit: none   # cap the number of sampled PKs (default: none = no limit)\n```\n\nEach detail table is created as `validation_log_detail__\u003cmart_table\u003e` in the same database/schema as `validation_log`, and includes:\n- All intersecting data columns from both relations (excluded columns are omitted)\n- Audit columns from `compare_and_classify_relation_rows` (e.g. `dbt_audit_row_status`, `dbt_audit_in_a`, `dbt_audit_in_b`)\n- Metadata columns: `dbt_audit_ext_mart_table`, `dbt_audit_ext_job_run_url`, `dbt_audit_ext_date_of_process`\n\nSee the [dbt Variables Reference](./docs/dbt-variables-reference.md) for full details on these variables.\n\nFor DX, we also have serveral other types:\n- Column by Column (`all_col`, [source](./macros/validation/get_validation_all_col.sql))\n- Count by Group (not available in `sh` script, [source](./macros/validation/get_validation_count_by_group.sql))\n- Show Column Conflicts (not available in `sh` script, [source](./macros/validation/show_validation_columns_conflicts.sql))\n\nDepending on projects, it might be vary in the strategy of validation. Therefore, in this package, we're suggesting 1 first approach that we've used successfully in the real-life migration project (Informatica to dbt).\n\n**Context**: Our dbt project has 3 layers (staging, intermediate, and mart). Each mart model will have the independant set of upstream models, or it is the isolated pipeline for each mart model. We want to validate mart models only.\n\n**Goal**: 100% matched rate ✅, \u003e=99% is still good 🟡, and below 99% is unacceptable ❌\n\n**Pre-requisites**: 2 consecutive snapshots (e.g. Day1, Day2) of both source data and mart tables\n\n**Flow**:\n\n- _Freeze the source data_, so we have `source__YYYYMMD1` and `source__YYYYMMD2`, `mart__YYYYMMD1` and `mart__YYYYMMD2`\n- _Scenario 1: Validate the fresh run against D1_\n  - Configure source yml to use `source__YYYYMMD1`\n  - Run dbt to build mart tables, callled `mart_dbt`\n  - Run validation macros to compare between `mart_dbt` vs `mart__YYYYMMD1` 👍\n- _Scenario 2: Validate the incremental run against D2 based on D1_\n  - Configure source yml to use `source__YYYYMMD2`\n  - Clone `mart__YYYYMMD1` to `mart_dbt` to mimic that dbt should have the D1 data already (e.g. [clone_relation](./macros/dwh/clone_relation.sql) for a single table, or [clone_relation_extended](./macros/dwh/clone_relation_extended.sql) to also clone dependent JOIN/LOOKUP tables not maintained by the pipeline)\n  - Run incrementally dbt to build mart tables\n  - Run validation macros to compare between `mart_dbt` vs `mart__YYYYMMD2` 👍👍\n\nFinnally, check the validation log report, and decide what to do next steps:\n\n🛩️ Sample report table on Snowflake:\n\n![alt text](./docs/assets/img/snowflake-report-table.png)\n\n💡 Optionally, let's build the [Sheet](https://docs.google.com/spreadsheets/d/1473_-s3R9D1Sx117fzqhY8SqjnqtfDmni6qKw_9tLXE/edit?usp=sharing) to communicate the outcome with clent, here is the BigQuery+GGSheet sample:\n\n![alt text](./docs/assets/img/google-sheet-validation_resul.png)\n\n## Demo\n\n\u003cdiv\u003e\n  \u003ca href=\"https://www.loom.com/share/bb20f033d92544bab2009984d661176a\"\u003e\n    \u003cp\u003edbt-audit-helper Extension - First Version - Watch Video\u003c/p\u003e\n  \u003c/a\u003e\n  \u003ca href=\"https://www.loom.com/share/bb20f033d92544bab2009984d661176a\"\u003e\n    \u003cimg style=\"max-width:500px;\" src=\"https://cdn.loom.com/sessions/thumbnails/bb20f033d92544bab2009984d661176a-7f1a1827496781a6-full-play.gif\"\u003e\n  \u003c/a\u003e\n\u003c/div\u003e\n\n## How to Contribute\n\n`dbt-audit-helper-ext` is an open-source dbt package. Whether you are a seasoned open-source contributor or a first-time committer, we welcome and encourage you to contribute code, documentation, ideas, or problem statements to this project.\n\n👉 See [CONTRIBUTING guideline](./CONTRIBUTING.md)\n\n🌟 And finally, kudos to **our beloved OG Contributors** who orginally developed the macros and scripts in this package: [@William](https://www.linkedin.com/in/william-horel), [@Duc](https://www.linkedin.com/in/ducche), [@Csabi](https://www.linkedin.com/in/csaba-elekes-data), [@Adrien](https://www.linkedin.com/in/adrien-boutreau) \u0026 [@Dat](https://www.linkedin.com/in/datnguye)\n\n## About Infinite Lambda\n\nInfinite Lambda is a cloud and data consultancy. We build strategies, help organizations implement them, and pass on the expertise to look after the infrastructure.\n\nWe are an Elite Snowflake Partner, a Platinum dbt Partner, and a two-time Fivetran Innovation Partner of the Year for EMEA.\n\nNaturally, we love exploring innovative solutions and sharing knowledge, so go ahead and:\n\n🔧 Take a look around our [Git](https://github.com/infinitelambda)\n\n✏️ Browse our [tech blog](https://infinitelambda.com/category/tech-blog/)\n\nWe are also chatty, so:\n\n👀 Follow us on [LinkedIn](https://www.linkedin.com/company/infinite-lambda/)\n\n👋🏼 Or just [get in touch](https://infinitelambda.com/contacts/)\n\n[\u003cimg src=\"https://raw.githubusercontent.com/infinitelambda/cdn/1.0.0/general/images/GitHub-About-Section-1080x1080.png\" alt=\"About IL\" width=\"500\"\u003e](https://infinitelambda.com/)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Finfinitelambda%2Fdbt-audit-helper-ext","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Finfinitelambda%2Fdbt-audit-helper-ext","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Finfinitelambda%2Fdbt-audit-helper-ext/lists"}