{"id":50335597,"url":"https://github.com/databricks-solutions/databricks-dbt-training-labs","last_synced_at":"2026-05-29T13:30:24.947Z","repository":{"id":357320314,"uuid":"1218465407","full_name":"databricks-solutions/databricks-dbt-training-labs","owner":"databricks-solutions","description":null,"archived":false,"fork":false,"pushed_at":"2026-05-12T08:37:12.000Z","size":6304,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-12T10:14:22.520Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/databricks-solutions.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":"CODEOWNERS","security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":"NOTICE.md","maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-04-22T22:53:15.000Z","updated_at":"2026-05-12T08:07:56.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/databricks-solutions/databricks-dbt-training-labs","commit_stats":null,"previous_names":["databricks-solutions/databricks-dbt-training-labs"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/databricks-solutions/databricks-dbt-training-labs","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/databricks-solutions%2Fdatabricks-dbt-training-labs","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/databricks-solutions%2Fdatabricks-dbt-training-labs/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/databricks-solutions%2Fdatabricks-dbt-training-labs/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/databricks-solutions%2Fdatabricks-dbt-training-labs/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/databricks-solutions","download_url":"https://codeload.github.com/databricks-solutions/databricks-dbt-training-labs/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/databricks-solutions%2Fdatabricks-dbt-training-labs/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33655440,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-05-29T02:00:06.066Z","response_time":107,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-05-29T13:30:23.337Z","updated_at":"2026-05-29T13:30:24.938Z","avatar_url":"https://github.com/databricks-solutions.png","language":null,"funding_links":[],"categories":[],"sub_categories":[],"readme":"# databricks-dbt-training-labs\n\nSelf-service, hands-on labs that complement an upcoming [dbt](https://www.getdbt.com/) on [Databricks](https://www.databricks.com/) course. The labs let you work through the course material at your own pace in your own workspace, taking you from a clean dbt project to production-grade models on Unity Catalog with optional deployment via [Declarative Automation Bundles (DABs)](https://docs.databricks.com/dev-tools/bundles) and Lakeflow Jobs.\n\n## Mission\n\n- **Course companion.** Each lab maps to a section of the dbt-on-Databricks course and is designed to be opened, read, and run end-to-end without an instructor.\n- **Self-service.** Clone the repo, fill in your own workspace values locally, and run. No shared environment, no shared state.\n- **Production-shaped, not toy.** Models follow the staging + marts layering, use Unity Catalog with `dbt-databricks`, and are deployable as Lakeflow Jobs via DABs.\n- **Public-safe.** Everything tracked here is portable across any Databricks workspace; workspace-specific config stays local (see [`local_deployment/`](local_deployment/)).\n\n## What's inside\n\nThe first lab is the canonical [Jaffle Shop](https://github.com/dbt-labs/jaffle-shop-classic) dataset, modelled end-to-end with `dbt-databricks` against a SQL Warehouse. It is intentionally a **first step**: a small, well-known schema that lets you focus on getting the local dev loop, profiles, seeds, sources, staging/marts layering, tests, packages, and DAB deployment working before tackling harder modelling problems.\n\nFuture labs will build on this foundation with **several deeper and more diverse examples of data modelling in Databricks**, planned to cover patterns such as:\n\n- Slowly Changing Dimensions (Type 1 / 2) using dbt snapshots\n- Incremental models with merge / append / insert-overwrite strategies on Delta\n- Streaming sources and CDC ingestion patterns (DLT / Lakeflow + dbt)\n- Star and dimensional modelling beyond the Jaffle toy schema\n- Data Vault style raw / business vault layers\n- Wide event / semi-structured (JSON, variants) modelling\n- Performance tuning: liquid clustering, partitioning, Z-ORDER, photon, materialization choice\n- Unity Catalog governance: row/column masking, tags, lineage-aware models\n- Multi-environment promotion (dev / staging / prod) via DAB targets and CI\n\nEach subsequent lab lands as its own top-level project folder, mirroring the `jaffle_shop/` layout, so you can run them independently in the same workspace.\n\n| Path | Purpose |\n|------|---------|\n| [`jaffle_shop/`](jaffle_shop/) | Lab 01 — the dbt project: staging + marts models, seeds, macros, tests, packages. |\n| [`local_deployment/`](local_deployment/) | Workspace-specific DAB and deploy config. Only the README is tracked; everything else stays local. |\n| [`.cursor/rules/`](.cursor/rules/) | Cursor agent guardrails: public-repo compliance, branch conventions, coding standards. |\n\n## Prerequisites\n\n- A Databricks workspace with Unity Catalog enabled\n- A SQL Warehouse (Serverless or Classic)\n- A catalog and schema you own (or the right to create one)\n- [Databricks CLI](https://docs.databricks.com/dev-tools/cli/install.html) v0.218+ configured with a profile\n- Python 3.10+\n- `dbt-databricks\u003e=1.8.0,\u003c2.0.0`\n\n## Quick start (local dbt run)\n\n```bash\ncd jaffle_shop\n\n# 1. Copy the templates -- both copies are gitignored.\ncp .env.template .env\ncp profiles.yml.example profiles.yml\n\n# 2. Open .env in your editor and fill in REAL values for every variable:\n#      DATABRICKS_HOST          -\u003e https://\u003cyour-workspace\u003e.cloud.databricks.com\n#      DATABRICKS_HTTP_PATH     -\u003e /sql/1.0/warehouses/\u003cyour-warehouse-id\u003e\n#      DATABRICKS_CATALOG       -\u003e \u003cyour-catalog\u003e\n#      DATABRICKS_SCHEMA        -\u003e jaffle_shop  (or any schema you own)\n#      DATABRICKS_TOKEN         -\u003e \u003cyour-pat\u003e   (or use OAuth, see below)\n#    profiles.yml reads these via env_var() at runtime, so you only fill in .env.\n\n# 3. Install dbt and run.\npython -m venv .venv \u0026\u0026 source .venv/bin/activate\npip install -r requirements.txt\n\nset -a \u0026\u0026 source .env \u0026\u0026 set +a       # export the env vars to dbt\ndbt deps\ndbt debug                             # confirms connection before you run anything\ndbt seed --full-refresh --vars '{\"load_source_data\": true}'\ndbt run\ndbt test\n```\n\n`profiles.yml` and `.env` are gitignored. Never commit them.\n\n## Optional: deploy as a Lakeflow Job via DAB\n\nThe DAB bundle (`databricks.yml` at the repo root) is workspace-specific and therefore gitignored. See [local_deployment/README.md](local_deployment/README.md) for a placeholder bundle you can copy and fill in for your environment.\n\n### Authenticate the CLI\n\n`databricks bundle …` needs a CLI profile that points at the same workspace you're deploying to. The recommended approach is a [service principal](https://docs.databricks.com/dev-tools/auth.html#service-principal-oauth) with an OAuth client_id / client_secret -- no PATs, no expiring tokens. Add a profile to `~/.databrickscfg`:\n\n```ini\n[dbt-ws-profile]\nhost          = https://\u003cyour-workspace\u003e.cloud.databricks.com\nclient_id     = \u003cservice-principal-client-id\u003e\nclient_secret = \u003cservice-principal-oauth-secret\u003e\n```\n\nIf you'd rather use a PAT for a quick test, replace the OAuth pair with `token = \u003cyour-pat\u003e`.\n\n### Validate, deploy, run\n\nFrom the repo root, pass `--profile` to every bundle command (or set `DATABRICKS_CONFIG_PROFILE=dbt-ws-profile` once for the shell):\n\n```bash\ndatabricks bundle validate --profile=dbt-ws-profile\ndatabricks bundle deploy   --profile=dbt-ws-profile\ndatabricks bundle run jaffle_shop_dbt_job --profile=dbt-ws-profile\n```\n\n## How to get help\n\nDatabricks does not offer official support for this content. For questions or bugs, please open a GitHub issue and the team will help on a best-effort basis.\n\n## Security\n\nPlease report security issues per [SECURITY.md](SECURITY.md).\n\n## License\n\nSee [LICENSE.md](LICENSE.md) and [NOTICE.md](NOTICE.md). All third-party packages used by this lab are listed in `jaffle_shop/packages.yml` and `jaffle_shop/requirements.txt`; refer to each project's own license.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdatabricks-solutions%2Fdatabricks-dbt-training-labs","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdatabricks-solutions%2Fdatabricks-dbt-training-labs","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdatabricks-solutions%2Fdatabricks-dbt-training-labs/lists"}