https://github.com/redgerd/databricks-e2e-data-lakehouse
End-to-end data engineering pipeline on Azure Databricks. It covers data ingestion with Autoloader and Structured Streaming, ETL with PySpark, dimensional modeling (star schema), and Slowly Changing Dimensions (SCD). The entire process is orchestrated using Delta Live Tables (DLT), showcasing a robust and production-ready data solution.
https://github.com/redgerd/databricks-e2e-data-lakehouse
Last synced: 9 months ago
JSON representation
End-to-end data engineering pipeline on Azure Databricks. It covers data ingestion with Autoloader and Structured Streaming, ETL with PySpark, dimensional modeling (star schema), and Slowly Changing Dimensions (SCD). The entire process is orchestrated using Delta Live Tables (DLT), showcasing a robust and production-ready data solution.
- Host: GitHub
- URL: https://github.com/redgerd/databricks-e2e-data-lakehouse
- Owner: Redgerd
- Created: 2025-08-25T14:31:06.000Z (10 months ago)
- Default Branch: main
- Last Pushed: 2025-09-02T04:23:47.000Z (10 months ago)
- Last Synced: 2025-09-02T06:14:02.184Z (10 months ago)
- Language: Jupyter Notebook
- Size: 1000 Bytes
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files: