Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/airscholar/modern-data-eng-dbt-databricks-azure
In this project, we setup and end to end data engineering using Apache Spark, Azure Databricks, Data Build Tool (DBT) using Azure as our cloud provider.
https://github.com/airscholar/modern-data-eng-dbt-databricks-azure
apache-spark azure databricks dbt modern-data-engineering
Last synced: 2 months ago
JSON representation
In this project, we setup and end to end data engineering using Apache Spark, Azure Databricks, Data Build Tool (DBT) using Azure as our cloud provider.
- Host: GitHub
- URL: https://github.com/airscholar/modern-data-eng-dbt-databricks-azure
- Owner: airscholar
- Created: 2023-12-18T15:00:39.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2023-12-18T15:35:38.000Z (about 1 year ago)
- Last Synced: 2024-04-18T02:57:13.347Z (9 months ago)
- Topics: apache-spark, azure, databricks, dbt, modern-data-engineering
- Homepage: https://youtu.be/divjURi-low
- Size: 118 KB
- Stars: 11
- Watchers: 3
- Forks: 9
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Modern Data Engineering with Medallion Architecture using DBT, Databricks, Spark and Azure Cloud
In this project, we setup and end to end data engineering using Apache Spark, Azure Databricks, Data Build Tool (DBT) using Azure as our cloud provider. This project illustrate the process of data ingestion to the lakehouse, data integration with ADF and data transformation with Databricks, and DBT.## System Architecture
![System Architecture.jpeg](System%20Architecture.jpeg)## Commands
Try running the following commands:
- dbt run # for running models
- dbt test # for tests
- dbt snapshot # for snapshotting and slowly changing dimensions
- dbt docs generate # for documentation
- dbt docs serve # for documentation preview## Resources:
* [Medium Article](https://medium.com/@yusuf.ganiyu/robust-data-pipelines-with-databricks-spark-dbt-and-azure-data-engineering-project-e5780fbc07a6)
* [DBT](https://docs.getdbt.com/guides)
* [Databricks](https://docs.databricks.com/)
* [Azure](https://docs.microsoft.com/en-us/azure/?product=featured)
* [Azure Data Factory](https://docs.microsoft.com/en-us/azure/data-factory/)
* [Azure Data Lake Storage Gen2](https://docs.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-introduction)### Youtube Video
[![Modern Data Engienering](https://img.youtube.com/vi/divjURi-low/0.jpg)](https://youtu.be/divjURi-low)