https://github.com/chathumiamarasinghe/dwh-project
Building a data warehouse project - sqlserver-v1 ← SQL Server Implementation & snowflake - v2 ← Snowflake Implementation
https://github.com/chathumiamarasinghe/dwh-project
airflow airflow-on-docker bronze-silver-gold docker-compose medallion-architecture snowflake sql-server wsl2
Last synced: 4 months ago
JSON representation
Building a data warehouse project - sqlserver-v1 ← SQL Server Implementation & snowflake - v2 ← Snowflake Implementation
- Host: GitHub
- URL: https://github.com/chathumiamarasinghe/dwh-project
- Owner: chathumiamarasinghe
- License: mit
- Created: 2025-10-20T12:01:37.000Z (8 months ago)
- Default Branch: main
- Last Pushed: 2025-11-28T16:56:38.000Z (7 months ago)
- Last Synced: 2025-11-30T21:54:50.275Z (7 months ago)
- Topics: airflow, airflow-on-docker, bronze-silver-gold, docker-compose, medallion-architecture, snowflake, sql-server, wsl2
- Language: TSQL
- Homepage:
- Size: 68.4 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# 📦 Data Warehouse Project
This repository contains two versions of a data warehouse implementation for a fictional business use case.
Both follow the same business logic and data model, but use different platforms.
---
## 🚀 Available Versions
| Version | Technology Stack | Branch Name | Status |
|--------|------------------|-------------|--------|
| **v1** | SQL Server + SSIS | `sqlserver_v1` | ✔ Completed |
| **v2** | Snowflake + Tasks + Stored Procedures | `snowflake_v2` | ✔ Completed |
---
## 🏗️ Architecture
Both implementations are built using a **Medallion Architecture**:
Bronze → Silver → Gold
| Layer | Description |
|-------|------------|
| **Bronze** | Raw data ingestion (no transformations) |
| **Silver** | Cleaned, validated, standardized data |
| **Gold** | Business-ready tables, facts, dimensions |
---
## 🧱 Architecture Overview
Both implementations follow the Medallion Architecture pattern:
┌──────────────┐
│ Bronze │ (Raw Data)
└───────┬──────┘
│
▼
┌──────────────┐
│ Silver │ (Cleaned + Standardized)
└───────┬──────┘
│
▼
┌──────────────┐
│ Gold │ (Analytics Models: Facts + Dimensions)
└──────────────┘
## 📂 Repository Structure
📁 dwh-project
├── README.md
├── sqlserver_v1/ ← SQL Server Implementation
└── snowflake_v2/ ← Snowflake Implementation
---
## 🔧 How to Work with the Repo
### Clone the repository:
```sh
git clone https://github.com/chathumiamarasinghe/dwh-project.git
Switch to a version:
git checkout sqlserver_v1
```
or
```
git checkout snowflake_v2
```
## 🧪 Data Sources Used
1. CRM system (Customer details)
2. Sales dataset
3. Product master data
## ▶️ Running the Pipeline
1️⃣ Start Airflow
```sql
docker compose up -d
```
2️⃣ Confirm DAGs are detected
```sql
airflow dags list
```
Expected:
bronze_layer_load
silver_layer_load
gold_layer_load
full_etl_pipeline
3️⃣ Trigger Pipeline Manually
```sql
airflow dags trigger full_etl_pipeline
```