https://github.com/nxion/sql-data-warehouse-project
Building a modern data warehouse with MS SQL server, ETL processes, data modeling and analyitics.
https://github.com/nxion/sql-data-warehouse-project
data data-analysis data-analytics data-engineering data-lakehouse data-warehouse datalake datascience etl etl-job medallion-architecture ms mssql sql sql-query sql-server
Last synced: 13 days ago
JSON representation
Building a modern data warehouse with MS SQL server, ETL processes, data modeling and analyitics.
- Host: GitHub
- URL: https://github.com/nxion/sql-data-warehouse-project
- Owner: nxion
- License: mit
- Created: 2025-03-02T17:59:14.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-03-02T18:19:30.000Z (over 1 year ago)
- Last Synced: 2025-03-02T19:18:41.745Z (over 1 year ago)
- Topics: data, data-analysis, data-analytics, data-engineering, data-lakehouse, data-warehouse, datalake, datascience, etl, etl-job, medallion-architecture, ms, mssql, sql, sql-query, sql-server
- Homepage:
- Size: 806 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# My MS SQL Data Warehouse Project and Analysis
Hello to all who read this! This project demonstrates a comprehensive data warehousing and analytics solution, from building a data warehouse to generating actionable insights.
---
## π Project Overview
Since this is designed as a portfolio project, it highlights industryβs best practices in data engineering and analytics. The project involves:
* Data Architecture: Design a Data Warehouse using the medallion method (Bronze, Silver, and Gold layers).
* ETL Pipelines: Extraction, transformation and loading of data from a source system into our warehouse.
* Data Modeling: Develop a fact and dimension tables to optimize analytical queries
* Analytics & Reporting: Creating SQL based reports for actionable insights.
## π‘ Alternatives Database Systems
This portfolio project was based off using Microsoft SQL Server since it is a common industry standard. I will also be implementing in PostgreSQL, MariaDB, MySQL and Google BigQuery to showcase my adaptation of other database technologies.
## π¦Ί Project Expansion
While not part of the original project. I thought of a AI powered chat interface that could query the data and action results using NLP and a simple chat interface. More to come...
---
## ποΈ Data Architecture
The data architecture for this project follows Medallion Architecture Bronze, Silver, and Gold layers:

* π₯ **Bronze Layer**: Stores raw data as-is from the source systems. Data is ingested from CSV Files into SQL Server Database.
* π₯ **Silver Layer**: This layer includes data cleansing, standardization, and normalization processes to prepare data for analysis.
* π₯ **Gold Layer**: Houses business-ready data modeled into a star schema required for reporting and analytics.
---
## π Project Requirements
### Building the Data Warehouse (Data Engineering)
#### Objective
* Develop a modern data warehouse using SQL Server to consolidate sales data, enabling analytical reporting and informed decision-making.
#### Specifications
* **Data Sources**: Import data from two source systems (ERP and CRM) provided as CSV files.
* **Data Quality**: Cleanse and resolve data quality issues prior to analysis.
* **Integration**: Combine both sources into a single, user-friendly data model designed for analytical queries.
* **Scope**: Focus on the latest dataset only; historization of data is not required.
* **Documentation**: Provide clear documentation of the data model to support both business stakeholders and analytics teams.
### BI: Analytics & Reporting (Data Analysis)
#### Objective
* Develop SQL-based analytics to deliver detailed insights into:
1. Customer Behavior
2. Product Performance
3. Sales Trends
These insights empower stakeholders with key business metrics, enabling strategic decision-making.
Additional requirements, naming conventions and tables descriptions with details will be listed in the [docs folder](docs) of this project.
---
This project was created by [Data with Baraa](https://www.youtube.com/@DataWithBaraa). It was a wonderful project, and I learned a considerable amount. In my professional life I have done projects like this, but it was something that I was never formally trained in. My background is more Infrastructure Technology Specialist than Data focused. This project has encouraged me to do more projects related to data analysis/engineering. All credit should go to him. A link to [this project](https://github.com/DataWithBaraa/sql-data-warehouse-project) and his [Github](https://github.com/DataWithBaraa) are here.