https://github.com/harshindcoder/sql_data_warehouse_project
Building a modern data warehouse with MySQL, including ETL processes, data modeling and analytics.
https://github.com/harshindcoder/sql_data_warehouse_project
Last synced: about 1 year ago
JSON representation
Building a modern data warehouse with MySQL, including ETL processes, data modeling and analytics.
- Host: GitHub
- URL: https://github.com/harshindcoder/sql_data_warehouse_project
- Owner: harshindcoder
- License: mit
- Created: 2025-02-14T14:14:38.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-02-26T08:50:06.000Z (over 1 year ago)
- Last Synced: 2025-02-26T09:28:10.339Z (over 1 year ago)
- Size: 848 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Data Warehouse and Analytics Project 🚀
This repository demonstrates a comprehensive data warehousing and analytics solution, from building a data warehouse to generating actionable insights. It highlights industry best practices in data engineering and analytics.
## 📖 Project Overview
**This project involves:**
- **Data Architecture:** Designing a Modern Data Warehouse Using Medallion Architecture (Bronze, Silver, and Gold layers).
- **ETL Pipelines:** Extracting, transforming, and loading data from source systems into the warehouse.
- **Data Modeling:** Developing fact and dimension tables optimized for analytical queries.
- **Analytics & Reporting:** Creating SQL-based reports and dashboards for actionable insights.
**🎯 Skills Demonstrated:**
- SQL Development
- Data Architecture
- Data Engineering
- ETL Pipeline Development
- Data Modeling
- Data Analytics
---
## 🛠️ Important Links & Tools:
- **Datasets:** Project datasets (CSV files)
- **MySQL:** DBMS for hosting your SQL database
- **GitHub Repository:** Version control and collaboration
- **DrawIO:** Data architecture, models, flows, and diagrams
- **Notion:** Project management and organization
---
## 🚀 Project Requirements
### **Building the Data Warehouse (Data Engineering)**
- **Objective:** Develop a modern data warehouse using MySQL to consolidate sales data, enabling analytical reporting and informed decision-making.
- **Data Sources:** ERP and CRM CSV files
- **Data Quality:** Data cleansing and quality resolution
- **Integration:** Combine both sources into a star schema model
- **Scope:** Focus on the latest dataset only
- **Documentation:** Provide clear documentation for stakeholders and analytics teams
### **BI: Analytics & Reporting (Data Analysis)**
- **Objective:** Develop SQL-based analytics to deliver insights on:
- Customer Behavior
- Product Performance
- Sales Trends
- **Outcome:** Actionable insights for strategic decision-making
---
## 🏗️ Data Architecture (Medallion Architecture)
- **Bronze Layer:** Raw data from CSV files ingested into MySQL
- **Silver Layer:** Data cleansing, standardization, and normalization
- **Gold Layer:** Star schema with business-ready data for analytics
---
## 📂 Repository Structure
```
data-warehouse-project/
├── datasets/ # Raw datasets (ERP and CRM)
├── docs/ # Project documentation
│ ├── datawarehouse.drawio # Data architecture diagram
├── scripts/ # SQL scripts for ETL
│ ├── bronze/ # Raw data load scripts
│ ├── silver/ # Data transformation scripts
│ ├── gold/ # Analytical model scripts
├── README.md # Project overview and instructions
├── LICENSE # Project license (MIT)
```
---
## 🛡️ License
This project is licensed under the MIT License. You are free to use, modify, and share this project with proper attribution.