An open API service indexing awesome lists of open source software.

https://github.com/harshindcoder/sql_data_warehouse_project

Building a modern data warehouse with MySQL, including ETL processes, data modeling and analytics.
https://github.com/harshindcoder/sql_data_warehouse_project

Last synced: about 1 year ago
JSON representation

Building a modern data warehouse with MySQL, including ETL processes, data modeling and analytics.

Awesome Lists containing this project

README

          

# Data Warehouse and Analytics Project 🚀

This repository demonstrates a comprehensive data warehousing and analytics solution, from building a data warehouse to generating actionable insights. It highlights industry best practices in data engineering and analytics.

## 📖 Project Overview
**This project involves:**
- **Data Architecture:** Designing a Modern Data Warehouse Using Medallion Architecture (Bronze, Silver, and Gold layers).
- **ETL Pipelines:** Extracting, transforming, and loading data from source systems into the warehouse.
- **Data Modeling:** Developing fact and dimension tables optimized for analytical queries.
- **Analytics & Reporting:** Creating SQL-based reports and dashboards for actionable insights.

**🎯 Skills Demonstrated:**
- SQL Development
- Data Architecture
- Data Engineering
- ETL Pipeline Development
- Data Modeling
- Data Analytics

---

## 🛠️ Important Links & Tools:
- **Datasets:** Project datasets (CSV files)
- **MySQL:** DBMS for hosting your SQL database
- **GitHub Repository:** Version control and collaboration
- **DrawIO:** Data architecture, models, flows, and diagrams
- **Notion:** Project management and organization

---

## 🚀 Project Requirements
### **Building the Data Warehouse (Data Engineering)**
- **Objective:** Develop a modern data warehouse using MySQL to consolidate sales data, enabling analytical reporting and informed decision-making.
- **Data Sources:** ERP and CRM CSV files
- **Data Quality:** Data cleansing and quality resolution
- **Integration:** Combine both sources into a star schema model
- **Scope:** Focus on the latest dataset only
- **Documentation:** Provide clear documentation for stakeholders and analytics teams

### **BI: Analytics & Reporting (Data Analysis)**
- **Objective:** Develop SQL-based analytics to deliver insights on:
- Customer Behavior
- Product Performance
- Sales Trends
- **Outcome:** Actionable insights for strategic decision-making

---

## 🏗️ Data Architecture (Medallion Architecture)
- **Bronze Layer:** Raw data from CSV files ingested into MySQL
- **Silver Layer:** Data cleansing, standardization, and normalization
- **Gold Layer:** Star schema with business-ready data for analytics

---

## 📂 Repository Structure
```
data-warehouse-project/
├── datasets/ # Raw datasets (ERP and CRM)
├── docs/ # Project documentation
│ ├── datawarehouse.drawio # Data architecture diagram
├── scripts/ # SQL scripts for ETL
│ ├── bronze/ # Raw data load scripts
│ ├── silver/ # Data transformation scripts
│ ├── gold/ # Analytical model scripts
├── README.md # Project overview and instructions
├── LICENSE # Project license (MIT)
```

---

## 🛡️ License
This project is licensed under the MIT License. You are free to use, modify, and share this project with proper attribution.