An open API service indexing awesome lists of open source software.

https://github.com/vishal-bhandary/sql-data-analytics

This repository contains a collection of SQL scripts demonstrating various analytical techniques, such as changes over time, cumulative, performance, data segmentation, part-to-whole analysis.
https://github.com/vishal-bhandary/sql-data-analytics

analytics business-intelligence customer-segmentation dashboarding data-analysis data-reporting data-visualization data-warehouse etl kpi product-analysis sql sql-server star-schema t-sql

Last synced: 2 months ago
JSON representation

This repository contains a collection of SQL scripts demonstrating various analytical techniques, such as changes over time, cumulative, performance, data segmentation, part-to-whole analysis.

Awesome Lists containing this project

README

          

# ๐Ÿ“Š SQL Data Analytics Project

A hands-on, analytics-driven SQL project built to uncover insights from a modern data warehouse. This project uses advanced T-SQL queries to perform data exploration, segmentation, trend analysis, and KPI generation using dimensional modeling (star schema). It's ideal for showcasing data storytelling and analytical problem-solving skills.

---

## ๐Ÿ“š Table of Contents

* [๐Ÿ” Project Overview](#-project-overview)
* [๐ŸŽฏ Objectives](#-objectives)
* [๐Ÿงฑ Data Structure](#-data-structure)
* [๐Ÿ“Š Analysis Modules](#-analysis-modules)
* [๐Ÿ“ˆ Sample Insights](#-sample-insights)
* [๐Ÿš€ How to Run](#-how-to-run)
* [๐Ÿงฐ Tech Stack](#-tech-stack)
* [๐Ÿ”ฎ Future Enhancements](#-future-enhancements)
* [๐Ÿ“„ License](#-license)

---

## ๐Ÿ” Project Overview

This project simulates real-world analytics tasks performed over a cleaned and modeled SQL data warehouse. By leveraging structured queries, it enables business users and analysts to perform:

* Customer behavior analysis
* Product performance reviews
* Segmentation & cohort analysis
* Revenue and trend forecasting
* KPI dashboard reporting

The analysis is run on a **star schema** composed of dimension and fact tables in the `gold` layer.

---

## ๐ŸŽฏ Objectives

* Explore and validate database structures and metadata.
* Generate meaningful business metrics (sales, orders, customers).
* Perform segment-wise, time-series, and trend analyses.
* Build reusable analytical views and reports.
* Leverage SQL window functions for ranking and performance tracking.
* Enable customer and product segmentation.

---

## ๐Ÿงฑ Data Structure

**Star Schema Overview (Gold Layer):**

* `dim_customers` โ€“ Enriched customer demographics and CRM data
* `dim_products` โ€“ Product attributes, categories, cost, and life-cycle
* `fact_sales` โ€“ All transactional order data (sales, quantity, pricing)

Additional analytical views:

* `report_customers` โ€“ Customer-focused KPI summary and segmentation
* `report_products` โ€“ Product performance, ranking, and revenue classification

---

## ๐Ÿ“Š Analysis Modules

The project contains well-documented scripts categorized as follows:

| ๐Ÿ”น Module | ๐Ÿง  Purpose |
| -------------------------- | --------------------------------------------------------------------------- |
| **Database Exploration** | Understand schema and metadata using `INFORMATION_SCHEMA` |
| **Dimensions Exploration** | Extract unique values (e.g., countries, categories) for analysis |
| **Date Range Analysis** | Understand data availability over time using `MIN()`, `MAX()`, `DATEDIFF()` |
| **Key Metrics** | Calculate totals, averages, and counts (e.g., sales, orders, products) |
| **Magnitude Analysis** | Analyze grouped metrics by category, gender, region, etc. |
| **Ranking Analysis** | Use `TOP`, `RANK()`, `ROW_NUMBER()` to rank top/bottom performers |
| **Cumulative Analysis** | Track running totals and moving averages |
| **Performance Trends** | Perform YoY, MoM, and average deviation analysis using `LAG()` |
| **Segmentation Analysis** | Classify customers/products based on business logic |
| **Part-to-Whole Analysis** | Analyze category contribution to total revenue |
| **Customer Report View** | Full customer profiling: orders, lifetime, recency, segmentation |
| **Product Report View** | Full product performance: quantity, customers, segment, revenue metrics |

---

## ๐Ÿ“ˆ Sample Insights

> ๐Ÿ“Œ A few example outputs this project can generate:

* "Top 10 customers generated 40% of revenue"
* "Products costing over \$1,000 contribute 60% of total sales"
* "Category โ€˜Mountain Bikesโ€™ has the highest average monthly revenue"
* "Customer churn can be identified by tracking recency vs lifespan"
* "Segmented customers into `VIP`, `Regular`, and `New` based on behavior"

---

## ๐Ÿš€ How to Run

1. Ensure the Gold Layer tables (`dim_customers`, `dim_products`, `fact_sales`) are available and populated.
2. Run each script independently or as part of an orchestrated SQL job in SSMS.
3. You may also convert key queries into **SQL views** for reporting or dashboarding purposes.
4. Use tools like **Power BI**, **Tableau**, or **Excel** to connect to these views for visualization.

---

## ๐Ÿงฐ Tech Stack

* **SQL Server / T-SQL**
* **Dimensional Modeling (Star Schema)**
* **Advanced SQL Functions**

* Window Functions (`RANK()`, `LAG()`, `ROW_NUMBER()`)
* Aggregations (`SUM()`, `AVG()`, `COUNT()`)
* String, Date, and CASE logic

---

## ๐Ÿ”ฎ Future Enhancements

* Add parameterized stored procedures for dynamic filtering
* Build scheduled dashboards using Power BI
* Include predictive logic using SQL ML Services
* Introduce real-time metrics using SQL Change Data Capture (CDC)

---

## ๐Ÿ“„ License

This project is released under the [MIT License](LICENSE).