https://github.com/shiven424/dth-data-warehouse
This repository provides a sample star‑schema data warehouse for Direct‑to‑Home (DTH) television services built for SQL Server Management Studio. It contains SQL scripts, CSV datasets, and example analytics queries for exploring subscriptions, churn, engagement, and advertising effectiveness.
https://github.com/shiven424/dth-data-warehouse
analytics data-modeling data-warehouse etl sql sql-server ssms star-schema
Last synced: 2 months ago
JSON representation
This repository provides a sample star‑schema data warehouse for Direct‑to‑Home (DTH) television services built for SQL Server Management Studio. It contains SQL scripts, CSV datasets, and example analytics queries for exploring subscriptions, churn, engagement, and advertising effectiveness.
- Host: GitHub
- URL: https://github.com/shiven424/dth-data-warehouse
- Owner: shiven424
- License: mit
- Created: 2025-07-30T17:29:25.000Z (3 months ago)
- Default Branch: main
- Last Pushed: 2025-07-30T18:45:51.000Z (3 months ago)
- Last Synced: 2025-07-30T20:57:29.080Z (2 months ago)
- Topics: analytics, data-modeling, data-warehouse, etl, sql, sql-server, ssms, star-schema
- Homepage:
- Size: 9.37 MB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# DTH Data Warehouse Analytics (SQL Server Edition)
This project provides a comprehensive sample data warehouse designed for Direct‑to‑Home (DTH) television service analytics. It includes SQL Server scripts to build a star schema, sample CSV datasets, analytical queries, and diagrams to explore churn analysis, customer engagement, advertising performance, and more.
---
## Table of Contents
1. [Features](#features)
2. [Project Structure](#project-structure)
3. [Data Model](#data-model)
4. [Setup Guide](#setup-guide)
5. [Example Analytics](#example-analytics)
6. [Diagrams](#diagrams)---
## Features
* **Self-contained SQL Scripts**
* `DW_Tables.sql` creates all required dimension, fact, and aggregated tables and includes ETL logic for monthly summaries.
* **Sample Data for Instant Use**
* CSV files aligned with the schema enable direct loading without external data dependencies.
* **Analytics Query Library**
* `Queries.sql` includes pre-built queries for churn analysis, customer loyalty scoring, content feedback, promotion impact, and cohort-based retention.
* **Presentation & Documentation**
* ER diagrams, schema images, and a PowerPoint/PDF report describe the overall architecture and information flow.
---
## Project Structure
```
DTH-Data-Warehouse/
├── DW_Tables.sql # Schema creation & ETL
├── Queries.sql # Sample analytics queries
├── customer_dimension.csv # Sample dimension data
├── plan_dimension.csv # More CSVs for each table
├── DWdiagrams/ # ER diagrams and visuals
├── Schema.png # Star schema overview
├── Report_Slides.pdf # Documentation/presentation
└── README.md # This file
```> All scripts and datasets are placed in the root directory for ease of use.
---
## Data Model
This project uses a **star schema** centered around subscription and engagement activities with rich supporting dimensions.
### 📘 Dimension Tables
| Table Name | Description |
| ----------------------- | ------------------------------------------ |
| `Customer_dimension` | Subscriber demographics & contact info |
| `Plan_dimension` | Plan names, pricing, and included channels |
| `Time_dimension` | Date hierarchy (day, week, month, quarter) |
| `Channel_dimension` | Channel metadata and genres |
| `Content_dimension` | Program or episode details |
| `Reason_dimension` | Reasons for churn or unsubscription |
| `Promotion_dimension` | Promotional campaign metadata |
| `Event_dimension` | Major events or seasonal triggers |
| `Ad_exposure_dimension` | Advertisement details and impressions |
| `Genre_dimension` | Genre category mapping |### 📊 Fact Tables
| Table Name | Description |
| ------------------------------- | ---------------------------------------- |
| `Subscription_fact` | Subscription records and plan history |
| `Unsubscription_fact` | Customer churn details with reasons |
| `Feedback_fact` | Feedback mapped to plans and channels |
| `Customer_engagement_fact` | Ad exposure, content viewing, engagement |
| `Monthly_aggregate_fact` | Pre-computed monthly KPIs |
| `Series_monthly_aggregate_fact` | Series-level monthly summaries |### Sample CSV Structure
Example: `customer_dimension.csv`
```
customer_id, customer_name, customer_email, customer_address, customer_city, customer_zipcode
```Example: `plan_dimension.csv`
```
plan_id, plan_name, plan_price, channel_package, plan_duration
```These column structures match insert orders used in the SQL scripts.
---
## Setup Guide
### 1. Clone or Download the Repository
```bash
git clone https://github.com/shiven424/DTH-Data-Warehouse
```Alternatively, download the ZIP and extract it locally.
---
### 2. Create the Database
* Open **SQL Server Management Studio (SSMS)**.
* Execute `DW_Tables.sql` to:* Create all dimension, fact, and aggregate tables.
* Optionally populate them with sample data.---
### 3. Load CSV Data
You can load the sample CSVs into their respective tables using:
* **SSMS Import Data Wizard**, or
* `BULK INSERT` statements (included in `DW_Tables.sql`).Example tables:
* `Customer_dimension`
* `Plan_dimension`
* `Channel_dimension`---
### 4. Run ETL Logic
Run the ETL section in `DW_Tables.sql` to populate:
* `Monthly_aggregate_fact`
* `Series_monthly_aggregate_fact`These contain pre-computed KPIs for faster analytics.
---
### 5. Execute Analytics
* Open `Queries.sql` in SSMS.
* Run any of the provided analytical queries.Example analyses include:
* Churn timing
* Feedback sentiment
* Promotion lift
* Viewer engagement trends---
### 6. Connect to BI Tools (Optional)
You can connect the database to a BI tool such as:
* **Power BI**
* **Tableau**
* **Google Data Studio**Use them to create dashboards and visualize key metrics using the sample data.
---
## Example Analytics
**Query: Average Days to Churn by Reason**
```sql
SELECT
r.reason_category,
r.reason_description,
ROUND(AVG(DATEDIFF(day, s.subscription_start_date, u.unsubscription_date)), 1) AS avg_days_to_unsub
FROM Unsubscription_fact u
JOIN Subscription_fact s
ON u.customer_id = s.customer_id
AND u.unsubscription_date BETWEEN s.subscription_start_date AND s.subscription_end_date
JOIN Reason_dimension r
ON u.reason_id = r.reason_id
GROUP BY r.reason_category, r.reason_description
ORDER BY avg_days_to_unsub DESC;
```Other queries compute:
* Channel loyalty scores
* Promotion effectiveness
* Viewer cohort behavior
* Feedback trends by genre or content---
## Diagrams
Navigate to the `DWdiagrams/` folder to explore:
* Star schema diagram
* Information package visualizations
* `Schema.png` for a quick overviewThese assets assist in understanding relationships and metric derivations.