https://github.com/redgerd/superstore-data-warehouse
This repository showcases a data warehousing project focused on building and maintaining a robust data warehouse using Teradata for ETL processes. It covers everything from data extraction and transformation to the integration and creation of interactive dashboards for detailed analysis.
https://github.com/redgerd/superstore-data-warehouse
etl-pipeline relational-databases sql teradata
Last synced: 3 months ago
JSON representation
This repository showcases a data warehousing project focused on building and maintaining a robust data warehouse using Teradata for ETL processes. It covers everything from data extraction and transformation to the integration and creation of interactive dashboards for detailed analysis.
- Host: GitHub
- URL: https://github.com/redgerd/superstore-data-warehouse
- Owner: Redgerd
- Created: 2024-01-21T09:20:59.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-01-03T17:11:16.000Z (5 months ago)
- Last Synced: 2025-03-02T13:14:12.992Z (3 months ago)
- Topics: etl-pipeline, relational-databases, sql, teradata
- Language: Batchfile
- Homepage:
- Size: 50.8 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# ADW-2017


### **Overview**
This project focused on centralizing data into a data warehouse for efficient querying and reporting through automated ETL processes.### **ETL Processes**
Utilizing **Teradata BTEQ** and **Teradata FastLoad**, the ETL processes included:
- **Data Cleansing**: Removing duplicates and correcting inconsistencies to ensure data quality.
- **Transformation**: Applying business rules to convert raw data into structured and usable information.
- **Data Loading**: Efficiently loading the transformed data into the data warehouse.### **ETL Implementation**
- Implemented a **pre-existing normalized schema** for effective data storage and retrieval.
- Loaded and transformed raw CSV data into the warehouse, ensuring alignment with the schema and business rules.
- Used **Teradata FastLoad** for fast data ingestion and validated data integrity throughout the process.## Technologies Used
- **Teradata Studio**: An administration toolkit that helps users to create and administer database objects.
- **Teradata BTEQ**: Core ETL scripting tool for data extraction, transformation, and loading.
- **Teradata FastLoad**: Specialized utility for rapid loading of large datasets into the database.