An open API service indexing awesome lists of open source software.

https://github.com/redgerd/superstore-data-warehouse

This repository showcases a data warehousing project focused on building and maintaining a robust data warehouse using Teradata for ETL processes. It covers everything from data extraction and transformation to the integration and creation of interactive dashboards for detailed analysis.
https://github.com/redgerd/superstore-data-warehouse

etl-pipeline relational-databases sql teradata

Last synced: 3 months ago
JSON representation

This repository showcases a data warehousing project focused on building and maintaining a robust data warehouse using Teradata for ETL processes. It covers everything from data extraction and transformation to the integration and creation of interactive dashboards for detailed analysis.

Awesome Lists containing this project

README

        

# ADW-2017

![Teradata](https://img.shields.io/badge/Teradata-F37440?style=for-the-badge&logo=teradata&logoColor=white)
![data-warehouse](https://img.shields.io/badge/Data_Warehouse-%23E57373?style=for-the-badge&logo=cloud&logoColor=white)
![ETL](https://img.shields.io/badge/ETL-%2381C784?style=for-the-badge&logo=cloud&logoColor=white)

### **Overview**
This project focused on centralizing data into a data warehouse for efficient querying and reporting through automated ETL processes.

### **ETL Processes**
Utilizing **Teradata BTEQ** and **Teradata FastLoad**, the ETL processes included:
- **Data Cleansing**: Removing duplicates and correcting inconsistencies to ensure data quality.
- **Transformation**: Applying business rules to convert raw data into structured and usable information.
- **Data Loading**: Efficiently loading the transformed data into the data warehouse.

### **ETL Implementation**
- Implemented a **pre-existing normalized schema** for effective data storage and retrieval.
- Loaded and transformed raw CSV data into the warehouse, ensuring alignment with the schema and business rules.
- Used **Teradata FastLoad** for fast data ingestion and validated data integrity throughout the process.

## Technologies Used
- **Teradata Studio**: An administration toolkit that helps users to create and administer database objects.
- **Teradata BTEQ**: Core ETL scripting tool for data extraction, transformation, and loading.
- **Teradata FastLoad**: Specialized utility for rapid loading of large datasets into the database.

![LMu4W](https://github.com/Redgerd/ADW-2017/assets/117646793/226ebe9c-a632-4875-8c95-161813f148f7)