Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/ndomah/data-engineering

Links to data engineering projects and learning materials.
https://github.com/ndomah/data-engineering

airflow aws azure cassandra data-engineering databricks elt etl kafka pipelines snowflake

Last synced: about 12 hours ago
JSON representation

Links to data engineering projects and learning materials.

Awesome Lists containing this project

README

        

# Data Engineering
This README contains links to my data engineering portfolio projects and learning materials.

## Projects
[**AWS YouTube Data Analysis**](https://github.com/ndomah/AWS-YouTube-Data-Analysis)
- Tools Used: Python, SQL, AWS, Lambda, Athena, S3, IAM, Glue, QuickSight
- Analyzed YouTube trending video data using AWS services to build a scalable pipeline for data ingestion, ETL, and storage in a centralized data lake. Created QuickSight dashboards highlighting video views by country, category, and region. Workflow included ingestion, preprocessing, cataloging, and analysis.

[**Real-Time Data Streaming of Random User Data**](https://github.com/ndomah/Realtime-Data-Streaming-of-Random-User-Data)
- Tools Used: Python, PostgreSQL, Docker, Airflow, Kafka, Spark, Cassandra, Zookeeper
- Built a robust, scalabale, and fault-tolerant pipeline using a modern tech stack. The pipeline ingests, processes, and stores random user-generated data from an API.

[**Azure Medallion Architecture Pipeline**](https://github.com/ndomah/Azure-Medallion-Pipeline)
- Tools Used: Python, SQL, Azure, dbt, Databricks
- Implemented a complete data engineering pipeline using the Medallion Architecture (Bronze, Silver, and Gold layers) within Azure Databricks. It integrates several Azure services and dbt (Data Build Tool) to orchestrate data ingestion, transformation, and storage, ensuring a robust, scalable, and secure solution.

[**ELT Pipeline**](https://github.com/ndomah/ELT-Pipeline)
- Tools Used: Python, SQL, Airflow, Snowflake, dbt
- Built a simple ELT pipeline using dbt (Data Build Tool) to transform data in Snowflake, with orchestration managed by Apache Airflow. This setup showcases a modern data engineering workflow, essential for handling large-scale data transformations efficiently.

## Learning Materials
[**The Data Engineering Academy**](https://github.com/ndomah/The-Data-Engineering-Academy)

[**Data Engineering Zoomcamp**]()