https://github.com/iobruno/data-engineering-zoomcamp

Data Engineering examples for Airflow, Prefect, and Mage.ai; dbt for BigQuery, Redshift, ClickHouse, PostgreSQL; Spark/PySpark for Batch processing; and Kafka for Stream processing
https://github.com/iobruno/data-engineering-zoomcamp

airflow airflow-dags dbt-bigquery dbt-clickhouse dbt-postgres dbt-redshift kafka ksqldb mageai prefect pyspark spark typer-cli

Last synced: 5 months ago
JSON representation

Data Engineering examples for Airflow, Prefect, and Mage.ai; dbt for BigQuery, Redshift, ClickHouse, PostgreSQL; Spark/PySpark for Batch processing; and Kafka for Stream processing

Host: GitHub
URL: https://github.com/iobruno/data-engineering-zoomcamp
Owner: iobruno
License: cc-by-sa-4.0
Created: 2023-01-19T16:22:49.000Z (over 2 years ago)
Default Branch: master
Last Pushed: 2025-02-06T00:55:47.000Z (5 months ago)
Last Synced: 2025-02-06T01:29:18.391Z (5 months ago)
Topics: airflow, airflow-dags, dbt-bigquery, dbt-clickhouse, dbt-postgres, dbt-redshift, kafka, ksqldb, mageai, prefect, pyspark, spark, typer-cli
Language: Python
Homepage: https://github.com/DataTalksClub/data-engineering-zoomcamp
Size: 4.94 MB
Stars: 56
Watchers: 2
Forks: 2
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

awesome-clickhouse - iobruno/data-engineering-zoomcamp - The project provides a collection of resources and examples for Data Engineering, focusing on tools like Airflow, Prefect, and Kafka, along with various databases. (Integrations / Data Transfer and Synchronization)

README

        # Data Engineering Zoomcamp

## Taking the course (20205 Cohort)

* **Start**: 13 January 2025

* **Registration link**: https://airtable.com/shr6oVXeQvSI5HuWD

* [Cohort folder](https://github.com/DataTalksClub/data-engineering-zoomcamp/tree/main/cohorts/2025) with homeworks and deadlines

## Syllabus

### [Module 1: Data ingestion](module1-data-ingestion/)

* [Python ingestion with polars and pandas](module1-data-ingestion/python-ingest/)

* Rust data ingestion

* [data load tool (dlt)](module1-data-ingestion/data-load-tool/)

* [IaC with Terraform (Google Cloud Platform)](infrastructure/terraform-gcp/)

* Homework

### [Module 2: Workflow orchestration](module2-workflow-orchestration/)

* [Workflow orchestration with Airflow](module2-workflow-orchestration/airflow/)

* [Workflow orchestration with Mage](module2-workflow-orchestration/mageai/)

* [Workflow orchestration with Prefect](module2-workflow-orchestration/prefect/)

* Homework

### [Module 3: Lakehouses & Data Warehouse](module3-lakehouse-data-warehouse/)

* [BigQuery Data Warehouse](module3-lakehouse-data-warehouse/bigquery/)

* [StarRocks Query Engine](module3-lakehouse-data-warehouse/starrocks/)

* Lakehouse with Delta Lake

* Homework

### [Module 4: Analytics engineering](module4-analytics-engineering/)

* [BigQuery and dbt](module4-analytics-engineering/bigquery/)

* [Redshift and dbt](module4-analytics-engineering/redshift/)

* Databricks and dbt

* [ClickHouse and dbt](module4-analytics-engineering/clickhouse/)

* [PostgreSQL and dbt](module4-analytics-engineering/postgres/)

* [DuckDB and dbt](module4-analytics-engineering/duckdb/)

* [Data visualization with Superset/Metabase](module4-analytics-engineering/visualization/)

* Homework

### [Module 5: Batch processing](module5-batch-processing/)

* [PySpark](module5-batch-processing/pyspark/)

* Spark + Kotlin API

* Spark (Scala)

* Homework

### [Module 6: Stream processing](module6-stream-processing/)

* [Stream processing with Kafka, ksqlDB and Kotlin](module6-stream-processing/kotlin/)

* [Kafka Streams with ksqlDB](module6-stream-processing/ksqldb/)

* [RisingWave: Streaming Database](module6-stream-processing/risingwave/)

* Homework

### Extras

* [LakeHouse with Delta, Iceberg, Hive](https://github.com/iobruno/lakehouse-labs/)

* Capstone Project

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/iobruno/data-engineering-zoomcamp

Awesome Lists containing this project

README