Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/davidag/data-engineering-zoomcamp

My Data Engineering Zoomcamp 2023 notes and homework
https://github.com/davidag/data-engineering-zoomcamp

Last synced: about 19 hours ago
JSON representation

My Data Engineering Zoomcamp 2023 notes and homework

Awesome Lists containing this project

README

        

# Data Engineering Zoomcamp 2023

## Week 1: Introduction and Prerequisites

- How to use Docker and Docker Compose
- Python scripting and Jupyter notebooks
- pgcli and pgAdmin for interacting with PostgreSQL
- SQL refresher
- Google Cloud and Terraform

[Homework solution](week_1_basics_n_setup)

## Week 2: Workflow Orchestration

- Data Lakes and Data Warehouses
- Workflow orchestration with Prefect
- Google Cloud Storage and BigQuery
- Using a Docker registry to run Prefect flows in Docker containers

[Homework solution](week_2_workflow_orchestration)

## Week 3: Data Warehouse and BigQuery

- OLAP vs OLTP
- Data warehouses
- BigQuery, including partitioning and clustering
- BigQuery Machine Learning

[Homework solution](week_3_data_warehouse)

## Week 4: Analytics Engineering

- Analytics Engineering
- data build tool (dbt) and BigQuery
- dbt models
- data visualization

[Homework solution](week_4_analytics_engineering)

## Week 5: Batch Processing

- Data processing: batch vs streaming
- Apache Spark
- DataFrames: Actions and Transformations
- Spark SQL: Join and GroupBy
- Resilient Distributed Datasets
- Google Cloud Dataproc

[Homework solution](week_5_batch_processing)

## Week 6: Stream Processing

- [Stream processing](https://en.wikipedia.org/wiki/Stream_processing)
- [Apache Kafka](https://kafka.apache.org/)
- [Kafka Connect](https://kafka.apache.org/documentation/#connect)
- [Kafka Streams](https://kafka.apache.org/documentation/streams/)
- [Confluent Schema Registry](https://github.com/confluentinc/schema-registry)
- [ksqlDB](https://ksqldb.io/)

[Homework solution](week_6_stream_processing)