Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/longnguyen010203/spark-kafka-self-learning
📚🌊🎓 A third-year student is self-studying Spark and Kafka as part of their 👷 data engineering journey, with the goal of securing an 📬 internship or fresher job in 2024.
https://github.com/longnguyen010203/spark-kafka-self-learning
apache-kafka apache-spark cluster docker docker-compose zookeeper
Last synced: about 7 hours ago
JSON representation
📚🌊🎓 A third-year student is self-studying Spark and Kafka as part of their 👷 data engineering journey, with the goal of securing an 📬 internship or fresher job in 2024.
- Host: GitHub
- URL: https://github.com/longnguyen010203/spark-kafka-self-learning
- Owner: longNguyen010203
- License: apache-2.0
- Created: 2024-07-17T14:27:00.000Z (2 months ago)
- Default Branch: main
- Last Pushed: 2024-09-04T03:28:59.000Z (21 days ago)
- Last Synced: 2024-09-24T09:02:02.338Z (about 13 hours ago)
- Topics: apache-kafka, apache-spark, cluster, docker, docker-compose, zookeeper
- Language: Shell
- Homepage:
- Size: 112 MB
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# 🌃 Apache kafka Self-Learning 🌊
A third-year student is self-studying Spark and Kafka as part of their data engineering journey, with the goal of securing an internship or fresher job in 2024.## 📦 Technologies
- `Docker`
- `PostgreSQL`
- `Apache Spark`
- `Apache Kafka`
- `Zookeeper`## 🔦 Architecture
### 1. Apache Spark
- `SparkContext`
- `Driver Program`
- `Cluster Manager`
- `Worker Node`
- `Executor`
- `Cache`
- `Task`### 2. Apache Kafka
- `Producer`
- `Consumer`
- `Broker`
- `Cluster`
- `Topic`
- `Partition`
- `Offset`
- `Consumer-group`