Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/javadbahoosh/spark-streaming-multi-language-docker

Dockerized infrastructure and boilerplate code for consuming Kafka topics with Spark Streaming in Scala, Python, and Java, featuring Redis integration for result aggregation.
https://github.com/javadbahoosh/spark-streaming-multi-language-docker

docker kafka spark

Last synced: 16 days ago
JSON representation

Dockerized infrastructure and boilerplate code for consuming Kafka topics with Spark Streaming in Scala, Python, and Java, featuring Redis integration for result aggregation.

Awesome Lists containing this project

README

        

# Multi-Language Spark Streaming with Kafka and Redis: A Comparative Boilerplate

This project aims to create the required Docker infrastructure for consuming data from a Kafka topic using
Spark Streaming in different languages (Scala, Python, and Java). Additionally, it includes boilerplate code
for implementing Spark Streaming consumers in these languages.
## Prerequisites

- Docker
- redis-cli

## Setup

1. Clone the repository.
2. Build the Docker images: `docker compose build`
3. Run the project: `./run.sh`

## Monitoring

The `run.sh` script monitors Redis keys every 5 seconds:
- Scala: `scala_total_messages`, `scala_total_sum`
- Python: `python_total_messages`, `python_total_sum`
- Java: `java_total_messages`, `java_total_sum`