Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/scrapcodes/kafkaproducer

Benchmarks to measure latency using spark and kafka.
https://github.com/scrapcodes/kafkaproducer

benchmark kafka spark

Last synced: about 1 month ago
JSON representation

Benchmarks to measure latency using spark and kafka.

Awesome Lists containing this project

README

        

# Benchmark

To run the tests locally.

## Start kafka.

* Start two kafka broker locally.

* Create two topics `test` and `output`, with 10 partitions each.

## Start Spark.

* To run Spark locally do,

```
bin/spark-submit --class com.github.scrapcodes.kafka.SparkSQLKafkaConsumer --master local[20] --executor-memory 6G /path/Kafka-Producer-assembly.jar

```

## Start Kafka Producer.

This should write to `test` topic, which spark job is subscribed to. And writes to `output` topic.

```
cd Kafka-Producer

./sbt "run-main com.github.scrapcodes.kafka.LongRunningProducer test"

```

## Start Kafka Consumer.

This would read from the `output` topic and display statistics, by reading for 10 minutes.

```
./sbt "run-main com.github.scrapcodes.kafka.KafkaConsumerStatistics output /path/timeStats.txt 10

```