Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/sderosiaux/kafka-streams-consumer-offsets-to-json

A Kafka Streams process to convert __consumer_offsets to a JSON-readable topic
https://github.com/sderosiaux/kafka-streams-consumer-offsets-to-json

kafka kafkastreams

Last synced: about 1 month ago
JSON representation

A Kafka Streams process to convert __consumer_offsets to a JSON-readable topic

Awesome Lists containing this project

README

        

# Use case

Read the famous internal topic `__consumer_offsets` with any projects that can only read JSON topics, such as timeseries databases, to enable some monitoring and alerting.

Details on the blog post: https://www.sderosiaux.com/articles/2017/08/07/looking-at-kafka-s-consumers-offsets

# How to run it

Clone the project, and `sbt run`, or via Intellij IDEA, run the `Main` class.

You can also build it using `sbt universal:packageBin` and unzip `target\universal\kafka-consumer-offsets-to-json-topic-1.0.zip`.

# Dependencies

It only depends on Kafka Streams 0.11 (it can be setup to 0.10 if needed).

It is using some classes from Kafka server to know how to parse the content of `__consumer_offsets` which is binary-based.

# Notes

Kafka is regularly commiting messages to `__consumer_offsets`, even if no-one is consuming.

By running this Kafka Streams project, and listening to the output topic, you'll get something like :

```
$ kafka-console-consumer.bat --topic __consumer_offsets_json --new-consumer --bootstrap-server localhost:9092
{"topic":"WordsWithCountsTopic","partition":0,"group":"console-consumer-26549","version":1,"offset":95,"metadata":"","commitTimestamp":1501542796444,"expireTimestamp":1501629196444}
{"topic":"WordsWithCountsTopic","partition":0,"group":"console-consumer-26549","version":1,"offset":95,"metadata":"","commitTimestamp":1501542801444,"expireTimestamp":1501629201444}
{"topic":"WordsWithCountsTopic","partition":0,"group":"console-consumer-26549","version":1,"offset":95,"metadata":"","commitTimestamp":1501542806444,"expireTimestamp":1501629206444}
{"topic":"WordsWithCountsTopic","partition":0,"group":"console-consumer-26549","version":1,"offset":95,"metadata":"","commitTimestamp":1501542811445,"expireTimestamp":1501629211445}
{"topic":"WordsWithCountsTopic","partition":0,"group":"console-consumer-26549","version":1,"offset":95,"metadata":"","commitTimestamp":1501542816447,"expireTimestamp":1501629216447}
{"topic":"__consumer_offsets","partition":5,"group":"consumer-offsets-consumer-app-2","version":1,"offset":106,"metadata":"","commitTimestamp":1501542868586,"expireTimestamp":1501629268586}
{"topic":"__consumer_offsets","partition":17,"group":"consumer-offsets-consumer-app-2","version":1,"offset":1025,"metadata":"","commitTimestamp":1501542868736,"expireTimestamp":1501629268736}
```

The output topic offsets are not published to not cause an infinite loop (listening to it would cause messages published to this same topic, that would move on the offsets, that would cause new messages to pop etc.).

# Warning

It's probably not a good idea to rely on this in production, because it's an implementation detail that can changed across versions.

# Options

The broker address and the output topic can be customized.

Defaults are:
- `localhost:9092`
- `__consumer_offsets_json`

```
-b, --broker
-o, --output-topic
--help Show help message
```