https://github.com/pastuhov/python-kafka-groupbykey
https://github.com/pastuhov/python-kafka-groupbykey
aggregation kafka stateful streams
Last synced: 10 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/pastuhov/python-kafka-groupbykey
- Owner: pastuhov
- Created: 2021-03-23T13:32:40.000Z (about 5 years ago)
- Default Branch: master
- Last Pushed: 2021-03-24T06:20:32.000Z (about 5 years ago)
- Last Synced: 2025-02-23T11:36:42.981Z (over 1 year ago)
- Topics: aggregation, kafka, stateful, streams
- Language: Python
- Homepage:
- Size: 2.93 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# python-kafka-groupbykey
Realizes Kafka streams code:
```java
builder.stream("commit_log_topic").groupByKey().aggregate(...).toStream().to("snapshot_topic");
```
## Example
```python
from confluent_kafka import SerializingProducer, DeserializingConsumer
from http_server import start_server
from reducer import Reducer
from store import Store
from reduced import Reduced
from configuration import Configuration
import sys
config = Configuration(
commit_log_topic='commit_log_topic_name',
snapshot_topic='snapshot_topic_topic_name',
bootstrap_servers='localhost',
group_id='app_name',
)
def reduce(accumulator, current):
if accumulator is None:
accumulator = Reduced()
accumulator.value = {}
# your reduce code
return accumulator
s = Store(DeserializingConsumer(config.store_consumer), config.snapshot_topic)
s.hydrate()
start_server(s, 8333)
r = Reducer(
reduce,
DeserializingConsumer(config.consumer),
SerializingProducer(config.producer),
s,
config.commit_log_topic,
config.snapshot_topic,
config.batch_timeout_sec,
config.messages_per_transaction)
r.process()
```