https://github.com/cafjs/caf_kafka
A caf.js plugin to build a Kafka gateway
https://github.com/cafjs/caf_kafka
Last synced: 5 months ago
JSON representation
A caf.js plugin to build a Kafka gateway
- Host: GitHub
- URL: https://github.com/cafjs/caf_kafka
- Owner: cafjs
- Created: 2022-08-09T01:12:29.000Z (almost 4 years ago)
- Default Branch: master
- Last Pushed: 2023-04-19T00:32:40.000Z (about 3 years ago)
- Last Synced: 2025-10-02T06:46:01.306Z (9 months ago)
- Language: JavaScript
- Size: 50.8 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGES.md
- Contributing: CONTRIBUTING.md
- License: LICENSE-2.0.txt
Awesome Lists containing this project
README
# Caf.js
Co-design cloud assistants with your web app and IoT devices.
See https://www.cafjs.com
## Caf.js library for processing Kafka streams
[](https://github.com/cafjs/caf_kafka/actions/workflows/push.yml)
The Kafka Streams API https://docs.confluent.io/platform/current/streams/index.html provides "exactly-once" processing of Kafka events, but requires that the input and output data are stored in a Kafka cluster.
Can we also provide "exactly-once" event processing when we need interactions with the external world?
For example, calling Web APIs of third-party providers, or micro-services hosted in a different cloud, or triggering actions on connected IoT devices, or notifying clients...
This Caf.js library helps you to build a reliable Kafka Gateway to do just that.
It is typically combined with the standard Kafka Streams API, providing a last stage of the processing pipeline:
-> -> -> ->
and it can also poll external services/devices regularly, i.e., using Caf.js autonomous agents, to create events for a new Kafka topic:
-> -> -> ->
A challenge is to provide a reasonable throughput while respecting event ordering. The strategy of the Kafka Streams API, one logical thread per partition, is too slow when one Web API call can take 100 msec of latency or more...
Luckily, only events with the same key need to be processed in order, and we can exploit concurrency between keys in the same partition to increase throughput.
But then events can be processed out of order, making "exactly-once" event processing much more complicated...
### Our Solution
We associate a Kafka partition with one Caf.js process, and then the `caf_kafka` plugin will consume events in that partition, dispatching them to local CAs, i.e., transactional actors called Cloud Assistants, based on the event key.
CAs never block the node.js loop, and now we can process concurrently thousands of events if they have different keys.
CAs remember the offset of the last processed event in internal (checkpointed) state, filtering duplicated events.
They also serialize requests with fencing, and CAs internal state is always externally consistent after recovering from a failure.
Moreover, external actions are mediated by transactional plugins, which delay them until a request commits, and also guarantee that recovery actions are idempotent.
This means that, with the right plugins, we can extend exactly-once semantics to the external world without needing distributed transactions.
See https://www.cafjslabs.com/orchestration for details.
## API
See {@link module:caf_kafka/proxy_kafka}
## Configuration
### framework.json
See {@link module:caf_kafka/plug_kafka}
### ca.json
See {@link module:caf_kafka/plug_ca_kafka}