Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/jaredpetersen/kafka-connect-arangodb
🥑 Kafka connect sink connector for ArangoDB
https://github.com/jaredpetersen/kafka-connect-arangodb
arango arangodb kafka kafka-connect
Last synced: about 5 hours ago
JSON representation
🥑 Kafka connect sink connector for ArangoDB
- Host: GitHub
- URL: https://github.com/jaredpetersen/kafka-connect-arangodb
- Owner: jaredpetersen
- License: mit
- Created: 2019-01-27T03:51:19.000Z (almost 6 years ago)
- Default Branch: master
- Last Pushed: 2020-07-20T06:15:15.000Z (over 4 years ago)
- Last Synced: 2024-05-01T14:42:51.004Z (7 months ago)
- Topics: arango, arangodb, kafka, kafka-connect
- Language: Java
- Homepage: https://www.confluent.io/connector/kafka-connect-arangodb/
- Size: 380 KB
- Stars: 29
- Watchers: 3
- Forks: 8
- Open Issues: 7
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Funding: .github/FUNDING.yml
- License: LICENSE.md
- Codeowners: .github/CODEOWNERS
Awesome Lists containing this project
README
# Kafka Connect ArangoDB Connector
[![Build Status](https://github.com/jaredpetersen/kafka-connect-arangodb/workflows/Release/badge.svg)](https://github.com/jaredpetersen/kafka-connect-arangodb/actions)
[![Maven Central](https://maven-badges.herokuapp.com/maven-central/io.github.jaredpetersen/kafka-connect-arangodb/badge.svg)](https://maven-badges.herokuapp.com/maven-central/io.github.jaredpetersen/kafka-connect-arangodb)Kafka Connect Sink Connector for ArangoDB
## Usage
Kafka Connect ArangoDB is a Kafka Connector that translates record data into `REPSERT` and `DELETE` queries that are performed against ArangoDB. Only sinking data is supported at this time.Requires ArangoDB 3.4 or higher.
A full example of how Kafka Connect ArangoDB can be integrated into a Kafka cluster is available in the [development documentation](/docs/development/).
### Record Formats and Structures
The following record formats are supported:
* Avro (Recommended)
* JSON with Schema
* Plain JSONWith each of these formats, the record value can be structured in one of the following ways:
* Simple (Default)
* Change Data Capture#### Simple
The Simple format is a slim record value structure that only provides the information necessary for writing to the database.When written as plain JSON, the record value looks something like:
```json
{
"someKey": "changed value",
"otherKey": true
}
```When the Kafka Connect ArangoDB Connector receives records adhering to this format, it will translate it into the following ArangoDB database changes and perform them:
* Records with a `null` value are "tombstone" records and will result in the deletion of the document from the database
* Records with a non-`null` value will be repserted (replace the full document if it exists already, insert the full document if it does not)This format is the default, so no extra configuration is required.
#### CDC
The CDC format is a record value structure that is designed to handle records produced by Change Data Capture systems like [Debezium](https://debezium.io/). Each record value should be an object with properties `before` and `after` that store the "before" and "after" state of the document, respectively.When written as plain JSON, the record value looks something like:
```json
{
"before": {
"someKey": "some value",
"otherKey": false
},
"after": {
"someKey": "changed value",
"otherKey": true
}
}
```When the Kafka Connect ArangoDB Connector receives this data, it will translate it into the following ArangoDB database changes and perform them:
* Records with a `null` value are "tombstone" records and are are ignored
* Record with a non-`null` value and a `null` value for `after` will result in the deletion of the document
* Records with a non-`null` value and a non-`null` value for `after` will be repserted (replace the full document if it exists already, insert the full document if it does not)To use this record format, configure it as a Kafka Connect Single Message Transformation in the connector's config:
```json
{
. . .
"transforms": "cdc",
"transforms.cdc.type": "io.github.jaredpetersen.kafkaconnectarangodb.sink.transforms.Cdc"
}
```### Topics
The name of the topic determines the name of the collection the record will be written to.Record with topics that are just a plain string like `products` will go into a collection with the name `products`. If the record's topic name is period-separated like `dbserver1.mydatabase.customers`, the last period-separated value will be the collection's name (`customers` in this case). Each configured Kafka Connect ArangoDB Connector will only output data into a single database instance.
### Foreign Keys and Edge Collections
In most situations, the record values that you will want to sink into ArangoDB is not in a format that ArangoDB can use effectively. ArangoDB has it's own format for foreign keys (`{ "foreignKey": "MyCollection/1234" }`) and edges between vertices (`{ "_from": "MyCollection/1234", "_to": "MyCollection/5678" }`) that your input data likely doesn't implement by default. It is recommended that you build your own custom [Kafka Streams application](https://kafka.apache.org/documentation/streams/) to perform these mappings.## Configuration
### Connector Properties
| Name | Description | Type | Default | Importance |
| ------------------------ | ----------------------------------- | -------- | ------- | ---------- |
| `arangodb.host` | ArangoDB server host. | string | | high |
| `arangodb.port` | ArangoDB server host port number. | int | | high |
| `arangodb.user` | ArangoDB connection username. | string | | high |
| `arangodb.password` | ArangoDB connection password. | password | "" | high |
| `arangodb.useSsl` | ArangoDB use SSL connection. | boolean | false | high |
| `arangodb.database.name` | ArangoDB database name. | string | | high |### Single Message Transformations
| Type | Description |
| ------------------------------------------------------------------ | -------------------------------------------------- |
| `io.github.jaredpetersen.kafkaconnectarangodb.sink.transforms.Cdc` | Converts records from CDC format to Simple format. |