Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/OneCricketeer/schema-registry-transfer-smt
Kafka Connect Transform to copy Avro schemas between Schema Registries
https://github.com/OneCricketeer/schema-registry-transfer-smt
avro confluent kafka kafka-connect kafka-connect-smt schema-registry
Last synced: 2 months ago
JSON representation
Kafka Connect Transform to copy Avro schemas between Schema Registries
- Host: GitHub
- URL: https://github.com/OneCricketeer/schema-registry-transfer-smt
- Owner: OneCricketeer
- License: apache-2.0
- Created: 2019-01-29T23:05:25.000Z (almost 6 years ago)
- Default Branch: master
- Last Pushed: 2023-10-06T07:41:36.000Z (over 1 year ago)
- Last Synced: 2024-05-18T22:23:09.819Z (8 months ago)
- Topics: avro, confluent, kafka, kafka-connect, kafka-connect-smt, schema-registry
- Language: Java
- Size: 62.5 KB
- Stars: 49
- Watchers: 5
- Forks: 25
- Open Issues: 7
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-kafka - Schema Registry Transfer SMT
README
Schema Registry Transfer SMT
============================A [Kafka Connect Single Message Transformation (SMT)][smt] that reads the serialized [wire format header][wire-format] of Confluent's `KafkaAvroSerializer`, performs a lookup against a source [Confluent Schema Registry][schema-registry] for the ID in the message, and registers that schema into a destination Registry for that topic/subject under a new ID.
To be used where it is not feasible to make the destination Schema Registry as a follower to the source Registry, or when migrating topics to a new cluster.
> _Requires that the Kafka Connect tasks can reach both Schema Registries._
This transform doesn't mirror the contents of the `_schemas` topic, so therefore each registry can be completely isolated from one another. As a side-effect of this, the subject configurations that might be applied to the `/config` endpoint in the source registry are not copied to the destination. In other words, you might get schema registration errors if using differing compatibility levels on the registries. Just a heads-up.
Example Kafka Connectors where this could be applied.
- [Comcast/MirrorTool-for-Kafka-Connect](https://github.com/Comcast/MirrorTool-for-Kafka-Connect) - Code was tested with this first, and verified that the topic-renaming logic of this connector worked fine with this SMT.
- [Salesforce/mirus](https://github.com/salesforce/mirus)
- [Confluent Replicator](https://docs.confluent.io/current/connect/kafka-connect-replicator/index.html) - While this already can copy the schema, we observed it is only possible via the `AvroConverter`, which must first parse the entire message into a Kafka Connect `Struct` object. Thus, the class here is considered a "shallow" copier — it only inspects [the first 5 bytes][wire-format] of the keys and values for the schema ids.
- [KIP-382 (MirrorMaker 2.0)](https://cwiki.apache.org/confluence/display/KAFKA/KIP-382%3A+MirrorMaker+2.0) - Still open at the time of writing.## Installation
1. Edit the Kafka Connect worker properties file on each worker to include a new directory. For example, `/opt/kafka-connect/plugins`
```sh
plugin.path=/usr/share/java,/opt/kafka-connect/plugins
```2. Build this project
```sh
./mvnw clean package
```3. Copy the JAR from `target` to all Kafka Connect workers under a directory set by `plugin.path`
4. (Re)start Kafka Connect processes
## Usage
Standalone Kafka Connect configuration section
```properties
# Requires that records are entirely byte-arrays. These can go in the worker or connector configuration.
key.converter=org.apache.kafka.connect.converters.ByteArrayConverter
value.converter=org.apache.kafka.connect.converters.ByteArrayConverter# Setup the SMT
transforms=AvroSchemaTransfertransforms.AvroSchemaTransfer.type=cricket.jmoore.kafka.connect.transforms.SchemaRegistryTransfer
transforms.AvroSchemaTransfer.src.schema.registry.url=http://schema-registry-1:8081
transforms.AvroSchemaTransfer.dest.schema.registry.url=http://schema-registry-2:8081
```Distributed Kafka Connect configuration section
```json
"config" : {
..."__comment": "Requires that records are entirely byte-arrays. These can go in the worker or connector configuration.",
"key.converter": "org.apache.kafka.connect.converters.ByteArrayConverter",
"value.converter": "org.apache.kafka.connect.converters.ByteArrayConverter","__comment": "Setup the SMT",
"transforms": "AvroSchemaTransfer","transforms.AvroSchemaTransfer.type": "cricket.jmoore.kafka.connect.transforms.SchemaRegistryTransfer",
"transforms.AvroSchemaTransfer.src.schema.registry.url": "http://schema-registry-1:8081",
"transforms.AvroSchemaTransfer.dest.schema.registry.url": "http://schema-registry-2:8081"
}
```## Advanced Configuration
Configuration Parameter | Default | Description
----------------------- | ------- | -----------
**transfer.message.keys** | true | Indicates whether Avro schemas from message keys in source records should be copied to the destination Registry.
**include.message.headers** | true | Indicates whether message headers from source records should be preserved after the transform.
**schema.capacity** | 100 | Capacity of schemas that can be cached in each `CachedSchemaRegistryClient`## Embedded Schema Registry Client Configuration
Schema Registry Transfer SMT passes some properties prefixed by either `src.` or `dest.`
through to its embedded schema registry clients, after stripping away `src.` or `dest.`
prefix used to disambiguate which client is to receive which configuration value.Properties prefixed by `src.` are passed through to the source consumer's schema registry
client. Properties prefixed by `dest.` are passed through to the target producer's schema
registry client.Configuration Parameter | Default | Description
----------------------- | ------- | -----------
(src\|dest).basic.auth.credentials.source | URL | Specify how to pick credentials for Basic Auth header. Supported values are `URL`, `USER_INFO` and `SASL_INHERIT`
(src\|dest).basic.auth.user.info | | Specify credentials for Basic Auth in form of `{username}:{password}` when source is `USER_INFO`## Subject Renaming
Renaming of a subject can be done with the `RegexRouter` Transform **before** this one.
Example Configuration
```properties
transforms=TopicRename,AvroSchemaTransfertransforms.TopicRename.type=org.apache.kafka.connect.transforms.RegexRouter
transforms.TopicRename.regex=(.*)
transforms.TopicRename.replacement=replica.$1transforms.AvroSchemaTransfer.type=...
```[smt]: https://docs.confluent.io/current/connect/concepts.html#connect-transforms
[schema-registry]: https://docs.confluent.io/current/schema-registry/docs/index.html
[wire-format]: https://docs.confluent.io/current/schema-registry/docs/serializer-formatter.html#wire-format