Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/siddhi-io/siddhi-io-cdc

Extension which consumes CDC events
https://github.com/siddhi-io/siddhi-io-cdc

cdc change-data-capture database debezium extension io siddhi

Last synced: 1 day ago
JSON representation

Extension which consumes CDC events

Awesome Lists containing this project

README

        

Siddhi IO CDC
===================

[![Jenkins Build Status](https://wso2.org/jenkins/job/siddhi/job/siddhi-io-cdc/badge/icon)](https://wso2.org/jenkins/job/siddhi/job/siddhi-io-cdc/)
[![GitHub Release](https://img.shields.io/github/release/siddhi-io/siddhi-io-cdc.svg)](https://github.com/siddhi-io/siddhi-io-cdc/releases)
[![GitHub Release Date](https://img.shields.io/github/release-date/siddhi-io/siddhi-io-cdc.svg)](https://github.com/siddhi-io/siddhi-io-cdc/releases)
[![GitHub Open Issues](https://img.shields.io/github/issues-raw/siddhi-io/siddhi-io-cdc.svg)](https://github.com/siddhi-io/siddhi-io-cdc/issues)
[![GitHub Last Commit](https://img.shields.io/github/last-commit/siddhi-io/siddhi-io-cdc.svg)](https://github.com/siddhi-io/siddhi-io-cdc/commits/master)
[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)

The **siddhi-io-cdc extension** is an extension to Siddhi that captures change data from databases such as MySQL, MS SQL, PostgreSQL, H2 and Oracle.

For information on Siddhi and it's features refer Siddhi Documentation.

## Download

* Versions 3.x and above with group id `io.siddhi.extension.*` from here.
* Versions 2.x and lower with group id `org.wso2.extension.siddhi.*` from here.

## Latest API Docs

Latest API Docs is 2.0.15.

## Features

* cdc *(Source)*



The CDC source receives events when change events (i.e., INSERT, UPDATE, DELETE) are triggered for a database table. Events are received in the 'key-value' format.

There are two modes you could perform CDC: Listening mode and Polling mode.

In polling mode, the datasource is periodically polled for capturing the changes. The polling period can be configured.
In polling mode, you can only capture INSERT and UPDATE changes.

On listening mode, the Source will keep listening to the Change Log of the database and notify in case a change has taken place. Here, you are immediately notified about the change, compared to polling mode.

The key values of the map of a CDC change event are as follows.

For 'listening' mode:
    For insert: Keys are specified as columns of the table.
    For delete: Keys are followed by the specified table columns. This is achieved via 'before_'. e.g., specifying 'before_X' results in the key being added before the column named 'X'.
    For update: Keys are followed followed by the specified table columns. This is achieved via 'before_'. e.g., specifying 'before_X' results in the key being added before the column named 'X'.

For 'polling' mode: Keys are specified as the columns of the table.In order to connect in to the database table for receive CDC events, url, username, password and driverClassName(in polling mode) can be provided in deployment.yaml file under the siddhi namespace as below,



siddhi:
extensions:
-
extension:
name: 'cdc'
namespace: 'source'
properties:
url: jdbc:sqlserver://localhost:1433;databaseName=CDC_DATA_STORE
password: <password>
username: <>
driverClassName: com.microsoft.sqlserver.jdbc.SQLServerDriver



***Preparations required for working with Oracle Databases in listening mode***

Using the extension in Windows, Mac OSX and AIX are pretty straight forward inorder to achieve the required behaviour please follow the steps given below

  - Download the compatible version of oracle instantclient for the database version from [here](https://www.oracle.com/database/technologies/instant-client/downloads.html) and extract
  - Extract and set the environment variable LD_LIBRARY_PATH to the location of instantclient which was exstracted as shown below
  



export LD_LIBRARY_PATH=<path to the instant client location>


  - Inside the instantclient folder which was download there are two jars xstreams.jar and ojdbc<version>.jar convert them to OSGi bundles using the tools which were provided in the <distribution>/bin for converting the ojdbc.jar use the tool spi-provider.sh|bat and for the conversion of xstreams.jar use the jni-provider.sh as shown below(Note: this way of converting Xstreams jar is applicable only for Linux environments for other OSs this step is not required and converting it through the jartobundle.sh tool is enough)
  



./jni-provider.sh <input-jar> <destination> <comma seperated native library names>


  once ojdbc and xstreams jars are converted to OSGi copy the generated jars to the <distribution>/lib. Currently siddhi-io-cdc only supports the oracle database distributions 12 and above

*** Configurations for PostgreSQL***
When using listening mode with PostgreSQL, following properties has to be configured accordingly to create the connection.

    ***slot.name***: (default value = debezium) in postgreSQL only one connection can be created from single slot, so to create multiple connection custom slot.name should be provided.
 
    ***plugin.name***: (default value = decoderbufs ) Logical decoding output plugin name which the database is configured with. Other supported values are pgoutput, decoderbufs, wal2json.

    ***table.name***: table name should be provided as <schema_name>.<table_name>. As an example, public.customer


See parameter: mode for supported databases and change events.


## Dependencies
JDBC connector jar should be added to the runtime. Download the JDBC connector jar based on the database type that is being used.

To identify the required JDBC connector jar, please refer to this debezium release documentation

In addition to that, there are some prerequisites that need to be met based on the CDC mode used. Please find them below.

**Default mode (Listening mode):**

Currently MySQL, PostgreSQL and SQLServer are supported in Listening Mode.
To capture the change events, databases have to be configured as shown below.

* MySQL - https://debezium.io/documentation/reference/connectors/mysql.html#setup-the-mysql-server
* PostgreSQL - https://debezium.io/docs/connectors/postgresql/#setting-up-PostgreSQL
* SQLServer - https://debezium.io/docs/connectors/sqlserver/#setting-up-sqlserver

**Polling mode:**

* Change data capturing table should be have a polling column. Auto Incremental column or Timestamp can be used.

Please see API docs for more details about change data capturing modes.

## Installation

For installing this extension on various siddhi execution environments refer Siddhi documentation section on adding extensions.

## Running Integration tests in docker containers(Optional)

The CDC functionality are tested with the docker base integration test framework.
The test framework initialize a docker container with required configuration before execute the test suit.

**Start integration tests**

1. Install and run docker

2. To run the integration tests, navigate to the siddhi-io-cdc/ directory and issue the following commands.

* H2 default:

mvn clean install

* MySQL 5.7:

mvn verify -P local-mysql -Dskip.surefire.test=true

* Postgres 9.6:

mvn verify -P local-postgres -Dskip.surefire.test=true

* MSSQL:

mvn verify -P local-mssql -Dskip.surefire.test=true

* Oracle 11.2.0.2-xe:

mvn verify -P local-oracle -Dskip.surefire.test=true

## Support and Contribution

* We encourage users to ask questions and get support via StackOverflow, make sure to add the `siddhi` tag to the issue for better response.

* If you find any issues related to the extension please report them on the issue tracker.

* For production support and other contribution related information refer Siddhi Community documentation.