https://github.com/danielepalaia/gemfiregreenplumconnector

A generic gemfire-greenplum JDBC connector using COPY API
https://github.com/danielepalaia/gemfiregreenplumconnector

gemfire gradle greenplum java

Last synced: 8 months ago
JSON representation

A generic gemfire-greenplum JDBC connector using COPY API

Host: GitHub
URL: https://github.com/danielepalaia/gemfiregreenplumconnector
Owner: DanielePalaia
Created: 2019-10-29T16:55:14.000Z (almost 6 years ago)
Default Branch: master
Last Pushed: 2020-04-24T15:22:20.000Z (over 5 years ago)
Last Synced: 2025-01-05T14:46:46.242Z (10 months ago)
Topics: gemfire, gradle, greenplum, java
Language: Java
Homepage:
Size: 4.13 MB
Stars: 0
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: readme.md

Awesome Lists containing this project

README

# Summary

This project serves as example on how to create a simple connector from Gemfire to Postgresql/Greenplum using Gemfire AsyncListener ability.
https://gemfire.docs.pivotal.io/98/geode/developing/events/implementing_write_behind_event_handler.html
When a modification (INSERT, UPDATE, DELETE) is done on a Gemfire region, it is propagated on the relative Postgresql/Greenplum table.
A Greenplum table will consists on two field id and data where id is related to Gemfire key and data to Gemfire value in a Gemfire region.
The connector is generic and will take in input during configurations: Jdbc connection string, username/passwd and namespace.table of the Greenplum database where we want to ingest the rows.
It is using Copy to batch rows and insert them in Greenplum to maximize performance.

## build the project

Build the project with
gradle build

## Create the Greenplum database and table to ingest

The jdbc connection string can be passed in input to the AsyncListener during definition but
for the moment the table name is embedded in the code, so you need to create
an example database and inside it a test table so defined. The connector will connect with gpadmin user without passwd:

```
create database dashboard
\c dashboard
create schema rws;
create table rws.test(id text, data json);
```

## How to deploy the project in Geode

The script below shows how to start up a Geode system and deploy code to integrate Geode and Kafka.

```
start locator --name=locator
start server --name=server1
deploy --dir=/Users/dpalaia/Downloads/GemfireGreenplumConnector/geode-greenplum-listener/build/dependancies
y
deploy --dir=/Users/dpalaia/Downloads/GemfireGreenplumConnector/geode-greenplum-listener/build/libs
y
create async-event-queue --id=jdbc-queue --listener=example.geode.greenplum.GreenplumAsyncEventListener --listener-param=jdbcString#jdbc:postgresql://172.16.125.152:5432/dashboard,username#gpadmin,passwd#,tablename#rws.test,delim#|,rejectlimit#10 --batch-size=3 --batch-time-interval=3000000
create region --name=test --type=PARTITION --async-event-queue-id=jdbc-queue

```

## How to specify input
Input are specified with --listener-param option where:
**jdbcString#jdbc:postgresql://172.16.125.152:5432/example** is the connection string to use, specifying the ip address where Greenplum is stored and database name to use.
**username#gpadmin,passwd#** are the credentials to use to connect to GPDB
**tablename#rws.test** will be the schemaname.tablename to use in our case rws.table1
**delim#|** will be the delimiter to use by the copy command: in this case pipe
**rejectlimit#10** will be the reject limit option by the copy command if reached all the copy transaction will be rejected

## Do some operation on Geode with json and see operation propagated on Greenplum
```
Do some put to create items:

put --region=test --key='first' --value='{"name": "John", "age": "31", "city": "New York"}'
put --region='test' --key='second' --value='{"name": "John", "age": "31", "city": "New York"}'
put --region='test' --key='third' ---value='{"name": "John", "age": "31", "city": "New York"}'

Do some update:
put --region='test' --key='first' --value='{"name": "John", "age": "31", "city": "New YorkUpdated"}'

Do some delete:
remove --region='test' --key='first'
```

## How to detect copy errors

Use the SELECT * from gp_read_error_log('rws.test1'); to read the errors detected by the copy command and the entries skipped.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/danielepalaia/gemfiregreenplumconnector

Awesome Lists containing this project

README