{"id":20652637,"url":"https://github.com/cbrianpace/kafka-ora2pg","last_synced_at":"2025-04-18T21:37:35.977Z","repository":{"id":175139653,"uuid":"543264640","full_name":"cbrianpace/kafka-ora2pg","owner":"cbrianpace","description":"Database Replication between Postgres and Oracle","archived":false,"fork":false,"pushed_at":"2024-04-19T13:23:52.000Z","size":56397,"stargazers_count":9,"open_issues_count":0,"forks_count":2,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-03-29T06:33:48.807Z","etag":null,"topics":["debezium","oracle","oracle-database","postgres","postgresql"],"latest_commit_sha":null,"homepage":"","language":"Shell","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/cbrianpace.png","metadata":{"files":{"readme":"readme.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-09-29T18:23:52.000Z","updated_at":"2024-12-11T11:56:34.000Z","dependencies_parsed_at":"2024-11-16T17:46:36.571Z","dependency_job_id":null,"html_url":"https://github.com/cbrianpace/kafka-ora2pg","commit_stats":null,"previous_names":["cbrianpace/kafka-ora2pg"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cbrianpace%2Fkafka-ora2pg","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cbrianpace%2Fkafka-ora2pg/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cbrianpace%2Fkafka-ora2pg/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cbrianpace%2Fkafka-ora2pg/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/cbrianpace","download_url":"https://codeload.github.com/cbrianpace/kafka-ora2pg/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":249549936,"owners_count":21289577,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["debezium","oracle","oracle-database","postgres","postgresql"],"created_at":"2024-11-16T17:36:16.027Z","updated_at":"2025-04-18T21:37:35.948Z","avatar_url":"https://github.com/cbrianpace.png","language":"Shell","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Database Replication with Debezium and Kafka\n\nIt is common for other databases outside of the 'source' database to require up-to-date data to other systems.  This tutorial will step through setting up this replication from stratch.  \n\nThere are two sets of data that will be replicated.  The first is HR related tables (employees, departments, jobs, locations) which will be replicated from Oracle to Postgres.  The second is baseball related tables (game, team, venue) which will be replicated from Postgres to Oracle.\n\n## Components Overview\n\n### Volumes\n\nSeveral volumes need to be presented to various containers to persist data.  Be sure to review/update the docker-compose.yaml and the volume references.\n\n| Container    | Target Mount         |\n|--------------|----------------------|\n| oracle       | /opt/oracle/oradata  |\n| postgres     | /pgdata              |\n\nTo prepare mounts, create local directories and update the docker-compose file with the correct source mappings.  Below is used for this example:\n\n```shell\nmkdir -p /app/docker/oracle/oradata\nmkdir -p /app/docker/oracle/diag/rdbms\nmkdir -p /app/docker/postgres/pgdata\n```\n\n### Ports\n\nSeveral ports are exposed to interact with the deployed containers and software deployments.  Note that Postgres is using the non-default port setting of 5433 to avoid any already running Postgres clusters. \n\n| Container    | Port(s)         |\n|--------------|-----------------|\n| oracle       | 1521            |\n| postgres     | 5433            |\n| zookeeper    | 2181,2888,3888  |\n| kafka        | 9092,29092,9999 |\n| connect      | 8083,9080,9012  |\n| prometheus   | 9090            |\n| grafana      | 3001            |\n\n### Oracle\n\nOracle Database Express Edition is used in this tutorial.  Prior to using this tutorial be sure that you have read and understand the licenses around Express Edition.\n\nThe container used is container-registry.oracle.com/database/express:21.3.0-xe.  As this container spins up for the first time it could take several minutes to create the new instance.  If deploying on Linux then modify the oracle/Dockerfile and comment out the COPY lines for runOracle.sh, checkDBStatus.sh, createDB.sh.  These custom scripts are required to workaround issues on Mac (and possibily Windows).\n\n## Setup\n\nAll of the steps assume the current directory is debezium-poc.\n\n### Oracle Instant Client\n\nBefore performing the build, the Kafka Connect container needs the Oracle Instant client.  Refer to the Oracle Instant Client download page to download the necessary packages and review the license agreement.  Download the Basic Package and extract the contents into the connect/instantclient directory.\n\nHere is an example of the process for Mac OS:\n\n```shell\ncd connect\nwget https://download.oracle.com/otn_software/mac/instantclient/198000/instantclient-basic-macos.x64-19.8.0.0.0dbru.zip \nunzip instantclient-basic-macos.x64-19.8.0.0.0dbru.zip\nmv instantclient_19_8 instantclient\n```\n\n### Docker Compose\n\nReview the docker compose file and make any necessary adjustments for volumes.  Note that if any port modifications are made there may be requirements to modify the connector json files.\n\nDeploy the environment using docker-compose.\n\n```shell\ncd debezium-poc\ndocker-compose -f docker-compose.yaml up --build -d\n```\n\n### Oracle Setup\n\nOnce the Oracle container is fully up and running (look for the 'Database is Ready' banner in the container log), the database needs to be modified to enable archive log mode, supplemental logging, etc.  Use the following steps to exec into the Oracle container and perform these setups.  The database will restart several times.\n\n```shell\ndocker exec -it oracle /bin/bash\nsu - oracle\n# Set ORACLE_SID to XE using oraenv\n. oraenv\n./setup-oracle.sh\nsqlplus sys/welcome1@localhost:1521/xepdb1 as sysdba @sql/hr-ora.sql\nsqlplus sys/welcome1@localhost:1521/xepdb1 as sysdba @sql/mlb-ora.sql \n```\n\nThe container may also report a few ORA-600s which can be ignored as this is related to the container environment itself and does not affect the functionality needed for this tutorial.\n\n### Postgres Setup\n\nTo setup Postgres database execute the commands below.\n\n```shell\ndocker exec -it postgres /bin/bash\npsql -p 5433 -f /var/lib/pgsql/sql/hr-pg.sql\npsql -p 5433 -f /var/lib/pgsql/sql/mlb-pg.sql\n```\n\n### Register the Connectors\n\nUsing curl, register the connectors with Kafka-Connect.\n\n```shell\ncurl -i -X POST -H \"Accept:application/json\" -H  \"Content-Type:application/json\" http://localhost:8083/connectors/ -d @connectors/oracle-connector.json\ncurl -i -X POST -H \"Accept:application/json\" -H  \"Content-Type:application/json\" http://localhost:8083/connectors/ -d @connectors/postgres-connector.json\ncurl -i -X POST -H \"Accept:application/json\" -H  \"Content-Type:application/json\" http://localhost:8083/connectors/ -d @connectors/postgres-sink.json\ncurl -i -X POST -H \"Accept:application/json\" -H  \"Content-Type:application/json\" http://localhost:8083/connectors/ -d @connectors/oracle-sink.json\n```\n\n### Check Kafka Connect Logs\n\nTo ensure that everything is working from the Kafka-Connect perspective, view the logs from the kafka-connect conatiner.\n\n```shell\ndocker logs kafka-connect -f\n```\n\n## Test Replication\n\n### Update Oracle\n\nExec into Oracle container and start sqlplus.\n\n```shell\ndocker exec -it oracle /bin/bash\nsu - oracle\n# Set ORACLE_SID to XE via oraenv\n. oraenv\nsqlplus sys/welcome1@localhost:1521/xepdb1 as sysdba\n```\n\nFrom within the sqlplus session, execute the following SQL statements.\n\n```shell\nINSERT INTO hr.employees (employee_id, first_name, last_name, hire_date) VALUES (200, 'George', 'Washington', sysdate);\nSELECT SUM(salary) FROM hr.employees;\nUPDATE hr.employees SET salary=salary*1.1;\nCOMMIT;\nSELECT SUM(salary) FROM hr.employees;\n```\n\n### Verify Postgres\n\nWith the changes made to the employees table in Oracle, verify the updates in Postgres.  First, start psql using docker exec.\n\n```shell\ndocker exec -it postgres psql -p 5433 -d hr\n```\n\nRun the following SQL to verify replication.\n\n```sql\nSELECT * FROM employees WHERE last_name='Washington';\nSELECT SUM(salary) FROM employees;\n```\n\n### Update Postgres\n\nNow let us test the reverse replication using the sports tables.  Connect to Postgres using psql and the sport database.\n\n```shell\ndocker exec -it postgres psql -p 5433 -d sport\n```\n\nRun the following SQL statements to update data in Postgres.\n\n```sql\nINSERT INTO venue (venue_id, venue_name, city, country) VALUES (900,'Crunchy Park', 'Jacksonville, FL', 'USA');\nUPDATE venue SET venue_name='Pace Park' WHERE venue_id=136;\n```\n\n### Verify Oracle\n\nExec into Oracle container and start sqlplus.\n\n```shell\ndocker exec -it oracle /bin/bash\nsqlplus sys/welcome1@localhost:1521/xepdb1 as sysdba\n```\n\nRun the following SQL statement to verify the updates in Oracle.\n\n```sql\nSELECT * FROM sport.venue WHERE venue_id IN (900,136);\n```\n\n## Conclusion\n\nDebezium helps bridge the data gap by performing change data capture in both Oracle and Postgres and publishing those messages to Kafka.  The Oracle capture leverages logminner which does have some scalability challenges.  On the Postgres side, Debezium leverages the native logical replication capabilities and scales better.\n\nLast, Prometheus and Grafana is deployed with built in dashboards and mining of metrics published by Debezium.  Be sure to check those out by access Grafana at http://localhost:3001.  The default user/password for Grafana is admin/admin.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcbrianpace%2Fkafka-ora2pg","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcbrianpace%2Fkafka-ora2pg","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcbrianpace%2Fkafka-ora2pg/lists"}