{"id":24475280,"url":"https://github.com/bdbao/kafka-vm","last_synced_at":"2026-04-16T07:31:22.771Z","repository":{"id":259117928,"uuid":"876314288","full_name":"bdbao/Kafka-VM","owner":"bdbao","description":"This project demonstrates a basic Kafka implementation: using the kafka-python library via Ubuntu virtual machine; and Change Data Capture (CDC) between 2 DBMS via Docker.","archived":false,"fork":false,"pushed_at":"2024-11-03T11:51:21.000Z","size":28309,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-01-21T09:14:15.042Z","etag":null,"topics":["apache-kafka","data-streaming","docker","mysql","postgresql","python"],"latest_commit_sha":null,"homepage":"","language":"Makefile","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/bdbao.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-10-21T18:58:14.000Z","updated_at":"2024-11-06T15:05:36.000Z","dependencies_parsed_at":"2024-10-22T22:42:36.238Z","dependency_job_id":null,"html_url":"https://github.com/bdbao/Kafka-VM","commit_stats":null,"previous_names":["bdbao/kafka-vm"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bdbao%2FKafka-VM","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bdbao%2FKafka-VM/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bdbao%2FKafka-VM/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bdbao%2FKafka-VM/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/bdbao","download_url":"https://codeload.github.com/bdbao/Kafka-VM/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243602746,"owners_count":20317700,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["apache-kafka","data-streaming","docker","mysql","postgresql","python"],"created_at":"2025-01-21T09:14:18.570Z","updated_at":"2026-04-16T07:31:22.741Z","avatar_url":"https://github.com/bdbao.png","language":"Makefile","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Kafka Setup on Virtual Machine\n---\n# Demo 1: With `kafka-python` library on Virtal Ubuntu machine\n## Quick start\n```bash\nbrew install --cask multipass\nVM_NAME=\"kafka-vm\"\nmultipass launch --name \"$VM_NAME\" --disk 6G --mem 2.5G\nmultipass shell \"$VM_NAME\"\n```\nNow, switch to **Ubuntu shell**:\n```bash\nsudo adduser kafka # e.g: password is 1234\nsudo adduser kafka sudo\nsu -l kafka\n\ncd ~ \u0026\u0026 git clone https://github.com/bdbao/Kafka-VM\ncurl \"https://downloads.apache.org/kafka/3.8.0/kafka_2.13-3.8.0.tgz\" -o kafka.tgz\nmkdir kafka \u0026\u0026 cd kafka\ntar -xvzf ~/kafka.tgz --strip 1\ncp ~/Kafka-VM/config/server.properties ./config\ncp -r ~/Kafka-VM/scripts .\nsudo cp ~/Kafka-VM/system/* /etc/systemd/system\n\n# get python 3.8\nsudo add-apt-repository ppa:deadsnakes/ppa\nsudo apt update\nsudo apt install python3.8 -y\npython3.8 --version\n\n# Create Virtual Environment\nsudo apt install python3.8-venv -y\npython3.8 -m venv myenv \u0026\u0026 source myenv/bin/activate\n# rm -rf ~/kafka/myenv\n\npip install kafka-python\nsudo apt install openjdk-11-jre-headless -y\n\nsudo systemctl enable zookeeper\nsudo systemctl start zookeeper\nsudo systemctl enable kafka\nsudo systemctl start kafka\nsudo systemctl status zookeeper\nsudo systemctl status kafka\n\npython3.8 scripts/consumer.py\n```\nOpen another terminal:\n```bash\nsu -l kafka # pass: 1234\ncd kafka \u0026\u0026 source myenv/bin/activate\npython3.8 scripts/producer.py --mess \"This is a message\"\n```\nThen we can see update in the first terminal. This is **DONE**!\n\n## Build from scratch\n```bash\nmultipass launch --name kafka-vm --disk 6G --mem 2.5G\nmultipass shell kafka-vm\n```\nMove to Ubuntu shell\n```bash\nsudo adduser kafka\nsudo adduser kafka sudo\nsu -l kafka # pass: 1234\n\nsudo apt update\nsudo apt install openjdk-11-jre-headless -y\njava --version\n\nmkdir ~/Downloads\ncurl \"https://downloads.apache.org/kafka/3.8.0/kafka_2.13-3.8.0.tgz\" -o ~/Downloads/kafka.tgz\nmkdir ~/kafka \u0026\u0026 cd ~/kafka\ntar -xvzf ~/Downloads/kafka.tgz --strip 1\nreadlink -f $(which java)\n```\nModify the file `nano ~/kafka/config/server.properties`:\n```\nlisteners=PLAINTEXT://localhost:9092 # uncomment\nadvertised.listeners=PLAINTEXT://localhost:9092 # uncomment, ip addr show\nlog.dirs=/home/kafka/logs # change dir\ndelete.topic.enable = true # add at EOF\n```\nAdd to file `sudo nano /etc/systemd/system/kafka.service`:\n```\n[Unit]\nRequires=zookeeper.service\nAfter=zookeeper.service\n\n[Service]\nType=simple\nUser=kafka\nEnvironment=\"JAVA_HOME=/usr/lib/jvm/java-11-openjdk-arm64\"\nExecStart=/bin/sh -c '/home/kafka/kafka/bin/kafka-server-start.sh /home/kafka/kafka/config/server.properties \u003e /home/kafka/kafka/kafka.log 2\u003e\u00261'\nExecStop=/home/kafka/kafka/bin/kafka-server-stop.sh\nRestart=on-abnormal\n\n[Install]\nWantedBy=multi-user.target\n```\nAdd to file `sudo nano /etc/systemd/system/zookeeper.service`:\n```\n[Unit]\nRequires=network.target remote-fs.target\nAfter=network.target remote-fs.target\n\n[Service]\nType=simple\nUser=kafka\nExecStart=/home/kafka/kafka/bin/zookeeper-server-start.sh /home/kafka/kafka/config/zookeeper.properties\nExecStop=/home/kafka/kafka/bin/zookeeper-server-stop.sh\nRestart=on-abnormal\n\n[Install]\nWantedBy=multi-user.target\n```\n```bash\nsudo systemctl enable zookeeper\nsudo systemctl start zookeeper\nsudo systemctl status zookeeper\nsudo systemctl enable kafka\nsudo systemctl start kafka\nsudo systemctl status kafka\n\n# get python 3.8:\nsudo add-apt-repository ppa:deadsnakes/ppa\nsudo apt update\nsudo apt install python3.8\npython3.8 --version\n\n# Create Virtual Environment\nsudo apt install python3.8-venv\npython3.8 -m venv kafkaenv \u0026\u0026 source kafkaenv/bin/activate \u0026\u0026 pip install --upgrade pip \u0026\u0026 pip install kafka-python \u0026\u0026 deactivate\n# rm -rf ~/kafka/kafkaenv # delete venv\n```\nOpen 2 terminals for these 2 commnands: \n```bash\npython3.8 consumer.py\npython3.8 producer.py\n```\n\n## [Optional] Another demo ([DigitalOcean](https://www.digitalocean.com/community/tutorials/how-to-install-apache-kafka-on-ubuntu-20-04)):\nSomething more:\n```bash\n# install java, mem \u003e= 2GB\nsudo apt update\nsudo apt install openjdk-11-jre-headless -y\njava --version\n\n# For kafka 3.x\ncurl \"https://downloads.apache.org/kafka/3.8.0/kafka_2.13-3.8.0.tgz\" -o ~/Downloads/kafka.tgz\n\n~/kafka/bin/kafka-topics.sh --create --bootstrap-server localhost:9092 --replication-factor 1 --partitions 1 --topic TutorialTopic\n~/kafka/bin/kafka-topics.sh --delete --bootstrap-server localhost:9092 --topic TutorialTopic # delete topic\n```\n\n# Demo 2: Streamming between 2 DBMS using Kafka on Docker\n![demo2 info](images/demo2.png)\n## Quick start 1 (Host database on Docker)\n```bash\ngit clone https://github.com/bdbao/Kafka-VM\ncd Kafka-VM\n\ndocker compose -f database_docker/docker-compose.yml up -d\ndocker compose up -d\n\nmake source\nmake sink\n\nmysql -h 127.0.0.1 -P 3306 -u user_kafka -p db_kafka # pass: Admin@123\n  CREATE TABLE db_kafka.E00Status (id INT AUTO_INCREMENT PRIMARY KEY, status VARCHAR(50) NOT NULL, created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP);\n  \npsql -h localhost -U user_kafka -d db_kafka # pass: 1234\n  CREATE TABLE \"E00Status\" (id SERIAL PRIMARY KEY, status VARCHAR(50) NOT NULL, created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP);\n  INSERT INTO \"E00Status\" (status) VALUES ('Active'), ('Inactive'), ('Pending'), ('Completed'), ('Failed');\n  INSERT INTO \"E00Status\" (status) VALUES ('New Status');\n  UPDATE \"E00Status\" SET status = 'Archived' WHERE id = 2;\n```\n- Open: http://localhost:9000 to access the Kafka UI.\\\n  Open **DBeaver** for viewing databases.\n\n```bash\nmake clean # delete connections on Kafka\ndocker compose -f database_docker/docker-compose.yml down -v\ndocker compose down\n```\nThis is **DONE**!\n\n## Quick start 2 (Host database on local)\n```bash\ngit clone https://github.com/bdbao/Kafka-VM\ncd Kafka-VM\n\nbrew install postgresql@16 # or: postgresql\nbrew install mysql\nmake startdb\n\npsql -U postgres\n  # add new user\n  CREATE USER user_kafka WITH PASSWORD '1234';\n  ALTER USER user_kafka WITH SUPERUSER; # (optional)\n  ALTER USER user_kafka WITH REPLICATION;\n  # list all users\n  \\du\n  \\q # quit\n\npsql -h localhost -U user_kafka -d postgres\n  SHOW config_file;\n  # Go to file, change `wal_level = logical`, uncomment this line\n  # brew services restart postgresql@16\n  SHOW wal_level;\n\n  CREATE DATABASE db_kafka;\n  GRANT ALL PRIVILEGES ON DATABASE db_kafka TO user_kafka;\n  \\l # list all db\n  \n  \\c db_kafka\n\n  CREATE TABLE \"E00Status\" (\n      id SERIAL PRIMARY KEY,\n      status VARCHAR(50) NOT NULL,\n      created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP\n  );\n  \\dt # list all tables in db\n\nmysql -u root\n  CREATE USER 'user_kafka'@'localhost' IDENTIFIED BY 'Admin@123'; # (use % for any host)\n  GRANT ALL ON *.* TO 'user_kafka'@'localhost' WITH GRANT OPTION;\n  FLUSH PRIVILEGES; # apply changes\n  SHOW GRANTS FOR 'user_kafka'@'localhost';\n  # list all users\n  SELECT User, Host FROM mysql.user;\n\n  \\q # quit\n\nmysql -h localhost -P 3306 -u user_kafka -p # Pass: Admin@123\n  CREATE DATABASE db_kafka;\n  USE db_kafka;\n\n  CREATE TABLE db_kafka.E00Status (id INT AUTO_INCREMENT PRIMARY KEY, status VARCHAR(50) NOT NULL, created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP);\n  SHOW TABLES;\n\ndocker compose up -d\nmake source\nmake sink\n```\n- Open: http://localhost:9000 to access the Kafka UI.\\\n  Open **DBeaver** for viewing databases.\n\nChange source database for Change Data Capture (CDC)\n```sql\n-- Run: psql -h localhost -U user_kafka -d db_kafka\nINSERT INTO \"E00Status\" (status) VALUES ('Active'), ('Inactive'), ('Pending'), ('Completed'), ('Failed');\nINSERT INTO \"E00Status\" (status) VALUES ('New Status');\nUPDATE \"E00Status\" SET status = 'Archived' WHERE id = 2;\nDELETE FROM \"E00Status\" WHERE id = 3; -- not capture yet\n```\n- (Optional) Delete user and database:\n```bash\npsql -U postgres\n  DROP DATABASE db_kafka;\n  DROP PUBLICATION dbz_publication;\n  DROP USER user_kafka;\nmysql -u root\n  DROP DATABASE db_kafka;\n  DROP USER 'user_kafka'@'localhost';\n  FLUSH PRIVILEGES;\n```\n```bash\nmake clean # delete connections on Kafka\nmake stop\n```\nThis is **DONE**!\n## Build from scratch\n```bash\ncd Kafka-VM\n\nbrew install postgresql@16 # or: postgresql\nbrew install mysql\nmake startdb\n\ndocker compose up -d\n```\n- Open: http://localhost:9000 to access the Kafka UI and inspect the topics and messages.\n\nTo demonstrate the Debezium Postgres **source** connector and JDBC **sink** connector using your Docker Compose setup, you need to follow these steps:\n1. Create PostgreSQL database-user and the PostgreSQL Source Connector:\n```bash\npsql -U postgres\n  \\l # list all db\n  \\c your_db # choose db\n  \\dt # list all tables in db\n\n  # create db\n  CREATE DATABASE db_kafka;\n  \n  # delete db\n  SELECT pg_terminate_backend(pg_stat_activity.pid)\n  FROM pg_stat_activity\n  WHERE pg_stat_activity.datname = 'your_db';\n  DROP DATABASE your_db;\n\n  # list all users\n  \\du \n\n  # add new user\n  CREATE USER user_kafka WITH PASSWORD '1234';\n  ALTER USER user_kafka WITH SUPERUSER; # (optional)\n  ALTER USER user_kafka WITH REPLICATION;\n  GRANT ALL PRIVILEGES ON DATABASE db_airflow TO user_kafka;\n\n  SELECT * FROM pg_replication_slots;\n  SELECT pg_drop_replication_slot('debezium');\n  \n  # delete user\n  SELECT pg_terminate_backend(pg_stat_activity.pid)\n  FROM pg_stat_activity\n  WHERE pg_stat_activity.usename = 'username_to_delete';\n  DROP USER username_to_delete;\n\n  # change password\n  ALTER USER your_username WITH PASSWORD 'new_password';\n\n  \\q # quit psql postgres\n```\n1.1. Fix the bug: Change `wal_level` to `logical`\n```bash\npsql -h localhost -U user_kafka -d db_kafka # Check connection to db\n  SHOW config_file;\n  # Change `wal_level = logical`, uncomment this line\n  # brew services restart postgresql@16\n  SHOW wal_level;\n\n  # (Optional) If create new one: delete old Replication slot, or change 'slot.name'\n  SELECT * FROM pg_replication_slots;\n  SELECT pg_drop_replication_slot('debezium');\n  SELECT * FROM pg_replication_slots;\n```\n```bash\n# Send the connector configuration to Kafka Connect. \n# This will create the Debezium source connector that reads changes from your PostgreSQL database and publishes them to the Kafka topic.\ncurl -X POST http://localhost:8083/connectors \\\n-H \"Content-Type: application/json\" \\\n-d '{\n  \"name\": \"debezium-postgres-connector\",\n  \"config\": {\n    \"connector.class\": \"io.debezium.connector.postgresql.PostgresConnector\",\n    \"tasks.max\": \"1\",\n    \"database.hostname\": \"host.docker.internal\",\n    \"database.port\": \"5432\",\n    \"database.user\": \"user_kafka\",\n    \"database.password\": \"1234\",\n    \"database.dbname\": \"db_kafka\",\n    \"database.server.name\": \"source\",\n    \"plugin.name\": \"pgoutput\",\n    \"slot.name\": \"debezium\",\n    \"publication.name\": \"dbz_publication\",\n    \"table.include.list\": \"E00Status\",\n    \"database.history.kafka.bootstrap.servers\": \"kafka1:29092\",\n    \"database.history.kafka.topic\": \"schema-changes.sales\",\n    \"topic.prefix\": \"source\",\n    \"transforms\": \"route\",\n    \"transforms.route.type\": \"org.apache.kafka.connect.transforms.RegexRouter\",\n    \"transforms.route.regex\": \"([^.]+)\\\\.([^.]+)\\\\.([^.]+)\",\n    \"transforms.route.replacement\": \"$3\"\n  }\n}'\n```\nOutput is like:\n```\n{\"name\":\"debezium-postgres-connector\",\"config\":{\"connector.class\":\"io.debezium.connector.postgresql.PostgresConnector\",\"tasks.max\":\"1\",\"database.hostname\":\"host.docker.internal\",\"database.port\":\"5432\",\"database.user\":\"user_kafka\",\"database.password\":\"1234\",\"database.dbname\":\"db_kafka\",\"database.server.name\":\"source\",\"plugin.name\":\"pgoutput\",\"slot.name\":\"debezium\",\"publication.name\":\"dbz_publication\",\"table.include.list\":\"E00Status\",\"database.history.kafka.bootstrap.servers\":\"kafka1:29092\",\"database.history.kafka.topic\":\"schema-changes.sales\",\"topic.prefix\":\"source\",\"transforms\":\"route\",\"transforms.route.type\":\"org.apache.kafka.connect.transforms.RegexRouter\",\"transforms.route.regex\":\"([^.]+)\\\\.([^.]+)\\\\.([^.]+)\",\"transforms.route.replacement\":\"$3\",\"name\":\"debezium-postgres-connector\"},\"tasks\":[],\"type\":\"source\"}%\n```\n- Script to **Create sample table** for Change Data Capture:\n```bash\npsql -h localhost -U user_kafka -d db_kafka\n  CREATE TABLE \"E00Status\" (\n      id SERIAL PRIMARY KEY,\n      status VARCHAR(50) NOT NULL,\n      created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP\n  );\n\n  INSERT INTO \"E00Status\" (status) VALUES\n  ('Active'),\n  ('Inactive'),\n  ('Pending'),\n  ('Completed'),\n  ('Failed');\n\n  INSERT INTO \"E00Status\" (status) VALUES ('New Status');\n  UPDATE \"E00Status\" SET status = 'Archived' WHERE id = 2;\n  DELETE FROM \"E00Status\" WHERE id = 3; # Not capture yet\n\n# Check the consumer\ndocker exec -it kafka1 /usr/bin/kafka-console-consumer --bootstrap-server localhost:29092 --topic E00Status --from-beginning\n```\nCheck if the connector is created and running:\n```bash\ncurl http://localhost:8083/connectors\n# Output: `[\"debezium-postgres-connector\"]%`\n```\n2. Create MySQL database-user and the JDBC Sink Connector\n```bash\nmysql -u root\nmysql -u root -p # if you’ve set a password\n  # list all db\n  SHOW DATABASES; \n  \n  # create db\n  CREATE DATABASE db_kafka;\n\n  USE db_kafka;\n  SHOW TABLES;\n  SELECT * FROM table_name;\n\n  # delete db\n  DROP DATABASE db_kafka;\n\n  # list all users\n  SELECT User, Host FROM mysql.user;\n  SHOW GRANTS FOR 'root'@'localhost'; # show user privileges\n\n  # add new user\n  CREATE USER 'user_kafka'@'localhost' IDENTIFIED BY 'Admin@123'; # (use % for any host)\n  GRANT ALL ON *.* TO 'user_kafka'@'localhost' WITH GRANT OPTION;\n  FLUSH PRIVILEGES; # apply changes\n  SHOW GRANTS FOR 'user_kafka'@'%';\n\n  # delete user\n  DROP USER 'user_kafka'@'localhost';\n\n  # change password\n  ALTER USER 'username'@'host' IDENTIFIED BY 'new_password';\n  FLUSH PRIVILEGES;\n\n  \\q # quit mysql\n\nmysql --version # See the version for choosing `MySQL8Dialect`\n\nmysql -h localhost -P 3306 -u user_kafka -p # pass: Admin@123\n  CREATE TABLE db_kafka.E00Status (\n      id INT AUTO_INCREMENT PRIMARY KEY,\n      status VARCHAR(50) NOT NULL,\n      created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP\n  );\n\n  (DELETE FROM db_kafka.E00Status WHERE id=3;)\n```\n\n2.1. Fix the bug: ***org.hibernate.exception.GenericJDBCException: Unable to acquire JDBC Connection [Connections could not be acquired from the underlying database!]***\n- Add **MySQL JDBC Driver** to Kafka in Docker\\\nDownload: [Here](https://downloads.mysql.com/archives/get/p/3/file/mysql-connector-j-8.0.31.zip)\n```bash\n# docker volume create kafka-jdbc\n# docker run --rm -v kafka-jdbc:/jdbc busybox mkdir /jdbc/lib\n# docker cp ./mysql-connector-j-8.0.31/mysql-connector-j-8.0.31.jar $(docker create --rm -v kafka-jdbc:/jdbc busybox):/jdbc/lib/\ndocker cp ./mysql-connector-j-8.0.31/mysql-connector-j-8.0.31.jar debezium:/jdbc/lib/\ndocker exec -it debezium ls /jdbc/lib # check file .jar exists\n```\n- Update `docker-compose.yml` as in [here](https://github.com/bdbao/Kafka-VM/blob/27555ef1c7caf88a8c8330e7f49b12b444445bec/docker-compose.yml) (in comment part).\n```bash\ndocker-compose down\ndocker-compose up -d\n```\n```bash\n# Send the sink connector configuration\n# topic: {table_name}\ncurl -X POST http://localhost:8083/connectors \\\n-H \"Content-Type: application/json\" \\\n-d '{\n  \"name\": \"jdbc-sink-connector\",\n  \"config\": {\n    \"connector.class\": \"io.debezium.connector.jdbc.JdbcSinkConnector\",\n    \"tasks.max\": \"1\",\n    \"topics\": \"E00Status\",\n    \"connection.url\": \"jdbc:mysql://host.docker.internal:3306/db_kafka\",\n    \"connection.username\": \"user_kafka\",\n    \"connection.password\": \"Admin@123\",\n    \"auto.create\": \"true\",\n    \"auto.evolve\": \"true\",\n    \"insert.mode\": \"upsert\",\n    \"primary.key.fields\": \"id\",\n    \"primary.key.mode\": \"record_key\",\n    \"schema.evolution\": \"basic\",\n    \"transforms\": \"unwrap\",\n    \"transforms.unwrap.type\": \"io.debezium.transforms.ExtractNewRecordState\",\n    \"key.converter\": \"org.apache.kafka.connect.json.JsonConverter\",\n    \"key.converter.schemas.enable\": \"true\",\n    \"value.converter\": \"org.apache.kafka.connect.json.JsonConverter\",\n    \"value.converter.schemas.enable\": \"true\",\n    \"hibernate.dialect\": \"org.hibernate.dialect.MySQL8Dialect\"\n  }\n}'\n```\nOutput is like:\n```\n{\"name\":\"jdbc-sink-connector\",\"config\":{\"connector.class\":\"io.debezium.connector.jdbc.JdbcSinkConnector\",\"tasks.max\":\"1\",\"topics\":\"source.E00Status\",\"connection.url\":\"jdbc:mysql://host.docker.internal:3306/db_kafka\",\"connection.username\":\"user_kafka\",\"connection.password\":\"Admin@123\",\"auto.create\":\"true\",\"auto.evolve\":\"true\",\"insert.mode\":\"upsert\",\"primary.key.fields\":\"id\",\"primary.key.mode\":\"record_key\",\"schema.evolution\":\"basic\",\"transforms\":\"unwrap\",\"transforms.unwrap.type\":\"io.debezium.transforms.ExtractNewRecordState\",\"key.converter\":\"org.apache.kafka.connect.json.JsonConverter\",\"key.converter.schemas.enable\":\"true\",\"value.converter\":\"org.apache.kafka.connect.json.JsonConverter\",\"value.converter.schemas.enable\":\"true\",\"hibernate.dialect\":\"org.hibernate.dialect.MySQL8Dialect\",\"name\":\"jdbc-sink-connector\"},\"tasks\":[],\"type\":\"sink\"}%\n```\nCheck the status of the JDBC connector:\n```bash\ncurl http://localhost:8083/connectors/\n# Output: `[\"debezium-postgres-connector\",\"jdbc-sink-connector\"]%`\n```\n\n2.2. Fix the bug: ***org.apache.kafka.connect.errors.ConnectException: Exiting WorkerSinkTask due to unrecoverable exception. at org.apache.kafka.connect.runtime.WorkerSinkTask.deliverMessages(WorkerSinkTask.java:635) at***\n```bash\n# Check consumer can catch the change or not\n\ndocker exec -it debezium ls /kafka/connect/debezium-connector-jdbc | grep \"mysql\"\ndocker cp ./mysql-connector-j-8.0.31/mysql-connector-j-8.0.31.jar debezium:/kafka/connect/debezium-connector-jdbc/mysql-connector-java-8.0.31.jar\ndocker exec -it debezium rm -rdf /kafka/connect/debezium-connector-jdbc/mysql-connector-j-9.0.0.jar\n\ndocker logs debezium \u003e log_debezium.txt\n```\n2.3. Fix the bug (in `log_debezium.txt`): ***org.apache.kafka.connect.errors.ConnectException: Failed to process a sink record. Caused by: java.lang.NullPointerException: Cannot invoke \"org.apache.kafka.connect.data.Schema.name()\" because the return value of \"org.apache.kafka.connect.sink.SinkRecord.valueSchema()\" is null***\n```bash\nmake clean\n\ncurl -X POST http://localhost:8083/connectors \\\n-H \"Content-Type: application/json\" \\\n-d '{\n  \"name\": \"debezium-postgres-connector\",\n  \"config\": {\n    \"connector.class\": \"io.debezium.connector.postgresql.PostgresConnector\",\n    \"tasks.max\": \"1\",\n    \"database.hostname\": \"host.docker.internal\",\n    \"database.port\": \"5432\",\n    \"database.user\": \"user_kafka\",\n    \"database.password\": \"1234\",\n    \"database.dbname\": \"db_kafka\",\n    \"database.server.name\": \"source\",\n    \"plugin.name\": \"pgoutput\",\n    \"slot.name\": \"debezium\",\n    \"publication.name\": \"dbz_publication\",\n    \"table.include.list\": \"public.E00Status\",\n    \"database.history.kafka.bootstrap.servers\": \"kafka1:29092\",\n    \"database.history.kafka.topic\": \"schema-changes.sales\",\n    \"topic.prefix\": \"source\",\n    \"transforms\": \"route\",\n    \"transforms.route.type\": \"org.apache.kafka.connect.transforms.RegexRouter\",\n    \"transforms.route.regex\": \"([^.]+)\\\\.([^.]+)\\\\.([^.]+)\",\n    \"transforms.route.replacement\": \"$3\",\n    \"key.converter\": \"org.apache.kafka.connect.json.JsonConverter\",\n    \"key.converter.schemas.enable\": \"true\",\n    \"value.converter\": \"org.apache.kafka.connect.json.JsonConverter\",\n    \"value.converter.schemas.enable\": \"true\"\n  }\n}'\n\ncurl -X POST http://localhost:8083/connectors \\\n-H \"Content-Type: application/json\" \\\n-d '{\n  \"name\": \"jdbc-sink-connector\",\n  \"config\": {\n    \"connector.class\": \"io.debezium.connector.jdbc.JdbcSinkConnector\",\n    \"tasks.max\": \"1\",\n    \"topics\": \"E00Status\",\n    \"connection.url\": \"jdbc:mysql://host.docker.internal:3306/db_kafka\",\n    \"connection.username\": \"user_kafka\",\n    \"connection.password\": \"Admin@123\",\n    \"auto.create\": \"true\",\n    \"auto.evolve\": \"true\",\n    \"insert.mode\": \"upsert\",\n    \"primary.key.fields\": \"id\",\n    \"primary.key.mode\": \"record_key\",\n    \"schema.evolution\": \"basic\",\n    \"transforms\": \"unwrap\",\n    \"transforms.unwrap.type\": \"io.debezium.transforms.ExtractNewRecordState\",\n    \"key.converter\": \"org.apache.kafka.connect.json.JsonConverter\",\n    \"key.converter.schemas.enable\": \"true\",\n    \"value.converter\": \"org.apache.kafka.connect.json.JsonConverter\",\n    \"value.converter.schemas.enable\": \"true\",\n    \"errors.log.enable\": \"true\",\n    \"errors.log.include.messages\": \"true\",\n    \"errors.tolerance\": \"all\"\n  }\n}'\n```\n\n3. Verify Data Flow (DBeaver for viewing)\n   - PostgreSQL to Kafka: To test the data flow, modify a record in the PostgreSQL source database (insert, update, or delete). The Debezium source connector should capture the change and publish it to the Kafka topic.\n   - Kafka to MySQL: The JDBC sink connector will pick up this change and apply it to your MySQL destination.\n   - **Fix DBeaver** for MySQL: \"Public Key Retrieval is not allowed\":\n     + Right-click your connection, choose \"Edit Connection\"\n      + On the \"Connection settings\" screen (main screen), click on \"Driver properties\"\n      + Set these two properties: \"allowPublicKeyRetrieval\" to true and \"useSSL\" to false\n\nFor troubleshooting: `docker logs debezium`\n\n# Some notes:\n- Host DB on local machine / different container:\n  - Call db_server from local machine, use: `localhost`\n  - Call db_server from Docker container: `host.docker.internal`\n- Host DB on the same container (in same file **docker-compose.yml**) like **demo-3**:\n  - Call db_server from local machine, still use: `localhost`\n  - Call db_server from that Docker container: `localhost` or ***service-name***\n- Some other resource:\n  - [MySQL to PostgreSQL, demo-3](https://blog.devgenius.io/change-data-capture-from-mysql-to-postgresql-using-kafka-connect-and-debezium-ae8740ef3a1d)\n  - [MySQL to MySQL](https://medium.com/@alexander.murylev/kafka-connect-debezium-mysql-source-sink-replication-pipeline-fb4d7e9df790)\n  - [Dezebium doc for PostgreSQL](https://debezium.io/documentation/reference/stable/connectors/postgresql.html)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbdbao%2Fkafka-vm","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbdbao%2Fkafka-vm","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbdbao%2Fkafka-vm/lists"}