{"id":13813008,"url":"https://github.com/blendle/pg2kafka","last_synced_at":"2025-05-14T22:31:32.131Z","repository":{"id":49212449,"uuid":"108138694","full_name":"blendle/pg2kafka","owner":"blendle","description":"Ship changes in Postgres 🐘 to Kafka 📖","archived":true,"fork":false,"pushed_at":"2021-06-23T11:10:42.000Z","size":117,"stargazers_count":66,"open_issues_count":6,"forks_count":5,"subscribers_count":9,"default_branch":"master","last_synced_at":"2025-03-04T17:40:44.025Z","etag":null,"topics":["golang","kafka","postgresql","stream-processing"],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"isc","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/blendle.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2017-10-24T14:29:04.000Z","updated_at":"2025-02-11T21:33:39.000Z","dependencies_parsed_at":"2022-09-10T15:11:26.197Z","dependency_job_id":null,"html_url":"https://github.com/blendle/pg2kafka","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/blendle%2Fpg2kafka","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/blendle%2Fpg2kafka/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/blendle%2Fpg2kafka/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/blendle%2Fpg2kafka/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/blendle","download_url":"https://codeload.github.com/blendle/pg2kafka/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254239602,"owners_count":22037734,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["golang","kafka","postgresql","stream-processing"],"created_at":"2024-08-04T04:00:59.947Z","updated_at":"2025-05-14T22:31:27.058Z","avatar_url":"https://github.com/blendle.png","language":"Go","funding_links":[],"categories":["Go"],"sub_categories":[],"readme":"pg2kafka\n--------\n\nThis service adds triggers to a given table in your Postgres database after\ntaking a snapshot of it's initial representation, and tracks changes to that\ntable to deliver them to a Kafka topic.\n\nIt consists of two parts:\n\n- A schema in your DB containing an `outbound_event_queue` table and all the\n  necessary functions and triggers to take snapshots and track changes.\n- A small executable that reads from said table, and ships them to Kafka.\n\n*pg2kafka is still in early development*, it is not advised to use this in\nproduction. If you run into issues, please open an issue.\n\nWe use this as a way to reliably get data out of our hosted PostgreSQL databases\nwhere we cannot use systems like [debezium](http://debezium.io) or\n[bottled water](https://github.com/confluentinc/bottledwater-pg) since we do not\nhave access to the WAL logs and cannot install native extensions or run binaries\non the database host machine.\n\nThe following SQL statements are used to send updates to the relevant topic:\n\n* `INSERT`\n* `UPDATE`\n* `DELETE`\n\n## Usage\n\nConnect pg2kafka to the database you want to stream changes from, and set the\n`PERFORM_MIGRATIONS` env var to `true`, this will create a schema `pg2kafka` in\nsaid DB and will set up an `outbound_event_queue` table there, together with the\nnecessary functions and triggers to start exporting data.\n\nIn order to start tracking changes for a table, you need to execute the\n`pg2kafka.setup` function with the table name and a column to use as external\nID. The external ID will be what's used as a partitioning key in Kafka, this\nensures that messages for a given entity will always end up in order, on the\nsame partition. The example below will add the trigger to the `products` table\nand use its `sku` column as the external ID.\n\nLet's say we have a database called `shop_test`:\n\n```bash\n$ createdb shop_test\n```\n\nIt contains a table `products`:\n\n```sql\nCREATE TABLE products (\n  id BIGSERIAL,\n  sku TEXT,\n  name TEXT\n);\n```\n\nAnd it already has some data in it:\n\n```sql\nINSERT INTO products (sku, name) VALUES ('CM01-R', 'Red Coffee Mug');\nINSERT INTO products (sku, name) VALUES ('CM01-B', 'Blue Coffee Mug');\n```\n\nGiven that we've already connected pg2kafka to it, and it has ran it's\nmigrations, we can start tracking changes to the `products` table:\n\n```sql\nSELECT pg2kafka.setup('products', 'sku');\n```\n\nThis will create snapshots of the current data in that table:\n\n```json\n{\n  \"uuid\": \"ea76e080-6acd-413a-96b3-131a42ab1002\",\n  \"external_id\": \"CM01-B\",\n  \"statement\": \"SNAPSHOT\",\n  \"data\": {\n    \"id\": 2,\n    \"sku\": \"CM01-B\",\n    \"name\": \"Blue Coffee Mug\"\n  },\n  \"created_at\": \"2017-11-02T16:14:36.709116Z\"\n}\n{\n  \"uuid\": \"e1c0008d-6b7a-455a-afa6-c1c2eebd65d3\",\n  \"external_id\": \"CM01-R\",\n  \"statement\": \"SNAPSHOT\",\n  \"data\": {\n    \"id\": 1,\n    \"sku\": \"CM01-R\",\n    \"name\": \"Red Coffee Mug\"\n  },\n  \"created_at\": \"2017-11-02T16:14:36.709116Z\"\n}\n```\n\nNow once you start making changes to your table, you should start seeing events\ncome in on the `pg2kafka.shop_test.products` topic:\n\n```sql\nUPDATE products SET name = 'Big Red Coffee Mug' WHERE sku = 'CM01-R';\n```\n\n```json\n{\n  \"uuid\": \"d6521ce5-4068-45e4-a9ad-c0949033a55b\",\n  \"external_id\": \"CM01-R\",\n  \"statement\": \"UPDATE\",\n  \"data\": {\n    \"name\": \"Big Red Coffee Mug\"\n  },\n  \"created_at\": \"2017-11-02T16:15:13.94077Z\"\n}\n```\n\nThe producer topics are all in the form of\n`pg2kafka.$database_name.$table_name`, you need to make sure that this topic\nexists, or else pg2kafka will crash.\n\nYou can optionally prepend a namespace to the Kafka topic, by setting the\n`TOPIC_NAMESPACE` environment variable. When doing this, the final topic name\nwould be `pg2kafka.$namespace.$database_name.$table_name`.\n\n### Cleanup\n\nIf you decide not to use pg2kafka anymore you can cleanup the Database triggers\nusing the following command:\n\n```sql\nDROP SCHEMA pg2kafka CASCADE;\n```\n\n## Development\n\n### Setup\n\n#### Golang\n\nYou will need Go 1.9.\n\n#### PostgreSQL\n\nSet up a database and expose a connection string to it as an env variable, for\nlocal development we also specify `sslmode=disable`.\n\n```bash\n$ createdb pg2kafka_test\n$ export DATABASE_URL=\"postgres://user:password@localhost/pg2kafka_test?sslmode=disable\"\n```\n\n#### Kafka\n\nInstall [Kafka](http://kafka.apache.org/) if you don't already have it running.\nThis is not required to run the tests, but it is required if you want to run\npg2kafka locally against a real Kafka.\n\nCreate a topic for the table you want to track in your database:\n\n```bash\nkafka-topics \\\n  --zookeeper localhost:2181 \\\n  --create \\\n  --topic pg2kafka.pg2kafka_test.users \\\n  --replication-factor 1 \\\n  --partitions 3\n```\n\nThen export the Kafka host as an URL so pg2kafka can use it:\n\n```bash\n$ export KAFKA_BROKER=\"localhost:9092\"\n```\n\n### Running the service locally\n\nMake sure you export the `DATABASE_URL` and `KAFKA_BROKER`, and also\n`export PERFORM_MIGRATIONS=true`.\n\n```bash\n$ go run main.go\n```\n\nTo run the service without using Kafka, you can set a `DRY_RUN=true` flag, which\nwill produce the messages to stdout.\n\n### Running tests\n\nThe only thing required for the tests to run is that you've set up a database\nand exposed a connection string to it as `DATABASE_URL`. All the necessary\nschemas, tables and triggers will be created by the tests.\n\n```bash\n$ ./script/test\n```\n\n## License\npg2kafka is released under the ISC license. See [LICENSE](https://github.com/blendle/pg2kafka/blob/master/LICENSE) for details.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fblendle%2Fpg2kafka","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fblendle%2Fpg2kafka","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fblendle%2Fpg2kafka/lists"}