{"id":13393689,"url":"https://github.com/hasura/pgdeltastream","last_synced_at":"2025-04-23T16:33:00.253Z","repository":{"id":57497446,"uuid":"124432682","full_name":"hasura/pgdeltastream","owner":"hasura","description":"Streaming Postgres logical replication changes atleast-once over websockets","archived":false,"fork":false,"pushed_at":"2018-06-13T11:00:52.000Z","size":8328,"stargazers_count":253,"open_issues_count":5,"forks_count":15,"subscribers_count":38,"default_branch":"master","last_synced_at":"2024-10-26T18:30:33.034Z","etag":null,"topics":["go","logical-replication","postgresql","websockets","write-ahead-log"],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/hasura.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2018-03-08T18:37:36.000Z","updated_at":"2024-08-20T23:41:50.000Z","dependencies_parsed_at":"2022-09-03T23:52:23.128Z","dependency_job_id":null,"html_url":"https://github.com/hasura/pgdeltastream","commit_stats":null,"previous_names":[],"tags_count":6,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hasura%2Fpgdeltastream","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hasura%2Fpgdeltastream/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hasura%2Fpgdeltastream/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hasura%2Fpgdeltastream/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/hasura","download_url":"https://codeload.github.com/hasura/pgdeltastream/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":223929544,"owners_count":17226913,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["go","logical-replication","postgresql","websockets","write-ahead-log"],"created_at":"2024-07-30T17:00:58.662Z","updated_at":"2024-11-10T08:24:18.974Z","avatar_url":"https://github.com/hasura.png","language":"Go","funding_links":[],"categories":["Go","WAL","Data"],"sub_categories":["Logical Replication","Replication"],"readme":"# PGDeltaStream\n\nA Golang webserver to stream Postgres changes *atleast-once* over websockets, using Postgres' logical decoding feature.\n\n![PGDeltaStream Short Demo](demo.gif \"PGDeltaStream Short Demo\")\n\n**Note:** Currently, pgdeltastream is ideal as a reference boilerplate Golang server of how to connect to a Postgres logical replication slot, take a snapshot and stream changes. It should not be used to expose websockets to arbitrary clients!\n\n## Introduction\n\nPGDeltaStream uses Postgres's logical decoding feature to stream table changes over a websocket connection. It is a stateless service and can be connected directly to a Postgres instance.\n\nPGDeltaStream gives you endpoints to snapshot your current data and then start streaming after the snapshot guaranteeing that you don’t lose any event data. Clients can also ACK an offset value as frequently as they desire over the websocket connection. If a client reconnects, then the stream continues from the last ACKed offset.\n\nThis process **guarantees atleast-once delivery** of changes in Postgres.\n\n## How it works\n\nWhen a logical replication slot is created, Postgres creates a snapshot of the current state of the database and records the consistent point from where streaming is supposed to begin. The snapshot helps build an initial state of the database over which streaming changes can be applied.\n\nTo facilitate retrieving data from the snapshot and to stream changes from then onwards, the workflow is split into 3 phases:\n\n1. Init: Create a replication slot\n2. Snapshot: Get data from the snapshot over HTTP\n3. Stream: Stream WAL changes from the snapshot point over a websocket connection\n\n## Installation\n\nRun Postgres [configured for logical replication](#configuring-postgres-for-logical-replication) and [`wal2json`](https://github.com/eulerto/wal2json) installed:\n\n```bash\n# Run postgres\n$ docker run -it -p 5432:5432 debezium/postgres:10.0\n```\n\nLaunch PGDeltaStream:\n\n```bash\n$ docker run \\\n    -e DBNAME=\"postgres\" \\\n    -e PGUSER=\"postgres\" \\\n    -e PGPASS=\"''\" \\\n    -e PGHOST=\"localhost\" \\\n    -e PGPORT=5432 \\\n    -e SERVERHOST=\"localhost\" \\\n    -e SERVERPORT=12312 \\\n    --net host \\\n    -it hasura/pgdeltastream:v0.1.7\n```\n\n## Usage\n\n### Video guide\n\nWatch the [video guide](https://youtu.be/pMQxbbzq_gw) to get a super fast introduction of how to use PGDeltaStream.\n\n### Step 1: Init a replication slot\n\nCall the `/v1/init` endpoint to create a replication slot and get the slot name.\n\n```bash\n$ curl localhost:12312/v1/init\n{\"slotName\": \"delta_face56\"}\n```\n\nKeep note of this slot name to use in the next phases.\n\n### Step 2 (optional): Initialise data from snapshot\n\nTo get data from the snapshot, make a POST request to the `/v1/snapshot/data` endpoint with the slot name, table name, offset and limit. You can also specify the column and order you want the results to be sorted in:\n```\ncurl -X POST \\\n  http://localhost:12312/v1/snapshot/data \\\n  -H 'content-type: application/json' \\\n  -d '{\"slotName\": \"delta_face56\", \"table\": \"test_table\", \"offset\": 0, \"limit\": 5, \"order_by\": {\"column\": \"id\", \"order\": \"ASC\"}}'\n```\n\nThe returned data will be a JSON formatted list of rows:\n\n```json\n[\n  {\n    \"id\": 1,\n    \"name\": \"abc\"\n  },\n  {\n    \"id\": 2,\n    \"name\": \"abc1\"\n  },\n  {\n    \"id\": 3,\n    \"name\": \"abc2\"\n  },\n  {\n    \"id\": 4,\n    \"name\": \"val1\"\n  },\n  {\n    \"id\": 5,\n    \"name\": \"val2\"\n  }\n]\n```\n\nNote that only the data upto the time the replication slot was created will be available in the snapshot. \n\n### Step 3: Stream changes over a websocket\n\nConnect to the websocket endpoint `/v1/lr/stream` along with the slot name to start streaming the changes:\n\n```\nws://localhost:12312/v1/lr/stream?slotName=delta_face56\n```\n\nThe streaming data will contain the operation type (create, update, delete), table details, old values (in case of an update or delete), new values and the `nextlsn` value. \n\nThe query:\n\n```\nINSERT INTO test_table (name) VALUES ('newval1');\n```\nwill produce the following change record over the websocket connection:\n```javascript\n// Received over ws\n{\n  \"nextlsn\": \"0/170FCB0\",\n  \"change\": [\n    {\n      \"kind\": \"insert\",\n      \"schema\": \"public\",\n      \"table\": \"test_table\",\n      \"columnnames\": [\n        \"id\",\n        \"name\"\n      ],\n      \"columntypes\": [\n        \"integer\",\n        \"text\"\n      ],\n      \"columnvalues\": [\n        3,\n        \"newval1\"\n      ]\n    }\n  ]\n}\n```\n\nThe `nextlsn` is the Log Sequence Number (LSN) that points to the next record in the WAL. \n\n### Step 4: ACK the offset \n\nTo update postgres of the consumed position simply send this value over the websocket connection:\n```javascript\n// Send over ws\n{\"lsn\":\"0/170FCB0\"}\n```\n\nThis will commit to Postgres that you've consumed upto the WAL position `0/170FCB0` so that in case of a failure of the websocket connection, the streaming resumes from this record.\n\n### Reset stream\n\nThe application has been designed as a single session use case; i.e. as of now there can be only one replication slot and corresponding stream that can be managed. Any calls to `/v1/init` will delete the existing replication slot, and create a new replication slot (along with the snapshot).\n\nAt any point if you wish to start over with a new replication slot, call `/v1/init` again to reset the \"stream\".\n\n## Configuring Postgres for logical replication\n\nTo use the logical replication feature, set the following parameters in `postgresql.conf`:\n\n```\nwal_level = logical\nmax_replication_slots = 4\n```\n\nFurther, add this to `pg_hba.conf`:\n\n```\nhost    replication     all             127.0.0.1/32            trust\n```\n\nRestart the `postgresql` service.\n\n## Slot names\n\nThe slots names are autogenerated following the format `delta_\u003cword\u003e\u003cnumber\u003e`. \n\nThis is so that it is easy to remember the slot name instead of a string of random characters and the `delta_` prefix identifies it as a slot created by this application.\n\n## Contributing\n\nContributions are welcome!\n\nRead the [contributing guide](CONTRIBUTING.md) to learn about setting up the development environment, building the project and running tests.\n\nDo check the [issues](https://github.com/hasura/pgdeltastream/issues) page to see the backlog and help us in improving PGDeltaStream!\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhasura%2Fpgdeltastream","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhasura%2Fpgdeltastream","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhasura%2Fpgdeltastream/lists"}