{"id":13639761,"url":"https://github.com/pgEdge/spock","last_synced_at":"2025-04-20T01:32:30.946Z","repository":{"id":121849895,"uuid":"583373845","full_name":"pgEdge/spock","owner":"pgEdge","description":"Logical Multi-master Replication","archived":false,"fork":false,"pushed_at":"2025-04-11T01:09:36.000Z","size":2587,"stargazers_count":194,"open_issues_count":12,"forks_count":21,"subscribers_count":13,"default_branch":"REL4_0_STABLE","last_synced_at":"2025-04-13T07:52:01.478Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://github.com/pgedge/pgedge","language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/pgEdge.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2022-12-29T15:32:33.000Z","updated_at":"2025-04-10T21:57:32.000Z","dependencies_parsed_at":null,"dependency_job_id":"67c44fa6-cf1d-4b7f-aec2-e153c5e8287d","html_url":"https://github.com/pgEdge/spock","commit_stats":null,"previous_names":[],"tags_count":84,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pgEdge%2Fspock","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pgEdge%2Fspock/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pgEdge%2Fspock/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pgEdge%2Fspock/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/pgEdge","download_url":"https://codeload.github.com/pgEdge/spock/tar.gz/refs/heads/REL4_0_STABLE","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":249838128,"owners_count":21332561,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-02T01:01:04.534Z","updated_at":"2025-04-20T01:32:30.237Z","avatar_url":"https://github.com/pgEdge.png","language":"C","funding_links":[],"categories":["Replication and Clustering","C","High-Availability"],"sub_categories":[],"readme":"# Spock\n![Spockbench tests](https://github.com/pgedge/spock-private/actions/workflows/spockbench.yml/badge.svg)\n## Multi-Master Replication with Conflict Resolution \u0026 Avoidance\n\n\nThis SPOCK extension provides multi-master replication for PostgreSQL 14+.\nWe originally leveraged the [pgLogical](https://github.com/2ndQuadrant/pglogical) and [BDR2](https://github.com/2ndQuadrant/bdr/tree/REL0_9_94b2) \nprojects as a solid foundation to build upon for this enterprise-class extension. \n\n**Version 4.0** is our current version under active development.  It presently includes the following important enhancements beyond v3.3:\n\n* Full re-work of paralell slots implementation to support mixed OLTP workloads\n* Improved support for delta_apply columns to support various data types\n* Improved regression test coverage\n* Support for [Large Object LOgical Replication](https://github.com/pgedge/lolor)\n* Support for pg17beta\n\nOur current production version is v3.3 and includes the following enhancements over v3.2:\n\n* Automatic replication of DDL statements\n\nOur previous production version was v3.2 and includes the following important enhancements beyond Spock v3.1:\n\n* Support for pg14\n* Support for [Snowflake Sequences](https://github.com/pgedge/snowflake)\n* Support for setting a database to ReadOnly\n* A couple small bug fixes from pgLogical\n* Native support for Failover Slots via integrating pg_failover_slots extension\n* Paralell slots support for insert only workloads\n\n\nOur initial production version was 3.1 and included the following:\n\n* Support for both pg15 **AND** pg16\n* Prelim testing for online upgrades between pg15 \u0026 pg16\n* Regression testing improvements\n* Improved support for in-region shadow nodes (in different AZ's)\n* Improved and document support for replication and maintaining partitioned tables.\n\n\nOur beta version was 3.0 and includes the following important enhancements beyond its bdr/pg_logical base:\n\n* Support for pg15 (support for pg10 thru pg14 dropped)\n* Support for Asynchronous Multi-Master Replication with conflict resolution\n* Conflict-free delta-apply columns\n* Replication of partitioned tables (to help support geo-sharding) \n* Making database clusters location aware (to help support geo-sharding)\n* Better error handling for conflict resolution\n* Better management \u0026 monitoring stats and integration\n* A 'pii' table for making it easy for personally identifiable data to be kept in country\n* Better support for minimizing system interuption during switch-over and failover\n\n\nWe use the following terms, borrowed from [Jan's](https://www.linkedin.com/in/jan-wieck-3140812) well known [Slony](https://slony.info) project, to describe data streams between nodes:\n* Nodes - PostgreSQL database instances\n* Providers and Subscribers - roles taken by Nodes\n* Replication Set - a collection of tables\n\nUse cases supported are:\n* Asynchronous multi-active replication with conflict resolution\n* Upgrades between major versions\n* Full database replication\n* Selective replication of sets of tables using replication sets\n* Selective replication of table rows at either publisher or subscriber side (row_filter)\n* Selective replication of partitioned tables\n* Selective replication of table columns at publisher side\n* Data gather/merge from multiple upstream servers\n\nArchitectural details:\n* Spock works on a per-database level, not whole server level like physical streaming replication\n* One Provider may feed multiple Subscribers without incurring additional disk write overhead\n* One Subscriber can merge changes from several origins and detect conflict\n  between changes with automatic and configurable conflict resolution (some,\n  but not all aspects required for multi-master).\n* Cascading replication is implemented in the form of changeset forwarding.\n\n# Major New Features\n\n## Snowflake Sequences\n[Snowflake Sequences](https://github.com/pgEdge/snowflake-sequences) is a PostgreSQL extension providing an int8 and sequence based unique ID solution to optionally replace the PostgreSQL built-in bigserial data type. This extension allows Snowflake IDs that are unique within one sequence across multiple PostgreSQL instances in a distributed cluster.\n\n## Automatic Replication of DDL\nDDL statements can now be automatically replicated. This feature can be enabled by setting the following to on: `spock.enable_ddl_replication`, `spock.include_ddl_repset`, and `spock.allow_ddl_from_functions`. It is recommended to set these to on only when the database schema matches exactly on all nodes- either when all databases have no objects, or when all databases have exactly the same objects and all tables are added to replication sets.\n\nBy default, these settings are set to off. When these settings are on, it is recommended that DDL statements dangerous for replication be executed in a maintenance window to avoid errors that will impact replication.\n\n`spock.enable_ddl_replication` will enable replication of ddl statements through the default replication set. Some DDL statements are intentionally not replicated (ie. CREATE DATABASE), and some are replicated but could cause issues in two ways. Some DDL statements could lead to inconsistent data (ie. CREATE TABLE... AS...) since the DDL statement is replicated before the table is added to the replication set. Some DDL statements are replicated, but are potentially an issue in a 3+ node cluster (ie. DROP TABLE).\n\n`spock.include_ddl_repset` will enable spock to automatically add tables to replication sets at the time they are created on each node. Tables with Primary Keys will be added to the default replication set, and tables without Primary Keys will be added to the default_insert_only replication set. Altering a table to add or remove a Primary Key will make the correct adjustment to which replication set the table is part of. Setting a table to unlogged will remove it from replication. Detaching a partition will not remove it from replication.\n\n`spock.allow_ddl_from_functions` will enable spock to automatically replicate DDL statements that are called within functions to also be automatically replicated. This can be turned off if these functions are expected to run on every node.When this is set to off statements replicated from functions adhere to the same rule previously described for 'include_ddl_repset.' If a table possesses a defined primary key, it will be added into the 'default' replication set; alternatively, they will be added to the 'default_insert_only' replication set.\n\nDuring the auto replication process, various messages are generated to provide information about the execution. Here are the descriptions for each message:\n- \"DDL statement replicated.\"\nThis message is a INFO level message. It is displayed whenever a DDL statement is successfully replicated. To include these messages in the server log files, the configuration must have \"log_min_messages=INFO\" set.\n- \"DDL statement replicated, but could be unsafe.\"\nThis message serves as a warning. It is generated when certain DDL statements, though successfully replicated, are deemed potentially unsafe. For example, statements like \"CREATE TABLE... AS...\" will trigger this warning.\n- \"This DDL statement will not be replicated.\"\nThis warning message is generated when auto replication is active, but the specific DDL is either unsupported or intentionally excluded from replication.4- \"table 'test' was added to 'default' replication set.\" This is a LOG message providing information about the replication set used for a given table when 'spock.include_ddl_repset' is set.\n\n\n## Replication of Partitioned Tables\n\nPartitioned tables can now be replicated. By default, when adding a partitioned table to a replication set, it will include all of its present partitions. The later partitions can be added using the `partition_add` function. The DDL for the partitioned and partitions should be present on the subscriber nodes (same as for normal tables).\n\nSimilarly, when removing a partitioned table from a replication set, by default, the partitions of the table will also be removed.\n\nThe replication of partitioned tables is a bit different from normal tables. When doing initial synchronization, we query the partitioned table (or parent) to get all the rows for synchronization purposes and don't synchronize the individual partitions. However, after the initial sync of data, the normal operations resume i.e. the partitions start replicating like normal tables.\n\nIt's possible to add individual partitions to the replication set in which case they will be replicated like regular tables (to the table of the same name as the partition on the subscriber). This has performance advantages when partitioning definition is the same on both provider and subscriber, as the partitioning logic does not have to be executed.\n\n**Note:** There is an exception to individual partition replication, which is, the individual partitions won't sync up the existing data. It's equivalent to setting `synchronize_data = false`.\n\nWhen partitions are replicated through a partitioned table, the exception is the TRUNCATE command which always replicates with the list of affected tables or partitions.\n\nAdditionally, `row_filter` can also be used with partitioned tables, as well as with individual partitions.\n\n\n## Conflict-Free Delta-Apply Columns (Conflict Avoidance)\n\nLogical Multi-Master replication can get itself into trouble on running sums (such as a YTD balance).  Unlike other\nsolutions, we do NOT have a special data type for this.   Any numeric data type will do (including numeric, float, double precision, int4, int8, etc).\n\nSuppose that a running bank account sum contains a balance of `$1,000`.   Two transactions \"conflict\" because they overlap with each from two different multi-master nodes.   Transaction A is a `$1,000` withdrawal from the account.  Transaction B is also a `$1,000` withdrawal from the account.  The correct balance is `$-1,000`.  Our Delta-Apply algorithm fixes this problem and highly conflicting workloads with this scenario (like a tpc-c like benchmark) now run correctly at lightning speeds.\n\nThis feature is powerful AND simple in its implementation as follows:\n\n  - A small diff patch to PostgreSQL core\n    - a very small PostgreSQL licensed patch is applied to a core PostgreSQL source tree before building a PG binary.\n    - the above diff patch adds functionality to support ALTER TABLE t1 ALTER COLUMN c1 SET(log_old_value=true)\n    - this patch will be submitted to pg16 core PostgreSQL and discussed at the Ottawa Conference.\n\n  - When an update occurs on a 'log_old_value' column\n    - First, the old value for that column is captured to the WAL \n    - Second, the new value comes in the transaction to be applied to a subscriber\n    - Before the new value overwrites the old value, a delta value is created from the above two steps and it is correctly applied\n\nNote that on a conflicting transaction, the delta column will get correctly calculated and applied.  The configured conflict resolution strategy applies to non-delta columns (normally last-update-wins).\n\nAs a special safety-valve feature.  If the user ever needs to re-set a log_old_value column you can temporarily alter the column to \"log_old_value\" is false.\n\n## Conflicts Overview\n\nIn case the node is subscribed to multiple providers, or when local writes\nhappen on a subscriber, conflicts can arise for the incoming changes. These\nare automatically detected and can be acted on depending on the configuration.\n\nThe configuration of the conflicts resolver is done via the\n`spock.conflict_resolution` setting.\n\nThe resolved conflicts are logged using the log level set using\n`spock.conflict_log_level`. This parameter defaults to `LOG`. If set to\nlower level than `log_min_messages` the resolved conflicts won't appear in\nthe server log.\n\n## Conflict Configuration options\n\nSome aspects of Spock can be configured using configuration options that\ncan be either set in `postgresql.conf` or via `ALTER SYSTEM SET`.\n\n- `spock.conflict_resolution`\n  Sets the resolution method for any detected conflicts between local data\n  and incoming changes.\n\n  Possible values:\n  \u003c!--\n  - `error` - the replication will stop on error if conflict is detected and\n    manual action is needed for resolving\n  - `apply_remote` - always apply the change that's conflicting with local\n    data\n  - `keep_local` - keep the local version of the data and ignore the\n     conflicting change that is coming from the remote node\n  --\u003e\n  - `last_update_wins` - the version of data with newest commit timestamp\n     will be kept (this can be either local or remote version)\n\n  For conflict resolution, the `track_commit_timestamp` PostgreSQL setting \n  is always enabled.\n\n- `spock.conflict_log_level`\n  Sets the log level for reporting detected conflicts when the\n  `spock.conflict_resolution` is set to anything else than `error`.\n\n  Main use for this setting is to suppress logging of conflicts.\n\n  Possible values are same as for `log_min_messages` PostgreSQL setting.\n\n  The default is `LOG`.\n\n- `spock.batch_inserts`\n  Tells Spock to use batch insert mechanism if possible. Batch mechanism\n  uses PostgreSQL internal batch insert mode which is also used by `COPY`\n  command.\n\n\n## Requirements\n\nThe `spock` extension must be installed on both provider and subscriber.\nYou must `CREATE EXTENSION spock` on both.  For major version upgrades, the old node \ncan be running a recent version of pgLogical2 before it is upgraded to become a Spock node.\n\nTables on the provider and subscriber must have the same names and be in the\nsame schema. Future revisions may add mapping features.\n\nTables on the provider and subscriber must have the same columns, with the same\ndata types in each column. `CHECK` constraints, `NOT NULL` constraints, etc., must\nbe the same or weaker (more permissive) on the subscriber than the provider.\n\nTables must have the same `PRIMARY KEY`s. It is not recommended to add additional\n`UNIQUE` constraints other than the `PRIMARY KEY` (see below).\n\nSome additional requirements are covered in [Limitations and Restrictions](#limitations-and-restrictions).\n\n## Usage\n\nThis section describes basic usage of the Spock replication extension.  \nIt should be noted the pgEdge, when you install the Spock extension, does this quick setup for you (and more).\n\n### Quick setup\n\n\nFirst the PostgreSQL server has to be properly configured to support logical\ndecoding:\n\n    wal_level = 'logical'\n    max_worker_processes = 10   # one per database needed on provider node\n                                # one per node needed on subscriber node\n    max_replication_slots = 10  # one per node needed on provider node\n    max_wal_senders = 10        # one per node needed on provider node\n    shared_preload_libraries = 'spock'\n    track_commit_timestamp = on # needed for conflict resolution\n\n`pg_hba.conf` has to allow logical replication connections from\nlocalhost. Logical replication connections are treated\nby `pg_hba.conf` as regular connections to the provider database.\n\nNext the `spock` extension has to be installed on all nodes in the database to be replicated:\n\n    CREATE EXTENSION spock;\n\nNow create the provider node:\n\n    SELECT spock.node_create(\n        node_name := 'provider1',\n        dsn := 'host=providerhost port=5432 dbname=db'\n    );\n\nAdd all tables in `public` schema to the `default` replication set.\n\n    SELECT spock.repset_add_all_tables('default', ARRAY['public']);\n\nOptionally you can also create additional replication sets and add tables to\nthem (see [Replication sets](#replication-sets)).\n\nIt's usually better to create replication sets before subscribing so that all\ntables are synchronized during initial replication setup in a single initial\ntransaction. However, users of bigger databases may instead wish to create them\nincrementally for better control.\n\nOnce the provider node is setup, subscribers can be subscribed to it. First the\nsubscriber node must be created:\n\n    SELECT spock.node_create(\n        node_name := 'subscriber1',\n        dsn := 'host=thishost port=5432 dbname=db'\n    );\n\nAnd finally on the subscriber node you can create the subscription which will\nstart synchronization and replication process in the background:\n\n    SELECT spock.sub_create(\n        subscription_name := 'subscription1',\n        provider_dsn := 'host=providerhost port=5432 dbname=db'\n    );\n\n    SELECT spock.sub_wait_for_sync('subscription1');\n\n### Creating subscriber nodes with base backups\n\nIn addition to the SQL-level node and subscription creation, spock also\nsupports creating a subscriber by cloning the provider with `pg_basebackup` and\nstarting it up as a spock subscriber. This is done with the\n`spock_create_subscriber` tool; see the `--help` output.\n\nUnlike `spock.sub_create`'s data sync options, this clone ignores\nreplication sets and copies all tables on all databases. However, it's often\nmuch faster, especially over high-bandwidth links.\n\n### Node management\n\nNodes can be added and removed dynamically using the SQL interfaces.\n\n#### spock-node-create\n- `spock.node_create(node_name name, dsn text)`\n  Creates a node.\n\n  Parameters:\n  - `node_name` - name of the new node, only one node is allowed per database\n  - `dsn` - connection string to the node, for nodes that are supposed to be\n    providers, this should be reachable from outside\n\n\n#### spock-node-drop\n- `spock.node_drop(node_name name, ifexists bool)`\n  Drops the spock node.\n\n  Parameters:\n  - `node_name` - name of an existing node\n  - `ifexists` - if true, error is not thrown when subscription does not exist,\n    default is false\n\n#### spock-node-add-interface\n- `spock.node_add_interface(node_name name, interface_name name, dsn text)`\n  Adds additional interface to a node.\n\n  When node is created, the interface for it is also created with the `dsn`\n  specified in the `create_node` and with the same name as the node. This\n  interface allows adding alternative interfaces with different connection\n  strings to an existing node.\n\n  Parameters:\n  - `node_name` - name of an existing node\n  - `interface_name` - name of a new interface to be added\n  - `dsn` - connection string to the node used for the new interface\n\n#### spock-node-drop-interface\n- `spock.node_drop_interface(node_name name, interface_name name)`\n  Remove existing interface from a node.\n\n  Parameters:\n  - `node_name` - name of and existing node\n  - `interface_name` - name of an existing interface\n\n### Subscription management\n\n#### spock-sub-create\n- `spock.sub_create(subscription_name name, provider_dsn text,\n  repsets text[], sync_structure boolean,\n  sync_data boolean, forward_origins text[], apply_delay interval)`\n  Creates a subscription from current node to the provider node. Command does\n  not block, just initiates the action.\n\n  Parameters:\n  - `subscription_name` - name of the subscription, must be unique\n  - `provider_dsn` - connection string to a provider\n  - `repsets` - array of replication sets to subscribe to, these must\n    already exist, default is \"{default,default_insert_only,ddl_sql}\"\n  - `sync_structure` - specifies if to synchronize structure from\n    provider to the subscriber, default false\n  - `sync_data` - specifies if to synchronize data from provider to\n    the subscriber, default true\n  - `forward_origins` - array of origin names to forward, currently only\n    supported values are empty array meaning don't forward any changes\n    that didn't originate on provider node (this is useful for two-way\n    replication between the nodes), or \"{all}\" which means replicate all\n    changes no matter what is their origin, default is \"{all}\"\n  - `apply_delay` - how much to delay replication, default is 0 seconds\n  - `force_text_transfer` - force the provider to replicate all columns\n    using a text representation (which is slower, but may be used to\n    change the type of a replicated column on the subscriber), default\n    is false\n\n  The `subscription_name` is used as `application_name` by the replication\n  connection. This means that it's visible in the `pg_stat_replication`\n  monitoring view. It can also be used in `synchronous_standby_names` when\n  spock is used as part of\n  [synchronous replication](#synchronous-replication) setup.\n\n  Use `spock.sub_wait_for_sync(subscription_name)` to wait for the\n  subscription to asynchronously start replicating and complete any needed\n  schema and/or data sync.\n\n#### spock-sub-drop\n- `spock.sub_drop(subscription_name name, ifexists bool)`\n  Disconnects the subscription and removes it from the catalog.\n\n  Parameters:\n  - `subscription_name` - name of the existing subscription\n  - `ifexists` - if true, error is not thrown when subscription does not exist,\n    default is false\n\n#### spock-sub-disable\n- `spock.sub_disable(subscription_name name, immediate bool)`\n   Disables a subscription and disconnects it from the provider.\n\n  Parameters:\n  - `subscription_name` - name of the existing subscription\n  - `immediate` - if true, the subscription is stopped immediately, otherwise\n    it will be only stopped at the end of current transaction, default is false\n\n#### spock-sub-enable\n- `spock.sub_enable(subscription_name name, immediate bool)`\n  Enables disabled subscription.\n\n  Parameters:\n  - `subscription_name` - name of the existing subscription\n  - `immediate` - if true, the subscription is started immediately, otherwise\n    it will be only started at the end of current transaction, default is false\n\n#### spock-sub-alter-interface\n- `spock.sub_alter_interface(subscription_name name, interface_name name)`\n  Switch the subscription to use different interface to connect to provider\n  node.\n\n  Parameters:\n  - `subscription_name` - name of an existing subscription\n  - `interface_name` - name of an existing interface of the current provider\n    node\n\n#### spock-sub-sync\n- `spock.sub_sync(subscription_name name, truncate bool)`\n  All unsynchronized tables in all sets are synchronized in a single operation.\n  Tables are copied and synchronized one by one. Command does not block, just\n  initiates the action. Use `spock.wait_for_sub_sync`\n  to wait for completion.\n\n  Parameters:\n  - `subscription_name` - name of the existing subscription\n  - `truncate` - if true, tables will be truncated before copy, default false\n\n#### spock-sub-resync-table\n- `spock.sub_resync_table(subscription_name name, relation regclass)`\n  Resynchronize one existing table. The table may not be the target of any\n  foreign key constraints.\n  **WARNING: This function will truncate the table immediately, and only then\n  begin synchronising it, so it will be empty while being synced**\n\n  Does not block, use `spock.wait_for_table_sync` to wait for\n  completion.\n\n  Parameters:\n  - `subscription_name` - name of the existing subscription\n  - `relation` - name of existing table, optionally qualified\n\n#### spock-sub-wait-for-sync\n- `spock.sub_wait_for_sync(subscription_name name)`\n\n   Wait for a subscription to finish synchronization after a\n   `spock.sub_create` or `spock.sub_sync`.\n\n  This function waits until the subscription's initial schema/data sync,\n  if any, are done, and until any tables pending individual resynchronisation\n  have also finished synchronising.\n\n  For best results, run `SELECT spock.wait_slot_confirm_lsn(NULL, NULL)` on the\n  provider after any replication set changes that requested resyncs, and only\n  then call `spock.sub_wait_for_sync` on the subscriber.\n\n#### spock-sub-wait-table-sync\n- `spock.sub_wait_table_sync(subscription_name name, relation regclass)`\n\n  Same as `spock.sub_wait_for_sync`, but waits only for\n  the subscription's initial sync and the named table. Other tables pending\n  resynchronisation are ignored.\n\n- `spock.wait_slot_confirm_lsn`\n\n  `SELECT spock.wait_slot_confirm_lsn(NULL, NULL)`\n\n  Wait until all replication slots on the current node have replayed up to the\n  xlog insert position at time of call on all providers. Returns when\n  all slots' `confirmed_flush_lsn` passes the `pg_current_wal_insert_lsn()` at\n  time of call.\n\n  Optionally may wait for only one replication slot (first argument).\n  Optionally may wait for an arbitrary LSN passed instead of the insert lsn\n  (second argument). Both are usually just left null.\n\n  This function is very useful to ensure all subscribers have received changes\n  up to a certain point on the provider.\n\n#### spock-sub-show-status\n- `spock.sub_show_status(subscription_name name)`\n  Shows status and basic information about subscription.\n\n  Parameters:\n  - `subscription_name` - optional name of the existing subscription, when no\n    name was provided, the function will show status for all subscriptions on\n    local node\n\n#### spock-sub-show-table\n- `spock.sub_show_table(subscription_name name, relation regclass)`\n  Shows synchronization status of a table.\n\n  Parameters:\n  - `subscription_name` - name of the existing subscription\n  - `relation` - name of existing table, optionally qualified\n\n#### spock-sub-add-repset\n- `spock.sub_add_repset(subscription_name name, replication_set name)`\n  Adds one replication set into a subscriber. Does not synchronize, only\n  activates consumption of events.\n\n  Parameters:\n  - `subscription_name` - name of the existing subscription\n  - `replication_set` - name of replication set to add\n\n#### spock-sub-remove-repset\n- `spock.sub_remove_repset(subscription_name name, replication_set name)`\n  Removes one replication set from a subscriber.\n\n  Parameters:\n  - `subscription_name` - name of the existing subscription\n  - `replication_set` - name of replication set to remove\n\n\nThere is also a `postgresql.conf` parameter,\n`spock.extra_connection_options`, that may be set to assign connection\noptions that apply to all connections made by spock. This can be a useful\nplace to set up custom keepalive options, etc.\n\nspock defaults to enabling TCP keepalives to ensure that it notices\nwhen the upstream server disappears unexpectedly. To disable them add\n`keepalives = 0` to `spock.extra_connection_options`.\n\n### Replication sets\n\nReplication sets provide a mechanism to control which tables in the database\nwill be replicated and which actions on those tables will be replicated.\n\nEach replicated set can specify individually if `INSERTs`, `UPDATEs`,\n`DELETEs` and `TRUNCATEs` on the set are replicated. Every table can be in\nmultiple replication sets and every subscriber can subscribe to multiple\nreplication sets as well. The resulting set of tables and actions replicated\nis the union of the sets the table is in. The tables are not replicated until\nthey are added into a replication set.\n\nThere are three preexisting replication sets named \"default\",\n\"default_insert_only\" and \"ddl_sql\". The \"default\" replication set is defined\nto replicate all changes to tables in it. The \"default_insert_only\" only\nreplicates INSERTs and is meant for tables that don't have primary key (see\n[Limitations](#primary-key-required) section for details).\nThe \"ddl_sql\" replication set is defined to replicate schema changes specified by\n`spock.replicate_ddl`\n\nThe following functions are provided for managing the replication sets:\n\n#### spock-repset-create\n- `spock.repset_create(set_name name, replicate_insert bool, replicate_update bool, replicate_delete bool, replicate_truncate bool)`\n  This function creates a new replication set.\n\n  Parameters:\n  - `set_name` - name of the set, must be unique\n  - `replicate_insert` - specifies if `INSERT` is replicated, default true\n  - `replicate_update` - specifies if `UPDATE` is replicated, default true\n  - `replicate_delete` - specifies if `DELETE` is replicated, default true\n  - `replicate_truncate` - specifies if `TRUNCATE` is replicated, default true\n\n#### spock-repset-alter\n- `spock.repset_alter(set_name name, replicate_inserts bool, replicate_updates bool, replicate_deletes bool, replicate_truncate bool)`\n  This function changes the parameters of the existing replication set.\n\n  Parameters:\n  - `set_name` - name of the existing replication set\n  - `replicate_insert` - specifies if `INSERT` is replicated, default true\n  - `replicate_update` - specifies if `UPDATE` is replicated, default true\n  - `replicate_delete` - specifies if `DELETE` is replicated, default true\n  - `replicate_truncate` - specifies if `TRUNCATE` is replicated, default true\n\n#### spock-repset-drop\n- `spock.repset_drop(set_name text)`\n  Removes the replication set.\n\n  Parameters:\n  - `set_name` - name of the existing replication set\n\n#### spock-repset-add-table\n- `spock.repset_add_table(set_name name, relation regclass, sync_data boolean, columns text[], row_filter text)`\n  Adds a table to replication set.\n\n  Parameters:\n  - `set_name` - name of the existing replication set\n  - `relation` - name or OID of the table to be added to the set\n  - `sync_data` - if true, the table data is synchronized on all\n    subscribers which are subscribed to given replication set, default false\n  - `columns` - list of columns to replicate. Normally when all columns\n    should be replicated, this will be set to NULL which is the\n    default\n  - `row_filter` - row filtering expression, default NULL (no filtering),\n    see [Row Filtering](#row-filtering) for more info.\n  **WARNING: Use caution when synchronizing data with a valid row filter.**\nUsing `sync_data=true` with a valid `row_filter` is like a one-time operation for a table.\nExecuting it again with modified `row_filter` won't synchronize data to subscriber. Subscribers\nmay need to call `spock.alter_sub_resync_table()` to fix it.\n\n#### spock-repset-add-all-tables\n- `spock.repset_add_all_tables(set_name name, schema_names text[], sync_data boolean)`\n  Adds all tables in given schemas. Only existing tables are added, table that\n  will be created in future will not be added automatically. For how to ensure\n  that tables created in future are added to correct replication set, see\n  [Automatic assignment of replication sets for new tables](#automatic-assignment-of-replication-sets-for-new-tables).\n\n  Parameters:\n  - `set_name` - name of the existing replication set\n  - `schema_names` - array of names name of existing schemas from which tables\n    should be added\n  - `sync_data` - if true, the table data is synchronized on all\n    subscribers which are subscribed to given replication set, default false\n\n#### spock-repset-remove-table\n- `spock.repset_remove_table(set_name name, relation regclass)`\n  Remove a table from replication set.\n\n  Parameters:\n  - `set_name` - name of the existing replication set\n  - `relation` - name or OID of the table to be removed from the set\n\n#### spock-repset-add-seq\n*Warning:* For a multi master system, adding sequences to replication sets is not recomended. Use our new [Snowflake Sequences](https://github.com/pgEdge/snowflake-sequences) instead.\n- `spock.repset_add_seq(set_name name, relation regclass, sync_data boolean)`\n  Adds a sequence to a replication set.\n\n  Parameters:\n  - `set_name` - name of the existing replication set\n  - `relation` - name or OID of the sequence to be added to the set\n  - `sync_data` - if true, the sequence value will be synchronized immediately, default false\n\n#### spock-repset-add-all-seqs\n*Warning:* For a multi master system, adding sequences to replication sets is not recomended. Use our new [Snowflake Sequences](https://github.com/pgEdge/snowflake-sequences) instead.\n- `spock.repset_add_all_seqs(set_name name, schema_names text[], sync_data boolean)`\n  Adds all sequences from the given schemas. Only existing sequences are added, any sequences that\n  will be created in future will not be added automatically.\n\n  Parameters:\n  - `set_name` - name of the existing replication set\n  - `schema_names` - array of names name of existing schemas from which tables\n    should be added\n  - `sync_data` - if true, the sequence value will be synchronized immediately, default false\n\n#### spock-repset-remove-seq\n- `spock.repset_remove_seq(set_name name, relation regclass)`\n  Remove a sequence from a replication set.\n\n  Parameters:\n  - `set_name` - name of the existing replication set\n  - `relation` - name or OID of the sequence to be removed from the set\n\nYou can view the information about which table is in which set by querying the\n`spock.tables` view.\n\n#### Automatic assignment of replication sets for new tables\n\nThe event trigger facility can be used for describing rules which define\nreplication sets for newly created tables.\n\nExample:\n\n    CREATE OR REPLACE FUNCTION spock_assign_repset()\n    RETURNS event_trigger AS $$\n    DECLARE obj record;\n    BEGIN\n        FOR obj IN SELECT * FROM pg_event_trigger_ddl_commands()\n        LOOP\n            IF obj.object_type = 'table' THEN\n                IF obj.schema_name = 'config' THEN\n                    PERFORM spock.repset_add_table('configuration', obj.objid);\n                ELSIF NOT obj.in_extension THEN\n                    PERFORM spock.repset_add_table('default', obj.objid);\n                END IF;\n            END IF;\n        END LOOP;\n    END;\n    $$ LANGUAGE plpgsql;\n\n    CREATE EVENT TRIGGER spock_assign_repset_trg\n        ON ddl_command_end\n        WHEN TAG IN ('CREATE TABLE', 'CREATE TABLE AS')\n        EXECUTE PROCEDURE spock_assign_repset();\n\nThe above example will put all new tables created in schema `config` into\nreplication set `configuration` and all other new tables which are not created\nby extensions will go to `default` replication set.\n\n### Additional functions\n\n#### spock-replicate-ddl\n- `spock.replicate_ddl(command text, repsets text[])`\n  Execute locally and then send the specified command to the replication queue\n  for execution on subscribers which are subscribed to one of the specified\n  `repsets`.\n\n  Parameters:\n  - `command` - DDL query to execute\n  - `repsets` - array of replication sets which this command should be\n    associated with, default \"{ddl_sql}\"\n\n#### spock-seq-sync\n- `spock.seq_sync(relation regclass)`\n  Push sequence state to all subscribers. Unlike the subscription and table\n  synchronization function, this function should be run on provider. It forces\n  update of the tracked sequence state which will be consumed by all\n  subscribers (replication set filtering still applies) once they replicate the\n  transaction in which this function has been executed.\n\n  Parameters:\n  - `relation` - name of existing sequence, optionally qualified\n\n### Row Filtering\n\nSpock allows row based filtering both on provider side and the subscriber\nside.\n\n#### Row Filtering on Provider\n\nOn the provider the row filtering can be done by specifying `row_filter`\nparameter for the `spock.repset_add_table` function. The\n`row_filter` is normal PostgreSQL expression which has the same limitations\non what's allowed as the `CHECK` constraint.\n\nSimple `row_filter` would look something like `row_filter := 'id \u003e 0'` which\nwould ensure that only rows where values of `id` column is bigger than zero\nwill be replicated.\n\nIt's allowed to use volatile function inside `row_filter` but caution must\nbe exercised with regard to writes as any expression which will do writes\nwill throw error and stop replication.\n\nIt's also worth noting that the `row_filter` is running inside the replication\nsession so session specific expressions such as `CURRENT_USER` will have\nvalues of the replication session and not the session which did the writes.\n\n#### Row Filtering on Subscriber\n\nOn the subscriber the row based filtering can be implemented using standard\n`BEFORE TRIGGER` mechanism.\n\nIt is required to mark any such triggers as either `ENABLE REPLICA` or\n`ENABLE ALWAYS` otherwise they will not be executed by the replication\nprocess.\n\n## Synchronous Replication\n\nSynchronous replication is supported using same standard mechanism provided\nby PostgreSQL for physical replication.\n\nThe `synchronous_commit` and `synchronous_standby_names` settings will affect\nwhen `COMMIT` command reports success to client if spock subscription\nname is used in `synchronous_standby_names`. Refer to PostgreSQL\ndocumentation for more info about how to configure these two variables.\n\n  The batch inserts will improve replication performance of transactions that\n  did many inserts into one table. Spock will switch to batch mode when\n  transaction did more than 5 INSERTs.\n\n  It's only possible to switch to batch mode when there are no\n  `INSTEAD OF INSERT` and `BEFORE INSERT` triggers on the table and when\n  there are no defaults with volatile expressions for columns of the table.\n  Also the batch mode will only work when `spock.conflict_resolution` is\n  set to `error`.\n\n  The default is `true`.\n\n- `spock.use_spi`\n  Tells Spock to use SPI interface to form actual SQL\n  (`INSERT`, `UPDATE`, `DELETE`) statements to apply incoming changes instead\n  of using internal low level interface.\n\n  This is mainly useful for debugging purposes.\n\n  The default in PostgreSQL is `false`.\n\n  This can be set to `true` only when `spock.conflict_resolution` is set to `error`.\nIn this state, conflicts are not detected.\n\n- `spock.temp_directory`\n  Defines system path where to put temporary files needed for schema\n  synchronization. This path need to exist and be writable by user running\n  Postgres.\n\n  Default is empty, which tells Spock to use default temporary directory\n  based on environment and operating system settings.\n\n## Limitations and restrictions\n\n### Superuser is required\n\nCurrently spock replication and administration requires superuser\nprivileges. It may be later extended to more granular privileges.\n\n### `UNLOGGED` and `TEMPORARY` not replicated\n\n`UNLOGGED` and `TEMPORARY` tables will not and cannot be replicated, much like\nwith physical streaming replication.\n\n### One database at a time\n\nTo replicate multiple databases you must set up individual provider/subscriber\nrelationships for each. There is no way to configure replication for all databases\nin a PostgreSQL install at once.\n\n### PRIMARY KEY or REPLICA IDENTITY required\n\n`UPDATE`s and `DELETE`s cannot be replicated for tables that lack a `PRIMARY\nKEY` or other valid replica identity such as using an index, which must be unique,\nnot partial, not deferrable, and include only columns marked NOT NULL.\nReplication has no way to find the tuple that should be updated/deleted since\nthere is no unique identifier.\n`REPLICA IDENTITY FULL` is not supported yet.\n\n\n### Only one unique index/constraint/PK\n\nIf more than one upstream is configured or the downstream accepts local writes\nthen only one `UNIQUE` index should be present on downstream replicated tables.\nConflict resolution can only use one index at a time so conflicting rows may\n`ERROR` if a row satisfies the `PRIMARY KEY` but violates a `UNIQUE` constraint\non the downstream side. This will stop replication until the downstream table\nis modified to remove the violation.\n\nIt's fine to have extra unique constraints on an upstream if the downstream only\ngets writes from that upstream and nowhere else. The rule is that the downstream\nconstraints must *not be more restrictive* than those on the upstream(s).\n\nPartial secondary unique indexes are permitted, but will be ignored for\nconflict resolution purposes.\n\n### Unique constraints must not be deferrable\n\nOn the downstream end spock does not support index-based constraints\ndefined as `DEFERRABLE`. It will emit the error\n\n    ERROR: spock doesn't support index rechecks needed for deferrable indexes\n    DETAIL: relation \"public\".\"test_relation\" has deferrable indexes: \"index1\", \"index2\"\n\nif such an index is present when it attempts to apply changes to a table.\n\n### DDL\n\nAutomatic DDL replication is not supported. Managing DDL so that the provider and\nsubscriber database(s) remain compatible is the responsibility of the user.\n\nspock provides the `spock.replicate_ddl` function to allow DDL\nto be run on the provider and subscriber at a consistent point.\n\n### No replication queue flush\n\nThere's no support for freezing transactions on the master and waiting until\nall pending queued xacts are replayed from slots. Support for making the\nupstream read-only for this will be added in a future release.\n\nThis means that care must be taken when applying table structure changes. If\nthere are committed transactions that aren't yet replicated and the table\nstructure of the provider and subscriber are changed at the same time in a way\nthat makes the subscriber table incompatible with the queued transactions\nreplication will stop.\n\nAdministrators should either ensure that writes to the master are stopped\nbefore making schema changes, or use the `spock.replicate_ddl`\nfunction to queue schema changes so they're replayed at a consistent point\non the replica.\n\nOnce multi-master replication support is added then using\n`spock.replicate_ddl` will not be enough, as the subscriber may be\ngenerating new xacts with the old structure after the schema change is\ncommitted on the publisher. Users will have to ensure writes are stopped on all\nnodes and all slots are caught up before making schema changes.\n\n### FOREIGN KEYS\n\nForeign keys constraints are not enforced for the replication process - what\nsucceeds on provider side gets applied to subscriber even if the `FOREIGN KEY`\nwould be violated.\n\n### TRUNCATE\n\nUsing `TRUNCATE ... CASCADE` will only apply the `CASCADE` option on the\nprovider side.\n\n(Properly handling this would probably require the addition of `ON TRUNCATE CASCADE`\nsupport for foreign keys in PostgreSQL).\n\n`TRUNCATE ... RESTART IDENTITY` is not supported. The identity restart step is\nnot replicated to the replica.\n\n### Sequences\n\nWe strongly recommend that you use our new [Snowflake Sequences](https://github.com/pgEdge/snowflake-sequences) rather\nthan using the legacy sequences described below.\n\nThe state of sequences added to replication sets is replicated periodically\nand not in real-time. Dynamic buffer is used for the value being replicated so\nthat the subscribers actually receive future state of the sequence. This\nminimizes the chance of subscriber's notion of sequence's `last_value` falling\nbehind but does not completely eliminate the possibility.\n\nIt might be desirable to call `sync_sequence` to ensure all subscribers\nhave up to date information about given sequence after \"big events\" in the\ndatabase such as data loading or during the online upgrade.\n\nIt's generally recommended to use `bigserial` and `bigint` types for sequences\non multi-node systems as smaller sequences might reach end of the sequence\nspace fast.\n\nUsers who want to have independent sequences on provider and subscriber can\navoid adding sequences to replication sets and create sequences with step\ninterval equal to or greater than the number of nodes. And then setting a\ndifferent offset on each node. Use the `INCREMENT BY` option for\n`CREATE SEQUENCE` or `ALTER SEQUENCE`, and use `setval(...)` to set the start\npoint.\n\n### Triggers\n\nApply process and the initial COPY process both run with\n`session_replication_role` set to `replica` which means that `ENABLE REPLICA`\nand `ENABLE ALWAYS` triggers will be fired.\n\n### PostgreSQL Version differences\n\nSpock can replicate across PostgreSQL major versions. Despite that, long\nterm cross-version replication is not considered a design target, though it may\noften work. Issues where changes are valid on the provider but not on the\nsubscriber are more likely to arise when replicating across versions.\n\nIt is safer to replicate from an old version to a newer version since PostgreSQL\nmaintains solid backward compatibility but only limited forward compatibility.\nInitial schema synchronization is only supported when replicating between same\nversion of PostgreSQL or from lower version to higher version.\n\nReplicating between different minor versions makes no difference at all.\n\n### Database encoding differences\n\nSpock does not support replication between databases with different\nencoding. We recommend using `UTF-8` encoding in all replicated databases.\n\n### Large objects\n\nPostgreSQL's logical decoding facility does not support decoding changes\nto [large objects](https://www.postgresql.org/docs/current/largeobjects.html), \nso spock cannot replicate large objects.\n\nAlso any DDL limitations apply so extra care need to be taken when using\n`replicate_ddl_command()`.\n\n\n## Spock Read Only\n\nSpock supports enabling a cluster to be operated in read-only mode.\n\nThe read-only status is managed using a GUC (Grand Unified Configuration) parameter \nnamed `spock.readonly`. This parameter can be set to enable or disable the read-only \nmode. The read-only mode restricts non-superusers to read-only operations, while \nsuperusers can still perform both read and write operations regardless of the setting.\n\nThe flag is at cluster level: either all databases are read-only or all databases\nare read-write (the usual setting).\n\nThe read-only mode is implemented by filtering SQL statements:\n\n- SELECT statements are allowed if they don't call functions that write.\n- DML (INSERT, UPDATE, DELETE) and DDL statements including TRUNCATE are \n  forbidden entirely.\n- DCL statements GRANT and REVOKE are also forbidden.\n\nThis means that the databases are in read-only mode at SQL level: however, the\ncheckpointer, background writer, walwriter, and the autovacuum launcher are still\nrunning; this means that the database files are not read-only and that in some\ncases the database may still write to disk.\n\n### Cluster Read-Only Mode\n\nThe cluster read-only mode can now be controlled using the GUC parameter `spock.readonly`. \nThis configuration parameter allows you to set the cluster to read-only mode. Note that only a\nsuperuser can change this setting. When the cluster is set to read-only mode, non-superusers will\nbe restricted to read-only operations, while superusers will still be able to perform read and write\noperations regardless of the setting.\n\n### Setting Read-Only Mode\n\nThis value can be changed using the ALTER SYSTEM command.\n\n  ```sql\nALTER SYSTEM SET spock.readonly = 'on';\nSELECT pg_reload_conf();\n  ```\n\nTo set the cluster to read-only mode for a session, use the `SET` command. Here are the steps:\n\n  ```sql\nSET spock.readonly TO on;\n```\n\nTo query the current status of the cluster, you can use the following SQL command:\n  \n  ```sql\nSHOW spock.readonly;\n  ```\n\nThis command will return on if the cluster is in read-only mode and off if it is not.\n\nNotes\n - Only superusers can set and unset the spock.readonly parameter.\n - When the cluster is in read-only mode, only non-superusers are restricted to read-only operations. Superusers can continue to perform both read and write operations.\n - By switching to using a GUC parameter, you can easily manage the cluster's read-only status through standard PostgreSQL configuration mechanisms.\n\nSpock is licensed under the [pgEdge Community License v1.0](PGEDGE-COMMUNITY-LICENSE.md)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FpgEdge%2Fspock","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FpgEdge%2Fspock","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FpgEdge%2Fspock/lists"}