{"id":15567871,"url":"https://github.com/the4thdoctor/repcloud","last_synced_at":"2025-08-19T11:09:18.577Z","repository":{"id":35113877,"uuid":"195761328","full_name":"the4thdoctor/repcloud","owner":"the4thdoctor","description":"postgresql repack in cloud with a twist","archived":false,"fork":false,"pushed_at":"2021-03-31T08:44:27.000Z","size":371,"stargazers_count":12,"open_issues_count":2,"forks_count":3,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-07-29T19:26:32.691Z","etag":null,"topics":["cloud","database","maintenance","postgresql","repack"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"postgresql","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/the4thdoctor.png","metadata":{"files":{"readme":"README.rst","changelog":null,"contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE.txt","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null},"funding":{"github":"the4thdoctor","patreon":null,"open_collective":null,"ko_fi":"the4thdoctor","tidelift":null,"community_bridge":null,"liberapay":null,"issuehunt":null,"otechie":null,"custom":null}},"created_at":"2019-07-08T07:42:59.000Z","updated_at":"2024-08-18T18:28:47.000Z","dependencies_parsed_at":"2022-07-24T20:32:17.396Z","dependency_job_id":null,"html_url":"https://github.com/the4thdoctor/repcloud","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/the4thdoctor/repcloud","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/the4thdoctor%2Frepcloud","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/the4thdoctor%2Frepcloud/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/the4thdoctor%2Frepcloud/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/the4thdoctor%2Frepcloud/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/the4thdoctor","download_url":"https://codeload.github.com/the4thdoctor/repcloud/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/the4thdoctor%2Frepcloud/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":271143398,"owners_count":24706346,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-19T02:00:09.176Z","response_time":63,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cloud","database","maintenance","postgresql","repack"],"created_at":"2024-10-02T17:13:45.607Z","updated_at":"2025-08-19T11:09:18.553Z","avatar_url":"https://github.com/the4thdoctor.png","language":"Python","funding_links":["https://github.com/sponsors/the4thdoctor","https://ko-fi.com/the4thdoctor"],"categories":[],"sub_categories":[],"readme":"repcloud\n------------------------------\n.. image:: https://img.shields.io/github/issues/the4thdoctor/repcloud\n\t\t:alt: GitHub issues\n\t\t:target: https://github.com/the4thdoctor/repcloud/issues\n\n.. image:: https://img.shields.io/github/forks/the4thdoctor/repcloud\n\t\t:alt: GitHub forks\n\t\t:target: https://github.com/the4thdoctor/repcloud/network\n\n.. image:: https://img.shields.io/badge/License-PostgreSQL-blue\n\t\t:target: https://github.com/the4thdoctor/repcloud/issues\n\n.. image:: https://img.shields.io/github/release/the4thdoctor/repcloud\n\t\t:target: https://github.com/the4thdoctor/repcloud/release\n\n.. image:: https://img.shields.io/pypi/dm/repcloud\n    :target: https://pypi.org/project/repcloud\n\n\nrepcloud is a repacker for PostgreSQL tables. Unlikely pgrepack there's no need for extension or external libraries.\n\nThe procedure can repack the tables using a similar strategy like pgrepack, but without the physical file swap.\n\nThis allow the procedure to be executed on an environment where it is not possible, to install external libraries, or\nthere is no super user access (e.g. cloud hosted databases, hence the name).\n\nWhen repacking the process creates a copy of the original table and using a select insert copies the existing data into the new relation.\nA trigger on the original table stores the data changes for which are replayed on the new one before attempting the swap.\n\nAll the existing indices, foreign keys, and referencing foreign keys are created before the swap.\nViews and materialised views referencing the repacked table are dropped and created as well.\n\nAcknowledgement\n...................................\nCoding repcloud has been possible thanks to the sponsorhip of `Cleo AI. https://www.meetcleo.com/ \u003chttps://www.meetcleo.com/\u003e`_ \n\n.. image:: https://raw.githubusercontent.com/the4thdoctor/repcloud/master/images/cleo_logo.png\n        :target: https://www.meetcleo.com/\n        :scale: 30 %\n\n\nConfiguration\n...................................\n\nThe script, which executes the repack, is rpcl. At its first execution the it creates a directory in the user's home named .repcloud\nUnder this directory there are three other subfolders.\n\n.repcloud/logs where the procedure's logs are stored\n.repcloud/pid where the procedure's pid file is stored\n.repcloud/config where the configurations are stored.\nTHe file config-example.toml is copied into the the folder ./replcoud/config. It is a template for the configuration.\n\nthe command line rpcl accepts the following options:\n\n* --config specifies the config file to use in .repcloud/config. If omitted tje defaults configuration default.toml will be used\n* --connection specifies which connection to use within the configuration file. if omitted any connection is used for repacking\n* --debug forces the process in foreground with log sent both, to file and console\n* --start-replay starts the replay_data process as soon as the prepare_repack is finished. It applies only to prepare_repack.\n\nWithout debug and with the log_dest set to file, the process starts in background.\n\nrpcl accepts the following commands\n\n* show_connections shows the connections defined within the configuration file\n* create_schema creates the repack helper schemas in the target database\n* drop_schema drops the repack helper schemas from the target database. if any table failed the repack, its copy is dropped as well\n* repack_tables repacks the tables listed within the connection\n* prepare_repack prepares the tables for the repack, creates the new table, copies the data, and builds the indices. It stops before the swap.\n* abort_repack cancel any prepared table for repack  and resets the status of any table  left in failed or in progress status. The logging triggers, the log table and the copy table if present are dropped  by the command.\n* replay_data starts a replay data daemon. Useful to avoid a big lag to clear up between the prepare_repack and the final swap. It can be started automatically at the end of prepare repack with the option --start-replay\n* stop_prepare terminates the background process of prepare_repack\n* stop_repack terminates the background process of repack_tables\n* stop_replay  terminates the background process of replay_data\n\nPlease note that prepare_repack requires much more space than repack_tables because all tables are copied and prepared for the repack instead of repacking and dropping\nthem one by one.\n\n\nIn the configuration file the notifier and notifier.email sections allow to setup an email notification, which is triggered when the repack or prepare repack process is complete.\n\nFillfactor\n+++++++++++++++\nThe tool supports the **fillfactor** setup for the repacked tables. This is possible using a specific configuration file  stored in the directory *~/.repcloud/config/table_conf*\n\nThe file describing the storage settings must be named after the configuration and the connection which the settings apply in the form *\u003cconfiguration\u003e_\u003cconnection\u003e.toml*.\n\nFor example, if we are using the configuration *default.toml* where there is the connection *repack* the table configuration file's name should be \n*default_repack.toml* \n\nIf the table settings file is not present then the default values are used.\n\nInside the directory *~/.repcloud/config/table_conf* there is an example file to help the configuration.\n\nThe configuration at moment supports only **fillfactor** as storage parameter.\n\nA global fillfactor which applies to any table in the database can be set under the section **[storage]**.\n\nSchema wide fillfactor is supported adding the value under the section **[storage.schemaname]**.\n\nFillfactor for tables can be set using the section named after the schema and the table **[storage.schemaname.tablename]**.\n\nThe example configuration file sets the fillfactor:\n\n  * for all the tables in the database to 100\n  * for all the tables in the schema foo to 80\n  * for the table foo.bar to 30\n\n\n::\n\n    #table configuration example\n    # storage data. currently only fillfactor is allowed\n    \n    #set the fillfactor for all the tables \n    [storage]\n    fillfactor = 100 \n    \n    #sets the fillfactor for all the tables in the schema foo\n    [storage.foo]\n    fillfactor = 80 \n    \n    #set the fillfactor for the table foo.bar\n    [storage.foo.bar]\n    fillfactor = 30 \n\nCleanup json/jsonb\n++++++++++++++++++++++++++++++++\n\nIn the table's configuration file it's possible to specify whether to cleanup json/jsonb keys with null keys.\nIt's possible to remove jsonb keys entirely but this applies only to the data type jsonb.\nThe table's configuration file provides both examples.\n\n::\n\n\t[public.foobar]\n\t#cleanup_nulls and remove_keys for the same field are  currently mutually exclusive with cleanup_nulls taking the precedence\n\t#strip nulls from a json/jsonb field\n\tfoo.cleanup_nulls = true\n\n\t#filtering data, based on the key currently only jsonb is supported\n\tbar.remove_keys = [ \"key1\" ]\n\nExample files\n++++++++++++++++++++++++++++++++\n\n\nExample configuration file: config-example.toml_.\n\n.. _config-example.toml: https://github.com/the4thdoctor/repcloud/blob/master/repcloud/config/config-example.toml\n\n\nExample table setup for configuration **config-example** and connection **repack**: config-example_repack.toml_.\n\n.. _config-example_repack.toml: https://github.com/the4thdoctor/repcloud/blob/master/repcloud/config/config-example_repack.toml\n\n\nLimitations\n............................\n\nThe procedure needs to be able to drop all the objects involved in the repack. Therefore the login user must be the object's owner or\nshould be able to drop the objects.\n\nThe swap requires an exclusive lock on the old relation for the time necessary to move the new relation into the correct schema and drop the old relation.\nIf an error occurs during this phase, everything is rolled back. The procedure resumes the replay and will attempt again the swap after a sufficient amount of data has been replayed.\n\nCurrently there is no support for single index repack or tablespace change.\n\nA connection must have the header in the form of [connections.\u003cconnection_name\u003e]\n\nEach connection requires the database connection data: user, password, port, host, database, sslmode.\n\nThe lists schemas and tables allow to specify which schema or tables we want to repack. If omitted the repack will process any table within the database.\n\nThe parameter max_replay_rows specifies how many rows should be replayed at once during the replay phase.\nlock_timeout specifies how long the process should wait for acquiring the lock on the table to swap before giving up. If the lock_timeout expires, the swap is delayed\nuntil a sufficient amount of rows are replayed again.\n\ncheck_time specifies the time between two checks for changed data on the repacked table. The value will be matched against the replay speed in order to determine\nif the replay can reach the consistent status with the original table.\nIf it's not possible the swap attempt aborts.\n\nIn case of deadlock, it's possible to specify the resolution strategy. with connection's parameter **deadlock_resolution**.\nThe possible values are *nothing, cancel_query, kill_query*.\n\nWith **nothing** the deadlock resolution will be managed by the database. With **cancel_query** the blocking queries will be cancelled with **pg_cancel_backend**. \nWith kill_query the blocking queries will be terminated with **pg_terminate_backend**.\n\nThe configuration's example file have the parameter set to nothing.\n\n::\n\n\tdeadlock_resolution = \"nothing\"\n\n\nLicense\n------------------------------\nrepcloud is released under the terms of the `PostgreSQL license - https://opensource.org/licenses/postgresql \u003chttps://opensource.org/licenses/postgresql\u003e`_\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fthe4thdoctor%2Frepcloud","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fthe4thdoctor%2Frepcloud","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fthe4thdoctor%2Frepcloud/lists"}