{"id":26650609,"url":"https://github.com/aiven/pghoard","last_synced_at":"2025-03-25T02:01:53.930Z","repository":{"id":33884311,"uuid":"37595806","full_name":"Aiven-Open/pghoard","owner":"Aiven-Open","description":"PostgreSQL® backup and restore service","archived":false,"fork":false,"pushed_at":"2025-03-14T12:07:52.000Z","size":8743,"stargazers_count":1336,"open_issues_count":55,"forks_count":98,"subscribers_count":78,"default_branch":"main","last_synced_at":"2025-03-20T22:09:56.767Z","etag":null,"topics":["aiven","backup","cloud-object-storage","postgresql","restore"],"latest_commit_sha":null,"homepage":"http://aiven-open.github.io/pghoard/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Aiven-Open.png","metadata":{"files":{"readme":"README.rst","changelog":"NEWS","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":".github/CODEOWNERS","security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2015-06-17T13:13:40.000Z","updated_at":"2025-03-18T11:30:21.000Z","dependencies_parsed_at":"2023-10-11T20:43:01.906Z","dependency_job_id":"63f2b4cc-ea67-4cfc-a783-75c91e000140","html_url":"https://github.com/Aiven-Open/pghoard","commit_stats":{"total_commits":858,"total_committers":72,"mean_commits":"11.916666666666666","dds":0.7144522144522145,"last_synced_commit":"28ee54fe2f15415048bfc374cd398b87bf8c0132"},"previous_names":["aiven/pghoard","ohmu/pghoard"],"tags_count":27,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Aiven-Open%2Fpghoard","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Aiven-Open%2Fpghoard/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Aiven-Open%2Fpghoard/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Aiven-Open%2Fpghoard/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Aiven-Open","download_url":"https://codeload.github.com/Aiven-Open/pghoard/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245383037,"owners_count":20606265,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["aiven","backup","cloud-object-storage","postgresql","restore"],"created_at":"2025-03-25T02:01:50.828Z","updated_at":"2025-03-25T02:01:53.891Z","avatar_url":"https://github.com/Aiven-Open.png","language":"Python","funding_links":[],"categories":["Python","Compiled list","Backups"],"sub_categories":["plv8:"],"readme":"PGHoard |BuildStatus|_\n======================\n\n.. |BuildStatus| image:: https://github.com/aiven/pghoard/actions/workflows/build.yml/badge.svg?branch=main\n.. _BuildStatus: https://github.com/aiven/pghoard/actions\n.. image:: https://codecov.io/gh/aiven/pghoard/branch/main/graph/badge.svg?token=nLr7M7hvCx\n   :target: https://codecov.io/gh/aiven/pghoard\n\n``pghoard`` is a PostgreSQL® backup daemon and restore tooling that stores backup data in cloud object stores.\n\nFeatures:\n\n* Automatic periodic basebackups\n* Automatic transaction log (WAL/xlog) backups (using either ``pg_receivexlog``,\n  ``archive_command`` or experimental PG native replication protocol support with ``walreceiver``)\n* Optional Standalone Hot Backup support\n* Cloud object storage support (AWS S3, Google Cloud, OpenStack Swift, Azure, Ceph)\n* Backup restoration directly from object storage, compressed and encrypted\n* Point-in-time-recovery (PITR)\n* Initialize a new standby from object storage backups, automatically configured as\n  a replicating hot-standby\n\nFault-resilience and monitoring:\n\n* Persists over temporary object storage connectivity issues by retrying transfers\n* Verifies WAL file headers before upload (backup) and after download (restore),\n  so that e.g. files recycled by PostgreSQL are ignored\n* Automatic history cleanup (backups and related WAL files older than N days)\n* \"Archive sync\" tool for detecting holes in WAL backup streams and fixing them\n* \"Archive cleanup\" tool for deleting obsolete WAL files from the archive\n* Keeps statistics updated in a file on disk (for monitoring tools)\n* Creates alert files on disk on problems (for monitoring tools)\n\nPerformance:\n\n* Parallel compression and encryption\n* WAL pre-fetching on restore\n\n\nOverview\n========\n\nPostgreSQL Point In Time Replication (PITR) consists of a having a database\nbasebackup and changes after that point go into WAL log files that can be\nreplayed to get to the desired replication point.\n\nPGHoard supports multiple operating models.  The basic mode where you have a\nseparate backup machine, ``pghoard`` can simply connect with\n``pg_receivexlog`` to receive WAL files from the database as they're\nwritten.  Another model is to use ``pghoard_postgres_command`` as a\nPostgreSQL ``archive_command``. There is also experimental support for PGHoard to\nuse PostgreSQL's native replication protocol with the experimental\n``walreceiver`` mode.\n\nWith both modes of operations PGHoard creates periodic basebackups using\n``pg_basebackup`` that is run against the database in question.\n\nThe PostgreSQL write-ahead log (WAL) and basebackups are compressed with\nSnappy (default), Zstandard (configurable, level 3 by default) or LZMA (configurable,\nlevel 0 by default) in order to ensure good compression speed and relatively small backup size.\nFor performance critical applications it is recommended to test compression\nalgorithms to find the most suitable trade-off for the particular use-case.\nE.g. Snappy is fast but yields larger compressed files, Zstandard (zstd) on the other hand\noffers a very wide range of compression/speed trade-off.\n\nOptionally, PGHoard can encrypt backed up data at rest. Each individual\nfile is encrypted and authenticated with file specific keys. The file\nspecific keys are included in the backup in turn encrypted with a master\nRSA private/public key pair.\n\nPGHoard supports backing up and restoring from either a local filesystem or\nfrom various object stores (AWS S3, Azure, Ceph, Google Cloud and OpenStack\nSwift.)\n\nIn case you just have a single database machine, it is heavily recommended\nto utilize one of the object storage services to allow backup recovery even\nif the host running PGHoard is incapacitated.\n\n\nRequirements\n============\n\nPGHoard can backup and restore PostgreSQL versions 9.6 and above, but is\nonly tested and actively developed with version 12 and above.\n\nThe daemon is implemented in Python and is tested and developed with version\n3.10 and above. The following Python modules are required:\n\n* psycopg2_ to look up transaction log metadata\n* requests_ for the internal client-server architecture\n\n.. _`psycopg2`: http://initd.org/psycopg/\n.. _`requests`: http://www.python-requests.org/en/latest/\n\nOptional requirements include:\n\n* azure_ for Microsoft Azure object storage (patched version required, see link)\n* botocore_ for AWS S3 (or Ceph-S3) object storage\n* google-api-client_ for Google Cloud object storage\n* cryptography_ for backup encryption and decryption (version 0.8 or newer required)\n* snappy_ for Snappy compression and decompression\n* zstandard_ for Zstandard (zstd) compression and decompression\n* systemd_ for systemd integration\n* swiftclient_ for OpenStack Swift object storage\n* paramiko_  for sftp object storage\n\n.. _`azure`: https://github.com/aiven/azure-sdk-for-python/tree/aiven/rpm_fixes\n.. _`botocore`: https://github.com/boto/botocore\n.. _`google-api-client`: https://github.com/google/google-api-python-client\n.. _`cryptography`: https://cryptography.io/\n.. _`snappy`: https://github.com/andrix/python-snappy\n.. _`zstandard`: https://github.com/indygreg/python-zstandard\n.. _`systemd`: https://github.com/systemd/python-systemd\n.. _`swiftclient`: https://github.com/openstack/python-swiftclient\n.. _`paramiko`: https://github.com/paramiko/paramiko\n\nDeveloping and testing PGHoard also requires the following utilities:\nflake8_, pylint_ and pytest_.\n\n.. _`flake8`: https://flake8.readthedocs.io/\n.. _`pylint`: https://www.pylint.org/\n.. _`pytest`: http://pytest.org/\n\nPGHoard has been developed and tested on modern Linux x86-64 systems, but\nshould work on other platforms that provide the required modules.\n\nVagrant\n=======\n\nThe Vagrantfile can be used to setup a vagrant development environment.   The vagrant environment has\npython 3.10, 3.11 and 3.12 virtual environments and installations of postgresql 12, 13, 14, 15 and 16.\n\nBy default vagrant up will start a Virtualbox environment. The Vagrantfile will also work for libvirt, just prefix\n``VAGRANT_DEFAULT_PROVIDER=libvirt`` to the ``vagrant up`` command.\n\nAny combination of Python (3.10, 3.11 and 3.12) and Postgresql (12, 13, 14, 15, 16 and 17)\n\nBring up vagrant instance and connect via ssh::\n\n  vagrant up\n  vagrant ssh\n  vagrant@ubuntu2004:~$ cd /vagrant\n\nTest with Python 3.11 and Postgresql 12::\n\n  vagrant@ubuntu2004:~$ source ~/venv3.11/bin/activate\n  vagrant@ubuntu2004:~$ PG_VERSION=12 make unittest\n  vagrant@ubuntu2004:~$ deactivate\n\nTest with Python 3.12 and Postgresql 13::\n\n  vagrant@ubuntu2004:~$ source ~/venv3.12/bin/activate\n  vagrant@ubuntu2004:~$ PG_VERSION=13 make unittest\n  vagrant@ubuntu2004:~$ deactivate\n\nAnd so on\n\n\nBuilding\n========\n\nTo build an installation package for your distribution, go to the root\ndirectory of a PGHoard Git checkout and run:\n\nDebian::\n\n  make deb\n\nThis will produce a ``.deb`` package into the parent directory of the Git\ncheckout.\n\nFedora::\n\n  make rpm\n\nThis will produce a ``.rpm`` package usually into ``rpm/RPMS/noarch/``.\n\nPython/Other::\n\n  python setup.py bdist_egg\n\nThis will produce an egg file into a dist directory within the same folder.\n\n\nInstallation\n============\n\nTo install it run as root:\n\nDebian::\n\n  dpkg -i ../pghoard*.deb\n\nFedora::\n\n  dnf install rpm/RPMS/noarch/*\n\nOn Linux systems it is recommended to simply run ``pghoard`` under\n``systemd``::\n\n  systemctl enable pghoard.service\n\nand eventually after the setup section, you can just run::\n\n  systemctl start pghoard.service\n\nPython/Other::\n\n  easy_install dist/pghoard-1.7.0-py3.6.egg\n\nOn systems without ``systemd`` it is recommended that you run ``pghoard``\nunder Supervisor_ or other similar process control system.\n\n.. _`Supervisor`: http://supervisord.org\n\n\nSetup\n=====\n\nAfter this you need to create a suitable JSON configuration file for your\ninstallation.\n\n0.  Make sure PostgreSQL is configured to allow WAL archival and retrieval.\n    ``postgresql.conf`` should have ``wal_level`` set to ``archive`` or\n    higher and ``max_wal_senders`` set to at least ``1`` (``archive_command`` mode)\n    or at least ``2`` (``pg_receivexlog`` and ``walreceiver`` modes), for example::\n\n        wal_level = archive\n        max_wal_senders = 4\n\n    Note that changing ``wal_level`` or ``max_wal_senders`` settings requires\n    restarting PostgreSQL.\n\n1. Create a suitable PostgreSQL user account for ``pghoard``::\n\n     CREATE USER pghoard PASSWORD 'putyourpasswordhere' REPLICATION;\n\n2. Edit the local ``pg_hba.conf`` to allow access for the newly created\n   account to the ``replication`` database from the primary and standby\n   nodes. For example::\n\n     # TYPE  DATABASE     USER     ADDRESS       METHOD\n     host    replication  pghoard  127.0.0.1/32  md5\n\n   After editing, please reload the configuration with either::\n\n     SELECT pg_reload_conf();\n\n   or by sending directly a ``SIGHUP`` to the PostgreSQL ``postmaster`` process.\n\n3. Fill in the created user account and primary/standby addresses into the\n   configuration file ``pghoard.json`` to the section ``backup_sites``.\n\n4. Fill in the possible object storage user credentials into the\n   configuration file ``pghoard.json`` under section ``object_storage``\n   in case you wish ``pghoard`` to back up into the cloud.\n\n5. Now copy the same ``pghoard.json`` configuration to the standby\n   node if there are any.\n\nOther possible configuration settings are covered in more detail under the\n`Configuration keys`_ section of this README.\n\n6. If all has been set up correctly up to this point, ``pghoard`` should now be\n   ready to be started.\n\n\nBacking up your database\n========================\n\nPostgreSQL backups consist of full database backups, *basebackups*, plus\nwrite ahead logs and related metadata, *WAL*.  Both *basebackups* and *WAL*\nare required to create and restore a consistent database (does not apply\nfor standalone hot backups).\n\nTo enable backups with PGHoard the ``pghoard`` daemon must be running\nlocally.  The daemon will periodically take full basebackups of the database\nfiles to the object store.  Additionally, PGHoard and PostgreSQL must be set\nup correctly to archive the WAL.  There are two ways to do this:\n\nThe default option is to use PostgreSQL's own WAL-archive mechanism with\n``pghoard`` by running the ``pghoard`` daemon locally and entering the\nfollowing configuration keys in ``postgresql.conf``::\n\n    archive_mode = on\n    archive_command = pghoard_postgres_command --mode archive --site default --xlog %f\n\nThis instructs PostgreSQL to call the ``pghoard_postgres_command`` whenever\na new WAL segment is ready.  The command instructs PGHoard to store the\nsegment in its object store.\n\nThe other option is to set up PGHoard to read the WAL stream directly from\nPostgreSQL.  To do this ``archive_mode`` must be disabled in\n``postgresql.conf`` and ``pghoard.json`` must set ``active_backup_mode`` to\n``pg_receivexlog`` in the relevant site, for example::\n\n    {\n        \"backup_sites\": {\n            \"default\": {\n                \"active_backup_mode\": \"pg_receivexlog\",\n                ...\n             },\n         },\n         ...\n     }\n\nNote that as explained in the `Setup`_ section, ``postgresql.conf`` setting\n``wal_level`` must always be set to ``archive``, ``hot_standby`` or\n``logical`` and ``max_wal_senders`` must allow 2 connections from PGHoard,\ni.e. it should be set to 2 plus the number of streaming replicas, if any.\n\nWhile ``pghoard`` is running it may be useful to read the JSON state file\n``pghoard_state.json`` that exists where ``json_state_file_path`` points.\nThe JSON state file is human readable and is meant to describe the current\nstate of ``pghoard`` 's backup activities.\n\n\nStandalone Hot Backup Support\n=============================\n\nPghoard has the option to enable standalone hot backups.\n\nTo do this ``archive_mode`` must be disabled in ``postgresql.conf`` and\n``pghoard.json`` must set ``active_backup_mode`` to ``standalone_hot_backup``\nin the relevant site, for example::\n\n\n    {\n        \"backup_sites\": {\n            \"default\": {\n                \"active_backup_mode\": \"standalone_hot_backup\",\n                ...\n             },\n         },\n         ...\n     }\n\n\nFor more information refer to the postgresql documentation\nhttps://www.postgresql.org/docs/9.5/continuous-archiving.html#BACKUP-STANDALONE\n\n\nRestoring databases\n===================\n\nYou can list your database basebackups by running::\n\n  pghoard_restore list-basebackups --config /var/lib/pghoard/pghoard.json\n\n  Basebackup                       Size  Start time            Metadata\n  -------------------------------  ----  --------------------  ------------\n  default/basebackup/2016-04-12_0  8 MB  2016-04-12T07:31:27Z  {'original-file-size': '48060928',\n                                                                'start-wal-segment': '000000010000000000000012',\n                                                                'compression-algorithm': 'snappy'}\n\nIf we'd want to restore to the latest point in time we could fetch the\nrequired basebackup by running::\n\n  pghoard_restore get-basebackup --config /var/lib/pghoard/pghoard.json \\\n      --target-dir /var/lib/pgsql/9.5/data --restore-to-primary\n\n  Basebackup complete.\n  You can start PostgreSQL by running pg_ctl -D foo start\n  On systemd based systems you can run systemctl start postgresql\n  On SYSV Init based systems you can run /etc/init.d/postgresql start\n\nNote that the ``target-dir`` needs to be either an empty or non-existent\ndirectory in which case PGHoard will automatically create it.\n\nAfter this we'd proceed to start both the PGHoard server process and the\nPostgreSQL server normally by running (on systemd based systems, assuming\nPostgreSQL 9.5 is used)::\n\n  systemctl start pghoard\n  systemctl start postgresql-9.5\n\nWhich will make PostgreSQL start recovery process to the latest point\nin time. PGHoard must be running before you start up the\nPostgreSQL server. To see other possible restoration options please run::\n\n  pghoard_restore --help\n\n\nCommands\n========\n\nOnce correctly installed, there are six commands available:\n\n``pghoard`` is the main daemon process that should be run under a service\nmanager, such as ``systemd`` or ``supervisord``.  It handles the backup of\nthe configured sites.\n\n``pghoard_restore`` is a command line tool that can be used to restore a\nprevious database backup from either ``pghoard`` itself or from one of the\nsupported object stores.  ``pghoard_restore`` can also configure\n``recovery.conf`` to use ``pghoard_postgres_command`` as the WAL\n``restore_command`` in ``recovery.conf``.\n\n``pghoard_archive_cleanup`` can be used to clean up any orphan WAL files\nfrom the object store.  After the configured number of basebackups has been\nexceeded (configuration key ``basebackup_count``), ``pghoard`` deletes the\noldest basebackup and all WAL associated with it.  Transient object storage\nfailures and other interruptions can cause the WAL deletion process to leave\norphan WAL files behind, they can be deleted with this tool.\n\n``pghoard_archive_sync`` can be used to see if any local files should\nbe archived but haven't been or if any of the archived files have unexpected\ncontent and need to be archived again. The other usecase it has is to determine\nif there are any gaps in the required files in the WAL archive\nfrom the current WAL file on to to the latest basebackup's first WAL file.\n\n``pghoard_create_keys`` can be used to generate and output encryption keys\nin the ``pghoard`` configuration format.\n\n``pghoard_postgres_command`` is a command line tool that can be used as\nPostgreSQL's ``archive_command`` or ``recovery_command``.  It communicates with\n``pghoard`` 's locally running webserver to let it know there's a new file that\nneeds to be compressed, encrypted and stored in an object store (in archive\nmode) or it's inverse (in restore mode.)\n\n\nConfiguration keys\n==================\n\n``active`` (default ``true``)\n\nCan be set on a per ``backup_site`` level to ``false`` to disable the taking\nof new backups and to stop the deletion of old ones.\n\n``active_backup_mode`` (default ``pg_receivexlog``)\n\nCan be either ``pg_receivexlog`` or ``archive_command``. If set to\n``pg_receivexlog``, ``pghoard`` will start up a ``pg_receivexlog`` process to be\nrun against the database server.  If ``archive_command`` is set, we rely on the\nuser setting the correct ``archive_command`` in\n``postgresql.conf``. You can also set this to the experimental ``walreceiver`` mode\nwhereby pghoard will start communicating directly with PostgreSQL\nthrough the replication protocol. (Note requires an unreleased version\nof psycopg2 library)\n\n``alert_file_dir`` (default ``backup_location`` if set else ``os.getcwd()``)\n\nDirectory in which alert files for replication warning and failover are\ncreated.\n\n``backup_location`` (no default)\n\nPlace where ``pghoard`` will create its internal data structures for local state\ndata and the actual backups.  (if no object storage is used)\n\n``backup_sites`` (default ``{}``)\n\nThis object contains names and configurations for the different PostgreSQL\nclusters (here called ``sites``) from which to take backups.  The\nconfiguration keys for sites are listed below.\n\n* ``compression`` WAL/basebackup compression parameters\n\n * ``algorithm`` default ``\"snappy\"`` if available, otherwise ``\"lzma\"`` or ``\"zstd\"``\n * ``level`` default ``\"0\"`` compression level for ``\"lzma\"`` or ``\"zstd\"`` compression\n * ``thread_count`` (default max(cpu_count, ``5``)) number of parallel compression threads\n\n``hash_algorithm`` (default ``\"sha1\"``)\n\nThe hash algorithm used for calculating checksums for WAL or other files. Must\nbe one of the algorithms supported by Python's hashlib.\n\n``http_address`` (default ``\"127.0.0.1\"``)\n\nAddress to bind the PGHoard HTTP server to.  Set to an empty string to\nlisten to all available IPv4 addresses.   Set it to the IPv6 ``::`` wildcard\naddress to bind to all available IPv4 and IPv6 addresses.\n\n``http_port`` (default ``16000``)\n\nHTTP webserver port. Used for the archive command and for fetching of\nbasebackups/WAL's when restoring if not using an object store.\n\n``json_state_file_path`` (default ``\"/var/lib/pghoard/pghoard_state.json\"``)\n\nLocation of a JSON state file which describes the state of the ``pghoard``\nprocess.\n\n``log_level`` (default ``\"INFO\"``)\n\nDetermines log level of ``pghoard``.\n\n``maintenance_mode_file`` (default ``\"/var/lib/pghoard/maintenance_mode_file\"``)\n\nIf a file exists in this location, no new backup actions will be started.\n\n``pg_receivexlog``\n\nWhen active backup mode is set to ``\"pg_receivexlog\"`` this object may\noptionally specify additional configuration options. The currently available\noptions are all related to monitoring disk space availability and optionally\npausing xlog/WAL receiving when disk space goes below configured threshold.\nThis is useful when PGHoard is configured to create its temporary files on\na different volume than where the main PostgreSQL data directory resides. By\ndefault this logic is disabled and the minimum free bytes must be configured\nto enable it.\n\n``pg_receivexlog.disk_space_check_interval`` (default ``10``)\n\nHow often to check available disk space.\n\n``pg_receivexlog.min_disk_free_bytes`` (default undefined)\n\nMinimum bytes (as an integer) that must be available in order to keep on\nreceiving xlogs/WAL from PostgreSQL. If available disk space goes below this\nlimit a ``STOP`` signal is sent to the ``pg_receivexlog`` / ``pg_receivewal``\napplication.\n\n``pg_receivexlog.resume_multiplier`` (default ``1.5``)\n\nNumber of times the ``min_disk_free_bytes`` bytes of disk space that is\nrequired to start receiving xlog/WAL again (i.e. send the ``CONT`` signal to\nthe ``pg_receivexlog`` / ``pg_receivewal`` process). Multiplier above 1\nshould be used to avoid stopping and continuing the process constantly.\n\n``restore_prefetch`` (default ``transfer.thread_count``)\n\nNumber of files to prefetch when performing archive recovery.  The default\nis the number of Transfer Agent threads to try to utilize them all.\n\n``statsd`` (default: disabled)\n\nEnables metrics sending to a statsd daemon that supports Telegraf\nor DataDog syntax with tags.\n\nThe value is a JSON object::\n\n  {\n      \"host\": \"\u003cstatsd address\u003e\",\n      \"port\": \u003cstatsd port\u003e,\n      \"format\": \"\u003cstatsd message format\u003e\",\n      \"tags\": {\n          \"\u003ctag\u003e\": \"\u003cvalue\u003e\"\n      }\n  }\n\n``format`` (default: ``\"telegraf\"``)\n\nDetermines statsd message format. Following formats are supported:\n\n* ``telegraf`` `Telegraf spec`_\n\n.. _`Telegraf spec`: https://github.com/influxdata/telegraf/tree/master/plugins/inputs/statsd\n\n* ``datadog`` `DataDog spec`_\n\n.. _`DataDog spec`: http://docs.datadoghq.com/guides/dogstatsd/#datagram-format\n\nThe ``tags`` setting can be used to enter optional tag values for the metrics.\n\n``pushgateway`` (default: disabled)\n\nEnables metrics sending to a Prometheus Pushgateway with tags.\n\nThe value is a JSON object::\n\n  {\n      \"endpoint\": \"\u003cpushgateway address\u003e\",\n      \"tags\": {\n          \"\u003ctag\u003e\": \"\u003cvalue\u003e\"\n      }\n  }\n\nThe ``tags`` setting can be used to enter optional tag values for the metrics.\n\n``prometheus`` (default: disabled)\n\nExpose metrics through a Prometheus endpoint.\n\nThe value is a JSON object::\n\n  {\n      \"tags\": {\n          \"\u003ctag\u003e\": \"\u003cvalue\u003e\"\n      }\n  }\n\nThe ``tags`` setting can be used to enter optional tag values for the metrics.\n\n``syslog`` (default ``false``)\n\nDetermines whether syslog logging should be turned on or not.\n\n``syslog_address`` (default ``\"/dev/log\"``)\n\nDetermines syslog address to use in logging (requires syslog to be true as\nwell)\n\n``syslog_facility`` (default ``\"local2\"``)\n\nDetermines syslog log facility. (requires syslog to be true as well)\n\n* ``transfer`` WAL/basebackup transfer parameters\n\n * ``thread_count`` (default max(cpu_count, ``5``)) number of parallel uploads/downloads\n\n``upload_retries_warning_limit`` (default ``3``)\n\nAfter this many failed upload attempts for a single file, create an\nalert file.\n\n``tar_executable`` (default ``\"pghoard_gnutaremu\"``)\n\nThe tar command to use for restoring basebackups. This must be GNU tar because some\nadvanced switches like ``--transform`` are needed. If this value is not defined (or\nis explicitly set to ``\"pghoard_gnutaremu\"``), Python's internal tarfile\nimplementation is used. The Python implementation is somewhat slower than the\nactual tar command and in environments with fast disk IO (compared to available CPU\ncapacity) it is recommended to set this to ``\"tar\"``.\n\nBackup site configuration\n=========================\n\nThe following options control the behavior of each backup site.  A backup\nsite means an individual PostgreSQL installation (\"cluster\" in PostgreSQL\nterminology) from which to take backups.\n\n``basebackup_age_days_max`` (default undefined)\n\nMaximum age for basebackups. Basebackups older than this will be removed. By\ndefault this value is not defined and basebackups are deleted based on total\ncount instead.\n\n``basebackup_chunks_in_progress`` (default ``5``)\n\nHow many basebackup chunks can there be simultaneously on disk while\nit is being taken. For chunk size configuration see ``basebackup_chunk_size``.\n\n``basebackup_chunk_size`` (default ``2147483648``)\n\nIn how large backup chunks to take a ``local-tar`` basebackup. Disk\nspace needed for a successful backup is this variable multiplied by\n``basebackup_chunks_in_progress``.\n\n``basebackup_compression_threads`` (default ``0``)\n\nNumber of threads to use within compression library during basebackup. Only\napplicable when using compression library that supports internal multithreading,\nnamely zstd at the moment. Default value 0 means not to use multithreading.\n\n``basebackup_count`` (default ``2``)\n\nHow many basebackups should be kept around for restoration purposes.  The\nmore there are the more diskspace will be used. If ``basebackup_max_age`` is\ndefined this controls the maximum number of basebackups to keep; if backup\ninterval is less than 24 hour or extra backups are created there can be more\nthan one basebackup per day and it is often desirable to set\n``basebackup_count`` to something slightly higher than the max age in days.\n\n``basebackup_count_min`` (default ``2``)\n\nMinimum number of basebackups to keep. This is only effective when\n``basebackup_age_days_max`` has been defined. If for example the server is\npowered off and then back on a month later, all existing backups would be very\nold. However, in that case it is usually not desirable to immediately delete\nall old backups. This setting allows specifying a minimum number of backups\nthat should always be preserved regardless of their age.\n\n``basebackup_hour`` (default undefined)\n\nThe hour of day during which to start new basebackup. If backup interval is\nless than 24 hours this is the base hour used to calculate the hours at which\nbackup should be taken. E.g. if backup interval is 6 hours and this value is\nset to 1 backups will be taken at hours 1, 7, 13 and 19. This value is only\neffective if also ``basebackup_interval_hours`` and ``basebackup_minute`` are\nset.\n\n``basebackup_interval_hours`` (default ``24``)\n\nHow often to take a new basebackup of a cluster.  The shorter the interval,\nthe faster your recovery will be, but the more CPU/IO usage is required from\nthe servers it takes the basebackup from.  If set to a null value basebackups\nare not automatically taken at all.\n\n``basebackup_minute`` (default undefined)\n\nThe minute of hour during which to start new basebackup. This value is only\neffective if also ``basebackup_interval_hours`` and ``basebackup_hour`` are\nset.\n\n``basebackup_mode`` (default ``\"basic\"``)\n\nThe way basebackups should be created.  The default mode, ``basic`` runs\n``pg_basebackup`` and waits for it to write an uncompressed tar file on the\ndisk before compressing and optionally encrypting it.  The alternative mode\n``pipe`` pipes the data directly from ``pg_basebackup`` to PGHoard's\ncompression and encryption processing reducing the amount of temporary disk\nspace that's required.\n\nNeither ``basic`` nor ``pipe`` modes support multiple tablespaces.\n\nSetting ``basebackup_mode`` to ``local-tar`` avoids using ``pg_basebackup``\nentirely when ``pghoard`` is running on the same host as the database.\nPGHoard reads the files directly from ``$PGDATA`` in this mode and\ncompresses and optionally encrypts them.  This mode allows backing up user\ntablespaces.\n\nWhen using ``delta`` mode, only changed files are uploaded into the storage.\nOn every backup snapshot of the data files is taken, this results in a manifest file,\ndescribing the hashes of all the files needed to be backed up.\nNew hashes are uploaded to the storage and used together with complementary\nmanifest from control file for restoration.\nIn order to properly assess the efficiency of ``delta`` mode in comparison with\n``local-tar``, one can use ``local-tar-delta-stats`` mode, which behaves the same as\n``local-tar``, but also collects the metrics as if it was ``delta`` mode. It can help\nin decision making of switching to ``delta`` mode.\n\n``basebackup_threads`` (default ``1``)\n\nHow many threads to use for tar, compress and encrypt tasks. Only applies for\n``local-tar`` basebackup mode. Only values 1 and 2 are likely to be sensible for\nthis, with higher thread count speed improvement is negligible and CPU time is\nlost switching between threads.\n\n``encryption_key_id`` (no default)\n\nSpecifies the encryption key used when storing encrypted backups. If this\nconfiguration directive is specified, you must also define the public key\nfor storing as well as private key for retrieving stored backups. These\nkeys are specified with ``encryption_keys`` dictionary.\n\n``encryption_keys`` (no default)\n\nThis key is a mapping from key id to keys. Keys in turn are mapping from\n``public`` and ``private`` to PEM encoded RSA public and private keys\nrespectively. Public key needs to be specified for storing backups. Private\nkey needs to be in place for restoring encrypted backups.\n\nYou can use ``pghoard_create_keys`` to generate and output encryption keys\nin the ``pghoard`` configuration format.\n\n``object_storage`` (no default)\n\nConfigured in ``backup_sites`` under a specific site.  If set, it must be an\nobject describing a remote object storage.  The object must contain a key\n``storage_type`` describing the type of the store, other keys and values are\nspecific to the storage type.\n\n``proxy_info`` (no default)\n\nDictionary specifying proxy information. The dictionary must contain keys ``type``,\n``host`` and ``port``. Type can be either ``socks5`` or ``http``.  Optionally,\n``user`` and ``pass`` can be specified for proxy authentication.  Supported by\nAzure, Google and S3 drivers.\n\nThe following object storage types are supported:\n\n* ``local`` makes backups to a local directory, see ``pghoard-local-minimal.json``\n  for example. Required keys:\n\n * ``directory`` for the path to the backup target (local) storage directory\n\n* ``sftp`` makes backups to a sftp server, required keys:\n\n * ``server``\n * ``port``\n * ``username``\n * ``password`` or ``private_key``\n\n* ``google`` for Google Cloud Storage, required configuration keys:\n\n * ``project_id`` containing the Google Storage project identifier\n * ``bucket_name`` bucket where you want to store the files\n * ``credential_file`` for the path to the Google JSON credential file\n\n* ``s3`` for Amazon Web Services S3, required configuration keys:\n\n * ``aws_access_key_id`` for the AWS access key id\n * ``aws_secret_access_key`` for the AWS secret access key\n * ``region`` S3 region of the bucket\n * ``bucket_name`` name of the S3 bucket\n\nOptional keys for Amazon Web Services S3:\n\n * ``encrypted`` if True, use server-side encryption. Default is False.\n\n* ``s3`` for other S3 compatible services such as Ceph, required\n  configuration keys:\n\n * ``aws_access_key_id`` for the AWS access key id\n * ``aws_secret_access_key`` for the AWS secret access key\n * ``bucket_name`` name of the S3 bucket\n * ``host`` for overriding host for non AWS-S3 implementations\n * ``port`` for overriding port for non AWS-S3 implementations\n * ``is_secure`` for overriding the requirement for https for non AWS-S3\n * ``is_verify_tls`` for configuring tls verify for non AWS-S3\n   implementations\n\n* ``azure`` for Microsoft Azure Storage, required configuration keys:\n\n * ``account_name`` for the name of the Azure Storage account\n * ``account_key`` for the secret key of the Azure Storage account\n * ``bucket_name`` for the name of Azure Storage container used to store\n   objects\n * ``azure_cloud`` Azure cloud selector, ``\"public\"`` (default) or ``\"germany\"``\n\n* ``swift`` for OpenStack Swift, required configuration keys:\n\n * ``user`` for the Swift user ('subuser' in Ceph RadosGW)\n * ``key`` for the Swift secret_key\n * ``auth_url`` for Swift authentication URL\n * ``container_name`` name of the data container\n\n * Optional configuration keys for Swift:\n\n  * ``auth_version`` - ``2.0`` (default) or ``3.0`` for keystone, use ``1.0`` with\n    Ceph Rados GW.\n  * ``segment_size`` - defaults to ``1024**3`` (1 gigabyte).  Objects larger\n    than this will be split into multiple segments on upload.  Many Swift\n    installations require large files (usually 5 gigabytes) to be segmented.\n  * ``tenant_name``\n  * ``region_name``\n  * ``user_id`` - for auth_version 3.0\n  * ``user_domain_id`` - for auth_version 3.0\n  * ``user_domain_name`` - for auth_version 3.0\n  * ``tenant_id`` - for auth_version 3.0\n  * ``project_id`` - for auth_version 3.0\n  * ``project_name`` - for auth_version 3.0\n  * ``project_domain_id`` - for auth_version 3.0\n  * ``project_domain_name`` - for auth_version 3.0\n  * ``service_type`` - for auth_version 3.0\n  * ``endpoint_type`` - for auth_version 3.0\n\n``nodes`` (no default)\n\nArray of one or more nodes from which the backups are taken.  A node can be\ndescribed as an object of libpq key: value connection info pairs or libpq\nconnection string or a ``postgres://`` connection uri. If for example you'd\nlike to use a streaming replication slot use the syntax {... \"slot\": \"slotname\"}.\n\n``pg_bin_directory`` (default: find binaries from well-known directories)\n\nSite-specific option for finding ``pg_basebackup`` and ``pg_receivexlog``\ncommands matching the given backup site's PostgreSQL version.  If a value is\nnot supplied PGHoard will attempt to find matching binaries from various\nwell-known locations.  In case ``pg_data_directory`` is set and points to a\nvalid data directory the lookup is restricted to the version contained in\nthe given data directory.\n\n``pg_data_directory`` (no default)\n\nThis is used when the ``local-tar`` ``basebackup_mode`` is used.  The data\ndirectory must point to PostgreSQL's ``$PGDATA`` and must be readable by the\n``pghoard`` daemon.\n\n``prefix`` (default: site name)\n\nPath prefix to use for all backups related to this site.  Defaults to the\nname of the site.\n\n\nAlert files\n===========\n\nAlert files are created whenever an error condition that requires human\nintervention to solve.  You're recommended to add checks for the existence\nof these files to your alerting system.\n\n``authentication_error``\n\nThere has been a problem in the authentication of at least one of the\nPostgreSQL connections.  This usually denotes a wrong username and/or\npassword.\n\n``configuration_error``\n\nThere has been a problem in the authentication of at least one of the\nPostgreSQL connections.  This usually denotes a missing ``pg_hba.conf`` entry or\nincompatible settings in postgresql.conf.\n\n``upload_retries_warning``\n\nUpload of a file has failed more times than\n``upload_retries_warning_limit``. Needs human intervention to figure\nout why and to delete the alert once the situation has been fixed.\n\n``version_mismatch_error``\n\nYour local PostgreSQL client versions of ``pg_basebackup`` or\n``pg_receivexlog`` do not match with the servers PostgreSQL version.  You\nneed to update them to be on the same version level.\n\n``version_unsupported_error``\n\nServer PostgreSQL version is not supported.\n\n\nLicense\n=======\n\nPGHoard is licensed under the Apache License, Version 2.0. Full license text\nis available in the ``LICENSE`` file and at\nhttp://www.apache.org/licenses/LICENSE-2.0.txt\n\n\nCredits\n=======\n\nPGHoard was created by Hannu Valtonen \u003channu.valtonen@aiven.io\u003e for\n`Aiven`_ and is now maintained by Aiven developers \u003copensource@aiven.io\u003e.\n\n.. _`Aiven`: https://aiven.io/\n\nRecent contributors are listed on the GitHub project page,\nhttps://github.com/aiven/pghoard/graphs/contributors\n\n\nContact\n=======\n\nBug reports and patches are very welcome, please post them as GitHub issues\nand pull requests at https://github.com/aiven/pghoard .  Any possible\nvulnerabilities or other serious issues should be reported directly to the\nmaintainers \u003copensource@aiven.io\u003e.\n\n\nTrademarks\n==========\n\nPostgres, PostgreSQL and the Slonik Logo are trademarks or registered trademarks of the PostgreSQL Community Association of Canada, and used with their permission.\n\nTelegraf, Vagrant and Datadog are trademarks and property of their respective owners. All product and service names used in this website are for identification purposes only and do not imply endorsement.\n\n\nCopyright\n=========\n\nCopyright (C) 2015 Aiven Ltd\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faiven%2Fpghoard","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Faiven%2Fpghoard","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faiven%2Fpghoard/lists"}