{"id":19458472,"url":"https://github.com/postgrespro/ptrack","last_synced_at":"2025-09-26T15:41:14.986Z","repository":{"id":40366361,"uuid":"233435914","full_name":"postgrespro/ptrack","owner":"postgrespro","description":"Block-level incremental backup engine for PostgreSQL","archived":false,"fork":false,"pushed_at":"2024-10-04T13:32:16.000Z","size":197,"stargazers_count":50,"open_issues_count":8,"forks_count":17,"subscribers_count":27,"default_branch":"master","last_synced_at":"2025-04-05T22:03:32.240Z","etag":null,"topics":["backups","c","incremental-backups","postgres","postgresql","postgresql-extension"],"latest_commit_sha":null,"homepage":"","language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/postgrespro.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":"AUTHORS.md","dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-01-12T18:07:22.000Z","updated_at":"2025-01-03T21:42:55.000Z","dependencies_parsed_at":"2023-02-18T09:20:37.810Z","dependency_job_id":"9e1a9b5b-5c5e-4d32-9671-f691d9450648","html_url":"https://github.com/postgrespro/ptrack","commit_stats":null,"previous_names":[],"tags_count":10,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/postgrespro%2Fptrack","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/postgrespro%2Fptrack/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/postgrespro%2Fptrack/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/postgrespro%2Fptrack/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/postgrespro","download_url":"https://codeload.github.com/postgrespro/ptrack/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248036063,"owners_count":21037092,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["backups","c","incremental-backups","postgres","postgresql","postgresql-extension"],"created_at":"2024-11-10T17:27:14.572Z","updated_at":"2025-09-26T15:41:14.888Z","avatar_url":"https://github.com/postgrespro.png","language":"C","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![Test](https://github.com/postgrespro/ptrack/actions/workflows/test.yml/badge.svg)](https://github.com/postgrespro/ptrack/actions/workflows/test.yml)\n[![Codecov](https://codecov.io/gh/postgrespro/ptrack/branch/master/graph/badge.svg)](https://codecov.io/gh/postgrespro/ptrack)\n[![GitHub release](https://img.shields.io/github/v/release/postgrespro/ptrack?include_prereleases)](https://github.com/postgrespro/ptrack/releases/latest)\n\n# ptrack\n\n## Overview\n\nPtrack is a block-level incremental backup engine for PostgreSQL. You can [effectively use](https://postgrespro.github.io/pg_probackup/#pbk-setting-up-ptrack-backups) `ptrack` engine for taking incremental backups with [pg_probackup](https://github.com/postgrespro/pg_probackup) backup and recovery manager for PostgreSQL.\n\nIt is designed to allow false positives (i.e. block/page is marked in the `ptrack` map, but actually has not been changed), but to never allow false negatives (i.e. loosing any `PGDATA` changes, excepting hint-bits).\n\nCurrently, `ptrack` codebase is split between small PostgreSQL core patch and extension. All public SQL API methods and main engine are placed in the `ptrack` extension, while the core patch contains only certain hooks and modifies binary utilities to ignore `ptrack.map.*` files.\n\nThis extension is compatible with PostgreSQL [11](patches/REL_11_STABLE-ptrack-core.diff), [12](patches/REL_12_STABLE-ptrack-core.diff), [13](patches/REL_13_STABLE-ptrack-core.diff), [14](patches/REL_14_STABLE-ptrack-core.diff), [15](patches/REL_15_STABLE-ptrack-core.diff).\n\n## Installation\n\n1) Specify the PostgreSQL branch to work with:\n\n```shell\nexport PG_BRANCH=REL_15_STABLE\n```\n\n2) Get the latest PostgreSQL sources:\n\n```shell\ngit clone https://github.com/postgres/postgres.git -b $PG_BRANCH\n```\n\n3) Get the latest `ptrack` sources:\n\n```shell\ngit clone https://github.com/postgrespro/ptrack.git postgres/contrib/ptrack\n```\n\n4) Change to the `ptrack` directory:\n\n```shell\ncd postgres/contrib/ptrack\n```\n\n5) Apply the PostgreSQL core patch:\n\n```shell\nmake patch\n```\n\n6) Compile and install PostgreSQL:\n\n```shell\nmake install-postgres prefix=$PWD/pgsql  # or some other prefix of your choice\n```\n\n7) Add the newly created binaries to the PATH:\n\n```shell\nexport PATH=$PWD/pgsql/bin:$PATH\n```\n\n8) Compile and install `ptrack`:\n\n```shell\nmake install USE_PGXS=1\n```\n\n9) Set `ptrack.map_size` (in MB):\n\n```shell\necho \"shared_preload_libraries = 'ptrack'\" \u003e\u003e \u003cDATA_DIR\u003e/postgresql.conf\necho \"ptrack.map_size = 64\" \u003e\u003e \u003cDATA_DIR\u003e/postgresql.conf\n```\n\n10) Run PostgreSQL and create the `ptrack` extension:\n\n```sql\npostgres=# CREATE EXTENSION ptrack;\n```\n\n## Configuration\n\nThe only one configurable option is `ptrack.map_size` (in MB). Default is `0`, which means `ptrack` is turned off. In order to reduce number of false positives it is recommended to set `ptrack.map_size` to `1 / 1000` of expected `PGDATA` size (i.e. `1000` for a 1 TB database).\n\nTo disable `ptrack` and clean up all remaining service files set `ptrack.map_size` to `0`.\n\n## Public SQL API\n\n * ptrack_version() — returns ptrack version string.\n * ptrack_init_lsn() — returns LSN of the last ptrack map initialization.\n * ptrack_get_pagemapset(start_lsn pg_lsn) — returns a set of changed data files with a number of changed blocks and their bitmaps since specified `start_lsn`.\n * ptrack_get_change_stat(start_lsn pg_lsn) — returns statistic of changes (number of files, pages and size in MB) since specified `start_lsn`.\n\nUsage example:\n\n```sql\npostgres=# SELECT ptrack_version();\n ptrack_version \n----------------\n 2.4\n(1 row)\n\npostgres=# SELECT ptrack_init_lsn();\n ptrack_init_lsn \n-----------------\n 0/1814408\n(1 row)\n\npostgres=# SELECT * FROM ptrack_get_pagemapset('0/185C8C0');\n        path         | pagecount |                pagemap                 \n---------------------+-----------+----------------------------------------\n base/16384/1255     |         3 | \\x001000000005000000000000\n base/16384/2674     |         3 | \\x0000000900010000000000000000\n base/16384/2691     |         1 | \\x00004000000000000000000000\n base/16384/2608     |         1 | \\x000000000000000400000000000000000000\n base/16384/2690     |         1 | \\x000400000000000000000000\n(5 rows)\n\npostgres=# SELECT * FROM ptrack_get_change_stat('0/285C8C8');\n files | pages |        size, MB        \n-------+-------+------------------------\n    20 |    25 | 0.19531250000000000000\n(1 row)\n```\n\n## Upgrading\n\nUsually, you have to only install new version of `ptrack` and do `ALTER EXTENSION ptrack UPDATE;`. However, some specific actions may be required as well:\n\n#### Upgrading from 2.0.0 to 2.1.*:\n\n* Put `shared_preload_libraries = 'ptrack'` into `postgresql.conf`.\n* Rename `ptrack_map_size` to `ptrack.map_size`.\n* Do `ALTER EXTENSION ptrack UPDATE;`.\n* Restart your server.\n\n#### Upgrading from 2.1.* to 2.2.*:\n\nSince version 2.2 we use a different algorithm for tracking changed pages. Thus, data recorded in the `ptrack.map` using pre 2.2 versions of `ptrack` is incompatible with newer versions. After extension upgrade and server restart old `ptrack.map` will be discarded with `WARNING` and initialized from the scratch.\n\n#### Upgrading from 2.2.* to 2.3.*:\n\n* Stop your server\n* Update ptrack binaries\n* Remove global/ptrack.map.mmap if it exist in server data directory\n* Start server\n* Do `ALTER EXTENSION ptrack UPDATE;`.\n\n#### Upgrading from 2.3.* to 2.4.*:\n\n* Stop your server\n* Update ptrack binaries\n* Start server\n* Do `ALTER EXTENSION ptrack UPDATE;`.\n\n## Limitations\n\n1. You can only use `ptrack` safely with `wal_level \u003e= 'replica'`. Otherwise, you can lose tracking of some changes if crash-recovery occurs, since [certain commands are designed not to write WAL at all if wal_level is minimal](https://www.postgresql.org/docs/12/populate.html#POPULATE-PITR), but we only durably flush `ptrack` map at checkpoint time.\n\n2. The only one production-ready backup utility, that fully supports `ptrack` is [pg_probackup](https://github.com/postgrespro/pg_probackup).\n\n3. You cannot resize `ptrack` map in runtime, only on postmaster start. Also, you will loose all tracked changes, so it is recommended to do so in the maintainance window and accompany this operation with full backup.\n\n4. You will need up to `ptrack.map_size * 2` of additional disk space, since `ptrack` uses additional temporary file for durability purpose. See [Architecture section](#Architecture) for details.\n\n## Benchmarks\n\nBriefly, an overhead of using `ptrack` on TPS usually does not exceed a couple of percent (~1-3%) for a database of dozens to hundreds of gigabytes in size, while the backup time scales down linearly with backup size with a coefficient ~1. It means that an incremental `ptrack` backup of a database with only 20% of changed pages will be 5 times faster than a full backup. More details [here](benchmarks).\n\n## Architecture\n\nWe use a single shared hash table in `ptrack`. Due to the fixed size of the map there may be false positives (when some block is marked as changed without being actually modified), but not false negative results. However, these false postives may be completely eliminated by setting a high enough `ptrack.map_size`.\n\nAll reads/writes are made using atomic operations on `uint64` entries, so the map is completely lockless during the normal PostgreSQL operation. Because we do not use locks for read/write access, `ptrack` keeps a map (`ptrack.map`) since the last checkpoint intact and uses up to 1 additional temporary file:\n\n* temporary file `ptrack.map.tmp` to durably replace `ptrack.map` during checkpoint.\n\nMap is written on disk at the end of checkpoint atomically block by block involving the CRC32 checksum calculation that is checked on the next whole map re-read after crash-recovery or restart.\n\nTo gather the whole changeset of modified blocks in `ptrack_get_pagemapset()` we walk the entire `PGDATA` (`base/**/*`, `global/*`, `pg_tblspc/**/*`) and verify using map whether each block of each relation was modified since the specified LSN or not.\n\n## Contribution\n\nFeel free to [send a pull request](https://github.com/postgrespro/ptrack/compare), [create an issue](https://github.com/postgrespro/ptrack/issues/new) or [reach us by e-mail](mailto:team-wd40@lists.postgrespro.ru??subject=[GitHub]%20Ptrack) if you are interested in `ptrack`.\n\n## Tests\n\nAll changes of the source code in this repository are checked by CI - see commit statuses and the project status badge. You can also run tests locally by executing a few Makefile targets.\n\n### Prerequisites\n\nTo run Python tests install the following packages:\n\nOS packages:\n  - python3-pip\n  - python3-six\n  - python3-pytest\n  - python3-pytest-xdist\n\nPIP packages:\n  - testgres\n\nFor example, for Ubuntu:\n\n```shell\nsudo apt update\nsudo apt install python3-pip python3-six python3-pytest python3-pytest-xdist\nsudo pip3 install testgres\n```\n\n### Testing\n\nInstall PostgreSQL and ptrack as described in [Installation](#installation), install the testing prerequisites, then do (assuming the current directory is `ptrack`):\n```shell\ngit clone https://github.com/postgrespro/pg_probackup.git ../pg_probackup  # clone the repository into postgres/contrib/pg_probackup\n# remember to export PATH=/path/to/pgsql/bin:$PATH\nmake install-pg-probackup USE_PGXS=1 top_srcdir=../..\nmake test-tap USE_PGXS=1\nmake test-python\n```\n\nIf `pg_probackup` is not located in `postgres/contrib` then additionally specify the path to the `pg_probackup` directory when building `pg_probackup`:\n```shell\nmake install-pg-probackup USE_PGXS=1 top_srcdir=/path/to/postgres pg_probackup_dir=/path/to/pg_probackup\n```\n\nYou can use a public Docker image which already has the necessary build environment (but not the testing prerequisites):\n\n```shell\ndocker run  -e USER_ID=`id -u` -it -v $PWD:/work --name=ptrack ghcr.io/postgres-dev/ubuntu-22.04:1.0\ndev@a033797d2f73:~$ \n```\n\n## Environment variables\n\n| Variable  | Possible values | Required | Default value  | Description |\n| -         | -               | -        | -              | -           |\n| NPROC     | An integer greater than 0 | No | Output of `nproc` | The number of threads used for building and running tests |\n| PG_CONFIG | File path | No | pg_config (from the PATH) | The path to the `pg_config` binary |\n| TESTS     | A Pytest filter expression | No | Not set (run all Python tests) | A filter to include only selected tests into the run. See the Pytest `-k` option for more information. This variable is only applicable to `test-python` for the tests located in [tests](https://github.com/postgrespro/pg_probackup/tree/master/tests). |\n| TEST_MODE | normal, legacy, paranoia | No | normal | The \"legacy\" mode runs tests in an environment similar to a 32-bit Windows system. This mode is only applicable to `test-tap`. The \"paranoia\" mode compares the checksums of each block of the database catalog (PGDATA) contents before making a backup and after the restoration. This mode is only applicable to `test-python`.|\n\n### TODO\n\n* Should we introduce `ptrack.map_path` to allow `ptrack` service files storage outside of `PGDATA`? Doing that we will avoid patching PostgreSQL binary utilities to ignore `ptrack.map.*` files.\n* Can we resize `ptrack` map on restart but keep the previously tracked changes?\n* Can we write a formal proof, that we never loose any modified page with `ptrack`? With TLA+?\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpostgrespro%2Fptrack","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpostgrespro%2Fptrack","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpostgrespro%2Fptrack/lists"}