{"id":19458468,"url":"https://github.com/postgrespro/pg_query_state","last_synced_at":"2025-04-04T10:06:49.561Z","repository":{"id":45543712,"uuid":"66460640","full_name":"postgrespro/pg_query_state","owner":"postgrespro","description":"Tool for query progress monitoring in PostgreSQL","archived":false,"fork":false,"pushed_at":"2025-02-25T10:12:01.000Z","size":235,"stargazers_count":157,"open_issues_count":4,"forks_count":26,"subscribers_count":31,"default_branch":"master","last_synced_at":"2025-03-28T09:06:24.922Z","etag":null,"topics":["postgresql","query-monitoring"],"latest_commit_sha":null,"homepage":"","language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/postgrespro.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2016-08-24T12:06:12.000Z","updated_at":"2025-02-20T07:50:25.000Z","dependencies_parsed_at":"2024-03-14T18:47:27.778Z","dependency_job_id":"829b328a-086a-4046-b8d6-6f3faaf0735f","html_url":"https://github.com/postgrespro/pg_query_state","commit_stats":null,"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/postgrespro%2Fpg_query_state","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/postgrespro%2Fpg_query_state/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/postgrespro%2Fpg_query_state/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/postgrespro%2Fpg_query_state/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/postgrespro","download_url":"https://codeload.github.com/postgrespro/pg_query_state/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247157107,"owners_count":20893214,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["postgresql","query-monitoring"],"created_at":"2024-11-10T17:27:13.809Z","updated_at":"2025-04-04T10:06:49.537Z","avatar_url":"https://github.com/postgrespro.png","language":"C","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![Build Status](https://travis-ci.com/postgrespro/pg_query_state.svg?branch=master)](https://travis-ci.com/postgrespro/pg_query_state)\n[![codecov](https://codecov.io/gh/postgrespro/pg_query_state/branch/master/graph/badge.svg)](https://codecov.io/gh/postgrespro/pg_query_state)\n\n# pg\\_query\\_state\nThe `pg_query_state` module provides facility to know the current state of query execution on working backend. To enable this extension you have to patch the stable version of PostgreSQL, recompile it and deploy new binaries. All patch files are located in `patches/` directory and tagged with suffix of PostgreSQL version number.\n\n## Overview\nEach nonutility query statement (SELECT/INSERT/UPDATE/DELETE) after optimization/planning stage is translated into plan tree which is kind of imperative representation of SQL query execution algorithm. EXPLAIN ANALYZE request allows to demonstrate execution statistics gathered from each node of plan tree (full time of execution, number rows emitted to upper nodes, etc). But this statistics is collected after execution of query. This module allows to show actual statistics of query running gathered from external backend. At that, format of resulting output is almost identical to ordinal EXPLAIN ANALYZE. Thus users are able to track of query execution in progress.\n\nIn fact, this module is able to explore external backend and determine its actual state. Particularly it's helpful when backend executes a heavy query and gets stuck.\n\n## Use cases\nUsing this module there can help in the following things:\n - detect a long query (along with other monitoring tools)\n - overwatch the query execution\n\n## Installation\nTo install `pg_query_state`, please apply corresponding patches `custom_signal_(PG_VERSION).patch` and `runtime_explain_(PG_VERSION).patch` (or `runtime_explain.patch` for PG version \u003c= 10.0) from the `patches/` directory to reqired stable version of PostgreSQL and rebuild PostgreSQL.\n\nTo do this, run the following commands from the postgresql directory:\n```\npatch -p1 \u003c path_to_pg_query_state_folder/patches/runtime_explain_(PG_VERSION).patch\npatch -p1 \u003c path_to_pg_query_state_folder/patches/custom_signals_(PG_VERSION).patch\n```\n\nThen execute this in the module's directory:\n```\nmake install USE_PGXS=1\n```\nTo execute the command correctly, make sure you have the PATH or PG_CONFIG variable set.\n```\nexport PATH=path_to_your_bin_folder:$PATH\n# or\nexport PG_CONFIG=path_to_your_bin_folder/pg_config\n```\n\nAdd module name to the `shared_preload_libraries` parameter in `postgresql.conf`:\n```\nshared_preload_libraries = 'pg_query_state'\n```\nIt is essential to restart the PostgreSQL instance. After that, execute the following query in psql:\n```sql\nCREATE EXTENSION pg_query_state;\n```\nDone!\n\n## Tests\nTest using parallel sessions with Python 3+ compatible script:\n```shell\npython3 tests/pg_qs_test_runner.py [OPTION]...\n```\n*prerequisite packages*:\n* `psycopg2` version 2.6 or later\n* `PyYAML` version 3.11 or later\n* `progressbar2` for stress test progress reporting\n\n*options*:\n* *- -host* --- postgres server host, default value is *localhost*\n* *- -port* --- postgres server port, default value is *5432*\n* *- -database* --- database name, default value is *postgres*\n* *- -user* --- user name, default value is *postgres*\n* *- -password* --- user's password, default value is empty\n* *- -tpc-ds-setup* --- setup database to run TPC-DS benchmark\n* *- -tpc-ds-run* --- runs only stress tests on TPC-DS benchmark\n\nOr run all tests in `Docker` using:\n\n```shell\nexport LEVEL=hardcore\nexport USE_TPCDS=1\nexport PG_VERSION=12\n\n./mk_dockerfile.sh\n\ndocker-compose build\ndocker-compose run tests\n```\n\nThere are different test levels: `hardcore`, `nightmare` (runs tests under `valgrind`) and `stress` (runs tests under `TPC-DS` load).\n\n## Function pg\\_query\\_state\n```plpgsql\npg_query_state(\n        integer     pid,\n        verbose     boolean DEFAULT FALSE,\n        costs       boolean DEFAULT FALSE,\n        timing      boolean DEFAULT FALSE,\n        buffers     boolean DEFAULT FALSE,\n        triggers    boolean DEFAULT FALSE,\n        format      text    DEFAULT 'text'\n) returns TABLE (\n    pid             integer,\n    frame_number    integer,\n    query_text      text,\n    plan            text,\n    leader_pid      integer\n)\n```\nextracts the current query state from backend with specified `pid`. Since parallel query can spawn multiple workers and function call causes nested subqueries so that state of execution may be viewed as stack of running queries, return value of `pg_query_state` has type `TABLE (pid integer, frame_number integer, query_text text, plan text, leader_pid integer)`. It represents tree structure consisting of leader process and its spawned workers identified by `pid`. Each worker refers to leader through `leader_pid` column. For leader process the value of this column is` null`. The state of each process is represented as stack of function calls. Each frame of that stack is specified as correspondence between `frame_number` starting from zero, `query_text` and `plan` with online statistics columns.\n\nThus, user can see the states of main query and queries generated from function calls for leader process and all workers spawned from it.\n\nIn process of execution some nodes of plan tree can take loops of full execution. Therefore statistics for each node consists of two parts: average statistics for previous loops just like in EXPLAIN ANALYZE output and statistics for current loop if node have not finished.\n\nOptional arguments:\n\n - `verbose` --- use EXPLAIN VERBOSE for plan printing;\n - `costs` --- add costs for each node;\n - `timing` --- print timing data for each node, if collecting of timing statistics is turned off on called side resulting output will contain WARNING message `timing statistics disabled`;\n - `buffers` --- print buffers usage, if collecting of buffers statistics is turned off on called side resulting output will contain WARNING message `buffers statistics disabled`;\n - `triggers` --- include triggers statistics in result plan trees;\n - `format` --- EXPLAIN format to be used for plans printing, possible values: {`text`, `xml`, `json`, `yaml`}.\n\nIf callable backend is not executing any query the function prints INFO message about backend's state taken from `pg_stat_activity` view if it exists there.\n\n**_Warning_**: Calling role have to be superuser or member of the role whose backend is being called. Otherwise function prints ERROR message `permission denied`.\n\n## Configuration settings\nThere are several user-accessible [GUC](https://www.postgresql.org/docs/9.5/static/config-setting.html) variables designed to toggle the whole module and the collecting of specific statistic parameters while query is running:\n\n - `pg_query_state.enable` --- disable (or enable) `pg_query_state` completely, default value is `true`\n - `pg_query_state.enable_timing` --- collect timing data for each node, default value is `false`\n - `pg_query_state.enable_buffers` --- collect buffers usage, default value is `false`\n\nThis parameters is set on called side before running any queries whose states are attempted to extract. **_Warning_**: if `pg_query_state.enable_timing` is turned off the calling side cannot get time statistics, similarly for `pg_query_state.enable_buffers` parameter.\n\n## Examples\nSet maximum number of parallel workers on `gather` node equals `2`:\n```sql\npostgres=# set max_parallel_workers_per_gather = 2;\n```\nAssume one backend with pid = 49265 performs a simple query:\n```sql\npostgres=# select pg_backend_pid();\n pg_backend_pid\n ----------------\n          49265\n(1 row)\npostgres=# select count(*) from foo join bar on foo.c1=bar.c1;\n```\nOther backend can extract intermediate state of execution that query:\n```sql\npostgres=# \\x\npostgres=# select * from pg_query_state(49265);\n-[ RECORD 1 ]+-------------------------------------------------------------------------------------------------------------------------\npid          | 49265\nframe_number | 0\nquery_text   | select count(*) from foo join bar on foo.c1=bar.c1;\nplan         | Finalize Aggregate (Current loop: actual rows=0, loop number=1)                                                         +\n             |   -\u003e  Gather (Current loop: actual rows=0, loop number=1)                                                               +\n             |         Workers Planned: 2                                                                                              +\n             |         Workers Launched: 2                                                                                             +\n             |         -\u003e  Partial Aggregate (Current loop: actual rows=0, loop number=1)                                              +\n             |               -\u003e  Nested Loop (Current loop: actual rows=12, loop number=1)                                             +\n             |                     Join Filter: (foo.c1 = bar.c1)                                                                      +\n             |                     Rows Removed by Join Filter: 5673232                                                                +\n             |                     -\u003e  Parallel Seq Scan on foo (Current loop: actual rows=12, loop number=1)                          +\n             |                     -\u003e  Seq Scan on bar (actual rows=500000 loops=11) (Current loop: actual rows=173244, loop number=12)\nleader_pid   | (null)\n-[ RECORD 2 ]+-------------------------------------------------------------------------------------------------------------------------\npid          | 49324\nframe_number | 0\nquery_text   | \u003cparallel query\u003e\nplan         | Partial Aggregate (Current loop: actual rows=0, loop number=1)                                                          +\n             |   -\u003e  Nested Loop (Current loop: actual rows=10, loop number=1)                                                         +\n             |         Join Filter: (foo.c1 = bar.c1)                                                                                  +\n             |         Rows Removed by Join Filter: 4896779                                                                            +\n             |         -\u003e  Parallel Seq Scan on foo (Current loop: actual rows=10, loop number=1)                                      +\n             |         -\u003e  Seq Scan on bar (actual rows=500000 loops=9) (Current loop: actual rows=396789, loop number=10)\nleader_pid   | 49265\n-[ RECORD 3 ]+-------------------------------------------------------------------------------------------------------------------------\npid          | 49323\nframe_number | 0\nquery_text   | \u003cparallel query\u003e\nplan         | Partial Aggregate (Current loop: actual rows=0, loop number=1)                                                          +\n             |   -\u003e  Nested Loop (Current loop: actual rows=11, loop number=1)                                                         +\n             |         Join Filter: (foo.c1 = bar.c1)                                                                                  +\n             |         Rows Removed by Join Filter: 5268783                                                                            +\n             |         -\u003e  Parallel Seq Scan on foo (Current loop: actual rows=11, loop number=1)                                      +\n             |         -\u003e  Seq Scan on bar (actual rows=500000 loops=10) (Current loop: actual rows=268794, loop number=11)\nleader_pid   | 49265\n```\nIn example above working backend spawns two parallel workers with pids `49324` and `49323`. Their `leader_pid` column's values clarify that these workers belong to the main backend.\n`Seq Scan` node has statistics on passed loops (average number of rows delivered to `Nested Loop` and number of passed loops are shown) and statistics on current loop. Other nodes has statistics only for current loop as this loop is first (`loop number` = 1).\n\nAssume first backend executes some function:\n```sql\npostgres=# select n_join_foo_bar();\n```\nOther backend can get the follow output:\n```sql\npostgres=# select * from pg_query_state(49265);\n-[ RECORD 1 ]+------------------------------------------------------------------------------------------------------------------\npid          | 49265\nframe_number | 0\nquery_text   | select n_join_foo_bar();\nplan         | Result (Current loop: actual rows=0, loop number=1)\nleader_pid   | (null)\n-[ RECORD 2 ]+------------------------------------------------------------------------------------------------------------------\npid          | 49265\nframe_number | 1\nquery_text   | SELECT (select count(*) from foo join bar on foo.c1=bar.c1)\nplan         | Result (Current loop: actual rows=0, loop number=1)                                                              +\n             |   InitPlan 1 (returns $0)                                                                                        +\n             |     -\u003e  Aggregate (Current loop: actual rows=0, loop number=1)                                                   +\n             |           -\u003e  Nested Loop (Current loop: actual rows=51, loop number=1)                                          +\n             |                 Join Filter: (foo.c1 = bar.c1)                                                                   +\n             |                 Rows Removed by Join Filter: 51636304                                                            +\n             |                 -\u003e  Seq Scan on bar (Current loop: actual rows=52, loop number=1)                                +\n             |                 -\u003e  Materialize (actual rows=1000000 loops=51) (Current loop: actual rows=636355, loop number=52)+\n             |                       -\u003e  Seq Scan on foo (Current loop: actual rows=1000000, loop number=1)\nleader_pid   | (null)\n```\nFirst row corresponds to function call, second - to query which is in the body of that function.\n\nWe can get result plans in different format (e.g. `json`):\n```sql\npostgres=# select * from pg_query_state(pid := 49265, format := 'json');\n-[ RECORD 1 ]+------------------------------------------------------------\npid          | 49265\nframe_number | 0\nquery_text   | select * from n_join_foo_bar();\nplan         | {                                                          +\n             |   \"Plan\": {                                                +\n             |     \"Node Type\": \"Function Scan\",                          +\n             |     \"Parallel Aware\": false,                               +\n             |     \"Function Name\": \"n_join_foo_bar\",                     +\n             |     \"Alias\": \"n_join_foo_bar\",                             +\n             |     \"Current loop\": {                                      +\n             |       \"Actual Loop Number\": 1,                             +\n             |       \"Actual Rows\": 0                                     +\n             |     }                                                      +\n             |   }                                                        +\n             | }\nleader_pid   | (null)\n-[ RECORD 2 ]+------------------------------------------------------------\npid          | 49265\nframe_number | 1\nquery_text   | SELECT (select count(*) from foo join bar on foo.c1=bar.c1)\nplan         | {                                                          +\n             |   \"Plan\": {                                                +\n             |     \"Node Type\": \"Result\",                                 +\n             |     \"Parallel Aware\": false,                               +\n             |     \"Current loop\": {                                      +\n             |       \"Actual Loop Number\": 1,                             +\n             |       \"Actual Rows\": 0                                     +\n             |     },                                                     +\n             |     \"Plans\": [                                             +\n             |       {                                                    +\n             |         \"Node Type\": \"Aggregate\",                          +\n             |         \"Strategy\": \"Plain\",                               +\n             |         \"Partial Mode\": \"Simple\",                          +\n             |         \"Parent Relationship\": \"InitPlan\",                 +\n             |         \"Subplan Name\": \"InitPlan 1 (returns $0)\",         +\n             |         \"Parallel Aware\": false,                           +\n             |         \"Current loop\": {                                  +\n             |           \"Actual Loop Number\": 1,                         +\n             |           \"Actual Rows\": 0                                 +\n             |         },                                                 +\n             |         \"Plans\": [                                         +\n             |           {                                                +\n             |             \"Node Type\": \"Nested Loop\",                    +\n             |             \"Parent Relationship\": \"Outer\",                +\n             |             \"Parallel Aware\": false,                       +\n             |             \"Join Type\": \"Inner\",                          +\n             |             \"Current loop\": {                              +\n             |               \"Actual Loop Number\": 1,                     +\n             |               \"Actual Rows\": 610                           +\n             |             },                                             +\n             |             \"Join Filter\": \"(foo.c1 = bar.c1)\",            +\n             |             \"Rows Removed by Join Filter\": 610072944,      +\n             |             \"Plans\": [                                     +\n             |               {                                            +\n             |                 \"Node Type\": \"Seq Scan\",                   +\n             |                 \"Parent Relationship\": \"Outer\",            +\n             |                 \"Parallel Aware\": false,                   +\n             |                 \"Relation Name\": \"bar\",                    +\n             |                 \"Alias\": \"bar\",                            +\n             |                 \"Current loop\": {                          +\n             |                   \"Actual Loop Number\": 1,                 +\n             |                   \"Actual Rows\": 611                       +\n             |                 }                                          +\n             |               },                                           +\n             |               {                                            +\n             |                 \"Node Type\": \"Materialize\",                +\n             |                 \"Parent Relationship\": \"Inner\",            +\n             |                 \"Parallel Aware\": false,                   +\n             |                 \"Actual Rows\": 1000000,                    +\n             |                 \"Actual Loops\": 610,                       +\n             |                 \"Current loop\": {                          +\n             |                   \"Actual Loop Number\": 611,               +\n             |                   \"Actual Rows\": 73554                     +\n             |                 },                                         +\n             |                 \"Plans\": [                                 +\n             |                   {                                        +\n             |                     \"Node Type\": \"Seq Scan\",               +\n             |                     \"Parent Relationship\": \"Outer\",        +\n             |                     \"Parallel Aware\": false,               +\n             |                     \"Relation Name\": \"foo\",                +\n             |                     \"Alias\": \"foo\",                        +\n             |                     \"Current loop\": {                      +\n             |                       \"Actual Loop Number\": 1,             +\n             |                       \"Actual Rows\": 1000000               +\n             |                     }                                      +\n             |                   }                                        +\n             |                 ]                                          +\n             |               }                                            +\n             |             ]                                              +\n             |           }                                                +\n             |         ]                                                  +\n             |       }                                                    +\n             |     ]                                                      +\n             |   }                                                        +\n             | }\nleader_pid   | (null)\n```\n\n## Feedback\nDo not hesitate to post your issues, questions and new ideas at the [issues](https://github.com/postgrespro/pg_query_state/issues) page.\n\n## Authors\n[Maksim Milyutin](https://github.com/maksm90)  \nAlexey Kondratov \u003ca.kondratov@postgrespro.ru\u003e Postgres Professional Ltd., Russia\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpostgrespro%2Fpg_query_state","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpostgrespro%2Fpg_query_state","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpostgrespro%2Fpg_query_state/lists"}