An open API service indexing awesome lists of open source software.

https://github.com/nickjj/flask-pg-extras

A Flask extension to obtain useful information from your PostgreSQL database.
https://github.com/nickjj/flask-pg-extras

Last synced: 4 months ago
JSON representation

A Flask extension to obtain useful information from your PostgreSQL database.

Awesome Lists containing this project

README

        

# What is Flask-PG-Extras? ![CI](https://github.com/nickjj/latest-releases/workflows/CI/badge.svg?branch=master)

It's a Flask extension that ports [Heroku's PG Extras
plugin](https://github.com/heroku/heroku-pg-extras) to Flask / SQLAlchemy
without needing to use Heroku.

You'll get over a dozen `flask pg-extras` commands that you can use to gain
insights on your PostgreSQL database. This includes information about locks,
index usage, buffer cache hit ratios, query times and more.

This could be useful when trying to analyze and improve the performance of your
database.

## Table of contents

- [Installation](#installation)
- [Ensuring the `pg-extras` command is available](#ensuring-the-pg-extras-command-is-available)
- [Going over the `pg-extras` sub-commands](#going-over-the-pg-extras-sub-commands)
- [FAQ](#faq)
- [What about MySQL, MS SQL and SQLite?](#what-about-mysql-ms-sql-and-sqlite)
- [Is it safe to run this against your production database?](#is-it-safe-to-run-this-against-your-production-database)
- [About the Author](#about-the-author)

## Installation

`pip install Flask-PG-Extras`

In order to use this extension you'll need to be using:

- Python 3.6+
- Flask 1.0+
- SQLAlchemy 1.3+ (Flask-SQLAlchemy is also fine)
- PostgreSQL 9.x+

#### Flask app factory example using this extension

```py
# hello/app.py

from flask import Flask
from flask_pg_extras import FlaskPGExtras

flask_pg_extras = FlaskPGExtras()

def create_app():
app = Flask(__name__)

flask_pg_extras.init_app(app)

@app.route('/')
def index():
return 'Hello world'

return app
```

*A more complete example app can be found in the [tests/
directory](https://github.com/nickjj/flask-pg-extras/tree/master/tests/example_app).*

## Ensuring the `pg-extras` command is available

You'll want to make sure to at least set the `FLASK_APP` environment variable:

```sh
export FLASK_APP=hello.app
export FLASK_ENV=development
```

Then run the `flask` binary to see its help menu:

```sh

Usage: flask [OPTIONS] COMMAND [ARGS]...

A general utility script for Flask applications.

Provides commands from Flask, extensions, and the application. Loads the
application defined in the FLASK_APP environment variable, or from a
wsgi.py file. Setting the FLASK_ENV environment variable to 'development'
will enable debug mode.

$ export FLASK_APP=hello.py
$ export FLASK_ENV=development
$ flask run

Options:
--version Show the flask version
--help Show this message and exit.

Commands:
...

pg-extras Obtain useful information from your PostgreSQL database.

...
```

If all went as planned you should see the new `pg-extras` command added to the
list of commands.

## Going over the `pg-extras` sub-commands

Running `flask pg-extras` will produce this help menu:

```sh

Usage: flask pg-extras [OPTIONS] COMMAND [ARGS]...

Obtain useful information from your PostgreSQL database.

Options:
--help Show this message and exit.

Commands:
extensions Installed and available extensions.
index-size Size of invidual indexes.
index-unused Unused and almost unused indexes.
queries-active-locks Queries with active locks.
queries-blocking Queries holding locks that are waiting.
queries-long-running Queries actively running for longer than 5 minutes.
queries-outliers 10 longest executing queries.
queries-popular 10 most called queries.
table-bloat Table bloat (beware of 10+ bloat ratios).
table-cache-hit Table cache hit rate (aim for 99%+).
table-est-row-count Estimated count of rows for each table (n_live_tup).
table-index-size Total size of all the indexes for each table.
table-index-usage Calculates your index hit rate for each table.
table-seq-scans Count of sequential scans for each table.
table-size Total size of the tables (excluding indexes).
total-cache-hit Total index and table hit rate.
total-index-size Total size of all indexes for every table.
total-table-size Total size of the tables (including indexes).
vacuum-stats Show dead rows and expected vacuum triggers.
```

### Commands

Most but not all of these commands are taken straight from Heroku's plugin. I
also renamed some of the commands to make it easier to group them up by what
they do.

Some of these examples and explanations are directly copy / pasted from
Heroku's plugin documentation.

---

#### `pg-extras extensions`

```
name default_version installed_version comment
------------------ ----------------- ------------------- --------------------------------------------------------------------
plpgsql 1 1 PL/pgSQL procedural language
moddatetime 1 functions for tracking last modification time
(truncated results for brevity)
tcn 1 Triggered change notifications
pg_stat_statements 1.7 track execution statistics of all SQL statements executed
pg_freespacemap 1.2 examine the free space map (FSM)
intarray 1.2 functions, operators, and index support for 1-D arrays of integers
```

If an extension is listed here then you have it installed and it's available
to be used by default, or you may need to explicitly enable it. It comes down
to what PostgreSQL does for a specific plugin.

For example the `pg_stat_statements` extension gives you access to query
performance stats and while it's available by default in the above case you
still need to explicitly enable it it since there are implications of turning
it on. That's why PostgreSQL doesn't turn it on by default.

---

#### `pg-extras index-size`

```
name | size
---------------------------------------------------------------+---------
idx_activity_attemptable_and_type_lesson_enrollment | 5196 MB
index_enrollment_attemptables_by_attempt_and_last_in_group | 4045 MB
index_attempts_on_student_id | 2611 MB
enrollment_activity_attemptables_pkey | 2513 MB
index_attempts_on_student_id_final_attemptable_type | 2466 MB
attempts_pkey | 2466 MB
index_attempts_on_response_id | 2404 MB
index_attempts_on_enrollment_id | 1957 MB
index_enrollment_attemptables_by_enrollment_activity_id | 1789 MB
enrollment_activities_pkey | 458 MB
index_enrollment_activities_by_lesson_enrollment_and_activity | 402 MB
index_placement_attempts_on_response_id | 109 MB
index_placement_attempts_on_placement_test_id | 108 MB
index_placement_attempts_on_grade_level_id | 97 MB
index_lesson_enrollments_on_lesson_id | 93 MB
```

This command displays the size of each index in the database, in MB. It is
calculated by taking the number of pages (reported in `relpages`) and
multiplying it by the page size (8192 bytes).

---

#### `pg-extras index-unused`

```
table | index | index_size | index_scans
---------------------+--------------------------------------------+------------+-------------
public.grade_levels | index_placement_attempts_on_grade_level_id | 97 MB | 0
public.observations | observations_attrs_grade_resources | 33 MB | 0
public.messages | user_resource_id_idx | 12 MB | 0
```

This command displays indexes that have < 50 scans recorded against them, and
are greater than 5 pages in size, ordered by size relative to the number of
index scans.

This command is generally useful for eliminating indexes that are unused, which
can impact write performance, as well as read performance should they occupy
space in memory.

---

#### `pg-extras queries-active-locks`

```
procpid | relname | transactionid | granted | query_snippet | age
---------+---------+---------------+---------+-----------------------+-----------------
31776 | | | t | in transaction | 00:19:29.837898
31776 | | 1294 | t | in transaction | 00:19:29.837898
31912 | | | t | select * from hello; | 00:19:17.94259
3443 | | | t | +| 00:00:00
| | | | select +|
| | | | pg_stat_activi |
```

This command displays queries that have taken out an exclusive lock on a
relation. Exclusive locks typically prevent other operations on that relation
from taking place, and can be a cause of "hung" queries that are waiting for a
lock to be granted.

---

#### `pg-extras queries-blocking`

```
blocked_pid | blocking_statement | blocking_duration | blocking_pid | blocked_statement | blocked_duration
-------------+--------------------------+-------------------+--------------+------------------------------------------------------------------------------------+------------------
461 | select count(*) from app | 00:00:03.838314 | 15682 | UPDATE "app" SET "updated_at" = '2013-03-04 15:07:04.746688' WHERE "id" = 12823149 | 00:00:03.821826
```

This command displays statements that are currently holding locks that other
statements are waiting to be released. This can be used in conjunction with
`pg-extras queries-active-locks` to determine which statements need to be
terminated in order to resolve lock contention.

---

#### `pg-extras queries-long-running`

```
pid | duration | query
-------+-----------------+---------------------------------------------------------------------------------------
19578 | 02:29:11.200129 | EXPLAIN SELECT "students".* FROM "students" WHERE "students"."id" = 1450645 LIMIT 1
19465 | 02:26:05.542653 | EXPLAIN SELECT "students".* FROM "students" WHERE "students"."id" = 1889881 LIMIT 1
19632 | 02:24:46.962818 | EXPLAIN SELECT "students".* FROM "students" WHERE "students"."id" = 1581884 LIMIT 1
(truncated results for brevity)
```

This command displays currently running queries that have been running for
longer than 5 minutes, descending by duration. Very long running queries can be
a source of multiple issues, such as preventing DDL statements completing or
vacuum being unable to update `relfrozenxid`.

---

#### `pg-extras queries-outliers`

*Requires enabling
[pg_stat_statements](https://www.postgresql.org/docs/current/pgstatstatements.html)
before it will work.*

```
qry | exec_time | prop_exec_time | ncalls | sync_io_time
-----------------------------------------+------------------+----------------+-------------+--------------
SELECT * FROM archivable_usage_events.. | 154:39:26.431466 | 72.2% | 34,211,877 | 00:00:00
COPY public.archivable_usage_events (.. | 50:38:33.198418 | 23.6% | 13 | 13:34:21.00108
COPY public.usage_events (id, reporte.. | 02:32:16.335233 | 1.2% | 13 | 00:34:19.784318
INSERT INTO usage_events (id, retaine.. | 01:42:59.436532 | 0.8% | 12,328,187 | 00:00:00
SELECT * FROM usage_events WHERE (alp.. | 01:18:10.754354 | 0.6% | 102,114,301 | 00:00:00
UPDATE usage_events SET reporter_id =.. | 00:52:35.683254 | 0.4% | 23,786,348 | 00:00:00
INSERT INTO usage_events (id, retaine.. | 00:49:24.952561 | 0.4% | 21,988,201 | 00:00:00
COPY public.app_ownership_events (id,.. | 00:37:14.31082 | 0.3% | 13 | 00:12:32.584754
INSERT INTO app_ownership_events (id,.. | 00:26:59.808212 | 0.2% | 383,109 | 00:00:00
SELECT * FROM app_ownership_events .. | 00:19:06.021846 | 0.1% | 744,879 | 00:00:00
```

This command displays statements, obtained from `pg_stat_statements`, ordered
by the amount of time to execute in aggregate. This includes the statement
itself, the total execution time for that statement, the proportion of total
execution time for all statements that statement has taken up, the number of
times that statement has been called, and the amount of time that statement
spent on synchronous I/O (reading / writing from the filesystem).

Typically, an efficient query will have an appropriate ratio of calls to total
execution time, with as little time spent on I/O as possible. Queries that have
a high total execution time but low call count should be investigated to
improve their performance. Queries that have a high proportion of execution
time being spent on synchronous I/O should also be investigated.

---

#### `pg-extras queries-popular`

*Requires enabling
[pg_stat_statements](https://www.postgresql.org/docs/current/pgstatstatements.html)
before it will work.*

```
qry | exec_time | prop_exec_time | ncalls | sync_io_time
-----------------------------------------+------------------+----------------+-------------+--------------
SELECT * FROM usage_events WHERE (alp.. | 01:18:11.073333 | 0.6% | 102,120,780 | 00:00:00
BEGIN | 00:00:51.285988 | 0.0% | 47,288,662 | 00:00:00
COMMIT | 00:00:52.31724 | 0.0% | 47,288,615 | 00:00:00
SELECT * FROM archivable_usage_event.. | 154:39:26.431466 | 72.2% | 34,211,877 | 00:00:00
UPDATE usage_events SET reporter_id =.. | 00:52:35.986167 | 0.4% | 23,788,388 | 00:00:00
INSERT INTO usage_events (id, retaine.. | 00:49:25.260245 | 0.4% | 21,990,326 | 00:00:00
INSERT INTO usage_events (id, retaine.. | 01:42:59.436532 | 0.8% | 12,328,187 | 00:00:00
SELECT * FROM app_ownership_events .. | 00:19:06.289521 | 0.1% | 744,976 | 00:00:00
INSERT INTO app_ownership_events(id, .. | 00:26:59.885631 | 0.2% | 383,153 | 00:00:00
UPDATE app_ownership_events SET app_i.. | 00:01:22.282337 | 0.0% | 359,741 | 00:00:00
```

This command is much like `pg-extras queries-outliers`, but ordered by the
number of times a statement has been called.

---

#### `pg-extras table-bloat`

```
type | schemaname | object_name | bloat | waste
-------+------------+-------------------------------+-------+----------
table | public | bloated_table | 1.1 | 98 MB
table | public | other_bloated_table | 1.1 | 58 MB
index | public | bloated_table::bloated_index | 3.7 | 34 MB
table | public | clean_table | 0.2 | 3808 kB
table | public | other_clean_table | 0.3 | 1576 kB
```

This command displays an estimation of table "bloat" space allocated to a
relation that is full of dead tuples, that has yet to be reclaimed. Tables that
have a high bloat ratio, typically 10 or greater should be investigated to see
if vacuuming is aggressive enough, and can be a sign of high table churn.

---

#### `pg-extras table-cache-hit`

```
name | buffer_hits | block_reads | total_read | ratio
-----------------------+-------------+-------------+------------+-------------------
plans | 32123 | 2 | 32125 | 0.999937743190662
subscriptions | 95021 | 8 | 95029 | 0.999915815172211
teams | 171637 | 200 | 171837 | 0.99883610631005
```

This command provides information on the efficiency of the buffer cache for
both index reads (index hit rate) as well as table reads (table hit rate). A
low buffer cache hit ratio can be a sign that you need a more powerful DB
server.

---

#### `pg-extras table-est-row-count`

```
name | estimated_count
-----------------------------------+-----------------
tastypie_apiaccess | 568891
notifications_event | 381227
core_todo | 178614
core_comment | 123969
notifications_notification | 102101
django_session | 68078
```

This command displays an estimated count of rows per table, descending by
estimated count. The estimated count is derived from `n_live_tup`, which is
updated by vacuum operations. Due to the way `n_live_tup` is populated, sparse
vs. dense pages can result in estimations that are significantly out from the
real count of rows.

---

#### `pg-extras table-index-size`

```
table | indexes_size
---------------------------------------------------------------+--------------
learning_coaches | 153 MB
states | 125 MB
charities_customers | 93 MB
charities | 16 MB
grade_levels | 11 MB
```

This command displays the total size of indexes for each table, in MB. It is
calculated by using the system administration function `pg_indexes_size()`.

---

#### `pg-extras table-index-usage`

```
relname | percent_of_times_index_used | rows_in_table
---------------------+-----------------------------+---------------
events | 65 | 1217347
app_infos | 74 | 314057
app_infos_user_info | 0 | 198848
user_info | 5 | 94545
delayed_jobs | 27 | 0
```

This command provides information on the efficiency of indexes, represented as
what percentage of total scans were index scans. A low percentage can indicate
under indexing or wrong data being indexed.

---

#### `pg-extras table-seq-scans`

```
name | count
-----------------------------------+----------
learning_coaches | 44820063
states | 36794975
grade_levels | 13972293
charities_customers | 8615277
charities | 4316276
messages | 3922247
contests_customers | 2915972
classroom_goals | 2142014
```

This command displays the number of sequential scans recorded against all
tables, descending by count of sequential scans. Tables that have very high
numbers of sequential scans may be under indexed, and it may be worth
investigating queries that read from these tables.

---

#### `pg-extras table-size`

```
name | size
---------------------------------------------------------------+---------
learning_coaches | 196 MB
states | 145 MB
grade_levels | 111 MB
charities_customers | 73 MB
charities | 66 MB
```

This command displays the size of each table in the database, in MB. It is
calculated by using the system administration function `pg_table_size()`, which
includes the size of the main data fork, free space map, visibility map and
TOAST data.

---

#### `pg-extras total-cache-hit`

```
name | ratio
----------------+------------------------
index hit rate | 0.99957765013541945832
table hit rate | 1.00
```

This command is similar to `table-cache-hit` except it's across all of your
tables.

---

#### `pg-extras total-index-size`

```
size
-------
28194 MB
```

This command displays the total size of all indexes on the database, in MB. It
is calculated by taking the number of pages (reported in `relpages`) and
multiplying it by the page size (8192 bytes).

---

#### `pg-extras total-table-size`

```
name | size
---------------------------------------------------------------+---------
learning_coaches | 349 MB
states | 270 MB
charities_customers | 166 MB
grade_levels | 122 MB
charities | 82 MB
```

This command displays the total size of each table in the database, in MB. It
is calculated by using the system administration function
`pg_total_relation_size()`, which includes table size, total index size and
TOAST data.

---

#### `pg-extras vacuum-stats`

```
schema | table | last_vacuum | last_autovacuum | rowcount | dead_rowcount | autovacuum_threshold | expect_autovacuum
--------+-----------------------+-------------+------------------+----------------+----------------+----------------------+-------------------
public | log_table | | 2013-04-26 17:37 | 18,030 | 0 | 3,656 |
public | data_table | | 2013-04-26 13:09 | 79 | 28 | 66 |
public | other_table | | 2013-04-26 11:41 | 41 | 47 | 58 |
public | queue_table | | 2013-04-26 17:39 | 12 | 8,228 | 52 | yes
public | picnic_table | | | 13 | 0 | 53 |
```

This command displays statistics related to vacuum operations for each table,
including an estimation of dead rows, last `autovacuum` and the current
`autovacuum` threshold. This command can be useful when determining if current
vacuum thresholds require adjustments, and to determine when the table was last
vacuumed.

---

## FAQ

### What about MySQL, MS SQL and SQLite?

While this extension does use SQLAlchemy, it's only using it to piggy back off
your existing connection details and to execute raw SQL queries.

A majority of the queries are accessing specific extensions, functions and
tables that only exist within PostgreSQL. If you take a look at the queries
being run, you'll find references to a few PostgreSQL specific things
with`pg_*`. That's why it will not work with any other database.

### Is it safe to run this against your production database?

You can run these commands in development to get a feel for things but this
extension is meant to be run against your production database to help figure
out how to improve the efficiency of your database while it's under your real
production load. Your results will likely be very different in development vs
production.

I encourage you to look at the [queries in the source
code](https://github.com/nickjj/flask-pg-extras/tree/master/flask_pg_extras/queries)
of this extension to get an idea of what's being run since you may run these
commands against your production database.

Some of these queries are above my pay grade to fully understand but I trust
the folks over at Heroku. Many thousands of people are running these exact
queries against their Heroku PostgreSQL databases.

## About the author

- Nick Janetakis | | [@nickjanetakis](https://twitter.com/nickjanetakis)

If you're interested in learning Flask I have a 20+ hour video course called
[Build a SAAS App with
Flask](https://buildasaasappwithflask.com/?utm_source=github&utm_medium=pgextras&utm_campaign=readme).
It's a course where we build a real world SAAS app. Everything about the course
and demo videos of what we build is on the site linked above.