https://github.com/hlinnaka/pg_quota
PostgreSQL extension for quotas
https://github.com/hlinnaka/pg_quota
Last synced: 2 months ago
JSON representation
PostgreSQL extension for quotas
- Host: GitHub
- URL: https://github.com/hlinnaka/pg_quota
- Owner: hlinnaka
- License: other
- Created: 2018-05-15T20:54:10.000Z (about 7 years ago)
- Default Branch: master
- Last Pushed: 2018-08-22T04:55:02.000Z (almost 7 years ago)
- Last Synced: 2024-11-21T21:42:21.535Z (7 months ago)
- Language: C
- Size: 15.6 KB
- Stars: 16
- Watchers: 1
- Forks: 8
- Open Issues: 4
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-pg - pg_quota
README
Overview
========pg_quota is an extension that provides disk space quotas for PostgreSQL.
Using pg_quota, you can limit the amount of disk space that a user can use.The disk space used by each relation is attributed to the relation's owner.
Space used for by other things, like temporary files or temporary relations,
catalog objects, etc. is ignored.Limitations
-----------* Quotas are counted against the direct owner of each relation. You cannot
have quotas on groups, for example.* The owner of each relation is determined by the effects of committed
transactions only. Uncommitted transactions are not taken into account.
For example, if you change the owner of a table with ALTER TABLE, the
table is counted towards the old owner's quota until the transaction
commits. Likewise, if you create a table and load it with data in the
same transaction, it is not counted towards the user's quota until the
transaction commits.* Tracking disk usage is implemented by periodically scanning through the
data directory. That can be slow, if you have a lot of tables or
partitions. It also means that there is a significant delay before
changes to disk usage is reflected in the quota status.* Quotas are only checked at the beginning of INSERT and COPY statements.
As long as the user has not exceeded the quota at the beginning of the
statement, the INSERT or COPY is allowed to go through, even if it
causes the quota to be exceeded.As a consequence of the above limitation , quotas are rather "soft". It's
not hard to exceed them if you try.Usage
=====GUCs:
pg_quota.refresh_naptime:
Delay between scans of the data directory.pg_quota.databases:
List of databases to enforce quotas on.In each database that you want to use the quotas on, install the extension,
and add the database name to disk_quotas.databases setting. It cannot be
changed while the server is running, server restart is required. A background
worker process is launched for each database, so if you wish to use quotas on
many databases, you might need to bump up max_worker_processes.The extension comes with a configuration table, "disk_quotas.disk_quotas".
Insert quota configuration into the table, e.g:INSERT INTO quota.config VALUES ('alice'::regrole, pg_size_bytes('10 GB'));
You can view the quotas in effect, and current disk space usage with:
SELECT rolname,
pg_size_pretty(space_used) as used,
pg_size_pretty(quota) as quota
FROM quota.status;NULL in 'quota' means no quota is set for the role.
Example:
rolname | used | quota
---------+--------+-------
foouser | 360 kB |
heikki | 46 MB | 13 MB
(2 rows)Design
======There is a background worker process for each database, for which quotas
are activated. The worker process maintains an in-memory model of every
relation file and their owner.A configuration table to hold the quotas.
A shared memory hash table containing the current total disk space usage,
and the quota loaded from the configuration table.There are two different problems:
1. Keeping the in-memory tree "model" of all the files and their owners up to
date.2. Enforcing the quota. When quota is exceeded, prevent operations that
extend files.Keeping the model up-to-date
----------------------------To bootstrap, when the background worker starts, it scans the whole data
directory, and adds all files to the model. It then scans pg_class, and fills
in the owner of each file in the model.The model is refreshed every X seconds, by scanning the data directory and
pg_class again, like at startup. To detect deleted files, we keep track of
when we last saw each file. Every scan increments a "generation" counter, and
every file we encounter is stamped with the current generation. If, after
scanning the data directory, there are any files in the model with an older
generation stamp, we know that it has been deleted.TODO:
In order to react more quickly to changes, we should use something like Linux
inotify to detect changes to files continuously. The current polling approach
doesn't scale very well if you have hundreds of thousands of files in the
data directory. Another alternative would be to have the backends themselves
notify the worker process, whenever a file is extended or shrunk (there is no
convenient "hook" location for that, currently). Or perhaps use logical
decoding, although that would not work for unlogged tables.Enforcing the quota
-------------------The disk_quota module hooks into the ExecutorCheckPerms_hook, which gets
executed whenever an INSERT or COPY operation starts. If the table owner's
quota has been exceeded, you get an error.There are some limitations to this approach:
* The quota is only checked at the beginning of the statement. If you have a
quota of 1 GB, and use COPY to load 10 GB of data, it will succeed as long
as you are below the quota at the beginning of the operation.* The quota is not enforced at UPDATEs, or utility commands like CREATE INDEX.
(If an UPDATE or CREATE INDEX exceeds the quota, any subsequent INSERTs or
COPYs will fail, though.)TODO:
It would be nice to have a hook directly in smgrextend() or similar lower
level function, so that we could check the quota whenever a relation file is
extended, for any command.