Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/rjuju/pg_anonymize
PostgreSQL dynamic data anonymization
https://github.com/rjuju/pg_anonymize
Last synced: 41 minutes ago
JSON representation
PostgreSQL dynamic data anonymization
- Host: GitHub
- URL: https://github.com/rjuju/pg_anonymize
- Owner: rjuju
- License: gpl-3.0
- Created: 2022-10-01T05:59:37.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2024-06-03T10:05:59.000Z (5 months ago)
- Last Synced: 2024-11-01T04:42:39.808Z (7 days ago)
- Language: C
- Size: 77.1 KB
- Stars: 41
- Watchers: 6
- Forks: 5
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
![Tests](https://github.com/rjuju/pg_anonymize/actions/workflows/tests.yml/badge.svg?branch=main)
pg_anonymize
============pg_anonymize is a PostgreSQL extension that allows to perform data
anonymization transparently on the database.Requirements
------------pg_anonymize is compatible with PostgreSQL 10 and above.
Installation
------------You need the PostgreSQL header files for the major version(s) you want to build
the extension. You have to make sure that `pg_config` is available and points
to the correct major version. For instance, for PostgreSQL 14, `pg_config`
returns at the time this document was written:```
$ pg_config --version
PostgreSQL 14.5
```Decompress the tarball or clone the git repository. In the pg_anonymize source
directory, run:```
make
sudo make install
```NOTE: you have to make sure that `sudo pg_config` sees the correct version. If
you want to install the extension on multiple major versions, or used to wrong
`pg_config`, you have to first clean the compiled files using:```
make clean
```Configuration
-------------pg_anonymize provides the following configuration options:
- **pg_anonymize.enabled** (bool): allows to globally enable or disable
pg_anonymize. The default value is **on**.- **pg_anonymize.check_labels** (bool): perform sanity checks (expression
validity, read-only, returned type and lack of SQL injection) on the defined
expression when declaring security labels. The default value is **on**.- **pg_anonymize.inherit_labels** (bool): inherit security labels from relation
ancestors (partitioned tables and inheritance tables) if any. The default
value is **on**.NOTE: even if **pg_anonymize.check_labels** is disabled, pg_anonymize will
still check that the defined expression doesn't contain any SQL injection.Usage
-----pg_anonymize must be loaded before being able to use use. There are multiple
ways to do it. Usually, only a few roles should require data anonymization, so
the recommended way is to only load the extension for such roles. For
instance, assuming the role **alice** should be used:```
ALTER ROLE alice SET session_preload_libraries = 'pg_anonymize';
```NOTE: only sessions opened by alice **after** this command has been
successfully run will load pg_anonymize.You can alternatively load it explicitly, for instance:
```
LOAD 'pg_anonymize';
```NOTE: LOAD requires superuser privileges.
You then need to declare the wanted role(s) as needing anonymized data. This
is done adding a SECURITY LABEL **anonymize** on the target role(s). For
instance:```
-- pg_anonymize need to be loaded before declaring SECURITY LABEL
LOAD 'pg_anonymize';
SECURITY LABEL FOR pg_anonymize ON ROLE alice IS 'anonymize';
```NOTE: declaring a SECURITY LABEL on a role requires CREATEROLE privilege.
You can then declare how to anonymize each column with SECURITY LABELS,
defining an expression to replace the actual content.For instance, assuming a simplistic customer table:
```
CREATE TABLE public.customer(id integer,
first_name text,
last_name text,
birthday date,
phone_number text);INSERT INTO public.customer VALUES (1, 'Nice', 'Customer', '1970-03-04', '+886 1234 5678');
GRANT SELECT ON TABLE public.customer TO alice;
```Let's anonymize the last name, birthday and phone number:
```
SECURITY LABEL FOR pg_anonymize ON COLUMN public.customer.last_name
IS $$substr(last_name, 1, 1) || '*****'$$;
SECURITY LABEL FOR pg_anonymize ON COLUMN public.customer.birthday
IS $$date_trunc('year', birthday)::date$$;
SECURITY LABEL FOR pg_anonymize ON COLUMN public.customer.phone_number
IS $$regexp_replace(phone_number, '\d', 'X', 'g')$$;
```NOTE: declaring a SECURITY LABEL on a column requires to be owner of the
underlying relation.The **alice** role will now automatically see anonymized data. For instance:
```
-- current role sees the normal data
=# SELECT * FROM public.customer;
id | first_name | last_name | birthday | phone_number
----+------------+-----------+------------+----------------
1 | Nice | Customer | 1970-03-04 | +886 1234 5678
(1 row)-- but alice will see anonymized data
=# \c - alice
You are now connected to database "rjuju" as user "alice".=> SELECT * FROM public.customer;
id | first_name | last_name | birthday | phone_number
----+------------+-----------+------------+----------------
1 | Nice | C***** | 1970-01-01 | +XXX XXXX XXXX
(1 row)-- pg_dump will also see anonymized data
$ pg_dump -U alice -t public.customer -a rjuju | grep "COPY" -A2
COPY public.customer (id, first_name, last_name, birthday, phone_number) FROM stdin;
1 Nice C***** 1970-01-01 +XXX XXXX XXXX
\.
```