Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/jackall3n/pg-anonymize
https://github.com/jackall3n/pg-anonymize
Last synced: 2 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/jackall3n/pg-anonymize
- Owner: jackall3n
- License: mit
- Created: 2023-06-16T18:15:03.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-10-28T23:02:18.000Z (3 months ago)
- Last Synced: 2024-10-29T00:17:15.915Z (3 months ago)
- Language: TypeScript
- Size: 290 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 15
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
Awesome Lists containing this project
README
pg-anonymize
=============Export your PostgreSQL database anonymized. Replace all sensitive data thanks to `faker`. Output to a file that you can easily import with `psql`.
[![oclif](https://img.shields.io/badge/cli-oclif-brightgreen.svg)](https://oclif.io)
[![Version](https://img.shields.io/npm/v/pg-anonymize.svg)](https://npmjs.org/package/pg-anonymize)
[![Downloads](https://img.shields.io/npm/dt/pg-anonymize.svg)](https://npmjs.org/package/pg-anonymize)
[![License](https://img.shields.io/npm/l/pg-anonymize.svg)](https://github.com/rap2hpoutre/pg-anonymize/blob/master/package.json)
## UsageRun this command by giving a connexion string and an output file name (no need to install first thanks to `npx`):
```bash
npx pg-anonymize postgres://user:secret@localhost:1234/mydb -o dump.sql
```☝️ This command requires `pg_dump`. It may already be installed as soon as PostgreSQL is installed.
Output can also be stdout ('-') so you can pipe the output to zip, gz, or to psql:
```bash
npx pg-anonymize postgres://user:secret@localhost:1234/mydb -o - | psql DATABASE_URL
```## API
### `--columns | -c`
#### Specify list of columns to anonymize
Use `--columns` option with a comma separated list of column name:
```bash
npx pg-anonymize postgres://localhost/mydb \
--columns=email,firstName,lastName,phone
```Specifying another list via `--columns` replace the default automatically anonymized values:
```csv
email,name,description,address,city,country,phone,comment,birthdate
```You can also specify the table for a column using the dot notation:
```csv
public.user.email,public.product.description,email,name
```#### Customize replacements
You can also choose which faker function you want to use to replace data (default is `faker.random.word`):
```bash
npx pg-anonymize postgres://localhost/mydb \
--columns=firstName:faker.name.firstName,lastName:faker.name.lastName
```:point_right: You don't need to specify faker function since the command will try to find correct function via column name.
You can use plain text too for static replacements:
```bash
npx pg-anonymize postgres://localhost/mydb \
--columns=textcol:hello,jsoncol:{},intcol:12
```### `--extension`
#### Use an extension file to create your own custom replacements
Create an extension file, written in javascript
```javascript
// myExtension.js
module.exports = {
maskEmail: (email) => {
const [name, domain] = email.split('@');
const { length: len } = name;
const maskedName = name[0] + '...' + name[len - 1];
const maskedEmail = maskedName + '@' + domain;
return maskedEmail;
}
};
```Pass the path to `--extension` and use the module exports in `--columns`
```bash
npx pg-anonymize postgres://localhost/mydb \
--extension ./myExtension.js \
--columns=email:extension.maskEmail
```### `--config | -f`
#### Use a configuration file
You can use the `--config` option to specify a file with a list of column names and optional replacements, one per line:
Create a configuration file:
```csv
name
password:faker.random.word
```Pass the path to the file into `--config`
```bash
npx pg-anonymize postgres://localhost/mydb \
--config /path/to/file
```### `--skip`
#### Skip tables
Use `--skip` to skip anonymizing entire tables
```bash
npx pg-anonymize postgres://localhost/mydb --skip public.posts
```### `--preserve-null | -n`
#### Preserve `NULL` values
Use `--preserve-null` to skip anonymization on fields with `NULL` values.
```bash
npx pg-anonymize postgres://localhost/mydb --preserve-null
```### `--faker-locale`
#### Set fakers locale (i18n)
Use `--faker-locale` to change the locale used by faker (default: `en`)
## Import the anonymized file
The anonymized output file is plain SQL text, you can import it with `psql`.
```bash
psql -d mylocaldb < output.sql
```## Why
There are a bunch of competitors, still I failed to use them:
- [`postgresql_anonymizer`](https://postgresql-anonymizer.readthedocs.io/en/stable/) may be [hard to setup](https://postgresql-anonymizer.readthedocs.io/en/stable/INSTALL/#install-on-macos) and may be cumbersome for simple usage. Still, I guess it's the best solution.
- [`pganonymize`](https://pypi.org/project/pganonymize/) fails when it does not use `public` schema or columns have uppercase characters
- [`pganonymizer`](https://github.com/asgeirrr/pgantomizer) also fails with simple cases. Errors are not explicit and silent.## Credit
The original version of this package was created by @rap2hpoutre. It worked very well, but had a few features missing and became stale. I decided to make significant changes to the cli and update core functionality. Due to the lack of response on my PRs, I decided to re-publish the package under a similar name. If the original author decides they'd like to get back involved, I'll happily merge all my changes into the original repository.