https://github.com/embulk/embulk-filter-encrypt
Encrypt filter plugin for Embulk
https://github.com/embulk/embulk-filter-encrypt
Last synced: 23 days ago
JSON representation
Encrypt filter plugin for Embulk
- Host: GitHub
- URL: https://github.com/embulk/embulk-filter-encrypt
- Owner: embulk
- License: apache-2.0
- Created: 2016-03-02T01:41:48.000Z (about 9 years ago)
- Default Branch: master
- Last Pushed: 2021-06-24T04:16:43.000Z (almost 4 years ago)
- Last Synced: 2025-04-14T19:54:39.269Z (about 1 month ago)
- Language: Java
- Size: 175 KB
- Stars: 6
- Watchers: 9
- Forks: 1
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- Changelog: ChangeLog
- License: LICENSE
Awesome Lists containing this project
README
# Encrypt filter plugin for Embulk
Converts columns using an encryption algorithm such as AES.
Encrypted data is encoded using base64. For example, if you have following input records:
id,password,comment
1,super,a
2,secret,bYou can apply encryption to password column and get following outputs:
id,password,comment
1,ayxU9lMA1iASdHGy/eAlWw==,a
2,v8ffsUOfspaqZ1KI7tPz+A==,b## Overview
* **Plugin type**: filter
## Configuration
- **algorithm**: encryption algorithm (see below) (enum, required)
- **column_names**: names of string columns to encrypt (array of string, required)
- **key_type**: encryption key (enum, optional, default: inline), can be either "inline" or "s3"
- **key_hex**: encryption key (string, required if key_type is inline)
- **iv_hex**: encryption initialization vector (string, required if mode of the algorithm is CBC and key_type is inline)
- **output_encoding**: the encoding of encrypted value, can be either "base64" or "hex" (base16)
- **aws_params**: AWS/S3 parameters (hash, required if key_type is s3)
- **region**: a valid AWS region
- **access_key**: a valid AWS access key
- **secret_key**: a valid AWS secret key
- **bucket**: a valid S3 bucket
- **path**: a valid S3 key (S3 file path)
## AlgorithmsAvailable algorithms are:
* **AES-256-CBC** (recommended)
* AES-192-CBC
* AES-128-CBC
* AES-256-ECB
* AES-192-ECB
* AES-128-ECBAES-256-CBC is the recommended algorithm. The other algorithms are prepared for compatibility with other components (see below "Decrypting data" section).
## Generating key and iv
### Using standard PBKDF2 Password-based Encryption algorithm
PBKDF2 is a standard (PKCS #5) algorithm to generate key and iv from a password.
To generate it, you can use [genkey.rb](https://raw.githubusercontent.com/embulk/embulk-filter-encrypt/master/genkey.rb) script.
You save above text as "genkey.rb", and run it as following:
$ ruby genkey.rb AES-256-CBC "my-pass-wo-rd"
It shows key and iv as following:
key=D0867C9310D061F17ACD11EB30DE68265DCB79849BE5FB2BE157919D19BF2F42
iv =2A1D6BD59D2DB50A59364BAD3B9B6544### Using openssl EVP_BytesToKey algorithm
You can use `openssl` EVP_BytesToKey algorithm to generate key and iv from a password. If you use AES-256-CBC cipher algorithm, you type following command:
$ echo secret | openssl enc -aes-256-cbc -a -nosalt -p
You will be asked to enter password. Then it shows key and iv:
key=DAFFED346E29C5654F54133D1FC65CCB5930071ACEAF5B64A22A11406F467DC9
iv =C92D28D70B4440DA3F0F05577ECFEE54
6aEGvMrGx7tODkPF7x5Yog==You can copy key and iv to key_hex and iv_hex parameters.
## Decrypting data
### openssl command
You can use openssl command as following:
$ echo | openssl enc -d -base64 | openssl enc -aes-256-cbc -d -K -iv
For example:
$ echo 6aEGvMrGx7tODkPF7x5Yog== | openssl enc -d -base64 | openssl enc -aes-256-cbc -d -K DAFFED346E29C5654F54133D1FC65CCB5930071ACEAF5B64A22A11406F467DC9 -iv C92D28D70B4440DA3F0F05577ECFEE54
secret### PostgreSQL
You can use PostgreSQL's `decrypt_iv` or `decrypt` function to decrypt values (provided as pgcrypto extension). If you use CBC,
decrypt_iv(decode(encrypted_column, 'base64'), decode('here_is_key_hex', 'hex'), decode('here_is_iv_hex', 'hex'), 'aes')
If you use ECB,
decrypt(decode(encrypted_column, 'base64'), decode('here_is_key_hex', 'hex'), 'aes')
## Example
* Inline key type
```yaml
filters:
- type: encrypt
algorithm: AES-256-CBC
column_names: [password, ip]
key_hex: 098F6BCD4621D373CADE4E832627B4F60A9172716AE6428409885B8B829CCB05
iv_hex: C9DD4BB33B827EB1FBA1B16A0074D460
output_encoding: hex
```* S3 key type
```yaml
filters:
- type: encrypt
algorithm: AES-256-CBC
column_names: [password, ip]
output_encoding: hex
key_type: s3
aws_params:
region: us-east-2
access_key: XXXXXXXXXXXXXXXXXXXX
secret_key: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
bucket: com.sample.keys
path: key.aes
```## Build
```
$ ./gradlew gem # -t to watch change of files and rebuild continuously
```