{"id":21882081,"url":"https://github.com/shgtkshruch/embulk-masking-sample","last_synced_at":"2026-04-10T01:17:16.672Z","repository":{"id":73276190,"uuid":"296651016","full_name":"shgtkshruch/embulk-masking-sample","owner":"shgtkshruch","description":"study for loading concealed data with Embulk","archived":false,"fork":false,"pushed_at":"2020-09-29T21:44:32.000Z","size":120,"stargazers_count":0,"open_issues_count":1,"forks_count":0,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-01-26T19:13:25.786Z","etag":null,"topics":["docker","embulk","lambda","rds","step-functions","terraform"],"latest_commit_sha":null,"homepage":"","language":"HCL","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/shgtkshruch.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-09-18T14:53:39.000Z","updated_at":"2020-09-29T21:44:35.000Z","dependencies_parsed_at":"2023-04-06T23:07:28.366Z","dependency_job_id":null,"html_url":"https://github.com/shgtkshruch/embulk-masking-sample","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shgtkshruch%2Fembulk-masking-sample","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shgtkshruch%2Fembulk-masking-sample/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shgtkshruch%2Fembulk-masking-sample/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shgtkshruch%2Fembulk-masking-sample/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/shgtkshruch","download_url":"https://codeload.github.com/shgtkshruch/embulk-masking-sample/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244890097,"owners_count":20527035,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["docker","embulk","lambda","rds","step-functions","terraform"],"created_at":"2024-11-28T09:26:45.573Z","updated_at":"2025-12-30T23:54:24.672Z","avatar_url":"https://github.com/shgtkshruch.png","language":"HCL","funding_links":[],"categories":[],"sub_categories":[],"readme":"study for loading concealed data with [Embulk](https://www.embulk.org/).\n\n## Diagram\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"https://user-images.githubusercontent.com/5207601/94560318-6f93a480-029d-11eb-8c80-c6f4f226733a.png\"\u003e\n\u003c/p\u003e\n\n## Requirements\n\n- [Docker Compose](https://docs.docker.com/compose/)\n- [dip](https://github.com/bibendi/dip)\n- [GitHub CLI](https://cli.github.com/)\n\n## Setup\n\n```sh\n# Download test data\n# ref: https://dev.mysql.com/doc/employee/en/\n$ gh repo clone datacharmer/test_db\n\n# Lunch MySQL server on 4306 port\n$ dip provition\n\n# Import test data to MySQL\n$ docker-compose exec db /bin/bash -c 'mysql -u root -p\"$MYSQL_ROOT_PASSWORD\" \u003c employees.sql'\n\n# Install embulk gems\n$ docker-compose exec -w /tmp/embulk/bundle embulk bash\n$ embulk bundle install\n```\n\n## Example\n\n[Official example](https://www.embulk.org/)\n\n```sh\n$ embulk example ./try1\n$ embulk guess ./try1/seed.yml -o ./try1/config.yml\n\n$ embulk preview ./try1/config.yml\n+---------+--------------+-------------------------+-------------------------+----------------------------+\n| id:long | account:long |          time:timestamp |      purchase:timestamp |             comment:string |\n+---------+--------------+-------------------------+-------------------------+----------------------------+\n|       1 |       32,864 | 2015-01-27 19:23:49 UTC | 2015-01-27 00:00:00 UTC |                     embulk |\n|       2 |       14,824 | 2015-01-27 19:01:23 UTC | 2015-01-27 00:00:00 UTC |               embulk jruby |\n|       3 |       27,559 | 2015-01-28 02:20:02 UTC | 2015-01-28 00:00:00 UTC | Embulk \"csv\" parser plugin |\n|       4 |       11,270 | 2015-01-29 11:54:36 UTC | 2015-01-29 00:00:00 UTC |                            |\n+---------+--------------+-------------------------+-------------------------+----------------------------+\n\n$ embulk run ./try1/config.yml\n1,32864,2015-01-27 19:23:49,20150127,embulk\n2,14824,2015-01-27 19:01:23,20150127,embulk jruby\n3,27559,2015-01-28 02:20:02,20150128,Embulk \"csv\" parser plugin\n4,11270,2015-01-29 11:54:36,20150129,\n```\n\n## MySQL\n\n```sh\n$ docker-compose exec embulk bash\n$ embulk guess -b bundle -o ./mysql/config.yml ./mysql/seed.yml\n$ embulk preview -b bundle ./mysql/config.yml\n+-------------+-------------------------+-------------------+------------------+---------------+-------------------------+\n| emp_no:long |    birth_date:timestamp | first_name:string | last_name:string | gender:string |     hire_date:timestamp |\n+-------------+-------------------------+-------------------+------------------+---------------+-------------------------+\n|      10,001 | 1953-09-01 15:00:00 UTC |            Georgi |          Facello |             M | 1986-06-25 15:00:00 UTC |\n|      10,002 | 1964-06-01 15:00:00 UTC |           Bezalel |           Simmel |             F | 1985-11-20 15:00:00 UTC |\n|      10,003 | 1959-12-02 15:00:00 UTC |             Parto |          Bamford |             M | 1986-08-27 15:00:00 UTC |\n|      10,004 | 1954-04-30 15:00:00 UTC |         Chirstian |          Koblick |             M | 1986-11-30 15:00:00 UTC |\n|      10,005 | 1955-01-20 15:00:00 UTC |           Kyoichi |         Maliniak |             M | 1989-09-11 15:00:00 UTC |\n...\n...\n\n$ embulk run -b bundle ./mysql/config.yml\n```\n\n## AWS\n\n### Create IAM user for Terraform\n\ncreate `.env-aws` file.\n\n```\nAWS_ACCESS_KEY_ID=xxx\nAWS_SECRET_ACCESS_KEY=xxx\nAWS_DEFAULT_REGION=xxx\n```\n\n```sh\n$ dip aws iam create-user --user-name embulk-mysql-rds-masking\n$ dip aws iam create-access-key --user-name embulk-mysql-rds-masking\n$ dip aws iam attach-user-policy \\\n  --policy-arn arn:aws:iam::aws:policy/AdministratorAccess \\\n  --user-name embulk-mysql-rds-masking\n```\n\n### Terraform\n\ncreate `.env-tf` file with `embulk-mysql-rds-masking` credential.\n\n```\nAWS_ACCESS_KEY_ID=xxx\nAWS_SECRET_ACCESS_KEY=xxx\nAWS_DEFAULT_REGION=xxx\n```\n\n```sh\n# generage zip files of lambda\n$ cd terraform/lambda \u0026\u0026 zip -r create-onetime-rds.zip create-onetime-rds.js \u0026\u0026 zip -r delete-onetime-rds.zip delete-onetime-rds.js \u0026\u0026 cd -\n\n$ dip terraform init\n$ dip terraform plan\n$ dip terraform apply\n```\n\n### Load `test_data` to RDS\n\n```sh\n$ docker-compose exec db /bin/bash -c 'mysql -h HOST -u dbuser -ppassword \u003c employees.sql'\n```\n\n### Create onetime RDS\n\n```sh\n$ dip aws stepfunctions start-execution \\\n  --state-machine-arn \u003cvalue\u003e \\\n  --input '{ \"DBInstanceIdentifier\": \"RDS_IDENTIFIER\" }'\n```\n\n### Transfer data from RDS to local MySQL\n\n1. Set db `host` to `mysql/seed.yml`\n2. Run embulk\n\n```sh\n$ docker-compose exec embulk bash\n$ embulk guess -b bundle -o ./mysql/config.yml ./mysql/seed.yml\n$ embulk preview -b bundle ./mysql/config.yml\n$ embulk run -b bundle ./mysql/config.yml\n```\n\n## Cleaning\n\nRemove aws resources.\n\n```sh\n$ dip terraform destroy\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fshgtkshruch%2Fembulk-masking-sample","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fshgtkshruch%2Fembulk-masking-sample","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fshgtkshruch%2Fembulk-masking-sample/lists"}