{"id":19651264,"url":"https://github.com/embulk/embulk-output-gcs","last_synced_at":"2025-08-22T18:16:39.550Z","repository":{"id":28617723,"uuid":"32136327","full_name":"embulk/embulk-output-gcs","owner":"embulk","description":"Google Cloud Storage output plugin for Embulk","archived":false,"fork":false,"pushed_at":"2023-11-24T03:13:14.000Z","size":316,"stargazers_count":12,"open_issues_count":3,"forks_count":5,"subscribers_count":11,"default_branch":"master","last_synced_at":"2025-07-27T08:21:49.451Z","etag":null,"topics":["embulk","google-cloud","google-cloud-storage"],"latest_commit_sha":null,"homepage":"https://github.com/embulk/embulk-output-gcs","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/embulk.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2015-03-13T06:50:16.000Z","updated_at":"2025-07-08T12:04:36.000Z","dependencies_parsed_at":"2024-01-12T15:12:21.550Z","dependency_job_id":null,"html_url":"https://github.com/embulk/embulk-output-gcs","commit_stats":{"total_commits":80,"total_committers":6,"mean_commits":"13.333333333333334","dds":"0.48750000000000004","last_synced_commit":"427f9fdc677885a7467606393f6a343ceda2c4c9"},"previous_names":[],"tags_count":11,"template":false,"template_full_name":null,"purl":"pkg:github/embulk/embulk-output-gcs","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/embulk%2Fembulk-output-gcs","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/embulk%2Fembulk-output-gcs/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/embulk%2Fembulk-output-gcs/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/embulk%2Fembulk-output-gcs/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/embulk","download_url":"https://codeload.github.com/embulk/embulk-output-gcs/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/embulk%2Fembulk-output-gcs/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":271680511,"owners_count":24802074,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-22T02:00:08.480Z","response_time":65,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["embulk","google-cloud","google-cloud-storage"],"created_at":"2024-11-11T15:05:49.480Z","updated_at":"2025-08-22T18:16:39.523Z","avatar_url":"https://github.com/embulk.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![Build Status](https://travis-ci.org/embulk/embulk-output-gcs.svg?branch=master)](https://travis-ci.org/embulk/embulk-output-gcs)\n\n# Google Cloud Storage output plugin for Embulk\n\nGoogle Cloud Storage output plugin for [Embulk](https://github.com/embulk/embulk).\n\n## Overview\n\n* **Plugin type**: file output\n* **Load all or nothing**: no\n* **Resume supported**: yes\n* **Cleanup supported**: no\n\n- Connector do not support retry in case we have any problem with streaming chanel. In this case, we need to run the job again.\n\n## Configuration\n\n- **bucket**: Google Cloud Storage bucket name (string, required)\n- **path_prefix**: Prefix of output keys (string, required)\n- **file_ext**: Extention of output file (string, required)\n- **sequence_format**: Format of the sequence number of the output files (string, default value is \".%03d.%02d\")\n- **content_type**: content type of output file (string, optional, default value is \"application/octet-stream\")\n- **auth_method**: Authentication method `private_key`, `json_key` or `compute_engine` (string, optional, default value is \"private_key\")\n- **service_account_email**: Google Cloud Platform service account email (string, required when auth_method is private_key)\n- **p12_keyfile**: Private key file fullpath of Google Cloud Platform service account (string, required when auth_method is private_key)\n- **json_keyfile** fullpath of json_key (string, required when auth_method is json_key)\n- **application_name**: Application name, anything you like (string, optional, default value is \"embulk-output-gcs\")\n- **max_connection_retry**: Number of connection retries to GCS (number, default value is 10)\n\n## Example\n\n```yaml\nout:\n  type: gcs\n  bucket: your-gcs-bucket-name\n  path_prefix: logs/out\n  file_ext: .csv\n  auth_method: `private_key` #default\n  service_account_email: 'XYZ@developer.gserviceaccount.com'\n  p12_keyfile: '/path/to/private/key.p12'\n  formatter:\n    type: csv\n    encoding: UTF-8\n```\n\n## Authentication\n\nThere are three methods supported to fetch access token for the service account.\n\n1. Public-Private key pair of GCP(Google Cloud Platform)'s service account\n2. JSON key of GCP(Google Cloud Platform)'s service account\n3. Pre-defined access token (Google Compute Engine only)\n\n### Public-Private key pair of GCP's service account\n\nYou first need to create a service account (client ID), download its private key and deploy the key with embulk.\n\n```yaml\nout:\n  type: gcs\n  auth_method: private_key\n  service_account_email: ABCXYZ123ABCXYZ123.gserviceaccount.com\n  p12_keyfile: /path/to/p12_keyfile.p12\n```\n\n### JSON key of GCP's service account\n\nYou first need to create a service account (client ID), download its json key and deploy the key with embulk.\n\n```yaml\nout:\n  type: gcs\n  auth_method: json_key\n  json_keyfile: /path/to/json_keyfile.json\n```\n\nYou can also embed contents of json_keyfile at config.yml.\n\n```yaml\nout:\n  type: gcs\n  auth_method: json_key\n  json_keyfile:\n    content: |\n      {\n          \"private_key_id\": \"123456789\",\n          \"private_key\": \"-----BEGIN PRIVATE KEY-----\\nABCDEF\",\n          \"client_email\": \"...\"\n       }\n```\n\n### Pre-defined access token(GCE only)\n\nOn the other hand, you don't need to explicitly create a service account for embulk when you\nrun embulk in Google Compute Engine. In this third authentication method, you need to\nadd the API scope \"https://www.googleapis.com/auth/devstorage.read_write\" to the scope list of your\nCompute Engine VM instance, then you can configure embulk like this.\n\n[Setting the scope of service account access for instances](https://cloud.google.com/compute/docs/authentication)\n\n```yaml\nout:\n  type: gcs\n  auth_method: compute_engine\n```\n\n## Build\n\n```\n$ ./gradlew gem\n```\n\n## Test\n\n```\n$ ./gradlew test  # -t to watch change of files and rebuild continuously\n```\n\nTo run unit tests, we need to configure the following environment variables.\n\nWhen environment variables are not set, skip almost test cases.\n\n```\nGCP_EMAIL\nGCP_P12_KEYFILE\nGCP_JSON_KEYFILE\nGCP_BUCKET\nGCP_BUCKET_DIRECTORY(optional, if needed)\n```\n\nIf you're using Mac OS X El Capitan and GUI Applications(IDE), like as follows.\n```\n$ vi ~/Library/LaunchAgents/environment.plist\n\u003c?xml version=\"1.0\" encoding=\"UTF-8\"?\u003e\n\u003c!DOCTYPE plist PUBLIC \"-//Apple//DTD PLIST 1.0//EN\" \"http://www.apple.com/DTDs/PropertyList-1.0.dtd\"\u003e\n\u003cplist version=\"1.0\"\u003e\n\u003cdict\u003e\n  \u003ckey\u003eLabel\u003c/key\u003e\n  \u003cstring\u003emy.startup\u003c/string\u003e\n  \u003ckey\u003eProgramArguments\u003c/key\u003e\n  \u003carray\u003e\n    \u003cstring\u003esh\u003c/string\u003e\n    \u003cstring\u003e-c\u003c/string\u003e\n    \u003cstring\u003e\n      launchctl setenv GCP_EMAIL ABCXYZ123ABCXYZ123.gserviceaccount.com\n      launchctl setenv GCP_P12_KEYFILE /path/to/p12_keyfile.p12\n      launchctl setenv GCP_JSON_KEYFILE /path/to/json_keyfile.json\n      launchctl setenv GCP_BUCKET my-bucket\n      launchctl setenv GCP_BUCKET_DIRECTORY unittests\n    \u003c/string\u003e\n  \u003c/array\u003e\n  \u003ckey\u003eRunAtLoad\u003c/key\u003e\n  \u003ctrue/\u003e\n\u003c/dict\u003e\n\u003c/plist\u003e\n\n$ launchctl load ~/Library/LaunchAgents/environment.plist\n$ launchctl getenv GCP_EMAIL //try to get value.\n\nThen start your applications.\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fembulk%2Fembulk-output-gcs","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fembulk%2Fembulk-output-gcs","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fembulk%2Fembulk-output-gcs/lists"}