https://github.com/embulk/embulk-output-s3
Embulk S3 output plugin
https://github.com/embulk/embulk-output-s3
Last synced: 24 days ago
JSON representation
Embulk S3 output plugin
- Host: GitHub
- URL: https://github.com/embulk/embulk-output-s3
- Owner: embulk
- License: apache-2.0
- Created: 2015-03-15T07:50:09.000Z (about 10 years ago)
- Default Branch: master
- Last Pushed: 2023-01-06T00:08:30.000Z (over 2 years ago)
- Last Synced: 2025-04-05T09:41:58.677Z (about 2 months ago)
- Language: Java
- Homepage:
- Size: 258 KB
- Stars: 16
- Watchers: 6
- Forks: 18
- Open Issues: 5
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# S3 file output plugin for Embulk
## Developers
* Manabu Takayama
* toyama hiroshi
* Civitaspo## Overview
* **Plugin type**: file output
* **Load all or nothing**: no
* **Resume supported**: yes
* **Cleanup supported**: yes## Configuration
- **path_prefix**: prefix of target keys (string, required)
- **file_ext**: suffix of target keys (string, required)
- **sequence_format**: format for sequence part of target keys (string, default: '.%03d.%02d')
- **bucket**: S3 bucket name (string, required)
- **endpoint**: S3 endpoint login user name (string, optional)
- **access_key_id**: AWS access key id. This parameter is required when your agent is not running on EC2 instance with an IAM Role. (string, defualt: null)
- **secret_access_key**: AWS secret key. This parameter is required when your agent is not running on EC2 instance with an IAM Role. (string, defualt: null)
- **tmp_path**: temporary file directory. If null, it is associated with the default FileSystem. (string, default: null)
- **tmp_path_prefix**: prefix of temporary files (string, default: 'embulk-output-s3-')
- **canned_acl**: canned access control list for created objects ([enum](#cannedaccesscontrollist), default: null)
- [Deprecated] **proxy_host**: proxy host to use when accessing AWS S3 via proxy. (string, default: null )
- [Deprecated] **proxy_port**: proxy port to use when accessing AWS S3 via proxy. (string, default: null )
- **http_proxy**: http proxy configuration to use when accessing AWS S3 via http proxy. (optional)
- **host**: proxy host (string, required)
- **port**: proxy port (int, optional)
- **https**: use https or not (boolean, default true)
- **user**: proxy user (string, optional)
- **password**: proxy password (string, optional)- **auth_method**: name of mechanism to authenticate requests (basic, env, instance, profile, properties, anonymous, or session. default: basic)
- "basic": uses access_key_id and secret_access_key to authenticate.
- **access_key_id**: AWS access key ID (string, required)
- **secret_access_key**: AWS secret access key (string, required)
- "env": uses AWS_ACCESS_KEY_ID (or AWS_ACCESS_KEY) and AWS_SECRET_KEY (or AWS_SECRET_ACCESS_KEY) environment variables.
- "instance": uses EC2 instance profile.
- "profile": uses credentials written in a file. Format of the file is as following, where `[...]` is a name of profile.
- **profile_file**: path to a profiles file. (string, default: given by AWS_CREDENTIAL_PROFILES_FILE environment varialbe, or ~/.aws/credentials).
- **profile_name**: name of a profile. (string, default: `"default"`)
```
[default]
aws_access_key_id=YOUR_ACCESS_KEY_ID
aws_secret_access_key=YOUR_SECRET_ACCESS_KEY
[profile2]
...
```- "properties": uses aws.accessKeyId and aws.secretKey Java system properties.
- "anonymous": uses anonymous access. This auth method can access only public files.
- "session": uses temporary-generated access_key_id, secret_access_key and session_token.
- **access_key_id**: AWS access key ID (string, required)
- **secret_access_key**: AWS secret access key (string, required)
- **session_token**: session token (string, required)
- "default": uses AWS SDK's default strategy to look up available credentials from runtime environment. This method behaves like the combination of the following methods.
1. "env"
1. "properties"
1. "profile"
1. "instance"### CannedAccessControlList
you can choose one of the below list.- AuthenticatedRead
- AwsExecRead
- BucketOwnerFullControl
- BucketOwnerRead
- LogDeliveryWrite
- Private
- PublicRead
- PublicReadWritecf. http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/model/CannedAccessControlList.html
## Example
```yaml
path_prefix: logs/out
file_ext: .csv
bucket: my-s3-bucket
endpoint: s3-us-west-1.amazonaws.com
access_key_id: ABCXYZ123ABCXYZ123
secret_access_key: AbCxYz123aBcXyZ123
formatter:
type: csv
```## Build
```
$ ./gradlew gem
```For Maintainers
----------------### Release
Modify `version` in `build.gradle` at a detached commit, and then tag the commit with an annotation.
```
git checkout --detach master(Edit: Remove "-SNAPSHOT" in "version" in build.gradle.)
git add build.gradle
git commit -m "Release vX.Y.Z"
git tag -a vX.Y.Z
(Edit: Write a tag annotation in the changelog format.)
```See [Keep a Changelog](https://keepachangelog.com/en/1.0.0/) for the changelog format. We adopt a part of it for Git's tag annotation like below.
```
## [X.Y.Z] - YYYY-MM-DD### Added
- Added a feature.### Changed
- Changed something.### Fixed
- Fixed a bug.
```Push the annotated tag, then. It triggers a release operation on GitHub Actions after approval.
```
git push -u origin vX.Y.Z
```