Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/mrpowers/redshift_copier
Redshift COPY command with Ruby
https://github.com/mrpowers/redshift_copier
Last synced: about 1 month ago
JSON representation
Redshift COPY command with Ruby
- Host: GitHub
- URL: https://github.com/mrpowers/redshift_copier
- Owner: MrPowers
- License: mit
- Created: 2016-03-03T06:54:30.000Z (almost 9 years ago)
- Default Branch: master
- Last Pushed: 2016-04-12T16:58:37.000Z (over 8 years ago)
- Last Synced: 2024-04-25T11:22:15.604Z (8 months ago)
- Language: Ruby
- Size: 6.84 KB
- Stars: 0
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE.txt
Awesome Lists containing this project
README
# RedshiftCopier
The redshift_copier gem can be used to copy a single CSV file or a collection of files in S3 into Redshift. If you would like to move data from one Redshift cluster to another, see the [redshift_extractor gem](https://github.com/MrPowers/redshift_extractor/).
## Copying without a manifest file
Here is how to use the redshift_copier gem to copy file(s) without a manifest file.
```ruby
db_config = {
user: "some user",
password: "password",
host: "host",
sslmode: 'require',
port: 5439,
dbname: "database name"
}
args = {
schema: "schema name",
table: "table name",
create_sql: File.read("path to create.sql file"),
aws_access_key_id: "aws_access_key_id",
aws_secret_access_key: "aws_secret_access_key",
s3_path: "s3://some_bucket/some_file.csv",
db_config: db_config,
copy_command_options: "dateformat 'auto' timeformat 'auto' blanksasnull emptyasnull escape removequotes delimiter ',' ignoreheader 1;"
}
copier = RedshiftCopier::Copy.new(args)
copier.run
```The `copy_command_options` need to be updated depending on your data. For example, if your CSV file does not have a header row, the `ignoreheader 1` option should be removed.
To copy a collection of files, set the `s3_path` to a folder: `"s3://some_bucket/some_folder/"`. All the files in the folder will be copied and they all must have the same CSV structure.
## Copying with a manifest file
Copying a collection of files is similar to copying a single file, but it requires a manifest file.
```json
{
"entries": [
{"url":"s3://some_bucket/some_folder/0000_part_00.gz"},
{"url":"s3://some_bucket/some_folder/0001_part_00.gz"}
]
}
```The path to the manifest file must be set in `s3_path` and the `copy_command_options` must specify the manifest option.
```ruby
args = {
schema: "schema name",
table: "table name",
create_sql: File.read("path to create.sql file"),
aws_access_key_id: "aws_access_key_id",
aws_secret_access_key: "aws_secret_access_key",
s3_path: "s3://some_bucket/some_folder/manifest",
db_config: db_config,
copy_command_options: "manifest dateformat 'auto' timeformat 'auto' blanksasnull emptyasnull escape gzip removequotes delimiter '|';"
}
copier = RedshiftCopier::Copy.new(args)
copy.run
```## Installation
Add this line to your application's Gemfile:
```ruby
gem 'redshift_copier'
```And then execute:
$ bundle
## Contributing
Bug reports and pull requests are welcome on GitHub at https://github.com/MrPowers/redshift_copier.
## License
The gem is available as open source under the terms of the [MIT License](http://opensource.org/licenses/MIT).