Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/mrkamel/s3sync
Sync S3 buckets to your filesystem
https://github.com/mrkamel/s3sync
Last synced: 23 days ago
JSON representation
Sync S3 buckets to your filesystem
- Host: GitHub
- URL: https://github.com/mrkamel/s3sync
- Owner: mrkamel
- Created: 2013-07-18T09:37:50.000Z (over 11 years ago)
- Default Branch: master
- Last Pushed: 2015-05-20T07:57:11.000Z (over 9 years ago)
- Last Synced: 2024-10-09T16:47:37.679Z (about 1 month ago)
- Language: Ruby
- Size: 152 KB
- Stars: 1
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# s3sync
s3sync is a low-memory, highly parallelizable (via prefixes) s3 client that
syncs a bucket to a local filesystem.## Setup
You need to have ruby installed on your system.
First, install bundler:```
$ gem install bundler
```Then, install s3sync's dependencies:
```
$ cd /path/to/s3sync
$ bundle
```Create a config file, like e.g. config.yml:
```
endpoint: s3-eu-west-1.amazonaws.com
access_key: YOUR ACCESS KEY
secret_key: YOUR SECRET KEY
```## Usage
```
$ ./s3sync --help
Usage: s3sync [options]
--config PATH
--bucket BUCKET
--prefix PREFIX
--path PATH
```The most common way to use it is:
```
$ s3sync --config config.yml --bucket BUCKET --prefix PREFIX --path /path/to/destination
```s3sync currently is a fetch only client, such that it will fetch all files matching the
bucket and optional prefix unless the file already exists on the local filesystem having
the same file size as on s3.# Example
The strength of s3sync is that it works great when parallelized by starting it multiple
times (e.g. via threads) with disjoint prefixes:```
$ s3sync --config config.yml --bucket BUCKET --prefix images/1 --path /path/to/destination
$ s3sync --config config.yml --bucket BUCKET --prefix images/2 --path /path/to/destination
$ s3sync --config config.yml --bucket BUCKET --prefix images/3 --path /path/to/destination
...
```To run it in parallel via your own ruby scripts, you can use something similar to:
```ruby
require "thread"def in_parallel(collection, n)
queue = Queue.newcollection.each { |element| queue.push element }
threads = []
n.times do
threads << Thread.new do
begin
until queue.empty?
yield queue.pop(true)
end
rescue ThreadError => e
# Queue empty
end
end
endthreads.each &:join
endprefixes = (1 .. 9).to_a
in_parallel prefixes, 5 do |prefix|
system "/path/to/s3sync", "--config", "/path/to/config.yml", "--bucket", "BUCKET", "--prefix", "images/#{prefix}", "--path", "/path/to/destination"
end
```