An open API service indexing awesome lists of open source software.

https://github.com/reagentx/s3-copy-concurrent

Script to copy files between places in AWS s3 concurrently
https://github.com/reagentx/s3-copy-concurrent

aws boto3 s3

Last synced: about 2 months ago
JSON representation

Script to copy files between places in AWS s3 concurrently

Awesome Lists containing this project

README

          

# s3-copy-concurrent

This is a Python 3 script that provides a function to concurrenrly copy files from one location in AWS S3 to another. Concurrent copy operations on multiple directories expedites copy times. When running, it prints:

364_1 -> 37303
Folder doesnt exist
original/364/1 moves to new/37303
Copying 23490 items using in 32 processes.
23490 copy operations in 1:19

The first line means that we are checking the files under the prefix 364_1 to ensure they are all in 37303. Since 37303 doesn't exist, we copy from original/364/1 to new/37303. This runs in 32 processes, i.e. 32 concurrent copy operations, and completes in 1 minute and 19 seconds.

The only dependency is the AWS s3 library `boto3` which can be installed into your `venv` with `pip install boto3`.

## Analysis

You can analyze the printout by piping the output to a file and running `analyze.py` to get some interesting numbers:

Total copies: 10,130,334
Total seconds: 117,735
Total minutes: 1962.25
Total hours: 32.70
Total days: 1.36