https://github.com/reagentx/s3-copy-concurrent
Script to copy files between places in AWS s3 concurrently
https://github.com/reagentx/s3-copy-concurrent
aws boto3 s3
Last synced: about 2 months ago
JSON representation
Script to copy files between places in AWS s3 concurrently
- Host: GitHub
- URL: https://github.com/reagentx/s3-copy-concurrent
- Owner: ReagentX
- License: gpl-3.0
- Created: 2018-09-15T23:01:59.000Z (almost 8 years ago)
- Default Branch: master
- Last Pushed: 2019-01-19T21:44:13.000Z (over 7 years ago)
- Last Synced: 2025-03-26T14:53:02.830Z (over 1 year ago)
- Topics: aws, boto3, s3
- Language: Python
- Size: 16.6 KB
- Stars: 3
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# s3-copy-concurrent
This is a Python 3 script that provides a function to concurrenrly copy files from one location in AWS S3 to another. Concurrent copy operations on multiple directories expedites copy times. When running, it prints:
364_1 -> 37303
Folder doesnt exist
original/364/1 moves to new/37303
Copying 23490 items using in 32 processes.
23490 copy operations in 1:19
The first line means that we are checking the files under the prefix 364_1 to ensure they are all in 37303. Since 37303 doesn't exist, we copy from original/364/1 to new/37303. This runs in 32 processes, i.e. 32 concurrent copy operations, and completes in 1 minute and 19 seconds.
The only dependency is the AWS s3 library `boto3` which can be installed into your `venv` with `pip install boto3`.
## Analysis
You can analyze the printout by piping the output to a file and running `analyze.py` to get some interesting numbers:
Total copies: 10,130,334
Total seconds: 117,735
Total minutes: 1962.25
Total hours: 32.70
Total days: 1.36