{"id":24512043,"url":"https://github.com/terrykong/s3tos3","last_synced_at":"2025-10-30T11:50:51.717Z","repository":{"id":80974581,"uuid":"174768742","full_name":"terrykong/s3tos3","owner":"terrykong","description":"Sync files and directories between s3 object stores","archived":false,"fork":false,"pushed_at":"2019-03-11T18:52:20.000Z","size":28,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-01-22T00:44:06.567Z","etag":null,"topics":["object-storage","object-store","s3","s3-bucket","s3-buckets","s3-storage","s3cmd","s3fs","sync","utility"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/terrykong.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-03-10T02:25:00.000Z","updated_at":"2019-03-11T22:06:06.000Z","dependencies_parsed_at":null,"dependency_job_id":"4d320e79-e2a6-48d2-85dc-ec308a91124d","html_url":"https://github.com/terrykong/s3tos3","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/terrykong%2Fs3tos3","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/terrykong%2Fs3tos3/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/terrykong%2Fs3tos3/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/terrykong%2Fs3tos3/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/terrykong","download_url":"https://codeload.github.com/terrykong/s3tos3/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243713416,"owners_count":20335567,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["object-storage","object-store","s3","s3-bucket","s3-buckets","s3-storage","s3cmd","s3fs","sync","utility"],"created_at":"2025-01-22T00:44:10.761Z","updated_at":"2025-10-30T11:50:46.696Z","avatar_url":"https://github.com/terrykong.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# s3tos3\nThis is a simple utility function that helps sync files or directories between s3 object stores.\n\nSometimes I would come across a situation where there were two clusters with s3 object stores (not necessarily on the cloud) and I wanted to move content between them. [s4cmd](https://github.com/bloomreach/s4cmd) is a great utility for accessing a single object store, so I wanted this tool to build upon s4cmd.\n\nPrereqs\n===\n+ python3\n+ `pip install s4cmd`\n\nUsage\n===\n```\nUsing this script requires a json config in ~/.s3tos3.config with lists of storages, here's an example:\n\n[\n  {\n    \"AWS_HOST\": \"host1\",\n    \"AWS_ACCESS_KEY_ID\": \"XXXXXXXXXXXXXXXXXXXX\",\n    \"AWS_SECRET_ACCESS_KEY\": \"YYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY\"\n  },\n  {\n    \"AWS_HOST\": \"host2\",\n    \"AWS_ACCESS_KEY_ID\": \"ZZZZZZZZZZZZZZZZZZZZ\",\n    \"AWS_SECRET_ACCESS_KEY\": \"WWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWW\"\n  }\n]\n\n# Lists all buckets in all object stores listed in config\npython s3tos3.py --ls_all \n\n# List content of path in all object stores\npython s3tos3.py --ls_all --ls_path s3://bucket1\n\n# List content of path in the second object store\npython s3tos3.py --ls_idx 1 --ls_path s3://bucket1\n\n# Dry run of the sync. src_idx and dest_idx refer to the index of the object store within the config\n# This will copy s3://root/file.txt -\u003e s3://workspace/file.txt\npython s3tos3.py --src_idx 0 --dest_idx 1 --src_path s3://root/file.txt --dest_path s3://workspace/ --dry_run\npython s3tos3.py --src_idx 0 --dest_idx 1 --src_path s3://root/file.txt --dest_path s3://workspace/file.txt --dry_run\n\n# Dry run of the sync. src_idx and dest_idx refer to the index of the object store within the config\n# This will copy s3://root/* -\u003e s3://workspace/*\npython s3tos3.py --src_idx 0 --dest_idx 1 --src_path s3://root/ --dest_path s3://workspace/ --dry_run\n\n# You can also pass forward args to s4cmd. Any arg that this script does not consume (with the exception of --dry_run) \n#  are passed straight to s4cmd\npython s3tos3.py --src_idx 0 --dest_idx 1 --src_path s3://root/ --dest_path s3://workspace/ --multipart-split-size=100000000 -c 8 -t 3\n\n# If / doesn't have enough mem for your copies and you want to use a different tmp dir then\npython s3tos3.py --src_idx 0 --dest_idx 1 --src_path s3://root/file.txt --dest_path s3://workspace/ --tmp_dir /a/different/tmp/dir\n```\n\nTODOs\n===\n+ Right now the script will copy one file to disk at a time and then sync that file to the destination object store. Might want to add an option to do these in parallel.\n+ Even better, avoid copying to local disk :)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fterrykong%2Fs3tos3","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fterrykong%2Fs3tos3","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fterrykong%2Fs3tos3/lists"}