https://github.com/engineervix/s3tob2
transfer files from Amazon S3 to Backblaze B2
https://github.com/engineervix/s3tob2
Last synced: 7 months ago
JSON representation
transfer files from Amazon S3 to Backblaze B2
- Host: GitHub
- URL: https://github.com/engineervix/s3tob2
- Owner: engineervix
- Created: 2025-07-29T09:36:07.000Z (8 months ago)
- Default Branch: main
- Last Pushed: 2025-07-29T09:50:09.000Z (8 months ago)
- Last Synced: 2025-08-14T07:47:49.573Z (7 months ago)
- Language: Python
- Size: 13.7 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# S3 to B2 Transfer Script
[](https://github.com/engineervix/s3tob2/actions/workflows/main.yml)
A Python script to efficiently transfer files from Amazon S3 to Backblaze B2 storage with support for concurrent transfers, checksum verification, and resume capability.
## Features
- **Concurrent transfers** - Transfer multiple files in parallel for faster migration
- **Resume support** - Skip files that already exist in B2 (configurable)
- **Checksum verification** - Verify file integrity after transfer
- **Move or copy** - Option to delete files from S3 after successful transfer
- **Prefix filtering** - Transfer only files matching a specific prefix
- **Progress logging** - Detailed logs with transfer progress and summary
- **Error handling** - Graceful handling of transfer failures with detailed error reporting
## Requirements
- Python 3.10+
- Required packages:
- boto3
- b2sdk
- python-dotenv
```bash
pip install -r requirements.txt
```
## Configuration
Create a `.env` file in the same directory as the script with the following variables:
```env
# Required S3 settings
S3_BUCKET=your-s3-bucket-name
# Required B2 settings
B2_BUCKET=your-b2-bucket-name
B2_APPLICATION_KEY_ID=your-b2-key-id
B2_APPLICATION_KEY=your-b2-application-key
# Optional S3 settings
S3_PREFIX=path/to/files/ # Filter objects by prefix (default: "")
AWS_ACCESS_KEY_ID=your-aws-key # Uses default AWS credentials if not set
AWS_SECRET_ACCESS_KEY=your-aws-secret # Uses default AWS credentials if not set
AWS_REGION=us-east-1 # AWS region (default: us-east-1)
# Optional transfer settings
DELETE_FROM_S3=false # Delete from S3 after transfer (default: false)
MAX_WORKERS=5 # Number of concurrent transfers (default: 5)
VERIFY_CHECKSUMS=true # Verify file integrity (default: true)
SKIP_EXISTING=true # Skip files that exist in B2 (default: true)
```
## Usage
1. Set up your `.env` file with the required credentials
2. Run the script:
```bash
python s3tob2.py
```
The script will:
- List all objects in the S3 bucket (with optional prefix filter)
- Transfer each file to B2, skipping existing files if configured
- Log progress to both console and `s3_to_b2_transfer.log`
- Display a summary of successful and failed transfers
## AWS Credentials
The script supports two methods for AWS authentication:
1. **Explicit credentials** - Set `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` in `.env`
2. **Default credentials** - Leave AWS credentials empty to use:
- IAM instance role (when running on EC2)
- AWS credentials file (`~/.aws/credentials`)
- Environment variables
## B2 Setup
1. Create a B2 application key with the following capabilities:
- `listBuckets`
- `listFiles`
- `readFiles`
- `writeFiles`
- `deleteFiles` (only needed if `DELETE_FROM_S3=true`)
2. Create or use an existing B2 bucket for the destination
## Transfer Options
### Copy vs Move
- **Copy mode** (`DELETE_FROM_S3=false`) - Files remain in S3 after transfer
- **Move mode** (`DELETE_FROM_S3=true`) - Files are deleted from S3 after successful transfer
### Resume Capability
When `SKIP_EXISTING=true`, the script checks if each file already exists in B2 before transferring. This allows you to resume interrupted transfers without re-uploading files.
### Concurrent Transfers
Adjust `MAX_WORKERS` based on your network capacity and file sizes. Higher values may improve throughput for many small files, while lower values may be better for large files.
## Logging
The script creates detailed logs in `s3_to_b2_transfer.log` including:
- File transfer progress
- Error messages for failed transfers
- Transfer summary with success/failure counts
## Error Handling
The script continues transferring remaining files even if individual transfers fail. Failed transfers are logged and counted in the final summary.
## Example Output
```
Configuration loaded:
S3 Bucket: my-s3-bucket
S3 Prefix: 'uploads/2024/'
AWS Region: us-east-1
B2 Bucket: my-b2-bucket
Delete from S3: False
Max Workers: 5
Verify Checksums: True
Skip Existing: True
2024-01-15 10:30:45 - INFO - Found 150 objects in S3 bucket
2024-01-15 10:30:45 - INFO - Starting transfer of 150 objects...
2024-01-15 10:30:46 - INFO - Downloading uploads/2024/file1.jpg (1048576 bytes) from S3...
2024-01-15 10:30:47 - INFO - Uploading uploads/2024/file1.jpg to B2...
2024-01-15 10:30:48 - INFO - Successfully transferred uploads/2024/file1.jpg
...
==================================================
TRANSFER SUMMARY
==================================================
Total objects: 150
Successful transfers: 148
Failed transfers: 2
Action: Copy
```
## Troubleshooting
### Authentication Errors
- Verify your AWS and B2 credentials are correct
- Ensure your B2 application key has the required capabilities
- Check that the bucket names exist and are accessible
### Transfer Failures
- Check the log file for detailed error messages
- Verify network connectivity to both S3 and B2
- Ensure sufficient permissions on both source and destination
### Performance Issues
- Adjust `MAX_WORKERS` based on your network and file characteristics
- For very large files, consider reducing concurrent transfers
- Monitor memory usage for large file transfers