https://github.com/momokatte/s3-bulk-delete
A Go app for bulk-deleting files from an Amazon S3 bucket using the Multi-Object Delete operation.
https://github.com/momokatte/s3-bulk-delete
amazon-s3 amazon-s3-bucket aws-s3 concurrent delete-multiple file-delete-utility file-deletion golang golang-application multithreaded parallel s3
Last synced: 5 months ago
JSON representation
A Go app for bulk-deleting files from an Amazon S3 bucket using the Multi-Object Delete operation.
- Host: GitHub
- URL: https://github.com/momokatte/s3-bulk-delete
- Owner: momokatte
- License: mit
- Created: 2018-12-12T22:39:15.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2019-01-10T03:45:21.000Z (over 7 years ago)
- Last Synced: 2025-08-15T03:55:41.922Z (10 months ago)
- Topics: amazon-s3, amazon-s3-bucket, aws-s3, concurrent, delete-multiple, file-delete-utility, file-deletion, golang, golang-application, multithreaded, parallel, s3
- Language: Go
- Homepage:
- Size: 28.3 KB
- Stars: 8
- Watchers: 1
- Forks: 3
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
s3-bulk-delete
==============
A Go app for bulk-deleting files from an Amazon S3 bucket using the Multi-Object Delete operation.
Supports stop/resume: optionally writes completed batch numbers to a file, and reads from that file on startup to load batch numbers to skip.
Usage
-----
The application will receive a list of S3 keys via standard input, one key per line. Example input:
```
prefix_01/file_01.dat
prefix_01/file_02.dat
prefix_01/file_03.dat
prefix_01/file_04.dat
prefix_02/file_01.dat
prefix_02/file_02.dat
prefix_02/file_03.dat
prefix_02/file_04.dat
```
Provide the AWS region and S3 bucket name via flags:
```bash
s3-keys.txt
```
If you just want to delete a few hundred or thousand keys with a particular prefix without previewing them, the AWS command-line interface provides a [recursive delete option](https://docs.aws.amazon.com/cli/latest/reference/s3/rm.html).
Request rate and concurrency
----------------------------
Through trial and error I have determined that the S3 Multi-Object Delete operation is governed by the [documented rate limit](https://docs.aws.amazon.com/AmazonS3/latest/dev/request-rate-perf-considerations.html) of 3500 DELETE requests per second (per bucket + prefix). It's allowed to exceed that rate for short durations, but sustained deletes of 3500 or more objects per second will eventually result in throttled API responses.
I've also found that API response times degrade with high concurrency and will sometimes result in "internal error" responses. 12 concurrent requests provides a good balance between response time and total batch throughput.
With the default settings, the application will realistically delete 3000 objects per second or just under 11 million objects per hour. If you have hundreds of millions of objects to delete under different prefixes, it would be best to split up the input keys and run multiple instances of the application.
API request costs
-----------------
DELETE requests are free for 'S3 Standard' and 'S3 Intelligent-Tiering' storage classes. Pricing for other storage classes is [here](https://aws.amazon.com/s3/pricing/#Request_pricing).
Roadmap
-------
January 2019:
- IAM role support
- Dockerize and build to DockerHub
- More metrics