https://github.com/badsyntax/s3-etag
Generate an accurate S3 ETAG in Node.js for any file (including multipart)
https://github.com/badsyntax/s3-etag
aws etag hash md5 s3
Last synced: 11 months ago
JSON representation
Generate an accurate S3 ETAG in Node.js for any file (including multipart)
- Host: GitHub
- URL: https://github.com/badsyntax/s3-etag
- Owner: badsyntax
- License: mit
- Created: 2021-12-16T13:54:43.000Z (about 4 years ago)
- Default Branch: master
- Last Pushed: 2024-05-22T10:15:42.000Z (over 1 year ago)
- Last Synced: 2024-05-22T11:30:01.270Z (over 1 year ago)
- Topics: aws, etag, hash, md5, s3
- Language: TypeScript
- Homepage: https://www.npmjs.com/package/s3-etag
- Size: 194 KB
- Stars: 4
- Watchers: 3
- Forks: 2
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE.md
- Codeowners: CODEOWNERS
Awesome Lists containing this project
README
# S3 ETAG
[](https://github.com/badsyntax/s3-etag/actions/workflows/build-test-publish.yml)
[](https://github.com/badsyntax/s3-etag/actions/workflows/codeql-analysis.yml)
Generate an accurate S3 ETAG in Node.js for any file (including multipart).
Please Note, this only works for unencrypted buckets.
## Installation
```console
npm install s3-etag
```
## Usage
```ts
import { generateETag } from 's3-etag';
// Simple MD5 hash of contents for non-multipart files
const etag = generateETag(absoluteFilePath);
// MD5 hash of combined contents & part number (see below) for multipart files
const partSizeInBytes = 10 * 1024 * 1024; // 10mb
const etag = generateETag(absoluteFilePath, partSizeInBytes);
```
## How It Works
This is a Node.js implementation of [this algorithm](https://stackoverflow.com/a/19896823/492325).
At a high level:
- If no `partSizeInBytes` is specified, return MD5 hash of file contents
- If `partSizeInBytes` is specified:
- Generate parts by comparing `partSizeInBytes` to the file size
- Read each part from the file, MD5 hash the part, and append it to a global combined hash
- Once all parts are processed, generate a new MD5 from the global combined hash, and suffix with the amount of parts
If the partSizeInBytes is unknown, you can find it by using AWS CLI:
1. Use `head-object` to retrieve the object's metadata using the following command, where `raw-files` is the bucket name, and `IMG2345.CR2` is the key
`aws s3api head-object --bucket raw-files --key IMG2345.CR2`
this will return
```
{
"AcceptRanges": "bytes",
"ContentType": "text/html",
"LastModified": "2024-05-22T08:45:10+00:00"
"ContentLength": 28333464,
"ETag": "\"85ae33db28930d3afe594da14cd190bb-2\"",
"VersionId": "28P4UkX5sCO.8vbyMojvecHndkHDwf",
"ContentType": "binary/octet-stream",
"ServerSideEncryption": "AES256",
"Metadata": {}
}
```
2. Look for a `-n` at the end of the eTag, where `n` is a number >= 2, representing the number of parts/chunks. For single part objects, the eTag will simply be the MD5 of the object.
3. Use `aws s3api head-object --bucket raw-files --key IMG2345.CR2 --part-number 1`to get the metadata of the first part/chunk. This will return
```
{
"AcceptRanges": "bytes",
"ContentType": "text/html",
"LastModified": "2024-05-22T08:45:10+00:00"
"ContentLength": 17179870,
"ETag": "\"85ae33db28930d3afe594da14cd190bb-2\"",
"VersionId": "28P4UkX5sCO.8vbyMojvecHndkHDwf",
"ContentType": "binary/octet-stream",
"ServerSideEncryption": "AES256",
"Metadata": {},
"PartsCount": 2
}
```
4. Use the `ContentLength` value as the `partSizeInBytes`
## License
See [LICENSE.md](./LICENSE.md).