{"id":28968353,"url":"https://github.com/widdix/s3-getobject-accelerator","last_synced_at":"2025-07-22T13:33:22.650Z","repository":{"id":147013620,"uuid":"616849131","full_name":"widdix/s3-getobject-accelerator","owner":"widdix","description":"Get large objects from S3 by using parallel byte-rangefetches/parts to improve performance.","archived":false,"fork":false,"pushed_at":"2025-06-03T18:03:44.000Z","size":271,"stargazers_count":17,"open_issues_count":2,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-06-24T09:07:33.438Z","etag":null,"topics":["aws","aws-nodejs","aws-s3"],"latest_commit_sha":null,"homepage":"","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/widdix.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2023-03-21T07:58:41.000Z","updated_at":"2025-05-22T09:25:09.000Z","dependencies_parsed_at":"2024-02-07T12:53:24.917Z","dependency_job_id":"c69499ca-98de-44dd-9736-504eb6211baa","html_url":"https://github.com/widdix/s3-getobject-accelerator","commit_stats":null,"previous_names":[],"tags_count":35,"template":false,"template_full_name":null,"purl":"pkg:github/widdix/s3-getobject-accelerator","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/widdix%2Fs3-getobject-accelerator","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/widdix%2Fs3-getobject-accelerator/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/widdix%2Fs3-getobject-accelerator/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/widdix%2Fs3-getobject-accelerator/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/widdix","download_url":"https://codeload.github.com/widdix/s3-getobject-accelerator/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/widdix%2Fs3-getobject-accelerator/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":266506134,"owners_count":23940019,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-07-22T02:00:09.085Z","response_time":66,"last_error":null,"robots_txt_status":null,"robots_txt_updated_at":null,"robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["aws","aws-nodejs","aws-s3"],"created_at":"2025-06-24T09:07:22.776Z","updated_at":"2025-07-22T13:33:22.612Z","avatar_url":"https://github.com/widdix.png","language":"JavaScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# S3 GetObject Accelerator\n\nGet large objects from S3 by using parallel byte-range fetches/parts without the AWS SDK to improve performance.\n\n\u003e We measured a troughoput of 6.5 Gbit/s on an m5zn.6xlarge in eu-west-1 using this lib with this settings: `{concurrency: 64}`.\n\n## Installation\n\n```bash\nnpm i s3-getobject-accelerator\n```\n\n## Examples\n\n### Compact\n\n```js\nconst {createWriteStream} = require('node:fs');\nconst {pipeline} = require('node:stream');\nconst {download} = require('s3-getobject-accelerator');\n\npipeline(\n  download({bucket: 'bucket', key: 'key', version: 'optional version'}, {partSizeInMegabytes: 8, concurrency: 4}).readStream(),\n  createWriteStream('/tmp/test'),\n  (err) =\u003e {\n    if (err) {\n      console.error('something went wrong', err);\n    } else {\n      console.log('done');\n    }\n  }\n);\n```\n\n### More verbose\n\nGet insights into the part downloads and write to file directly without stream if it is smaller than 1 TiB:\n\n```js\nconst {download} = require('s3-getobject-accelerator');\n\nconst d = download({bucket: 'bucket', key: 'key', version: 'optional version'}, {partSizeInMegabytes: 8, concurrency: 4});\n\nd.on('part:downloading', ({partNo}) =\u003e {\n  console.log('start downloading part', partNo);\n});\nd.on('part:downloaded', ({partNo}) =\u003e {\n  console.log('part downloaded, write to disk next in correct order', partNo);\n});\nd.on('part:writing', ({partNo}) =\u003e {\n  console.log('start writing part to disk', partNo);\n});\nd.on('part:done', ({partNo}) =\u003e {\n  console.log('part written to disk', partNo);\n});\n\nd.meta((err, metadata) =\u003e {\n  if (err) {\n    console.error('something went wrong', err);\n  } else {\n    if (metadata.lengthInBytes \u003e 1024 * 1024 * 1024 * 1024) {\n      console.error('file is larger than 1 TiB');\n    } else {\n      d.file('/tmp/test', (err) =\u003e {\n        if (err) {\n          console.error('something went wrong', err);\n        } else {\n          console.log('done');\n        }\n      });\n    }\n  }\n});\n```\n\n## API\n\n### download(s3source, options)\n\n* `s3source` `\u003cObject\u003e`\n  * `bucket` `\u003cstring\u003e`\n  * `key` `\u003cstring\u003e`\n  * `version` `\u003cstring\u003e` (optional)\n* `options` `\u003cObject\u003e`\n  * `partSizeInMegabytes` `\u003cnumber\u003e` (optional, defaults to uploaded part size)\n  * `concurrency` `\u003cnumber\u003e`\n  * `requestTimeoutInMilliseconds` `\u003cnumber\u003e` Maxium time for a request to complete from start to finish (optional, defaults to 300,000, 0 := no timeout)\n  * `resolveTimeoutInMilliseconds` `\u003cnumber\u003e` Maximum time for a DNS query to resolve (optional, defaults to 3,000, 0 := no timeout)\n  * `connectionTimeoutInMilliseconds` `\u003cnumber\u003e` Maximum time for a socket to connect (optional, defaults to 3,000, 0 := no timeout)\n  * `readTimeoutInMilliseconds` `\u003cnumber\u003e` Maxium time to read the response body (optional, defaults to 300,000, 0 := no timeout)\n  * `dataTimeoutInMilliseconds` `\u003cnumber\u003e` Maxium time between two data events while reading the response body (optional, defaults to 3,000, 0 := no timeout)\n  * `writeTimeoutInMilliseconds` `\u003cnumber\u003e` Maxium time to write the request body (optional, defaults to 300,000, 0 := no timeout)\n  * `region` `\u003cstring\u003e` (optional, defaults to [see AWS credentials \u0026 region](#aws-region))\n  * `v2AwsSdkCredentials` `\u003cAWS.Credentials\u003e` (optional)\n  * `endpointHostname` `\u003cstring\u003e` (optional, defaults to ${bucket}.s3.${region}.amazonaws.com or s3.${region}.amazonaws.com if the bucket contains a dot)\n  * `agent` `\u003chttps.Agent\u003e` (optional)\n* Returns:\n  * `meta(cb)` `\u003cFunction\u003e` Get meta-data before starting the download (downloads the first part and keeps the body in memory until download starts)\n    * `cb(err, metadata)` `\u003cFunction\u003e`\n      * `err` `\u003cError\u003e`\n      * `metadata` `\u003cObject\u003e`\n        * `lengthInBytes` `\u003cnumber\u003e`\n        * `parts` `\u003cnumber\u003e` Number of parts available (optional)\n  * `readStream()` `\u003cFunction\u003e` Start download\n    * Returns: [ReadStream](https://nodejs.org/api/stream.html#class-streamreadable)\n  * `file(path, cb)` `\u003cFunction\u003e` Start download\n    * `path` `\u003cstring\u003e`\n    * `cb(err)` `\u003cFunction\u003e`\n      * `err` `\u003cError\u003e`\n  * `abort([err])` `\u003cFunction\u003e` Abort download\n    * `err` `\u003cError\u003e`\n  * `partsDownloading()` `\u003cFunction\u003e` Number of parts downloading at the moment\n    * Returns `\u003cnumber\u003e`\n  * `addListener(eventName, listener)` See https://nodejs.org/api/events.html#emitteraddlistenereventname-listener\n  * `off(eventName, listener)` See https://nodejs.org/api/events.html#emitteroffeventname-listener\n  * `on(eventName, listener)` See https://nodejs.org/api/events.html#emitteroneventname-listener\n  * `once(eventName, listener)` See https://nodejs.org/api/events.html#emitteronceeventname-listener\n  * `removeListener(eventName, listener)` See https://nodejs.org/api/events.html#emitterremovelistenereventname-listener \n\n## AWS credentials\n\nAWS credentials are fetched in the following order:\n\n1. `options.v2AwsSdkCredentials`\n2. Environment variables\n  * `AWS_ACCESS_KEY_ID`\n  * `AWS_SECRET_ACCESS_KEY`\n  * `AWS_SESSION_TOKEN` (optional)\n3. IMDSv2\n\n## AWS region\n\nAWS region is fetched in the following order:\n\n1. `options.region`\n2. Environment variable `AWS_REGION`\n3. IMDSv2\n\n## Considerations\n\n* Typical sizes `partSizeInMegabytes` are 8 MB or 16 MB. If objects are uploaded using a multipart upload, it’s a good practice to download them in the same part sizes ( do not specify `partSizeInMegabytes`), or at least aligned to part boundaries, for best performance (see https://docs.aws.amazon.com/whitepapers/latest/s3-optimizing-performance-best-practices/use-byte-range-fetches.html).\n* Keep in mind that you pay per GET request to Amazon S3. The smaller the parts, the more expensive a download is.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwiddix%2Fs3-getobject-accelerator","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fwiddix%2Fs3-getobject-accelerator","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwiddix%2Fs3-getobject-accelerator/lists"}