https://github.com/auth0/node-s3-client

high level amazon s3 client for node.js
https://github.com/auth0/node-s3-client
Last synced: 3 months ago
JSON representation
high level amazon s3 client for node.js
Host: GitHub
URL: https://github.com/auth0/node-s3-client
Owner: auth0
License: mit
Fork: true (andrewrk/node-s3-client)
Created: 2018-11-01T21:09:42.000Z (about 7 years ago)
Default Branch: master
Last Pushed: 2020-09-24T13:31:49.000Z (over 5 years ago)
Last Synced: 2024-12-27T10:17:19.030Z (about 1 year ago)
Language: JavaScript
Homepage:
Size: 255 KB
Stars: 41
Watchers: 5
Forks: 25
Open Issues: 4
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
Awesome Lists containing this project

README

          # High Level Amazon S3 Client

Fork from https://github.com/andrewrk/node-s3-client

## Installation

`npm install @auth0/s3 --save`

## Features

 * Automatically retry a configurable number of times when S3 returns an error.

 * Includes logic to make multiple requests when there is a 1000 object limit.

 * Ability to set a limit on the maximum parallelization of S3 requests.

   Retries get pushed to the end of the parallelization queue.

 * Ability to sync a dir to and from S3.

 * Progress reporting.

 * Supports files of any size (up to S3's maximum 5 TB object size limit).

 * Uploads large files quickly using parallel multipart uploads.

 * Uses heuristics to compute multipart ETags client-side to avoid uploading

   or downloading files unnecessarily.

 * Automatically provide Content-Type for uploads based on file extension.

 * Support third-party S3-compatible platform services like Ceph

See also the companion CLI tool which is meant to be a drop-in replacement for

s3cmd: [s3-cli](https://github.com/andrewrk/node-s3-cli).

## Synopsis

### Create a client

```js

var s3 = require('s3');

var client = s3.createClient({

  maxAsyncS3: 20,     // this is the default

  s3RetryCount: 3,    // this is the default

  s3RetryDelay: 1000, // this is the default

  multipartUploadThreshold: 20971520, // this is the default (20 MB)

  multipartUploadSize: 15728640, // this is the default (15 MB)

  s3Options: {

    accessKeyId: "your s3 key",

    secretAccessKey: "your s3 secret",

    region: "your region",

    // endpoint: 's3.yourdomain.com',

    // sslEnabled: false

    // any other options are passed to new AWS.S3()

    // See: http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/Config.html#constructor-property

  },

});

```

### Create a client from existing AWS.S3 object

```js

var s3 = require('s3');

var awsS3Client = new AWS.S3(s3Options);

var options = {

  s3Client: awsS3Client,

  // more options available. See API docs below.

};

var client = s3.createClient(options);

```

### Upload a file to S3

```js

var params = {

  localFile: "some/local/file",

  s3Params: {

    Bucket: "s3 bucket name",

    Key: "some/remote/file",

    // other options supported by putObject, except Body and ContentLength.

    // See: http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3.html#putObject-property

  },

};

var uploader = client.uploadFile(params);

uploader.on('error', function(err) {

  console.error("unable to upload:", err.stack);

});

uploader.on('progress', function() {

  console.log("progress", uploader.progressMd5Amount,

            uploader.progressAmount, uploader.progressTotal);

});

uploader.on('end', function() {

  console.log("done uploading");

});

```

### Download a file from S3

```js

var params = {

  localFile: "some/local/file",

  s3Params: {

    Bucket: "s3 bucket name",

    Key: "some/remote/file",

    // other options supported by getObject

    // See: http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3.html#getObject-property

  },

};

var downloader = client.downloadFile(params);

downloader.on('error', function(err) {

  console.error("unable to download:", err.stack);

});

downloader.on('progress', function() {

  console.log("progress", downloader.progressAmount, downloader.progressTotal);

});

downloader.on('end', function() {

  console.log("done downloading");

});

```

### Sync a directory to S3

```js

var params = {

  localDir: "some/local/dir",

  deleteRemoved: true, // default false, whether to remove s3 objects

                       // that have no corresponding local file.

  s3Params: {

    Bucket: "s3 bucket name",

    Prefix: "some/remote/dir/",

    // other options supported by putObject, except Body and ContentLength.

    // See: http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3.html#putObject-property

  },

};

var uploader = client.uploadDir(params);

uploader.on('error', function(err) {

  console.error("unable to sync:", err.stack);

});

uploader.on('progress', function() {

  console.log("progress", uploader.progressAmount, uploader.progressTotal);

});

uploader.on('end', function() {

  console.log("done uploading");

});

```

## Tips

 * Consider increasing the socket pool size in the `http` and `https` global

   agents. This will improve bandwidth when using `uploadDir` and `downloadDir`

   functions. For example:

   ```js

   http.globalAgent.maxSockets = https.globalAgent.maxSockets = 20;

   ```

## API Documentation

### s3.AWS

This contains a reference to the aws-sdk module. It is a valid use case to use

both this module and the lower level aws-sdk module in tandem.

### s3.createClient(options)

Creates an S3 client.

`options`:

 * `s3Client` - optional, an instance of `AWS.S3`. Leave blank if you provide `s3Options`.

 * `s3Options` - optional. leave blank if you provide `s3Client`.

   - See AWS SDK documentation for available options which are passed to `new AWS.S3()`:

     http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/Config.html#constructor-property

 * `maxAsyncS3` - maximum number of simultaneous requests this client will

   ever have open to S3. defaults to `20`.

 * `s3RetryCount` - how many times to try an S3 operation before giving up.

   Default 3.

 * `s3RetryDelay` - how many milliseconds to wait before retrying an S3

   operation. Default 1000.

 * `multipartUploadThreshold` - if a file is this many bytes or greater, it

   will be uploaded via a multipart request. Default is 20MB. Minimum is 5MB.

   Maximum is 5GB.

 * `multipartUploadSize` - when uploading via multipart, this is the part size.

   The minimum size is 5MB. The maximum size is 5GB. Default is 15MB. Note that

   S3 has a maximum of 10000 parts for a multipart upload, so if this value is

   too small, it will be ignored in favor of the minimum necessary value

   required to upload the file.

### s3.getPublicUrl(bucket, key, [bucketLocation])

 * `bucket` S3 bucket

 * `key` S3 key

 * `bucketLocation` string, one of these:

   - "" (default) - US Standard

   - "eu-west-1"

   - "us-west-1"

   - "us-west-2"

   - "ap-southeast-1"

   - "ap-southeast-2"

   - "ap-northeast-1"

   - "sa-east-1"

You can find out your bucket location programatically by using this API:

http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3.html#getBucketLocation-property

returns a string which looks like this:

`https://s3.amazonaws.com/bucket/key`

or maybe this if you are not in US Standard:

`https://s3-eu-west-1.amazonaws.com/bucket/key`

### s3.getPublicUrlHttp(bucket, key)

 * `bucket` S3 Bucket

 * `key` S3 Key

Works for any region, and returns a string which looks like this:

`http://bucket.s3.amazonaws.com/key`

### client.uploadFile(params)

See http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3.html#putObject-property

`params`:

 * `s3Params`: params to pass to AWS SDK `putObject`.

 * `localFile`: path to the file on disk you want to upload to S3.

 * (optional) `defaultContentType`: Unless you explicitly set the `ContentType`

   parameter in `s3Params`, it will be automatically set for you based on the

   file extension of `localFile`. If the extension is unrecognized,

   `defaultContentType` will be used instead. Defaults to

   `application/octet-stream`.

The difference between using AWS SDK `putObject` and this one:

 * This works with files, not streams or buffers.

 * If the reported MD5 upon upload completion does not match, it retries.

 * If the file size is large enough, uses multipart upload to upload parts in

   parallel.

 * Retry based on the client's retry settings.

 * Progress reporting.

 * Sets the `ContentType` based on file extension if you do not provide it.

Returns an `EventEmitter` with these properties:

 * `progressMd5Amount`

 * `progressAmount`

 * `progressTotal`

And these events:

 * `'error' (err)`

 * `'end' (data)` - emitted when the file is uploaded successfully

   - `data` is the same object that you get from `putObject` in AWS SDK

 * `'progress'` - emitted when `progressMd5Amount`, `progressAmount`, and

   `progressTotal` properties change. Note that it is possible for progress to

   go backwards when an upload fails and must be retried.

 * `'fileOpened' (fdSlicer)` - emitted when `localFile` has been opened. The file

   is opened with the [fd-slicer](https://github.com/andrewrk/node-fd-slicer)

   module because we might need to read from multiple locations in the file at

   the same time. `fdSlicer` is an object for which you can call

   `createReadStream(options)`. See the fd-slicer README for more information.

 * `'fileClosed'` - emitted when `localFile` has been closed.

And these methods:

 * `abort()` - call this to stop the find operation.

### client.downloadFile(params)

See http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3.html#getObject-property

`params`:

 * `localFile` - the destination path on disk to write the s3 object into

 * `s3Params`: params to pass to AWS SDK `getObject`.

The difference between using AWS SDK `getObject` and this one:

 * This works with a destination file, not a stream or a buffer.

 * If the reported MD5 upon download completion does not match, it retries.

 * Retry based on the client's retry settings.

 * Progress reporting.

Returns an `EventEmitter` with these properties:

 * `progressAmount`

 * `progressTotal`

And these events:

 * `'error' (err)`

 * `'end'` - emitted when the file is downloaded successfully

 * `'progress'` - emitted when `progressAmount` and `progressTotal`

   properties change.

### client.downloadBuffer(s3Params)

http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3.html#getObject-property

 * `s3Params`: params to pass to AWS SDK `getObject`.

The difference between using AWS SDK `getObject` and this one:

 * This works with a buffer only.

 * If the reported MD5 upon download completion does not match, it retries.

 * Retry based on the client's retry settings.

 * Progress reporting.

Returns an `EventEmitter` with these properties:

 * `progressAmount`

 * `progressTotal`

And these events:

 * `'error' (err)`

 * `'end' (buffer)` - emitted when the file is downloaded successfully.

   `buffer` is a `Buffer` containing the object data.

 * `'progress'` - emitted when `progressAmount` and `progressTotal`

   properties change.

### client.downloadStream(s3Params)

http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3.html#getObject-property

 * `s3Params`: params to pass to AWS SDK `getObject`.

The difference between using AWS SDK `getObject` and this one:

 * This works with a stream only.

If you want retries, progress, or MD5 checking, you must code it yourself.

Returns a `ReadableStream` with these additional events:

 * `'httpHeaders' (statusCode, headers)` - contains the HTTP response

   headers and status code.

### client.listObjects(params)

See http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3.html#listObjects-property

`params`:

 * `s3Params` - params to pass to AWS SDK `listObjects`.

 * (optional) `recursive` - `true` or `false` whether or not you want to recurse

   into directories. Default `false`.

Note that if you set `Delimiter` in `s3Params` then you will get a list of

objects and folders in the directory you specify. You probably do not want to

set `recursive` to `true` at the same time as specifying a `Delimiter` because

this will cause a request per directory. If you want all objects that share a

prefix, leave the `Delimiter` option `null` or `undefined`.

Be sure that `s3Params.Prefix` ends with a trailing slash (`/`) unless you

are requesting the top-level listing, in which case `s3Params.Prefix` should

be empty string.

The difference between using AWS SDK `listObjects` and this one:

 * Retries based on the client's retry settings.

 * Supports recursive directory listing.

 * Makes multiple requests if the number of objects to list is greater than 1000.

Returns an `EventEmitter` with these properties:

 * `progressAmount`

 * `objectsFound`

 * `dirsFound`

And these events:

 * `'error' (err)`

 * `'end'` - emitted when done listing and no more 'data' events will be emitted.

 * `'data' (data)` - emitted when a batch of objects are found. This is

   the same as the `data` object in AWS SDK.

 * `'progress'` - emitted when `progressAmount`, `objectsFound`, and

   `dirsFound` properties change.

And these methods:

 * `abort()` - call this to stop the find operation.

### client.deleteObjects(s3Params)

See http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3.html#deleteObjects-property

`s3Params` are the same.

The difference between using AWS SDK `deleteObjects` and this one:

 * Retry based on the client's retry settings.

 * Make multiple requests if the number of objects you want to delete is

   greater than 1000.

Returns an `EventEmitter` with these properties:

 * `progressAmount`

 * `progressTotal`

And these events:

 * `'error' (err)`

 * `'end'` - emitted when all objects are deleted.

 * `'progress'` - emitted when the `progressAmount` or `progressTotal` properties change.

 * `'data' (data)` - emitted when a request completes. There may be more.

### client.uploadDir(params)

Syncs an entire directory to S3.

`params`:

 * `localDir` - source path on local file system to sync to S3

 * `s3Params`

   - `Prefix` (required)

   - `Bucket` (required)

 * (optional) `deleteRemoved` - delete s3 objects with no corresponding local file.

   default false

 * (optional) `getS3Params` - function which will be called for every file that

   needs to be uploaded. You can use this to skip some files. See below.

 * (optional) `defaultContentType`: Unless you explicitly set the `ContentType`

   parameter in `s3Params`, it will be automatically set for you based on the

   file extension of `localFile`. If the extension is unrecognized,

   `defaultContentType` will be used instead. Defaults to

   `application/octet-stream`.

 * (optional) `followSymlinks` - Set this to `false` to ignore symlinks.

   Defaults to `true`.

```js

function getS3Params(localFile, stat, callback) {

  // call callback like this:

  var err = new Error(...); // only if there is an error

  var s3Params = { // if there is no error

    ContentType: getMimeType(localFile), // just an example

  };

  // pass `null` for `s3Params` if you want to skip uploading this file.

  callback(err, s3Params);

}

```

Returns an `EventEmitter` with these properties:

 * `progressAmount`

 * `progressTotal`

 * `progressMd5Amount`

 * `progressMd5Total`

 * `deleteAmount`

 * `deleteTotal`

 * `filesFound`

 * `objectsFound`

 * `doneFindingFiles`

 * `doneFindingObjects`

 * `doneMd5`

And these events:

 * `'error' (err)`

 * `'end'` - emitted when all files are uploaded

 * `'progress'` - emitted when any of the above progress properties change.

 * `'fileUploadStart' (localFilePath, s3Key)` - emitted when a file begins

   uploading.

 * `'fileUploadEnd' (localFilePath, s3Key)` - emitted when a file successfully

   finishes uploading.

`uploadDir` works like this:

 0. Start listing all S3 objects for the target `Prefix`. S3 guarantees

    returned objects to be in sorted order.

 0. Meanwhile, recursively find all files in `localDir`.

 0. Once all local files are found, we sort them (the same way that S3 sorts).

 0. Next we iterate over the sorted local file list one at a time, computing

    MD5 sums.

 0. Now S3 object listing and MD5 sum computing are happening in parallel. As

    each operation progresses we compare both sorted lists side-by-side,

    iterating over them one at a time, uploading files whose MD5 sums don't

    match the remote object (or the remote object is missing), and, if

    `deleteRemoved` is set, deleting remote objects whose corresponding local

    files are missing.

### client.downloadDir(params)

Syncs an entire directory from S3.

`params`:

 * `localDir` - destination directory on local file system to sync to

 * `s3Params`

   - `Prefix` (required)

   - `Bucket` (required)

 * (optional) `deleteRemoved` - delete local files with no corresponding s3 object. default `false`

 * (optional) `getS3Params` - function which will be called for every object that

   needs to be downloaded. You can use this to skip downloading some objects.

   See below.

 * (optional) `followSymlinks` - Set this to `false` to ignore symlinks.

   Defaults to `true`.

```js

function getS3Params(localFile, s3Object, callback) {

  // localFile is the destination path where the object will be written to

  // s3Object is same as one element in the `Contents` array from here:

  // http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3.html#listObjects-property

  // call callback like this:

  var err = new Error(...); // only if there is an error

  var s3Params = { // if there is no error

    VersionId: "abcd", // just an example

  };

  // pass `null` for `s3Params` if you want to skip downloading this object.

  callback(err, s3Params);

}

```

Returns an `EventEmitter` with these properties:

 * `progressAmount`

 * `progressTotal`

 * `progressMd5Amount`

 * `progressMd5Total`

 * `deleteAmount`

 * `deleteTotal`

 * `filesFound`

 * `objectsFound`

 * `doneFindingFiles`

 * `doneFindingObjects`

 * `doneMd5`

And these events:

 * `'error' (err)`

 * `'end'` - emitted when all files are downloaded

 * `'progress'` - emitted when any of the progress properties above change

 * `'fileDownloadStart' (localFilePath, s3Key)` - emitted when a file begins

   downloading.

 * `'fileDownloadEnd' (localFilePath, s3Key)` - emitted when a file successfully

   finishes downloading.

`downloadDir` works like this:

 0. Start listing all S3 objects for the target `Prefix`. S3 guarantees

    returned objects to be in sorted order.

 0. Meanwhile, recursively find all files in `localDir`.

 0. Once all local files are found, we sort them (the same way that S3 sorts).

 0. Next we iterate over the sorted local file list one at a time, computing

    MD5 sums.

 0. Now S3 object listing and MD5 sum computing are happening in parallel. As

    each operation progresses we compare both sorted lists side-by-side,

    iterating over them one at a time, downloading objects whose MD5 sums don't

    match the local file (or the local file is missing), and, if

    `deleteRemoved` is set, deleting local files whose corresponding objects

    are missing.

### client.deleteDir(s3Params)

Deletes an entire directory on S3.

`s3Params`:

 * `Bucket`

 * `Prefix`

 * (optional) `MFA`

Returns an `EventEmitter` with these properties:

 * `progressAmount`

 * `progressTotal`

And these events:

 * `'error' (err)`

 * `'end'` - emitted when all objects are deleted.

 * `'progress'` - emitted when the `progressAmount` or `progressTotal` properties change.

`deleteDir` works like this:

 0. Start listing all objects in a bucket recursively. S3 returns 1000 objects

    per response.

 0. For each response that comes back with a list of objects in the bucket,

    immediately send a delete request for all of them.

### client.copyObject(s3Params)

See http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3.html#copyObject-property

`s3Params` are the same. Don't forget that `CopySource` must contain the

source bucket name as well as the source key name.

The difference between using AWS SDK `copyObject` and this one:

 * Retry based on the client's retry settings.

Returns an `EventEmitter` with these events:

 * `'error' (err)`

 * `'end' (data)`

### client.moveObject(s3Params)

See http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3.html#copyObject-property

`s3Params` are the same. Don't forget that `CopySource` must contain the

source bucket name as well as the source key name.

Under the hood, this uses `copyObject` and then `deleteObjects` only if the

copy succeeded.

Returns an `EventEmitter` with these events:

 * `'error' (err)`

 * `'copySuccess' (data)`

 * `'end' (data)`

## Examples

### Check if a file exists in S3

Using the AWS SDK, you can send a HEAD request, which will tell you if a file exists at `Key`.

See http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3.html#headObject-property

```js

var client = require('s3').createClient({ /* options */ });

client.s3.headObject({

  Bucket: 's3 bucket name',

  Key: 'some/remote/file'

}, function(err, data) {

  if (err) {

    // file does not exist (err.statusCode == 404)

    return;

  }

  // file exists

});

```

## Testing

`aws-vault exec  -- S3_KEY= npm test`

Tests upload and download large amounts of data to and from S3. The test

timeout is set to 40 seconds because Internet connectivity waries wildly.
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/auth0/node-s3-client

Awesome Lists containing this project

README