https://github.com/scality/s3blaster

S3 Blaster
https://github.com/scality/s3blaster
Last synced: 4 months ago
JSON representation
S3 Blaster
Host: GitHub
URL: https://github.com/scality/s3blaster
Owner: scality
License: apache-2.0
Created: 2016-06-01T23:38:00.000Z (about 10 years ago)
Default Branch: development/8.0
Last Pushed: 2022-07-22T03:18:32.000Z (almost 4 years ago)
Last Synced: 2025-04-22T01:38:37.725Z (about 1 year ago)
Language: JavaScript
Size: 224 KB
Stars: 10
Watchers: 58
Forks: 2
Open Issues: 3
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project

README

          # S3Blaster program

S3Blaster program measures performance of S3 servers. For a given test scenario,

S3Blaster emits requests to S3 servers. By collecting responses, S3Blaster

computes different metrics to measure performance of S3.

## Performance metrics

S3Blaster computes two performance metrics:

* Latency of a request

    Latency of a request is the interval between the time the request is sent to

S3 and the time S3Blaster receives the request's response. S3Blaster

computes different statistics on request latencies: average, standard-deviation,

percentiles.

* Number of operations per second

    This is the number of responses that S3Blaster receives per second.

S3Blaster computes average number of operations/s.

## Working schema

![Schema](./img/s3blaster.png)

Figure 1: Working schema of S3Blaster program

A working schema of S3Blaster is shown in Figure 1. Note, if there is no fork

operation (clusters), S3Blaster workers in the schema are replaced by the

S3Blaster Master. Roles of each component are described below.

### S3Blaster Master

Besides forking workers (if relevant), S3Blaster Master does following jobs:

* Monitoring resources' consuming (CPU & Memory usage) of running processes on

    all servers.

* Gathering statistics results computed from all workers.

* Forwarding obtained results to `Plotter` that plots appropriated graphs.

### S3Blaster Worker

* Based on programmed scenarios, each worker sends requests to S3 Servers and

    then computes performance metrics based on responses received from the

    servers.

* Forwarding obtained results to `Plotter` that plots appropriated graphs.

### Plotter

The component displays performance metrics and monitored resources' consuming

on graphs.

## Installation

Simple as `npm install`

### Plot dependencies

Output graphs would be created using `gnuplot` program. To install it on Ubuntu

```

sudo apt-get install libcairo2-dev

sudo apt-get install libpango1.0-dev

sudo apt-get install gnuplot

```

For other OS, see Cairo (https://cairographics.org/download/) and

gnuplot (http://www.gnuplot.info/download.html)

## Program structure

The program consists of three main files:

* `s3blaster.js`

    is the **core** of S3Blaster. S3Blaster master does the following jobs:

    * Forking S3Blaster workers (if relevant)

    * Gathering statistics results computed from all workers

    * Storing statistics in files that are used for plotting graphs. There are

    2 files `.txt` containing statistics whose suffix names are

    *_ssm* (for system monitoring) and *_summary* (for summary results).

    Whereas, S3Blaster worker does the following jobs:

    * Sending/receiving requests to/from S3

    * Calculating statistics on request latency, number of operations/s

    * Storing statistics in files that are used for plotting graphs. There are

    5 files `.txt` containing statistics whose suffix names are

    *_live*, *_stats*, *_thread*, *_pdf_func*, *_cdf_func*.

* `config.js`

    get all input parameters for S3Blaster. All input parameters for s3blaster

    can be assigned via the following ways with a decreasing priority

    level, i.e. parameter will get value given by the highest priority one:

    - command line

    - a JSON file defined by `--config` command, `config.json` by default

    - initialized in `config.js` file

    For credential keys `accessKeyId` and `secretAccessKey`, they can be

    assigned by the following ways with a decreasing priority level:

    - command line:

    ```

    --accessKeyId = your_access_key_id

    --secretAccessKey = your_secret_access_key

    ```

    - config JSON file, in "acc" sub-object:

    ```

    "accessKeyId": "your_access_key_id"

    "secretAccessKey": "your_secret_access_key"

    ```

    - environment variable:

    ```

    ACCESSKEYID=your_access_key_id

    SECRETACCESSKEY=your_secret_access_key

    ```

    - credential file defined by `--accFile`, default file `~/.s3cfg`

    ```

    access_key = your_access_key_id

    secret_key = your_secret_access_key

    ```

* `plotter.js`

    plots graphs showing statistics measured by S3Blaster. Plotter generates

    two graphs for *_ssm* and *_summary* files sent from S3Blaster Master.

    For each S3Blaster worker, Plotter could generate up to 5 graph files

    whose suffix names are *_stats*, *_thread*, *_size*, *_pdf*, *_cdf*.

An additional file `runS3Blaster.js` allows executing S3Blaster with input

parameters stored in an object that is used for pre-programmed tests (see

    [Pre-programmed tests](#pre-programmed-tests))

Besides, monitoring of resources' consuming of processes on servers and

monintoring storage size of Metadata store are implemented in files in `ssm`

folder.

## How to run S3Blaster

S3Blaster is executed with default input parameters with: `npm start`

By default, output results are stored in a sub-folder of the `results`

directory. A default name for the sub-folder is the current date. A specified

output folder can be defined by using the environment variable FOLDERNAME, e.g.

`FOLDERNAME=release/today npm start`

Then, all output files will be stored in the `./results/release/today`

directory.

### Plot manipulation

There are two plot actions applicable for simulation results. Configuration

info and results of a simulation is stored in an `.id` file whose prefix is

same as stats files.

* Re-plot graphs

    All graphs of a simulation can be re-plotted with modifying on

    - Prefix for output graph files

    - Type of output files (pdf or png)

    - Font size

    - Font type.

    CLI for re-plot:

    ```

    node lib/plotter.js --replot path_to_id_file --outputPrefixName 

    --outputType  --fontType  --fontSize 

    ```

* Performance comparison

    Performance of different tests are shown in a same file. The graph file

    contains two pages for latency and #operations/s performance metrics.

    In each page, graphs are placed in a layout where each column corresponds

    to a request type, e.g. put, get

    Each graph displays performances for an object size. The x-axis is number of

    parallel requests. Each curve in the graph shows performance of each test.

    CLI for comparison:

    ```

    node lib/plotter.js --compare

    path_to_id_file_test_1,path_to_id_file_test_2,path_to_id_file_test_3

    ```

### Main input parameters

There are multiple configurable input parameters as shown in

`config.json` file. However, some of them should be correctly

assigned for an appropriated measurement:

Name | Related keys

--- | ---

Credential keys | `accessKeyId`, `secretAccessKey`

S3 servers' address | `host`, `port`

Requests for measuring | `requests`

Schedule of requests | `schedule`

Buckets | `bucketPrefix`, `bucketsNb`

Objects | `objectsNb`, `sizes`, `unit`

Measurement time | `runTime`, `observationsNb`

Monitoring | `ssm`, `servers`, `user`, `publicKey`, `privateKey`, `mdPath`

Multipart Upload | `partSizes`

To run S3Blaster with other specified parameters, please see [configuration]

(./CONFIGURATION.md).

## Output: statistics and graphs

For each test, s3blaster generates files containing statistics results (`.txt`)

and graph files (`.pdf` or `.png`). Main configuration  information is shown on

top of each statistics/graph file.

* Summary results

    Summary results are shown in *_summary* text/graph files. It contains

performance metrics that are computed based on results from all workers (if

relevant). Moreover, if the monitoring of servers was active, the file show

amount of storage using by metadata. The amount is computed as the difference

of meta-stores between before and after test.

* Detailed result files

    For each S3Blaster worker, there are five `.txt` files and five graph files.

    * *_live* file contains real-time performance metrics. The 'real-time' means

    that the metrics are calculated from requests/responses received within the

    last duration time. The time duration is determined by `rate` in the

    `lib/config.js` file. Results of the file will be used to generate *_live*

    graph.

    * *_stats* file contains performance metrics computed after some given

    number of requests successes. A graph file is generated for each such file.

    * *_thread* file contains statistics vs. number of parallel requests and

        number of object sizes. A graph file is generated for each such file.

    * *_pdf_func* file contains estimated probability distribution function. A

    graph file is generated for each such file.

    * *_cdf_func* file contains estimated cumulative distribution function. A

    graph file is generated for each such file.

In each graph file, there could be some figures that are located in rows and

columns layout. Graphs on a row/column have a same configuration parameter,

e.g. same type of request, same object size etc.

* Monitored resources' consuming

    If the monitoring of resources' consuming by running processes was active,

collected  are stored in *_ssm.txt* file. These results combined with *_live*

statistics from each worker will be shown in a single graph *_live* file. In

that file, measured performance and resources' consuming are shown versus

elapsed time.

## Examples

* Example 1

Single connector, single bucket with 1000 objects, mixed requests, object 8KB,

run in 100 seconds

```javascript

node index.js --host localhost --port 8000

--bucketsNb 1 --objectsNb 1000 --sizes 8 --unit KB --requests

put,get,delete --schedule mixed --runTime 100 --output example1

```

* Example 2

Haproxy connector with port 80, multiple buckets, sequential requests, object

8KB & 100KB

```javascript

node index.js --host localhost --port 80

--bucketsNb 100 --objectsNb 1000 --sizes 8,100 --unit KB --requests

put,get,delete --schedule each --runTime 100 --output example2

```

* Example 3

*Balancing* connectors, 10 forks, monitoring servers, multiple buckets,

mixed requests with their proportion and range

```javascript

node index.js --host balancing --port 8000 --forksNb 10

--bucketsNb 100 --objectsNb 1000 --sizes 8 --unit KB --requests

put,get --proprReqs 2,8 --range 0:500,500:1000 --schedule each --ssm true

--runTime 100 --output example3

```

## Pre-programmed tests

There are six pre-programmed tests as below.

 Parameters | S3Simple | S3Standard | S3Full | S3Sizes | S3Parallel | S3MPU

 --- | --- | --- | --- | --- | --- | ---

 S3 Connector | single | single, balancing | single, haproxy, balancing | single | single | single

 #S3blaster workers  | 1 | 1 & multiple | 1 & multiple | 1 | 1 & multiple | 1

 Bucket | single & multiple |  single & multiple | single & multiple | single & multiple | single & multiple | single & multiple

 Requests | `put`, `get`, `delete` | `put`, `get`, `delete` | `put`, `list`, `get`, `get ACL`, `delete`, `combine` | `put`, `get`, `delete` | `put`, `get`, `delete` | `multi-upload`, `get`, `delete`

 Schedule | `sequential` & `mixed` | `sequential`| `sequential` & `mixed` | `sequential` & `mixed` | `sequential` | `sequential` & `mixed`

 Object size | `OB`, `10KB` | `OB`, `10KB`, `1MB` | `OB`, `10KB`, `512KB`, `1MB`, `10MB` | `0, 10, .., 90KB` & `100, 200, .., 900KB` & `1, 2, .., 10MB` | `OB`, `10KB`, `1MB` | `1GB` (part size `5MB`, `20MB`)

 #parallel requests | 64 | 1, 32, 64, 128, .., 2048 | 1, 8, 16, 32, .., 2048 | 1, 50 | 32, 64, 128, .., 10240 | 1, 2, 4, 8, 16

 Run time (s) | 60 | 120 | 240 | 600 | 600 | 1200

 |

 **Execution cmd** | `npm run s3simple` | `npm run s3standard` | `npm run s3full` | `npm run s3sizes` | `npm run s3parallel` | `npm run s3mpu`
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/scality/s3blaster

Awesome Lists containing this project

README