Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/khailey-zz/fio_scripts

scripts for running fio, parsing the data and graphing the output
https://github.com/khailey-zz/fio_scripts

Last synced: about 2 months ago
JSON representation

scripts for running fio, parsing the data and graphing the output

Awesome Lists containing this project

README

        

fio_scripts
===========

Theses scripts are for facilitating running the I/O benchmark tool
fio, parsing the fio data and graphing the output.
There are a lot of I/O benchmarking tools out there, most noteably
iozone and bonnie++, but fio seems to be the most flexible with
the most active user community

NOTE: the scripts work on fio 2.0.7 but possibly not on other versions

* fio project: http://freecode.com/projects/fio
* download: http://brick.kernel.dk/snaps/fio-2.0.7.tar.gz
* man page: http://linux.die.net/man/1/fio
* how to: http://www.bluestop.org/fio/HOWTO.txt
* mail list subscription: http://vger.kernel.org/vger-lists.html
* mail list archives : http://www.spinics.net/lists/fio/

files in this project

+ fio.sh - run a set of I/O benchmarks using fio
+ fioparse.sh - parse the output files from fio.sh runs
+ fio.r - create a function called graphit() in R
+ fiog.r - run graphit on different combinations of data from fioparse.sh
+ fiop.r - newer version of graphit that adds scaling info
+ fiopg.r - use to graph with the newer graphit in fiop.r
+ example data from fioparse.sh

> data_emc.r
> data_ssd.r
> data_pharos.r
> data_mem.r

NOTE: the scripts in this project require that you have already
downloaded fio and compiled a binary of fio.
The scripts also require version 2.0.7 of fio to work correctly.

Running fio.sh
---------------------------
First run fio.sh.
The script fio.sh will run a series of I/O benchmarks.
The series of I/O benchmarks are aimed at simulating the typical workload
of an Oracle database.
There are 3 types of I/O run

* random small reads
* sequential large reads
* sequential writes

for each of these the number of users is varied and the I/O request size is
varied.

usage: ./fio.sh [options]

run a set of I/O benchmarks

OPTIONS:
-h Show this message
-b binary name of fio binary, defaults to ./fio
-w directory work directory where fio creates a fio and reads and writes, default /domain0/fiotest
-o directory output directory, where to put output files, defaults to ./
-t tests tests to run, defaults to all, options are
readrand - IOPS test : 8k by 1,8,16,32 users
read - MB/s test : 1M by 1,8,16,32 users & 8k,32k,128k,1m by 1 user
write - redo test, ie sync seq writes : 1k, 4k, 8k, 128k, 1024k by 1 user
randrw - workload test: 8k read write by 1,8,16,32 users
-s seconds seconds to run each test for, default 60
-m megabytes megabytes for the test I/O file to be used, default 65536 (ie 64G)
-i individual file per process, default size 100m (otherwise uses the -m size)
-f force run, ie don't ask for confirmation on options
-c force creation of work file otherwise if it exists we use it as is
-u #users test only use this many users
-l blocksize test only use this blocksize in KB, ie 1-1024
-e recordsize use this recordsize if/when creating the zfs file system, default 8K
-d Use DTrace on the run
-x remove work file after run
-y initialize raw devices to "-m megabytes" with writes
writes will be evenly written across multiple devices, default is 64GB
-z raw_sizes size of each raw device. If multiple, colon separate, list inorder of raw_device
-r raw_device use raw device instead of file, multi devices colon separated

example
fio.sh ./fio.opensolaris /domain0/fiotest -t readrand -s 10 -m 1000 -f

Running fioparse.sh
---------------------------
Once the benchmarks have been run, use fioparse.sh to extract a consise
set of statistics from the output files.

./fioparse.sh -v *out
test users size MB ms IOPS 50us 1ms 4ms 10ms 20ms 50ms .1s 1s 2s 2s+
read 1 8K r 28.299 0.271 3622 99 0 0 0
read 1 32K r 56.731 0.546 1815 97 1 1 0 0 0
read 1 128K r 78.634 1.585 629 26 68 3 1 0 0
read 1 1M r 91.763 10.890 91 14 61 14 8 0 0
read 8 1M r 50.784 156.160 50 3 25 31 38 2
read 16 1M r 52.895 296.290 52 2 24 23 38 11
read 32 1M r 55.120 551.610 55 0 13 20 34 30
read 64 1M r 58.072 1051.970 58 3 6 23 66 0
randread 1 8K r 0.176 44.370 22 0 1 5 2 15 42 20 10
randread 8 8K r 2.763 22.558 353 0 2 27 30 30 6 1
randread 16 8K r 3.284 37.708 420 0 2 23 28 27 11 6
randread 32 8K r 3.393 73.070 434 1 20 24 25 12 15
randread 64 8K r 3.734 131.950 478 1 17 16 18 11 33
write 1 1K w 2.588 0.373 2650 98 1 0 0 0
write 1 8K w 26.713 0.289 3419 99 0 0 0 0
write 1 128K w 11.952 10.451 95 52 12 16 7 10 0 0 0
write 4 1K w 6.684 0.581 6844 90 9 0 0 0 0
write 4 8K w 15.513 2.003 1985 68 18 10 1 0 0 0
write 4 128K w 34.005 14.647 272 0 34 13 25 22 3 0
write 16 1K w 7.939 1.711 8130 45 52 0 0 0 0 0 0
write 16 8K w 10.235 12.177 1310 5 42 27 15 5 2 0 0
write 16 128K w 13.212 150.080 105 0 0 3 10 55 26 0 2

The above output is for human consumption, but when run with "-r" the output
will be given in R format:

./fioparse.sh -r *out

m <- NULL
m <- matrix(c(
"read", 1, "8K", 35.647, 0.217, 0.2, 8, 0.1, 4562 , 0, 0,92, 7, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,0.266,0.438,0.506,0.572,0.756,4.080
, "read", 1, "32K", 98.439, 0.315, 0.1, 22, 0.2, 3150 , 0, 0, 4,94, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,0.418,0.490,0.604,0.772,0.948,3.632
, "read", 1,"128K", 223.127, 0.556, 0.3, 40, 0.3, 1785 , 0, 0, 0,21,78, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,0.652,0.692,0.748,1.112,1.816,11.328
, "read", 1, "1M", 388.821, 2.567, 0.3, 16, 0.7, 388 , 0, 0, 0, 4, 0, 1,93, 0, 0, 0, 0, 0, 0, 0, 0, 0,2.768,3.376,4.832,10.432,15.424,16.192
, "read", 8, "1M", 265.195, 18.608, 0.3, 33, 5.8, 265 , 0, 0, 0, 5, 0, 0, 2, 0,19,71, 0, 0, 0, 0, 0, 0,21.376,22.144,24.448,29.056,29.568,33.536
, "read", 16, "1M", 239.514, 35.450, 3.0, 41, 6.0, 239 , 0, 0, 0, 0, 0, 0, 0, 0, 4,94, 0, 0, 0, 0, 0, 0,38.144,38.656,38.656,41.728,41.728,41.728
, "read", 32, "1M", 288.621, 48.080, 0.3, 76, 26.3, 288 , 0, 0, 0,10, 0, 0, 3, 0, 5,24,54, 0, 0, 0, 0, 0,74.240,75.264,75.264,75.264,76.288,77.312
, "read", 64, "1M", 326.718, 65.409, 0.3, 132, 43.0, 326 , 0, 0, 0,10, 1, 0, 2, 5, 1, 9,44,23, 0, 0, 0, 0,132.096,132.096,132.096,132.096,132.096,132.096
,"randread", 1, "8K", 28.188, 0.274, 0.0, 19, 0.2, 3608 ,11,34, 2,44, 6, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,0.506,0.524,0.524,0.628,0.740,1.640
,"randread", 8, "8K", 369.144, 0.166, 0.0, 12, 0.2,47250 , 0,63,20, 2,11, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,0.636,0.788,0.852,1.208,1.512,2.640
,"randread", 16, "8K", 482.962, 0.254, 0.0, 16, 0.3,61819 , 0,20,57, 5,14, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,0.820,1.192,1.784,4.128,5.408,9.024
,"randread", 32, "8K", 511.212, 0.480, 0.0, 53, 0.9,65435 , 0,14,52, 6,15, 6, 3, 1, 0, 0, 0, 0, 0, 0, 0, 0,1.912,4.384,5.536,8.640,10.304,17.792
,"randread", 64, "8K", 525.351, 0.904, 0.0, 1040, 6.6,67244 , 0, 8,47,11,16, 9, 3, 1, 0, 0, 0, 0, 0, 0, 0, 0,2.384,8.640,16.768,71.168,121.344,309.248
, "write", 1, "1K", 11.306, 0.084, 0.1, 2, 0.0,11577 , 0,91, 8, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,0.106,0.124,0.133,0.197,0.510,1.144
, "write", 1, "8K", 67.812, 0.113, 0.1, 15, 0.1, 8679 , 0,37,62, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,0.149,0.183,0.205,0.964,0.988,1.128
, "write", 1,"128K", 270.647, 0.458, 0.4, 22, 0.2, 2165 , 0, 0, 0,95, 0, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,0.478,1.160,1.192,1.272,1.448,6.304
, "write", 4, "1K", 27.946, 0.102, 0.1, 12, 0.1,28617 , 0,68,30, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,0.147,0.193,0.213,0.980,1.080,1.384
, "write", 4, "8K", 145.804, 0.158, 0.1, 16, 0.1,18662 , 0, 0,96, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,0.239,0.314,1.032,1.144,1.208,3.920
, "write", 4,"128K", 373.462, 0.999, 0.4, 40, 0.7, 2987 , 0, 0, 0, 0,73,24, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0,1.352,3.312,4.512,6.496,9.536,39.680
, "write", 16, "1K", 44.294, 0.195, 0.1, 30, 0.2,45357 , 0, 0,91, 7, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,0.306,0.422,0.572,1.336,2.224,5.472
, "write", 16, "8K", 210.676, 0.329, 0.1, 13, 0.3,26966 , 0, 0,42,51, 3, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,0.524,1.256,1.368,2.992,4.960,8.384
, "write", 16,"128K", 317.903, 3.517, 0.4, 85, 5.1, 2543 , 0, 0, 0, 0, 0, 8,83, 5, 0, 1, 0, 0, 0, 0, 0, 0,4.576,38.144,43.264,55.552,58.624,62.720
),nrow=31)
tm <- t(m)
m <-tm
colnames <- c("name","users","bs","MB","lat","min","max","std","iops"
, "us50","us100","us250","us500","ms1","ms2","ms4","ms10","ms20"
, "ms50","ms100","ms250","ms500","s1","s2","s5"
,"p95_00", "p99_00", "p99_50", "p99_90", "p99_95", "p99_99"
)
colnames(m)=colnames
m <- data.frame(m)

Graphing in R
-----------------------------------------
To get started with R see: http://scs.math.yorku.ca/index.php/R:_Getting_started_with_R
QUICK START: To install R, you can go to http://cran.r-project.org/ and follow the instructions.

Start R and load up the above in R and it creates the dataframe "m"

> m
name users bs MB lat min max std iops us50 us100 us250 us500 ms1 ms2 ms4 ms10 ms20 ms50 ms100 ms250 ms500 s1 s2 s5 p95_00 p99_00 p99_50 p99_90 p99_95 p99_99
1 read 1 8K 14.67 0.529 0.2 4 0.1 1877 0 0 3 26 70 0 0 0 0 0 0 0 0 0 0 0 0.644 0.708 0.74 1.032 1.416 3.152
2 read 1 32K 7.183 4.345 0.3 100 9.8 229 0 0 0 12 59 5 2 8 5 5 0 0 0 0 0 0 24.96 49.92 58.112 80.384 82.432 99.84
3 read 1 128K 13.277 9.408 0.6 222 17.3 106 0 0 0 0 29 32 2 8 10 13 2 0 0 0 0 0 41.728 80.384 90.624 166.912 222.208 222.208
4 read 1 1M 64.841 15.41 3 227 24.9 64 0 0 0 0 0 0 34 44 1 10 8 1 0 0 0 0 65 116 145 227 227 227
5 read 8 1M 129.512 38.45 3 395 48.7 129 0 0 0 0 0 0 0 30 33 5 18 9 0 0 0 0 139 219 265 306 396 396
6 read 16 1M 140.513 63.47 3 597 71.3 140 0 0 0 0 0 0 10 19 9 9 31 16 3 0 0 0 194 338 371 594 594 594
7 read 32 1M 179.886 96.23 3 1546 114.7 179 0 0 0 0 0 0 4 10 14 9 25 28 6 0 0 0 293 515 676 1483 1549 1549
8 read 64 1M 261.523 111.7 3 1270 114.6 261 0 0 0 0 0 0 0 1 7 32 15 32 8 1 0 0 318 545 685 857 865 1270
9 randread 1 8K 0.553 14.114 0.2 296 15.3 70 0 0 0 3 14 0 1 15 44 17 1 0 0 0 0 0 32.384 60.16 68.096 296.96 296.96 296.96
10 randread 8 8K 3.77 16.489 0.2 257 15.7 482 0 0 0 2 4 0 2 21 44 20 2 0 0 0 0 0 39.168 73.216 97.792 191.488 220.16 257.024
11 randread 16 8K 6.628 18.684 0.2 285 18 848 0 0 0 2 4 0 2 18 40 27 3 0 0 0 0 0 46.848 87.552 117.248 201.728 238.592 284.672
12 randread 32 8K 8.957 27.3 0.2 377 22.5 1146 0 0 0 0 0 0 1 8 32 48 7 1 0 0 0 0 65 117 143 241 306 367
13 randread 64 8K 9.989 44.843 0.2 348 24 1278 0 0 0 0 0 0 0 0 5 64 26 2 0 0 0 0 85 137 161 258 281 310
14 write 1 1K 1.51 0.643 0.2 106 2 1546 0 0 2 25 71 0 0 0 0 0 0 0 0 0 0 0 0.716 1.032 4.64 24.96 38.656 102.912
15 write 1 8K 13.922 0.557 0.2 4 0.1 1782 0 0 0 27 71 0 0 0 0 0 0 0 0 0 0 0 0.708 0.844 0.908 1.416 1.816 4.048
16 write 1 128K 62.081 2.009 0.7 547 8.4 496 0 0 0 0 39 51 0 5 2 0 0 0 0 0 0 0 9.92 10.176 20.096 40.192 49.92 544.768
17 write 4 1K 6.694 0.434 0.1 283 2.1 6855 0 0 26 54 18 0 0 0 0 0 0 0 0 0 0 0 0.7 0.828 1.384 15.296 23.424 71.168
18 write 4 8K 56.823 0.409 0.2 8 0.1 7273 0 0 1 88 9 0 0 0 0 0 0 0 0 0 0 0 0.692 0.788 0.852 1.848 2.48 4.32
19 write 4 128K 63.777 4.002 0.7 4650 95.1 510 0 0 0 0 8 84 0 3 2 0 0 0 0 0 0 0 9.92 19.84 20.352 40.192 49.92 4620.288
20 write 16 1K 14.483 0.6 0.2 815 3.7 14831 0 0 8 78 10 0 0 0 0 0 0 0 0 0 0 0 0.708 5.92 12.224 33.536 47.36 124.416
21 write 16 8K 179.438 0.387 0.2 691 2 22968 0 0 0 95 2 0 0 0 0 0 0 0 0 0 0 0 0.454 0.644 0.804 9.92 19.84 40.192
22 write 16 128K 67.58 16.606 0.7 5325 227.1 540 0 0 0 0 0 1 84 3 4 4 0 0 0 0 0 0 20.096 49.92 60.16 5341.184 5341.184 5341.184

In R we can now source "fio.r" which creates a function "graphit(m)"

source("fio.r") # create the graphit() function
source("data_ssd.r") # load some fio data, data_ssd.r is provided in github distro
graphit(m) # graph the data
source("fiog.r") # this will graph various combinations and save the png files
# to C:\temp
# the graphs will be for readrand, read and write tests
# the graphs will graph different user loads and I/O sizes in the data

By default it will graph 8K random reads.
If you source "fiog.r" it will run through a series of different combinations graphing them and saving the output.
The output is save to png files in the directory C:\TMP

Example data files are included

* data_emc.r
* data_ssd.r
* data_pharos.r
* data_mem.r

collected from different systems. The EMC data is one single spindle. The pharos data is striped but
shared filer. THe ssd data is from two striped SSD devices. The mem data is from using /tmp where /tmp
is a memory filesystem.
In order to graph these datasets, simple source them

source("data_ssd.r")

Then graph them

source("fiog.r")

NOTE: to source files they have to be in R's working directory.
You can get the working directory with

getwd()

you can set working directory with

setwd("C:\\Temp\\")

for example to set it to C:\Temp
if the data is in a csv file you can load it with

m <- read.csv("data_ssd.csv")

GRAPH Examples:

https://sites.google.com/site/oraclemonitor/i-o-graphics#TOC-Percentile-Latency

Running fiop.r and fiopg.r
-----------------------------------------------
new set of graphs with fiop.r version of graphit(). Use fiopg.r to create a set of graphs from a data set.
for example

source("fiop.r")
source("data_emc.r")
source("fiopg.r")

Each PNG file will have 3 graphs

1. latency on a log scale
2. latency on a base 10 scale
3. throughout MB/s

1: the log scale latency has several parts
-------------------------------------------

Four lines:

1. max latency - dashed red line
2. 99% latency - top of light grey shaded area
3. 95% latency - top of dark grey shaded area
4. avg latency - black line

Plus:

+ back ground is barchaerts, 0 percent at bottom to 100% at top

light blue % of I/Os below 1ms - probably some sort of cache read
green % of I/Os below 10ms
yellow % of I/Os over 10ms

+ histograms of latency buckets

at each user load level, color coded. Each bugkets height (horizontal) is % of I/Os in that bucket
like a fine grain breakdown of the background

2. base 10 latency
-------------------------------------------

shows the average latency - black line
the background is a type of scalling factor
blue means latency actually gets faster from one point to next
light red means it gets slower
darker red means latency gets slower and throughput actually decreases
the hight of these bars is (ratio of increase in latency) / ratio of increase in users )
for example if 1user was 1ms and 8 users was 8ms then that would be (8users/1user)/(8ms/1ms)

3. the throughput bar chart, shows MB/s
-------------------------------------------

the bars are color code with amount percentage of throughput that had a latency of that color where colors
are in the right hand axis legend in top graph the latency on log scale

see:

https://sites.google.com/site/oraclemonitor/i-o-graphics#TOC-percentile-latency-with-scaling

New Graphics
-----------------------------------------------------
a new version of the function graphit() is created by
fiop.r and fiopg.r will go through a set of I/O data
and print out variouis graphs of the data.

Examples of the graphs are on

https://plus.google.com/photos/105986002174480058008/albums/5773655476406055489?authkey=CIvKiJnA2eXSbQ

A visual explanation is here

https://plus.google.com/photos/105986002174480058008/albums/5773661884246310993

A Summary of the graph contents is:

The charts are mainly for exploring the data as opposed to a polished final graph showing I/O performance

A quick recap of the graphics:
There are 3 graphs

1. latency on log graph
2. latency on base 10 graph
3. throughput bar charts

On the log latency graph latency is shown for

* max latency - dashed red line
* average latency - solid black line
* 95% latency - dash black line with grey fill between 95% and average
* 99% latency - dash black line with light grey fill between 95% and 99% latency
* latency histogram - buckets represent % of I/Os for that latency.Each bucket is drawn at the y axis height that represents that latency. The buckets are also color coded to help more quickly identify
* background color - for each load test the background is coded one of 3 colors.
* ... yellow - % of I/Os over 10ms
* ... green - % of I/Os under 10ms
* ... blue - % of I/Os under 1ms

the idea being that the graphs should have all green. If the backgrounds are yellow then the I/Os are slow. If the backgrounds are blue then the I/Os represent a certain about of cached reads as opposed to physical spindle reads.

The second graph is latency on base 10 in order to more easily see the slopes of the increasing I/O latency with load.
On this second graph is also a bar chart in the background. The bars are color coded

* dark red - latency increased and throughput decreases
* light red - latency increased but throughput also increased
* light blue - latency actually got faster (shouldn't happen but does)

Ideally the bars are so small they aren't visible which means latency stays the same as load increases. The higher the bar the more the latency changed between tests

The third chart is simply the throughput, ie the MB/s. These bars have slices that represent the percentage of the I/O at the latency that corresponds to that color. The colors are defined in the legend of the top chart.

.