https://github.com/hdfgroup/hdf5-iotest
HDF5 Performance Analysis Checklist
https://github.com/hdfgroup/hdf5-iotest
Last synced: 10 months ago
JSON representation
HDF5 Performance Analysis Checklist
- Host: GitHub
- URL: https://github.com/hdfgroup/hdf5-iotest
- Owner: HDFGroup
- License: other
- Created: 2020-07-21T00:33:26.000Z (almost 6 years ago)
- Default Branch: master
- Last Pushed: 2024-12-23T23:27:45.000Z (over 1 year ago)
- Last Synced: 2025-08-12T21:11:35.264Z (10 months ago)
- Language: Jupyter Notebook
- Homepage:
- Size: 1.3 MB
- Stars: 13
- Watchers: 4
- Forks: 9
- Open Issues: 6
-
Metadata Files:
- Readme: README.org
- License: COPYING
Awesome Lists containing this project
README
#+TITLE: HDF5 I/O Test
#+AUTHOR: Gerd Heber
#+EMAIL: gheber@hdfgroup.org
#+DATE: [2020-12-30 Wed]
#+PROPERTY: header-args :eval never-export
* Description
This is a simple I/O performance tester for HDF5. (See also [[https://github.com/ornladios/ADIOS2/tree/master/source/utils/adios_iotest][ADIOS IOTEST]].) Its
purpose is to assess the /performance variability/ of a set of logically
equivalent HDF5 representations of a common pattern. The test repeatedly writes
(and reads) in parallel a set of 2D array variables in a tiled fashion, over a
set of "time steps." A schematic is shown below.
#+begin_src plantuml :hidden :file ./img/flow.png :exports results
start
:read configuration;
repeat
:set internal parameters;
:create HDF5 file;
repeat
:[create group];
repeat
:[create dataset(s)];
fork
:write 2D tile 1;
fork again
:...;
fork again
:write 2D tile P;
end fork
repeat while (last 2D array?) is (no)
->yes;
repeat while (last time step?) is (no)
->yes;
:close file;
:...
read back the data
**details not shown**;
:report timings;
repeat while (last combination?) is (no)
->yes;
stop
#+end_src
#+RESULTS:
[[file:./img/flow.png]]
A configuration file (see below) is used to control parameters such as the
number of steps, the count and shape of the 2D array variables, etc. (See
section [[sec:parameters]].) The test then uses between 48 and 192 different
combinations of up to 7 internal parameters (see section [[sec:internal-parameters]])
to configure the HDF5 library and output, to perform write and read operations,
and to report certain timings. (The number of combinations is different for
sequential and parallel runs, and depends on other choices.)
For a /baseline/, this test can (and should!) be run with a single MPI process
using the POSIX, core, and MPI-IO VFDs. Typically, each baseline variant covers
a different aspect of the underlying system, its configuration, and the
particular HDF5 library version. Doing MPI-parallel runs without this baseline
is the proverbial "getting off on the wrong foot."
* Installation
While it can be built manually, we recommend building =hdf5_iotest= from [[https://computing.llnl.gov/projects/spack-hpc-package-manager][Spack]].
1. Create a new Spack package
#+begin_src sh
spack create --name hdf5iotest
#+end_src
This creates a boilerplate =package.py= file, which we'll need to edit.
2. Edit the freshly minted =package.py= via
#+begin_src sh
spack edit hdf5iotest
#+end_src
=package.py= should look like this:
#+begin_src python
# Copyright 2013-2020 Lawrence Livermore National Security, LLC and other
# Spack Project Developers. See the top-level COPYRIGHT file for details.
#
# SPDX-License-Identifier: (Apache-2.0 OR MIT)
from spack import *
class Hdf5iotest(AutotoolsPackage):
"""
A simple I/O performance test for HDF5.
Run with 'hdf5_iotest '. A sample INI file called
hdf5_iotest.ini can be found in the share/hdf5-iotest subdirectory.
It is recommended to run multiple configurations. The combinator.sh
script in share/hdf5-iotest creates 24 "standard" parameter cominations.
"""
homepage = "https://github.com/HDFGroup/hdf5-iotest"
git = "https://github.com/HDFGroup/hdf5-iotest.git"
maintainers = ['gheber']
version('master', branch='master')
depends_on('autoconf', type='build')
depends_on('automake', type='build')
depends_on('libtool', type='build')
depends_on('m4', type='build')
depends_on('mpi')
depends_on('hdf5')
depends_on('util-linux-uuid')
def configure_args(self):
args = []
if "+gperftools" not in self.spec:
args += ["--enable-gperftools"]
return args
def test(self):
pass
#+end_src
3. Show the explicit dependency configuration (libraries, versions, compiler, architecture, etc. ) via
#+begin_src sh
spack spec hdf5iotest
#+end_src
Sample output can be found in the [[sec:spack-spec-out][appendix]].
4. Install the hdf5iotest package and all dependencies
#+begin_src sh
spack install hdf5iotest
#+end_src
** Installation w/o Spack
You need a working installation of parallel HDF5 (which depends on MPI).
#+begin_src sh
./autogen.sh
CC=h5pcc ./configure --prefix=
make
make install
#+end_src
* Usage
=hdf5_iotest= accepts a single argument, the name of a configuration file. If no
configuration file name is provided, it looks for a configuration named
=hdf5_iotest.ini= in the current working directory.
#+begin_src sh
hdf5_iotest [INI file]
#+end_src
The configuration file syntax is shown [[sec:parameters][below]]:
#+begin_src conf-unix :tangle src/hdf5_iotest.ini :noweb no-export
[DEFAULT]
<>
#+end_src
A complete configuration can be found [[https://raw.githubusercontent.com/HDFGroup/hdf5-iotest/master/src/hdf5_iotest.ini][here]].
** Custom Settings
Rather than modifying the =[DEFAULT]= section, we recommend that testers create
a new section, e.g., =[CUSTOM]=, and overwrite the default values as needed.
#+begin_example
[DEFAULT]
...
[CUSTOM]
...
#+end_example
Any parameter specified in the =[CUSTOM]= section overwrites its =[DEFAULT]=
counterpart.
** Parameters<>
The following configuration parameters are supported.
- Version :: The HDF5 I/O test configuration version
#+begin_src conf-unix :noweb-ref hdf5-iotest-conf
version = 0
#+end_src
Currently, 0 is the only valid version.
- Steps :: The number of steps or repetitions, a positive integer.
#+begin_src conf-unix :noweb-ref hdf5-iotest-conf
steps = 20
#+end_src
- Number of 2D Array Variables :: The number of 2D array variables to be
written, a positive integer.
#+begin_src conf-unix :noweb-ref hdf5-iotest-conf
arrays = 500
#+end_src
- Array Rows :: HDF5 I/O test can be run in /strong/ or /weak/ scaling mode (see
[[sec:scaling][below]]). For /strong/ scaling, this is the total number (across all MPI ranks)
of rows of each 2D array variable. For /weak/ scaling, this is the number of
rows per MPI process per 2D array variable.
#+begin_src conf-unix :noweb-ref hdf5-iotest-conf
rows = 100
#+end_src
- Array Columns :: HDF5 I/O test can be run in /strong/ or /weak/ scaling mode
(see [[sec:scaling][below]]). For /strong/ scaling, this is the total number (across all MPI
ranks) of columns of each 2D array variable. For /weak/ scaling, this is the
number of columns per MPI process per 2D array variable.
#+begin_src conf-unix :noweb-ref hdf5-iotest-conf
columns = 200
#+end_src
- Number of MPI Process Rows :: HDF5 I/O test is run over a logical 2D grid
of MPI processes. This is the number of MPI process rows.
#+begin_src conf-unix :noweb-ref hdf5-iotest-conf
process-rows = 1
#+end_src
For strong scaling, the =rows= must be divisible by =process-rows=.
- Number of MPI Process Columns :: HDF5 I/O test is run over a logical 2D grid
of MPI processes. This is the number of MPI process columns.
#+begin_src conf-unix :noweb-ref hdf5-iotest-conf
process-columns = 1
#+end_src
For strong scaling, the =columns= parameter must be divisible by
=process-columns=.
- Scaling<> :: HDF5 I/O test can be run with strong or weak
scaling. In /strong scaling/ mode, the total amount of data written and read
is independent of the number of MPI processes, i.e., the per process I/O share
diminishes with an increase in the number of I/O processes. In /weak scaling/
mode, the amount of data written and read by each MPI-process is kept
constant, and the total I/O increases with the number of MPI processes.
#+begin_src conf-unix :noweb-ref hdf5-iotest-conf
# [weak, strong]
scaling = weak
#+end_src
- Alignment Increment :: Align HDF5 objects greater than or equal to an
alignment threshold on addresses which are a multiple of this increment.
#+begin_src conf-unix :noweb-ref hdf5-iotest-conf
# align along increment [bytes] boundaries
alignment-increment = 1
#+end_src
By default, there are no alignment restrictions in effect, and only
increments greater than 1 have any effect.
- Alignment Threshold :: The minimum object size (in bytes) for which alignment
constraints will be enforced. A threshold of 0 forces everything to be
aligned.
#+begin_src conf-unix :noweb-ref hdf5-iotest-conf
# minimum object size [bytes] to force alignment (0=all objects)
alignment-threshold = 0
#+end_src
- Meta Block Size :: The minimum size of metadata block allocations
when ~H5FD_FEAT_AGGREGATE_METADATA~ is set by a VFL driver.
#+begin_src conf-unix :noweb-ref hdf5-iotest-conf
# minimum metadata block allocation size in [bytes]
meta-block-size = 2048
#+end_src
Setting the value to 0 with this function will turn off metadata
aggregation, even if the VFL driver attempts to use the metadata
aggregation strategy.
The POSIX, core, and MPI-IO VFDs all support metadata allocation
aggregation.
- Single Process I/O :: The I/O driver or mode to be used when running with a
single process.
#+begin_src conf-unix :noweb-ref hdf5-iotest-conf
# [posix, core, mpi-io-uni]
single-process = posix
#+end_src
This setting is important when establishing a single-process
/baseline/. =posix= uses the default POSIX VFD. =core= uses a memory-backed
HDF5 file where the underlying memory buffer grows in 64 MB
increments. =mpi-io-uni= uses the MPI-IO VFD (with a single process).
- HDF5 Output File Name :: The default HDF5 output file name is
=hdf5_iotest.h5=. Use this parameter to select a different name.
*Note*: The character "#" in the filename is reserved for creating an HDF5
with the "#" replaced with the accumulated case number, for example *hdf5_iotest.#.h5*.
#+begin_src conf-unix :noweb-ref hdf5-iotest-conf
hdf5-file = hdf5_iotest.h5
#+end_src
- Results File :: When running the HDF5 I/O test, certain metrics are printed to
=stdout=. To simplify the analysis of results from multiple
runs, they are also written to a CSV file whose name is
configurable.
#+begin_src conf-unix :noweb-ref hdf5-iotest-conf
csv-file = hdf5_iotest.csv
#+end_src
- Restart :: The simulations will resume from (and including) the last successful
entry in the result's CSV file. A value of 1 indicates a restart run,
and 0 is no restart. If the keyword is not present, the default is
not a restart.
#+begin_src conf-unix :noweb-ref hdf5-iotest-conf
# [0, 1]
restart = 1
#+end_src
- Split :: The split I/O driver will be used [1]. Zero indicates to not use
the split file diriver, default.
#+begin_src conf-unix :noweb-ref hdf5-iotest-conf
# [0, 1]
split = 1
#+end_src
- One-case :: The simulation will run only one case in the parameter space.
A value not equal to 0 indicates which parameter case to run, and the
value is a cumulative counter of the nested loops over the parameter space.
For example, to rerun the 100th case in the CSV file, the value would be 100.
If the keyword is not present, the default is to do all the cases.
#+begin_src conf-unix :noweb-ref hdf5-iotest-conf
# case number in the parameter space to run.
one-case = 100
#+end_src
- Compression :: Specifies the compression filter for chunked datasets and
currently supports /gzip/ and /szip/. The value corresponds to
parameters in the corresponding HDF5 API. Valid parameters for "/gzip/" is an
integer for the /level/ (see H5Pset_deflate). For "/szip/" value is
/options_mask/, and /pixels_per_block/ (see H5Pset_szip).
#+begin_src conf-unix :noweb-ref hdf5-iotest-conf
# compression filter (gzip, szip).
szip = H5_SZIP_NN_OPTION_MASK, 8
#+end_src
- Async :: Specifies calling the async APIs (requires HDF5 version > 1.12) and
[[https://github.com/hpc-io/vol-async][ASYNC VOL]]
#+begin_src conf-unix :noweb-ref hdf5-iotest-conf
# enable async [0 - no, 1 - yes]
async = 1
#+end_src
- Delay :: Add a delay between time steps. Helpful in simulating a computing phase
when doing async I/O.
#+begin_src conf-unix :noweb-ref hdf5-iotest-conf
# delay in the form of an integer and time unit second 's 'or milliseconds 'ms'
delay = 1s
#+end_src
* Internal Parameters<>
Currently, the I/O test varies the following parameters:
- Dataset Rank :: The 2D array variables can be stored individually, or embedded
into 3D or 4D datasets. In other words, the rank can be 2, 3, or 4.
- Slowest Dimension :: The slowest dimension can be array (count) or time.
- Initialization with Fill Values :: The default behavior of the HDF5 library is
to initialize storage with the default or a user-specified fill value. This
incurs additional I/O and may reduce performance.
- Storage Layout :: The dataset storage layout in the HDF5 file can be chunked
or contiguous (or compact or virtual or user-defined).
- Alignment :: HDF5 objects greater than or equal to an alignment threshold can
be aligned on addresses that are a multiple of a certain increment.
- Lower Library Version Bound :: The HDF5 library can be configured to use the
earliest or latest available file format micro-versions when generating
objects.
- MPI I/O Operations :: With MPI, the write and read operations can be collective
or independent.
Since there is no shortage of knobs in the HDF5 API, other parameters might be
added in the future.
* Appendix <>
** Sample =spack spec hdf5iotest= output <>
#+begin_example
==> Using specified package name: 'hdf5iotest'
==> Created template for hdf5iotest package
==> Created package file: /home/gerdheber/GitHub/spack/var/spack/repos/builtin/packages/hdf5iotest/package.py
Waiting for Emacs...
% spack spec hdf5iotest
Input spec
--------------------------------
hdf5iotest
Concretized
--------------------------------
hdf5iotest@spack%gcc@8.3.0 arch=linux-debian10-skylake
^autoconf@2.69%gcc@8.3.0 arch=linux-debian10-skylake
^m4@1.4.18%gcc@8.3.0+sigsegv patches=3877ab548f88597ab2327a2230ee048d2d07ace1062efe81fc92e91b7f39cd00,fc9b61654a3ba1a8d6cd78ce087e7c96366c290bc8d2c299f09828d793b853c8 arch=linux-debian10-skylake
^libsigsegv@2.12%gcc@8.3.0 arch=linux-debian10-skylake
^perl@5.32.0%gcc@8.3.0+cpanm+shared+threads arch=linux-debian10-skylake
^berkeley-db@18.1.40%gcc@8.3.0 arch=linux-debian10-skylake
^gdbm@1.18.1%gcc@8.3.0 arch=linux-debian10-skylake
^readline@8.0%gcc@8.3.0 arch=linux-debian10-skylake
^ncurses@6.2%gcc@8.3.0~symlinks+termlib arch=linux-debian10-skylake
^pkgconf@1.7.3%gcc@8.3.0 arch=linux-debian10-skylake
^automake@1.16.3%gcc@8.3.0 arch=linux-debian10-skylake
^hdf5@1.10.7%gcc@8.3.0~cxx~debug~fortran~hl~java+mpi+pic+shared~szip~threadsafe api=none arch=linux-debian10-skylake
^openmpi@4.0.5%gcc@8.3.0~atomics~cuda~cxx~cxx_exceptions+gpfs~java~legacylaunchers~lustre~memchecker~pmi~singularity~sqlite3+static~thread_multiple+vt+wrapper-rpath fabrics=none schedulers=none arch=linux-debian10-skylake
^hwloc@2.2.0%gcc@8.3.0~cairo~cuda~gl~libudev+libxml2~netloc~nvml+pci+shared arch=linux-debian10-skylake
^libpciaccess@0.16%gcc@8.3.0 arch=linux-debian10-skylake
^libtool@2.4.6%gcc@8.3.0 arch=linux-debian10-skylake
^util-macros@1.19.1%gcc@8.3.0 arch=linux-debian10-skylake
^libxml2@2.9.10%gcc@8.3.0~python arch=linux-debian10-skylake
^libiconv@1.16%gcc@8.3.0 arch=linux-debian10-skylake
^xz@5.2.5%gcc@8.3.0~pic arch=linux-debian10-skylake
^zlib@1.2.11%gcc@8.3.0+optimize+pic+shared arch=linux-debian10-skylake
^numactl@2.0.14%gcc@8.3.0 patches=4e1d78cbbb85de625bad28705e748856033eaafab92a66dffd383a3d7e00cc94 arch=linux-debian10-skylake
#+end_example
** Sample CSV Output
The CSV output looks like this. It's then easy to concatenate several of these
(after stripping out the header), and load them into [[https://pandas.pydata.org/][pandas]] or [[https://www.r-project.org/][R]].
#+begin_example
steps,arrays,rows,cols,scaling,proc-rows,proc-cols,slowdim,rank,version,alignment-increment,alignment-threshold,layout,fill,fmt,io,wall [s],fsize [B],write-phase-min [s],write-phase-max [s],creat-min [s],creat-max [s],write-min [s],write-max [s],read-phase-min [s],read-phase-max [s],read-min [s],read-max [s]
20,500,100,200,weak,1,1,step,2,"1.8.22",1,0,contiguous,true,earliest,core,15.37,1689710848,14.10,14.10,7.25,7.25,0.77,0.77,1.27,1.27,0.27,0.27
20,500,100,200,weak,1,1,step,2,"1.8.22",1,0,contiguous,true,latest,core,20.70,1682116757,19.53,19.53,12.08,12.08,0.78,0.78,1.17,1.17,0.28,0.28
...
#+end_example
*** Metrics
All timings are obtained via =MPI_Wtime=. For some metrics, we record minima and
maxima across MPI ranks. The columns =steps= through =mpi-io= are just
reiterations of the configuration parameters. The remaining columns are as
follows:
- =wall [s]= :: Wall time in seconds.
- =fsize [B]= :: The HDF5 output file size in bytes
- =write-phase-min [s],write-phase-max [s]= :: The fastest and slowest
cumulative write phase time in seconds. This includes the time for file and
dataset creation(s).
- =creat-min [s],creat-max [s]= :: The fastest and slowest time spent in
=H5Fcreate=, =H5Fclose=, and =H5Dcreate= in seconds
- =write-min [s],write-max [s]= :: The fastest and slowest cumulative =H5Dwrite=
time in seconds
- =read-phase-min [s],read-phase-max [s]= :: The fastest and slowest cumulative
read phase time in seconds. This includes the times for opening and closing
the HDF5 file and for creating dataset selections.
- =read-min [s],read-max [s]= :: The fastest and slowest cumulative =H5Dread=
time in seconds