Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/criteo/hwbench

hwbench is a benchmark orchestration tool to automate the low-level testing of servers.
https://github.com/criteo/hwbench

Last synced: about 2 months ago
JSON representation

hwbench is a benchmark orchestration tool to automate the low-level testing of servers.

Awesome Lists containing this project

README

        

# What is hwbench ?
**hwbench** is a benchmark orchestrator to automate the low-level testing of servers.

## What makes hwbench different?
### Scripted language
hwbench embeds a very simplified script language, greatly inspired by [fio](https://github.com/axboe/fio), that turns a very simple script file into a large list of individual tests.

### Prepares the server before the benchmark
Some tuning can be performed automatically to ensure constant system settings across time and reboots. It avoids many human mistakes.

### Collects server's context
At startup, hwbench will collect as much as possible server's context like:
- BIOS configuration
- server properties (via DMI)
- kernel logs
- software versions
- list of hardware components (PCI, CPU, Storage, ...)
- ...

This context will be attached to the performance metrics for later analysis.

### Can run any type of benchmark
hwbench is using *engines* to define how to execute a particular external application.
The current version of hwbench supports 3 different engines.
- [stress-ng](https://github.com/ColinIanKing/stress-ng): no need to present this very popular low-level benchmarking tool
- spike: a custom engine used to make fans spike. Very useful to study the cooling strategy of a server.
- sleep: a stupid sleep call used to observe how the system is behaving in idle mode

Benchmark performance metrics are extracted and saved for later analysis.

### Collects server's environment
If the server is equipped with a [BMC](https://en.wikipedia.org/wiki/Intelligent_Platform_Management_Interface#Baseboard_management_controller),
and only if the monitoring feature is enabled, hwbench will collect environmental metrics and associate them with the final results for later analysis.

This release supports Dell and HPE servers and collects:
- Thermal sensors
- Fans speed
- Power consumption metrics

This feature uses [Redfish](https://www.dmtf.org/standards/redfish) protocol with both generic and OEM-specific endpoints.

If the server is connected to a [PDU](https://en.wikipedia.org/wiki/Power_distribution_unit), and only if the monitoring feature is enabled,
hwbench can collect power metrics from it.

This release supports the following brands:
- Raritan

For more details and usage, see the specific [documentation](./documentation/monitoring.md)

# How can results be analyzed?
**hwgraph** tool, bundled in the same repository, generates graphs from **hwbench** output files.
If a single output file is provided, **hwgraph** plots for each benchmark :
- performance metrics
- performance metrics per watt
- environmental metrics along the run:
- fan speed
- thermal sensors
- power consumption
- CPU frequency

If multiple output files are passed as arguments, and only if they were generated with the same script file, **hwgraph** will compare for each benchmark the performance metrics.

For more details, see the specific documentation.

# Examples
Running the **simple.conf** job:
python3 -m hwbench.hwbench -j configs/simple.conf -m monitoring.cfg

# Requirements
## Mandatory
- python >= 3.9
- [python dependencies](./requirements/base.in)
- turbostat >= 2022.04.16
- numactl
- dmidecode
- util-linux >= 2.32
- lspci
- rpm

## Optional
- ipmitool
- ilorest (for HPE servers)
- stress-ng >= 0.17.04