https://github.com/swisscom/check_ilorest
check_ilorest is a monitoring plugin to check the hardware health on HPE servers.
https://github.com/swisscom/check_ilorest
hardware hardware-health hp hpe ilo ilom ilorest monitoring monitoring-plugins nagios-plugins observability
Last synced: 16 days ago
JSON representation
check_ilorest is a monitoring plugin to check the hardware health on HPE servers.
- Host: GitHub
- URL: https://github.com/swisscom/check_ilorest
- Owner: swisscom
- License: apache-2.0
- Created: 2025-09-25T11:12:57.000Z (9 months ago)
- Default Branch: main
- Last Pushed: 2025-11-19T13:20:28.000Z (7 months ago)
- Last Synced: 2026-03-06T12:52:46.869Z (3 months ago)
- Topics: hardware, hardware-health, hp, hpe, ilo, ilom, ilorest, monitoring, monitoring-plugins, nagios-plugins, observability
- Language: Python
- Homepage:
- Size: 45.9 KB
- Stars: 4
- Watchers: 1
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
README
# check_ilorest
This is the public repository for the check_ilorest monitoring plugin.
check_ilorest is a monitoring plugin to monitor the hardware health on HPE servers.
The plugin uses the `ilorest` command in the background, to query the hardware status from ILO (on HPE servers).
Authors:
- Swisscom (Schweiz) AG
- Claudio Kuenzler
This monitoring plugin was developed at and for Swisscom (Schweiz) AG and was published as open source software in 2025.
# Usage
## Requirements
As mentioned, `check_ilorest` uses the `ilorest` command. This command can be installed as RPM package, as PyPI package (using `pip3`) or built from source. Have a look at the official [ilorest repository](https://github.com/HewlettPackard/python-redfish-utility). The [ilorest user guide](https://servermanagementportal.ext.hpe.com/docs/redfishclients/ilorest-userguide/installation/) lists different installation method.
Besides `ilorest` available as command, the following requirements are needed:
- python3
- Must be run locally on a HPE server (remote ILO target is TODO)
- Must be run as root (`ilorest` requirement)
- TMPDIR (default: `/tmp`) must be mounted with `exec` option => alternatively use `-t` parameter (see `--help`)
## Monitoring Plugin
The plugin was developped for the purpose to easily integrate into classic system monitoring software, such as Icinga or Nagios.
```
root@linux ~ # /usr/lib64/nagios/plugins/check_ilorest.py
ILOREST HARDWARE OK: Hardware is healthy. Server: HPE ProLiant DL380 Gen11, S/N: XXXXXXXXXX, System BIOS: U54 v1.40 (06/29/2023)
```
The exit codes of `check_ilorest` are based on other monitoring plugins and follow the monitoring plugin development guidelines.
- Exit Code 0: OK, no hardware problems detected
- Exit Code 1: WARNING, degraded hardware detected
- Exit Code 2: CRITICAL, hardware failure detected
- Exit Code 3: UNKNOWN, there was a problem executing `check_ilorest` correctly
## Prometheus readiness
`check_ilorest` was developed for both classic and metrics-based monitoring tools. The plugin can be executed with the optional parameter `-o` to choose a different output format. By using `-o prometheus` the plugin outputs the information in Prometheus metrics exposition format. It could be used in combination with the Prometheus [Script Exporter](https://github.com/ricoberger/script_exporter) to execute the plugin and scrape the metrics through Script Exporter's API.
```
root@linux ~ # /usr/lib64/nagios/plugins/check_ilorest.py -o prometheus
#HELP ilorest_hardware_health Overall health status of server. 0=OK. 1=Warning. 2=Critical
#TYPE ilorest_hardware_health gauge
ilorest_hardware_health 0
#HELP ilorest_hardware_health_biosorhardwarehealth Health status of hardware component BiosOrHardwareHealth
#TYPE ilorest_hardware_health_biosorhardwarehealth gauge
ilorest_hardware_health_biosorhardwarehealth 0
#HELP ilorest_hardware_health_fans Health status of hardware component Fans
#TYPE ilorest_hardware_health_fans gauge
ilorest_hardware_health_fans 0
#HELP ilorest_hardware_health_memory Health status of hardware component Memory
#TYPE ilorest_hardware_health_memory gauge
ilorest_hardware_health_memory 0
#HELP ilorest_hardware_health_network Health status of hardware component Network
#TYPE ilorest_hardware_health_network gauge
ilorest_hardware_health_network 0
#HELP ilorest_hardware_health_powersupplies Health status of hardware component PowerSupplies
#TYPE ilorest_hardware_health_powersupplies gauge
ilorest_hardware_health_powersupplies 0
#HELP ilorest_hardware_health_processors Health status of hardware component Processors
#TYPE ilorest_hardware_health_processors gauge
ilorest_hardware_health_processors 0
#HELP ilorest_hardware_health_smartstoragebattery Health status of hardware component SmartStorageBattery
#TYPE ilorest_hardware_health_smartstoragebattery gauge
ilorest_hardware_health_smartstoragebattery 0
#HELP ilorest_hardware_health_storage Health status of hardware component Storage
#TYPE ilorest_hardware_health_storage gauge
ilorest_hardware_health_storage 0
#HELP ilorest_hardware_health_temperatures Health status of hardware component Temperatures
#TYPE ilorest_hardware_health_temperatures gauge
ilorest_hardware_health_temperatures 0
```
## Compatibility with ProLiant server generations
The `ilorest` command, used in the background of the plugin, queries the Redfish API integrated into the ILO controller (available since ILO4 v2.10+, ILO5 v1.20+, ILO6 v1.05+). Newer server generations have a better and more detailled implementation inside ILO, allowing to query more information and more hardware elements.
|HW generation|supported|notes|
|---------------|---------|-----|
|Gen9 | :white_check_mark: | limited number of hardware elements, shows overall health state |
|Gen10| :white_check_mark: | |
|Gen11| :white_check_mark: | |
## Compatibility with ilorest releases
The `check_ilorest` monitoring plugin has been successfully tested on the following `ilorest` releases. This list is non-conclusive. Most `ilorest` releases should actually work.
|ilorest release|supported|notes|
|---------------|---------|-----|
|4.6.0 | :white_check_mark: | |
|6.2.0 | :x: | `--nocache` not working |
|6.3.0 | :white_check_mark: | Slower than 4.6.x release |
## ToDos
- Support remote ILO (using `-a` and credentials)
- Support ignore list to ignore specific discovered hardware elements