{"id":19947979,"url":"https://github.com/roseswe/clustermon","last_synced_at":"2026-05-15T07:31:37.097Z","repository":{"id":40641288,"uuid":"465338791","full_name":"roseswe/ClusterMon","owner":"roseswe","description":"OCF ClusterMon helper script for cluster monitoring","archived":false,"fork":false,"pushed_at":"2026-04-15T15:30:09.000Z","size":35,"stargazers_count":1,"open_issues_count":2,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2026-04-15T17:28:10.067Z","etag":null,"topics":["cluster","ha","monitoring-tool","pacemaker","sles15"],"latest_commit_sha":null,"homepage":"","language":"Shell","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/roseswe.png","metadata":{"files":{"readme":"README.md","changelog":"ChangeLog.txt","contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2022-03-02T14:26:09.000Z","updated_at":"2026-04-15T15:30:14.000Z","dependencies_parsed_at":"2024-11-13T00:38:23.267Z","dependency_job_id":"3a030a15-8182-4683-a106-dc0eb4f82ee5","html_url":"https://github.com/roseswe/ClusterMon","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/roseswe/ClusterMon","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/roseswe%2FClusterMon","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/roseswe%2FClusterMon/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/roseswe%2FClusterMon/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/roseswe%2FClusterMon/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/roseswe","download_url":"https://codeload.github.com/roseswe/ClusterMon/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/roseswe%2FClusterMon/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33057830,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-13T13:14:54.681Z","status":"online","status_checked_at":"2026-05-15T02:00:06.351Z","response_time":103,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cluster","ha","monitoring-tool","pacemaker","sles15"],"created_at":"2024-11-13T00:38:13.765Z","updated_at":"2026-05-15T07:31:37.091Z","avatar_url":"https://github.com/roseswe.png","language":"Shell","funding_links":[],"categories":[],"sub_categories":[],"readme":"# ReadMe for ClusterMon Wrapper for Generic Pacemaker Cluster Alerting\n\nThis package/files are hosted at  \u003chttps://github.com/roseswe/ClusterMon\u003e\n\nThis project is a set of different (shellscript) wrappers for the ClusterMon resource agent addressing different use cases to be used as a blue print.\n\nFrom the man page:  ocf_heartbeat_ClusterMon (7) - Runs crm_mon in the background, recording the cluster status (events) to an HTML file.\n\n## Technical Details\n\nA pacemaker cluster is an event driven system. In this context, an event is a resource failure or configuration change (not exhaustive).\n\nClusterMon resource agent parameter and details can be found in the man page ocf_heartbeat_ClusterMon (7) [older]/ocf_pacemaker_ClusterMon (7), under \u003chttp://linux-ha.org/doc/man-pages/re-ra-ClusterMon.html\u003e, on github \u003chttps://github.com/ClusterLabs/resource-agents/blob/main/heartbeat/ClusterMon\u003e or with crm:\n\n    # crm ra info ClusterMon\n\n````\n# crm  ra info ClusterMon\nRuns crm_mon in the background, recording the cluster status to an HTML file (ocf:heartbeat:ClusterMon)\n\nThis is a ClusterMon Resource Agent.\nIt outputs current cluster status to the html.\n\nParameters (*: required, []: default):\n\nuser (string, [root]):\n    The user we want to run crm_mon as\n\nupdate (integer, [15000]): Update interval\n    How frequently should we update the cluster status\n\nextra_options (string): Extra options\n    Additional options to pass to crm_mon.  Eg. -n -r\n\npidfile (string, [/run/resource-agents/ClusterMon_ClusterMon.pid]): PID file\n    PID file location to ensure only one instance is running\n\nhtmlfile (string, [/run/resource-agents/ClusterMon_ClusterMon.html]): HTML output\n    Location to write HTML output to.\n\nOperations' defaults (advisory minimum):\n\n    start         timeout=20s\n    stop          timeout=20s\n    monitor       depth=0 timeout=20s interval=10s\n````\n\nThe `ocf:heartbeat:ClusterMon` resource can monitor the cluster status and triggers alerts on each cluster event. This resource runs crm_mon in the background at regular intervals (configurable) and uses crm_mon capabilities to send emails (SMTP), SNMP traps or to execute an external program via the extra_options parameter. It works by using crm_mon in the background, which is a binary that provides a summary of cluster’s current state. This binary has a couple options to send email (SMTP) or traps (SNMP) on any transition to a chosen recipient. Therefore you need a crm_mon binary that supports sending SNMP or SMTP, if not see workaround script `crm_mail_agent.sh`!\n\nOn SUSE SLES15 `crm_mon` comes with the package pacemaker-cli. NOTE: SUSE changed from pacemaker-cli-1 to pacemaker-cli-2 between SLE 12 and SLE 15\n\n## ClusterMon Easy Logger Script\n\n\u003e[!NOTE]\nThis chapter is written for SUSE HA solution, based on pacemaker. It was tested with SLES12 and SLES15\n\nThis is a very primitive script that simply output cluster changes to syslog (via logger) and into a local text file. The pacemaker primitive definition and clone set to use this script\n\n    primitive rsc_ClusterMonEL ocf:pacemaker:ClusterMon \\\n            params user=root update=10000 htmlfile=\"/tmp/cmeasylogger.html\" extra_options=\"-E /root/bin/cmeasylogger.sh -W\" \\\n            op monitor on-fail=restart interval=60\n    clone ClusterMon-clone rsc_ClusterMonEL \\\n            meta target-role=Started\n\nIn the context of the ClusterMon resource configuration for Pacemaker, the parameter update=10000 specifies the update interval in milliseconds for the monitoring script. This means that the monitoring script will update its status every 10 seconds. Adjust to your needs...\n\nCopy and chmod +x the cmeasylogger.sh file to all nodes :-)\n\nIf you want to play around and testing the provided scripts you can create a dummy resource. Example:\n\n    # crm configure show rsc_Dummy01\n    primitive rsc_Dummy01 Dummy \\\n    meta target-role=Started \\\n    op start timeout=20s interval=0s \\\n    op stop timeout=20s interval=0s \\\n    op monitor timeout=20s interval=10s\n\n\nExample output of created logfile (/root/cm_easylogger.txt):\n\n    # tail cm_easylogger.txt\n    ClusterMon-Easy:::20220629-170649,sap12ha1,rsc_ClusterMon,monitor,ok,0,7,0,:::\n    ClusterMon-Easy:::20220629-170649,sap12ha2,admin-ip,stop,OK,0,0,0,:::\n    ClusterMon-Easy:::20220629-170649,sap12ha2,mariadb-ip,stop,OK,0,0,0,:::\n    ClusterMon-Easy:::20220629-170649,sap12ha1,rsc_ClusterMon,monitor,OK,0,0,0,:::\n    ClusterMon-Easy:::20220629-170649,sap12ha1,admin-ip,start,unknown error,1,0,0,:::\n    ClusterMon-Easy:::20220629-170649,sap12ha1,mariadb-ip,start,unknown error,1,0,0,:::\n    ClusterMon-Easy:::20220629-170649,sap12ha1,admin-ip,start,unknown error,1,0,0,:::\n    ClusterMon-Easy:::20220629-170649,sap12ha1,mariadb-ip,start,unknown error,1,0,0,:::\n    ClusterMon-Easy:::20220629-170650,sap12ha1,admin-ip,stop,OK,0,0,0,:::\n    ClusterMon-Easy:::20220629-170650,sap12ha1,mariadb-ip,stop,OK,0,0,0,:::\n\nOther example on SLES15SP5 KVM Cluster\n\n    kvm02cs:~ # cat cm_easylogger.txt\n    ClusterMon-Easy:::20250224-170300,kvm01cs,rsc_ClusterMonEL,monitor,OK,0,0,0,,0,4778:::\n    ClusterMon-Easy:::20250224-170300,kvm02cs,rsc_ClusterMonEL,monitor,OK,0,0,0,,0,4775:::\n    ClusterMon-Easy:::20250224-170639,kvm01cs,rsc_ClusterMonEL,stop,pending,193,0,-1,,0,6317:::\n    ClusterMon-Easy:::20250224-170639,kvm01cs,rsc_ClusterMonEL,stop,OK,0,0,0,,0,6325:::\n    ClusterMon-Easy:::20250224-170716,kvm01cs,rsc_ClusterMonEL,start,pending,193,0,-1,,0,6654:::\n    ClusterMon-Easy:::20250224-170717,kvm01cs,rsc_ClusterMonEL,start,OK,0,0,0,,0,6664:::\n    ClusterMon-Easy:::20250224-170717,kvm01cs,rsc_ClusterMonEL,monitor,pending,193,0,-1,,0,6666:::\n    ClusterMon-Easy:::20250224-170717,kvm01cs,rsc_ClusterMonEL,monitor,OK,0,0,0,,0,6680:::\n\nWe move the rsc_Dummy01 resource\n\n    ClusterMon-Easy:::20250224-170734,kvm02cs,rsc_Dummy01,monitor,pending,193,0,-1,,0,6862:::\n    ClusterMon-Easy:::20250224-170734,kvm02cs,rsc_Dummy01,monitor,OK,0,0,0,,0,6877:::\n    ClusterMon-Easy:::20250224-170734,kvm02cs,rsc_Dummy01,start,OK,0,0,0,,0,6860:::\n    ClusterMon-Easy:::20250224-170734,kvm01cs,rsc_Dummy01,stop,pending,193,0,-1,,0,6830:::\n    ClusterMon-Easy:::20250224-170734,kvm02cs,rsc_Dummy01,start,pending,193,0,-1,,0,6851:::\n    ClusterMon-Easy:::20250224-170734,kvm01cs,rsc_Dummy01,stop,OK,0,0,0,,0,6841:::\n\nWe stopped and started the rsc_vd_cirros resource\n\n    ClusterMon-Easy:::20250224-171211,kvm01cs,rsc_vd_cirros,monitor,not running,7,0,0,,1,9077:::\n    ClusterMon-Easy:::20250224-171211,kvm01cs,rsc_vd_cirros,stop,pending,193,0,-1,,1,9082:::\n    ClusterMon-Easy:::20250224-171211,kvm01cs,rsc_vd_cirros,stop,OK,0,0,0,,1,9087:::\n    ClusterMon-Easy:::20250224-171211,kvm01cs,rsc_vd_cirros,start,pending,193,0,-1,,1,9092:::\n    ClusterMon-Easy:::20250224-171213,kvm01cs,rsc_vd_cirros,start,OK,0,0,0,,1,9215:::\n    ClusterMon-Easy:::20250224-171213,kvm01cs,rsc_vd_cirros,monitor,pending,193,0,-1,,1,9217:::\n    ClusterMon-Easy:::20250224-171213,kvm01cs,rsc_vd_cirros,monitor,OK,0,0,0,,1,9245:::\n\nNote: crm node fence kvm01cs\n\n    ClusterMon-Easy:::20250224-171421,kvm01cs,rsc_vd_cirros,st_notify_fence,Operation reboot of kvm01cs by kvm02cs for pacemaker-controld.8623@kvm02cs: OK (ref=6259287c-e825-493c-9f0a-527830618b15),0,0,0,,0,10203:::\n    ClusterMon-Easy:::20250224-171421,kvm02cs,rsc_vd_cirros,start,pending,193,0,-1,,0,10220:::\n    ClusterMon-Easy:::20250224-171421,kvm02cs,rsc_vd_cirros,start,OK,0,0,0,,0,10263:::\n    ClusterMon-Easy:::20250224-171421,kvm02cs,rsc_vd_cirros,monitor,pending,193,0,-1,,0,10264:::\n    ClusterMon-Easy:::20250224-171421,kvm02cs,rsc_vd_cirros,monitor,OK,0,0,0,,0,10395:::\n    ClusterMon-Easy:::20250224-171444,kvm02cs,sbd-fencing,monitor,OK,0,0,0,,0,10676:::\n    ClusterMon-Easy:::20250224-171444,kvm02cs,rsc_vd_cirros,monitor,OK,0,0,0,,0,10682:::\n    ClusterMon-Easy:::20250224-171444,kvm02cs,rsc_ClusterMonEL,monitor,OK,0,0,0,,0,10672:::\n    ClusterMon-Easy:::20250224-171444,kvm02cs,rsc_vd_cirros,start,OK,0,0,0,,0,10679:::\n    ClusterMon-Easy:::20250224-171444,kvm02cs,rsc_cl_NodeUtil,start,OK,0,0,0,,0,10673:::\n    ClusterMon-Easy:::20250224-171444,kvm02cs,sbd-fencing,start,OK,0,0,0,,0,10675:::\n    ClusterMon-Easy:::20250224-171444,kvm02cs,rsc_ClusterMonEL,start,OK,0,0,0,,0,10671:::\n    ClusterMon-Easy:::20250224-171444,kvm01cs,sbd-fencing,monitor,pending,193,7,-1,,0,10727:::\n    ClusterMon-Easy:::20250224-171444,kvm01cs,rsc_vd_cirros,monitor,pending,193,7,-1,,0,10739:::\n    ClusterMon-Easy:::20250224-171444,kvm01cs,rsc_cl_NodeUtil,monitor,pending,193,7,-1,,0,10740:::\n    ClusterMon-Easy:::20250224-171444,kvm02cs,rsc_vd_sles15,monitor,OK,0,0,0,,0,10684:::\n    ClusterMon-Easy:::20250224-171444,kvm02cs,rsc_Dummy01,start,OK,0,0,0,,0,10669:::\n    ClusterMon-Easy:::20250224-171444,kvm02cs,rsc_vd_sles15,start,OK,0,0,0,,0,10683:::\n    ClusterMon-Easy:::20250224-171444,kvm01cs,rsc_ClusterMonEL,monitor,pending,193,7,-1,,0,10737:::\n    ClusterMon-Easy:::20250224-171444,kvm01cs,rsc_cl_NodeUtil,start,pending,193,0,-1,,0,10751:::\n    ClusterMon-Easy:::20250224-171444,kvm01cs,rsc_cl_NodeUtil,monitor,pending,193,0,-1,,0,10765:::\n    ClusterMon-Easy:::20250224-171444,kvm01cs,rsc_ClusterMonEL,start,pending,193,0,-1,,0,10752:::\n    ClusterMon-Easy:::20250224-171444,kvm02cs,rsc_Dummy01,monitor,OK,0,0,0,,0,10670:::\n    ClusterMon-Easy:::20250224-171444,kvm01cs,rsc_ClusterMonEL,monitor,OK,0,0,0,,0,10791:::\n    ClusterMon-Easy:::20250224-171444,kvm02cs,rsc_cl_NodeUtil,monitor,OK,0,0,0,,0,10674:::\n    ClusterMon-Easy:::20250224-171444,kvm01cs,rsc_vd_sles15,monitor,pending,193,7,-1,,0,10741:::\n    ClusterMon-Easy:::20250224-171444,kvm01cs,rsc_Dummy01,monitor,pending,193,7,-1,,0,10729:::\n    ClusterMon-Easy:::20250224-171444,kvm01cs,rsc_cl_NodeUtil,start,OK,0,0,0,,0,10764:::\n    ClusterMon-Easy:::20250224-171444,kvm01cs,rsc_ClusterMonEL,monitor,pending,193,0,-1,,0,10785:::\n    ClusterMon-Easy:::20250224-171444,kvm01cs,rsc_ClusterMonEL,start,OK,0,0,0,,0,10784:::\n    ClusterMon-Easy:::20250224-171444,kvm01cs,rsc_cl_NodeUtil,monitor,OK,0,0,0,,0,10782:::\n\n\n### Debugging ClusterMon and the Easy Logger Script by hand\n\n    crm_mon -d -i 5 --output-as=html --output-to=/tmp/cmel.html -E \"/root/bin/cmeasylogger.sh\" --watch-fencing\n\n\u003e[!NOTE]\n-W, --watch-fencing - Listen for fencing events. For use with --external-agent.\n\n## No more mail support in ClusterMon! New helper script for sending mail\n\nAt least it seems that SLES12 has dropped the mail-to option from crm_mon. So we need to write a workaround around that is using a little helper script (and filter script) :-(\n\n    # rpm -q --changelog pacemaker-cli | grep -m1 SMTP\n        - tools: remove crm_mon SMTP support (fate#324508)\n\n\n### SLES12+SLES15 and a helper script\n\nClusterMon must run on all cluster nodes, therefore we define it as a clone resource.\n\ncrm_mail_agent.sh - example CIB (must be adapted to your environment):\n\n    primitive rsc_ClusterMon ocf:pacemaker:ClusterMon \\\n      params user=root update=10000 pidfile=\"/crm_scripts/crm_monitor/crmMon.pid\" \\\n      htmlfile=\"/crm_scripts/crm_monitor/crmHtml.html\"  \\\n      extra_options=\"-E /crm_scripts/crm_monitor/crm_mail_agent.sh\" \\\n      op monitor on-fail=restart interval=60\n\n    clone ClusterMon-clone rsc_ClusterMon \\\n      meta target-role=Started\n\n\u003e[!NOTE]\n\u003eChanging the shell script (e.g. crm_mail_agent.sh): It will be executed the next monitor interval with your changes. So you can edit it on the fly.\n\n\n### SLES11 and maybe other distros, Get Mail\n\n(Feedback highly appreciated)\n\n    primitive rsc_ClusterMon ocf:pacemaker:ClusterMon \\\n        params user=root update=30 extra_options=\"--mail-to=root\" \\\n        op monitor on-fail=restart interval=60\n\n    clone ClusterMon-clone rsc_ClusterMon \\\n        meta target-role=Started\n\n### RHEL 7.9 Example\n\nHere we write a status file to /var/www/html/status.html. Semanage/chown/chmod the HTML file after creation of the resource.\n\n```\npcs resource create rsc_mon_status ocf:pacemaker:ClusterMon \\\n    htmlfile=\"/var/www/html/status.html\" \\\n    update=\"10000\" \\\n    --group g_app01\n```\n\n## Available Variables for crm_mon\n\nAt least valid for pacemaker 1.x and 2.0\n\n    Environment Variable    Description\n    CRM_notify_desc         The textual output relevant error code of the operation (if any) that caused the status change\n    CRM_notify_node         The node on which the status change happened\n    CRM_notify_rc           The return code of the operation\n    CRM_notify_recipient    The static external-recipient from the resource definition\n    CRM_notify_rsc          The name of the resource that changed the status\n    CRM_notify_status       The numerical representation of the status of the operation\n    CRM_notify_task         The operation that caused the status change\n    CRM_notify_target_rc    The expected return code of the operation\n\nSource:  \u003chttps://clusterlabs.org/pacemaker/doc/deprecated/en-US/Pacemaker/1.1/html/Pacemaker_Explained/s-notification-external.html\u003e  and \u003chttps://fossies.org/linux/pacemaker/lib/common/alerts.c\u003e\n\nTested with SLES12SP5, SLES15SP2, SLES15SP4, SLES15SP5\n\n    # crm_mon --version\n\n    # SLES15SP2\n    Pacemaker 2.0.4+20200616.2deceaa3a-3.15.1\n    Written by Andrew Beekhof\n    # SLES15SP4\n    Pacemaker 2.1.2+20211124.ada5c3b36-150400.2.43\n    Written by Andrew Beekhof\n    # SLES15SP5\n    Pacemaker 2.1.5+20221208.a3f44794f-150500.6.20.1\n    Written by Andrew Beekhof and the Pacemaker project contributors\n    # SLES15SP7\n    Pacemaker 2.1.10+20250718.fdf796ebc8-150700.3.3.1\n    Written by Andrew Beekhof and the Pacemaker project contributors\n\n    # RHEL/MLS 7.9\n    Pacemaker 1.1.23-1.el7_9.1\n    # RHEL/MLS 9.7\n    Pacemaker 2.1.10-1.1.el9_7\n\n## Too complex? Need something easy?\n\n### The mailto cluster script\n\nThe **mailto** script in the context of **SUSE Linux Enterprise Server (SLES) 15** refers to a specialized **Resource Agent** (RA) within the High Availability (HA) Extension. Its primary job is to provide automated email notifications to administrators whenever a cluster event occurs—most commonly a **resource takeover** or a failover between nodes. In SLES 15, this is typically managed through the Cluster Resource Management Shell (`crmsh`).\n\n#### Core Functionality\n\nThe `mailto` agent is a \"Basic\" cluster script used to:\n\n* **Notify Recipients:** Send an email to one or more addresses when a service moves from one node to another.\n* **Provide Status:** It helps admins track the health of the cluster without constantly monitoring the Hawk2 web interface or running `crm status`.\n\nYou can view the details and parameters of this script directly from the command line on your SLES node:\n\n```bash\n# View the script details\ncrm script show mailto\n\n```\n#### Configuration Example\n\nTo set up a mailto resource in your cluster, you would typically define a \"primitive\" resource. Below is a conceptual example of how to add it via the `crm` shell:\n\n```bash\ncrm configure primitive admin_email ocf:heartbeat:MailTo \\\n    params email=\"admin@example.com\" \\\n    params subject=\"Cluster Alert: Resource Failover\" \\\n    op monitor interval=120s\n\n```\n\n/*end*/\n\u003c!-- vim:set fileencoding=utf8 fileformat=unix filetype=gfm tabstop=2 expandtab:\n@(#)$Id: README.md,v 1.14 2026/04/16 13:13:49 ralph Exp $  --\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Froseswe%2Fclustermon","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Froseswe%2Fclustermon","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Froseswe%2Fclustermon/lists"}