Error: Your Requested widget " wp-github-commits-5" is not in the widget list.
  • [do_widget_area sidebar-1]
    • [do_widget_area sidebar-2]
      • [do_widget id="tag_cloud-2"]
    • [do_widget_area widgets_for_shortcodes]
      • [do_widget id="wp-github-commits-14"]
      • [do_widget id="wp-github-commits-17"]
      • [do_widget id="wp-github-commits-15"]
      • [do_widget id="wp-github-commits-25"]
      • [do_widget id="wp-github-commits-3"]
      • [do_widget id="wp-github-commits-13"]
      • [do_widget id="wp-github-commits-20"]
      • [do_widget id="wp-github-commits-16"]
      • [do_widget id="wp-github-commits-2"]
      • [do_widget id="wp-github-commits-24"]
      • [do_widget id="wp-github-commits-23"]
      • [do_widget id="wp-github-commits-21"]
      • [do_widget id="wp-github-commits-5"]
      • [do_widget id="wp-github-commits-6"]
      • [do_widget id="wp-github-commits-26"]
      • [do_widget id="wp-github-commits-9"]
      • [do_widget id="wp-github-commits-22"]
      • [do_widget id="wp-github-commits-19"]
      • [do_widget id="wp-github-commits-8"]
      • [do_widget id="wp-github-commits-7"]
      • [do_widget id="wp-github-commits-18"]
      • [do_widget id="wp-github-commits-11"]
      • [do_widget id="wp-github-commits-4"]
      • [do_widget id="wp-github-commits-12"]
      • [do_widget id="wp-github-commits-27"]
    • [do_widget_area wp_inactive_widgets]
      • [do_widget id="recent-posts-2"]
      • [do_widget id="recent-comments-2"]

    Introduction

    Clustering is a very important technology to ensure application availability and performance. Accurate monitoring of your clusters is crucial to keep your applications stable. Not monitoring is not an option, as you might never know when a failover has taken place and waste valuable time looking in the wrong places for solutions to your problems. There are different options and levels of monitoring you can choose for monitoring Microsoft Windows failover clusters.

    Preferred Node Option

    One of the easiest is checking each cluster service if it is still running on it’s preferred node. The plugin from Nedstars (link) to check if MS cluster services are running on their preferred node only works for Windows 2008 R2 or later versions. Apart from that I noticed that there seemed to be a bug in Nedstars script. If a MS cluster contained more then one cluster service, it seemed to only check one of them. I did not investigate this further, so forgive me if I’m wrong here. As we are still using multiple Windows 2003 failover clusters,  I decided to write a completely new plugin that uses WMI to get information about failover cluster services on Microsoft Windows 2003 failover clusters. Windows Server 2008 R2 Failover Cluster Preferred Node: preferred node on 2008 Windows Server 2003 R2 Failover Cluster Preferred Node: check_ms_cluster_preferrede_node_2003R2 The plugin starts by checking the version of the Windows server where it is running on with WMI. If the version is lesser then 6.1 (which is the version number of Windows Server 2008 R2), the script will continue to use WMI to get all cluster services and then check each cluster service individually to see where it is running and compare that with it’s preferred node. If the OS version is not lesser then 6.1, the plugin will import the module ‘failoverclusters’ and start using module commands instead of WMI. I did not check if the script works on a cluster with more then two nodes, as we don’t have clusters with three or more nodes. If anyone could test this for me and let me know… 

    How to use the check_ms_cluster_preferred_node?

    1. Copy the ‘check_ms_cluster_preferred_node.ps1’ Powershell script to the NSClient++ scripts folder, preferably in a sub-folder ‘Powershell’ on all the failover cluster nodes you wish to monitor.
    2. In the nsclient.ini configuration file, define the external command like this (and restart the NSClient++ service (nscp) afterwards):
    3. Make a command in Nagios like this:
    4. Configure your service in Nagios. Use the command you previously made. A healthy check in Nagios XI would look like this:check_ms_cluster_preferred_node

    Other Microsoft Failover Cluster monitoring options?

    So is monitoring the preferred node the best way to monitor clusters. Definitely not. A cluster might have no or multiple preferred nodes for some reason. If you happen to own some MS clusters for critical applications, you will most likely want to be alerted for more issues then only a cluster service failover. I’ve seen multiple clusters with issues that did not consist of a failover. Instead one or more cluster service just went offline and online on the same node. In order to monitor this, I wrote another Powershell script that checks for certain event id’s in the Windows application eventlog and alerts when these events contain information about failing cluster services.

    Willem D'Haese
    Expert Monitoring at Digipolis
    Expert Monitoring with a demonstrated history of working in the information technology and services industry. Strong ICT skills such as monitoring, virtualization, automation.

    Leave a Reply

    Your email address will not be published. Required fields are marked *

    You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code class="" title="" data-url=""> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> <pre class="" title="" data-url=""> <span class="" title="" data-url="">