Monitoring MS SharePoint Health

 

 


Error: Your Requested widget " wp-github-commits-19" is not in the widget list.
  • [do_widget_area sidebar-1]
    • [do_widget_area sidebar-2]
      • [do_widget id="tag_cloud-2"]
    • [do_widget_area widgets_for_shortcodes]
      • [do_widget id="wp-github-commits-14"]
      • [do_widget id="wp-github-commits-17"]
      • [do_widget id="wp-github-commits-15"]
      • [do_widget id="wp-github-commits-25"]
      • [do_widget id="wp-github-commits-3"]
      • [do_widget id="wp-github-commits-13"]
      • [do_widget id="wp-github-commits-20"]
      • [do_widget id="wp-github-commits-16"]
      • [do_widget id="wp-github-commits-2"]
      • [do_widget id="wp-github-commits-24"]
      • [do_widget id="wp-github-commits-23"]
      • [do_widget id="wp-github-commits-21"]
      • [do_widget id="wp-github-commits-5"]
      • [do_widget id="wp-github-commits-6"]
      • [do_widget id="wp-github-commits-26"]
      • [do_widget id="wp-github-commits-9"]
      • [do_widget id="wp-github-commits-22"]
      • [do_widget id="wp-github-commits-19"]
      • [do_widget id="wp-github-commits-8"]
      • [do_widget id="wp-github-commits-7"]
      • [do_widget id="wp-github-commits-18"]
      • [do_widget id="wp-github-commits-11"]
      • [do_widget id="wp-github-commits-4"]
      • [do_widget id="wp-github-commits-12"]
      • [do_widget id="wp-github-commits-27"]
    • [do_widget_area wp_inactive_widgets]
      • [do_widget id="recent-posts-2"]
      • [do_widget id="recent-comments-2"]

    Introduction

    SharePoint is a web application platform in the Microsoft Office server suite. Launched in 2001, SharePoint combines various functions which are traditionally separate applications: intranet, extranet, content management, document management, personal cloud, enterprise social networking, enterprise search, business intelligence, workflow management, web content management, and an enterprise application store.
    SharePoint Health Analyzer is a feature in Microsoft SharePoint Foundation 2010 that enables administrators to schedule regular, automatic checks for potential configuration, performance, and usage problems in the server farm. Any errors that SharePoint Health Analyzer finds are identified in status reports that are made available to farm administrators in Central Administration. Status reports explain each issue, list the servers where the problem exists, and outline the steps that an administrator can take to remedy the problem.
    SharePoint Health Analyzer monitors the farm by applying a set of health rules. A number of these rules ship with SharePoint Foundation. You can create and deploy additional rules by writing code that uses the SharePoint Foundation object model. When a health rule executes, SharePoint Health Analyzer creates a status report and adds it to the Health Analyzer Reports list in the Monitoring section of Central Administration.
    This plugin will create a PSObject for each item in the status report that has no ‘Success’ severity and return a critical state if any problems are found, together with information about each problem in the status report, such as the failed service and date modified. 

    SharePoint_2010

    How to use check_ms_sharepoint_health?

    1. Put the script in the NSClient++ scripts folder, preferably in a subfolder Powershell, enabling you to use this Reactor action to update your plugins folder without having to edit the script.
    2. In the nsclient.ini configuration file, define the script like this:
      check_ms_sharepoint health=cmd /c echo scripts/powershell/check_ms_sharepoint_health.ps1; exit $LastExitCode | powershell.exe -command –
    3. Make a command in Nagios like this:
      check_ms_sharepoint_health => $USER1$/check_nrpe -H $HOSTADDRESS$ -p 5666 -t 60 -c check_ms_sharepoint_health 
    4. Configure your service in Nagios, make use of the above created command. 

    Monitoring Microsoft Failover Cluster Preferred Node

     

     


    Error: Your Requested widget " wp-github-commits-5" is not in the widget list.
    • [do_widget_area sidebar-1]
      • [do_widget_area sidebar-2]
        • [do_widget id="tag_cloud-2"]
      • [do_widget_area widgets_for_shortcodes]
        • [do_widget id="wp-github-commits-14"]
        • [do_widget id="wp-github-commits-17"]
        • [do_widget id="wp-github-commits-15"]
        • [do_widget id="wp-github-commits-25"]
        • [do_widget id="wp-github-commits-3"]
        • [do_widget id="wp-github-commits-13"]
        • [do_widget id="wp-github-commits-20"]
        • [do_widget id="wp-github-commits-16"]
        • [do_widget id="wp-github-commits-2"]
        • [do_widget id="wp-github-commits-24"]
        • [do_widget id="wp-github-commits-23"]
        • [do_widget id="wp-github-commits-21"]
        • [do_widget id="wp-github-commits-5"]
        • [do_widget id="wp-github-commits-6"]
        • [do_widget id="wp-github-commits-26"]
        • [do_widget id="wp-github-commits-9"]
        • [do_widget id="wp-github-commits-22"]
        • [do_widget id="wp-github-commits-19"]
        • [do_widget id="wp-github-commits-8"]
        • [do_widget id="wp-github-commits-7"]
        • [do_widget id="wp-github-commits-18"]
        • [do_widget id="wp-github-commits-11"]
        • [do_widget id="wp-github-commits-4"]
        • [do_widget id="wp-github-commits-12"]
        • [do_widget id="wp-github-commits-27"]
      • [do_widget_area wp_inactive_widgets]
        • [do_widget id="recent-posts-2"]
        • [do_widget id="recent-comments-2"]

      Introduction

      Clustering is a very important technology to ensure application availability and performance. Accurate monitoring of your clusters is crucial to keep your applications stable. Not monitoring is not an option, as you might never know when a failover has taken place and waste valuable time looking in the wrong places for solutions to your problems. There are different options and levels of monitoring you can choose for monitoring Microsoft Windows failover clusters.

      Preferred Node Option

      One of the easiest is checking each cluster service if it is still running on it’s preferred node. The plugin from Nedstars (link) to check if MS cluster services are running on their preferred node only works for Windows 2008 R2 or later versions. Apart from that I noticed that there seemed to be a bug in Nedstars script. If a MS cluster contained more then one cluster service, it seemed to only check one of them. I did not investigate this further, so forgive me if I’m wrong here. As we are still using multiple Windows 2003 failover clusters,  I decided to write a completely new plugin that uses WMI to get information about failover cluster services on Microsoft Windows 2003 failover clusters. Windows Server 2008 R2 Failover Cluster Preferred Node: preferred node on 2008 Windows Server 2003 R2 Failover Cluster Preferred Node: check_ms_cluster_preferrede_node_2003R2 The plugin starts by checking the version of the Windows server where it is running on with WMI. If the version is lesser then 6.1 (which is the version number of Windows Server 2008 R2), the script will continue to use WMI to get all cluster services and then check each cluster service individually to see where it is running and compare that with it’s preferred node. If the OS version is not lesser then 6.1, the plugin will import the module ‘failoverclusters’ and start using module commands instead of WMI. I did not check if the script works on a cluster with more then two nodes, as we don’t have clusters with three or more nodes. If anyone could test this for me and let me know… 

      How to use the check_ms_cluster_preferred_node?

      1. Copy the ‘check_ms_cluster_preferred_node.ps1’ Powershell script to the NSClient++ scripts folder, preferably in a sub-folder ‘Powershell’ on all the failover cluster nodes you wish to monitor.
      2. In the nsclient.ini configuration file, define the external command like this (and restart the NSClient++ service (nscp) afterwards):
      3. Make a command in Nagios like this:
      4. Configure your service in Nagios. Use the command you previously made. A healthy check in Nagios XI would look like this:check_ms_cluster_preferred_node

      Other Microsoft Failover Cluster monitoring options?

      So is monitoring the preferred node the best way to monitor clusters. Definitely not. A cluster might have no or multiple preferred nodes for some reason. If you happen to own some MS clusters for critical applications, you will most likely want to be alerted for more issues then only a cluster service failover. I’ve seen multiple clusters with issues that did not consist of a failover. Instead one or more cluster service just went offline and online on the same node. In order to monitor this, I wrote another Powershell script that checks for certain event id’s in the Windows application eventlog and alerts when these events contain information about failing cluster services.

      Monitoring Microsoft Exchange 2010 Mailbox

       
       

      Error: Your Requested widget " wp-github-commits-6" is not in the widget list.
      • [do_widget_area sidebar-1]
        • [do_widget_area sidebar-2]
          • [do_widget id="tag_cloud-2"]
        • [do_widget_area widgets_for_shortcodes]
          • [do_widget id="wp-github-commits-14"]
          • [do_widget id="wp-github-commits-17"]
          • [do_widget id="wp-github-commits-15"]
          • [do_widget id="wp-github-commits-25"]
          • [do_widget id="wp-github-commits-3"]
          • [do_widget id="wp-github-commits-13"]
          • [do_widget id="wp-github-commits-20"]
          • [do_widget id="wp-github-commits-16"]
          • [do_widget id="wp-github-commits-2"]
          • [do_widget id="wp-github-commits-24"]
          • [do_widget id="wp-github-commits-23"]
          • [do_widget id="wp-github-commits-21"]
          • [do_widget id="wp-github-commits-5"]
          • [do_widget id="wp-github-commits-6"]
          • [do_widget id="wp-github-commits-26"]
          • [do_widget id="wp-github-commits-9"]
          • [do_widget id="wp-github-commits-22"]
          • [do_widget id="wp-github-commits-19"]
          • [do_widget id="wp-github-commits-8"]
          • [do_widget id="wp-github-commits-7"]
          • [do_widget id="wp-github-commits-18"]
          • [do_widget id="wp-github-commits-11"]
          • [do_widget id="wp-github-commits-4"]
          • [do_widget id="wp-github-commits-12"]
          • [do_widget id="wp-github-commits-27"]
        • [do_widget_area wp_inactive_widgets]
          • [do_widget id="recent-posts-2"]
          • [do_widget id="recent-comments-2"]
         

        Introduction

        I based this scipt on the MS Exchange 2010 DAG Mailbox check by Matt Haynes, but made the following changes:

        • Recovery mailbox databases are excluded. This way we don’t get any false alerts when colleagues are restoring a backup to a temporary recovery database.
        • For MS Exchange, backups are crucial, as your logs will grow very fast when they don’t get backupped. Therefore a warning is generated if the last backup date is over 26 hours and an error is generated if it was over 50 hours. (these time periods can be altered, depending on the total time your backup software requires to backup all your mailbox databases)
        • Performance data is gathered from the amount of mounted, healthy, and unhealthy mailbox databases.
        • If no mailbox databases are found on the Exchange server, an error is also generated.

        I would like to thank my colleague, De Clerck John, who created the exclusion for the recovery mailbox database and the check for last backup. His knowledge of MS Exchange and Powershell seems to be unlimited. As previously said, the script will count all healthy, failed and mounted mailbox databases and output performance data. This way, you can easily see when exactly something went wrong with your mailbox databases:

        check-ms-exchange-2010-health-graph-01

        In this screenshot it appears we had an issue on 28 April. In fact our Exchange administrators are busy migrating the mailbox databases to different MS Exchange servers, which is also clear in the next days of the graph.

        exchange

        How to monitor your MS Exchange databases?

        1. Copy the ‘check_ms_exchange_2010_health.ps1’ Powershell script to the NSClient++ scripts folder, preferably in a sub-folder ‘Powershell’ on all the Exchange 2010 servers you wish to monitor.
        2. In the nsclient.ini configuration file, define the external command like this (and restart the NSClient++ service (nscp) afterwards):
        3. Make a command in Nagios like this:
        4. Configure your service in Nagios. Use the command you previously made. Argument 1 should be the hostname of the MS Exchange server

        Grtz Willem