Monitoring Windows Scheduled Tasks

Introduction

Tasks scheduler is a Microsoft Windows component that allows you to schedule programs or scripts to start at pre-defined intervals. There are two major versions of the task scheduler: In version 1.0, definitions and schedules are stored in binary .job files. Every task corresponds to a single action. This plugin will not work on version 1.0 of the task scheduler, which is running on Windows Server 2000 and 2003. In version 2.0, the Windows task scheduler got a redesigned user interface based on Management console. Version 2.0 also supports calendar and event-based triggers, such as starting a task when a particular event is logged to the event log, or when a combination of events has occurred. Also, several tasks that are triggered by the same event can be configured to run either simultaneously or in a pre-determined chained sequence of a series of actions.

Tasks can also be configured to run based on system status such as being idle for a pre-configured amount of time, on startup, logoff, or only during or for a specified time. Other new features are a credential manager to store passwords so they cannot be retrieved easily. Also, scheduled tasks are executed in their own session, instead of the same session as system services or the current user. You can find a list of all task scheduler 2.0 interfaces here.

Requirements

Starting from Windows Powershell 4.0, you can use a whole range of Powershell cmdlets to manage your scheduled tasks with Powershell. This plugin for Nagios does not use these cmdlets, as it has to be Powershell 2.0 compatible. Maybe in a few years, when Powershell 2.0 becomes obsolete, I’ll patch the script to make use of the new cmdlets. You can find the complete list of cmdlets here. Failing tasks will always end with some sort of error code. You can find the complete list of error codes here. This plugin will output the exitcodes for failing tasks in the Nagios service description. Output will also notify you on tasks that are still running. We have multiple Windows servers at work with a growing amount of scheduled tasks and each scheduled task needs to be monitored. With the help of Nagios and this plugin you can find out:

  • How many are running at the same time?
  • How many are failing?
  • How long are they running?
  • Who created them?

Versions

Disabled scheduled tasks are excluded by default from 3.14.12.06. In earlier versions, you had to manually exclude them by excluding them with -EF or -ET. It seemed like a logical decision to exclude disabled tasks by default and was suggested by someone on the Nagios Exchange reviewing the plugin.. Maybe one day I’ll make a switch to include them again if specified. As some scheduled tasks do not need to be monitored, the script enables you to exclude complete folders.

Since v5.13.160614 it is possible to include hidden tasks. Just add the ‘–Hidden 1’ switch to your parameters and your hidden tasks will be monitored.

One of the folders I tend to exclude almost all the time is the “Microsoft” folder. It seems like several tasks in the Microsoft folder tend to fail sometimes. So unless you absolutely need to know the state of every single scheduled task running on your Windows Server, I can advise you to exclude it too. You can find the folder and tasks in this locations: C:\Windows\System32\Tasks
It is possible to include tasks or task folders with the ‘–InclFolders’ and ‘–InclTasks’ parameters. This filter will get applied after the exclude parameter. Please note that including a folder is not recursive. Only tasks in the root of the folder will be included.

Help

This is the help of the plugin, which lists all valid parameters:

You could put every scheduled task  you don’t want to monitor in a separate  folder and exclude it with the -EF parameter. Alternatvely, you can use the -ET parameter to exclude based on name patterns. One quite important thing to know is that in order to exclude or include the root folder, you need to escape the backslash, like this: “\\”.

How to monitor your scheduled tasks?

  1. Put the script in the NSClient++ scripts folder, preferably in a subfolder Powershell.
  2. In the nsclient.ini configuration file, define the script like this:

    For more information about external scripts configuration, please review the NSClient documentation. You can also consider defining a wrapped script in nsclient.ini to simplify configuration.
  3. Make a command in Nagios like this:
  4. Configure your service in Nagios. Make use of the above created command. Configure something similar like this as $ARG1$:

Some things to consider to make it work:

  • “set-exectionpolicy remotesigned”
  • Nscp service account permissions => Running with local system should suffice, but I had users telling me it only worked with a local admin. I found out that on some NSClient++ versions, more specific version 0.4.3.88 and probably some earlier versions too, the following error occured when running nscp service as local system: “CHECK_NRPE: Invalid packet type received from server”. After filing an issue on the GitHub project page of NSClient++, Michael Medin quickly acknowledged the issue and solved it from version 0.4.3.102, so the plugin should work again as local system.

Examples

If you would run the script in cli from you Nagios plugin folder, this would be the command:

If you would want to exclude one noisy unimportant scheduled task, the command used in cli would look like this:

If you only want the scheduled tasks in the root to be monitored, you can use this command:

This would only give you the scheduled tasks available in the root folder. The output look like this now.

Final Words

It seems the perfdata in the Highcharts graphs sometimes contains decimal numbers (see screenshot), which is kind of strange as I’m sure I only pass rounded numbers. Seems this is related to the way RRD files are working. To reduce the amount of storage space used, NPCD and RRD while average out the data, resulting in decimals, even when you don’t expect them.

This is a small to do list:

  • Add switches to change returned values and output.
  • Add array parameter with exit codes that should be excluded.
  • Test remote execution. In some cases it might be useful to be able to check remotely for failed windows tasks.
  • Include a warning / critical threshold when discovered tasks exceed a certain duration.
  • I was hoping to add some more exit codes to check, which would make failed tasks easier to troubleshoot. You can find the list of scheduled task exit codes here. The constants that begin with SCHED_S_ are success constants, and the constants that begin with SCHED_E_ are error constants.

Screenshots:

These are some screenshots of the Nagios XI Graph Explorer for two of our servers making use of the plugin to monitor scheduled tasks: Tasks 01 check_ms_win_tasks_graph_02 Let me know on the Nagios Exchange what you think of my plugin by rating it or submitting a review. Please also consider starring the project on GitHub.

Willem

check-ms-win-disk-load-graph-01

Monitoring Windows Disk Load

Introduction

I rolled out check_diskstat on our Linux servers in September 2014  and really missed a similar plugin for monitoring disk load on Windows servers. Hence, I started thinking about a new Powershell script, which would use the Powershell command ‘get-counter’, to gather all disk related information from the Performance Monitor. I started with making a list of the requirements:

  • The main requirement was that it had to be multilingual, as I work on English and Dutch versions of Windows Server 2003, 2003 R2, 2008 and 2008 R2. 
  • Another requirement was that the script had to allow an argument that specifies the amount of samples over which an average could be calculated.
  • The perfdata output should be outputted in a way where all disk load related values had to be visible in a graph. I had to deal with very high values, eg 8763098004 and very small decimals, eg 0,00014. This implied I had to find some way to make it visually attractive and correct in Highcharts, for example by outputting in milliseconds instead of seconds or megabytes instead of bytes.
  • The plugin also had to work culture independent. Some culture use ‘,’ and other use ‘.’ as decimal. I solved this by replacing [System.Threading.Thread]::CurrentThread.CurrentCulture with ‘en-US’ ans setting it back to the original value once I’m done.

Monitoring disk load may be useful in finding the cause of performance issues. If a component of an application starts writing huge logs or big amounts of data in a database on your Windows disks, a bottleneck could be created in your application’s flow. This bottleneck could quickly result in any kind of lag, latency or slowness for end-users, resulting in more incidents, calls or complaints. An integral part of the job as monitoring engineer, is to avoid  situations as described above. Here Nagios can help you, by alerting you before applications start getting slow. Up until now, the only way to monitor performance counters for Windows servers, was using an agent like NSClient++ (or NCPA?) to retrieve one performance counter. My check_ms_windows_disk_load plugin enables you to combine several disk load related performance counters with only one service. This method has several advantages:

  • You don’t need to worry what counters to monitor. The plugin will do that for you.
  • As the plugin monitors 8 performance counters, and you only need one service, this would save you 7 services for each disk. So your Nagios server has less work, which enables you to monitor other stuff instead or increase the monitor interval on your checks.
  • As you can pass maxsamples (-ms or –MaxSamples) as a parameter, you can choose yourself how long you want the plugin to run before calculating averages. Each sample should be one second.

You could also prove to your application engineers that the storage is or is not the cause of their application’s performance. You can use comprehensive graphs visualizing a collection of disk performance related information. You also need knowledge about your disk load in order to choose the right disk type for the job. Are your 3TB SATA disks strong enough to handle the job or will you have to buy more expensive SSD’s to achieve the performance you need?

How to monitor your disk load?

  1. Put the script in the NSClient++ scripts folder, preferably in a subfolder Powershell.
  2. In the nsclient.ini configuration file, define the script like this:
  3. Make a command in Nagios like this:
  4. Configure your service in Nagios. Make use of the above created command. Configure something similar like this as $ARG1$:

Examples:

One day after everything is configured correctly, your Highcharts graphs should look like this:

disk load graph 01

If you want to test the load on your Windows disks, you can use this Storage Load Generator DiskSPD from Microsoft to play. (Yes Microsoft has a GitHub account!!)

I hope this plugin can help you monitor the disk load on your Windows hosts. Please rate it on the Nagios Exchange if you like my work.