Real-time Eventlog Monitoring with Nagios and NSClient++

Introduction to real-time eventlog monitoring

NSClient++ has a very powerful component that enables you to achieve real-time eventlog monitoring on Windows systems. This feature requires passive monitoring of Windows eventlogs via NSCA or NRDP.

The biggest benefits of real-time eventlog monitoring are:

  • It can help you find problems faster (real-time), as NSClient++ will send the events with NSCA the moment it occurs.
  • It is much more resource efficient then using active checks for monitoring eventlogs. It actually requires fewer resources on both the Nagios server, as on the client where NSClient is running!
  • There is no need to search through every application’s documentation, as you can just catch all the errors and filter them out if not needed.

The biggest drawbacks of real-time eventlog monitoring are:

  • As it are passive services, new events will overwrite the previous event, which could cause you to miss a problem on your Nagios dashboards. 
  • You need  a dedicated database table to store the real-time eventlog exclusions. 
  • You will need some basic scripting skills to automate building the real-time eventlog exclusion string in the NSClient configuration file.

General requirements for using real-time eventlog monitoring

NSCA Configuration of your NSClient++

As NSClient++’s real-time eventlog monitoring component will send the events passively to you Nagios server, you will need to setup NSCA. Please read through this documentation for configuring NSCA in NSClient++.

NSCA Configuration of your Nagios server

NSCA also requires some configuration on your Nagios server. Please read through this documentation for configuring NSCA in Nagios Core or this documentation for configuring NSCA in Nagios XI.

Passive services for each Windows host on your Nagios server

Each Windows host needs at least one passive service, which is able to accept the filtered Windows eventlogs. You can make as much of them as you require. I choose to use one for all application eventlog errors and one for all system eventlog errors:

Real-Time Eventlog Monitoring Passive Services

A database to store your real-time eventlog exclusions

If you want to generate a real-time eventlog exclusion filter, you need to somehow store a combination of hostnames, event id’s and event sources. We are using MSSQL at the moment and generate the exclusions with Powershell. This database needs at least a servername, eventlog, eventid, eventsource and comment column. The combination of those allow you to make an exclusion for almost any type of Windows event.

Real-time Eventlog Monitoring Exclusion Database

Some sort of automation software which can be called with a Nagios XI quick action

Thanks to Nagios XI quick actions, you can quickly exclude noisy events by updating the NSClient++ configuration file with the correct filter. With the correct customization and scripts, this allows you to create a self-learning system. For this to work, you basically need one script which will store a new real-time eventlog exclusion in a database and another which generates the NSClient++ configuration file with the latest combination of real-time eventlog exclusions. We are using Rundeck, a free and open source automation tool to execute the above jobs.

Detailed NSClient ++ configuration

Minimal nsclient.ini ‘modules’ settings:

Minimal nsclient.ini ‘NSCA’ settings:

The above configuration doesn’t use any encryption. Once your tests work out, I advise you to configure some sort of encryption to prevent hackers from sniffing your NSCA packets. Please note that at this moment (31/05/17) the official Nagios NSCA project does not support aes, only Rijndael. This GitHub issue has been created to fix this problem. You’ll have to use one of the other less strong encryption methods at the moment.

Example nsclient.ini ‘eventlog’ settings:

This is an example configuration for getting real-time eventlog monitoring to work. Please note that this has been tested on NSClient++ 0.5.1.28. I’m not 100 % sure it works on earlier versions.

The above configuration template is just an example. As you can see it contains a DUMMYAPPLICATIONFILTER and a DUMMYSYSTEMFILTER. You can easily replace these with the generated exclusion filter. A few examples of how such a filter might look:

(id NOT IN (1,3,10,12,13,23,26,33,37,38,58,67,101,103,104,107,108,110,112,274,502,511,1000,1002,1004,1005,1009,1010,1026,1027,1053,1054,1085,1101,1107,1116,1301,1325,1334,1373,1500,1502,1504,1508,1511,1515,1521,1533)) AND (id NOT IN (1509) OR source NOT IN ('Userenv')) AND (id NOT IN (1055) OR source NOT IN ('Userenv')) AND (id NOT IN (1030) OR source NOT IN ('Userenv')) AND (id NOT IN (1006) OR source NOT IN ('Userenv')) 

Or

(id NOT IN (1,3,4,5,8,9,10,11,12,15,19,27,37,39,50,54,56,137,1030,1041,1060,1066,1069,1071,1111,1196,3621,4192,4224,4243,4307,5722,5723)) AND (id NOT IN (36888) OR source NOT IN ('Schannel')) AND (id NOT IN (36887) OR source NOT IN ('Schannel')) AND (id NOT IN (36874) OR source NOT IN ('Schannel')) AND (id NOT IN (36870) OR source NOT IN ('Schannel')) AND (id NOT IN (12292) OR source NOT IN ('VSS')) AND (id NOT IN (7030) OR source NOT IN ('ServiceControlManager')) 

Only errors which are not filtered by the real-time eventlog filters such as the examples above will be sent to your Nagios passive services.

Multiple NSCA Targets

This is an nsclient.ini config file where two NSCA targets are defined. This can be useful in scenarios where a backup Nagios server needs to be identical as the primary Nagios server:

How to generate errors in your Windows eventlogs?

In order to test, you will need a way to debug and hence a way to generate errors with specific sources or id’s. You can do this very easily with Powershell:

If you get an error saying that the source passed with the above command does not exist, you can create it like this:

Or another way:

(Almost) Final Words

As I can hear some people think “why don’t you post the code to generate the real-time eventlog exclusion filter?”. Well, the answer is simple, I don’t have the time to clean up all the code, so it doesn’t contain any sensitive information. But as a special gift for all my blog readers who got to the end of this post, I’ll post a snippet of the exclusion generating Powershell code here. The rest you will have to make your self for now.

I will open the comments section for now, but please only use it for constructive information. 

Grtz

Willem

Monitoring Microsoft IIS Application Pools

Introduction

For those who are not aware, IIS is a HTTP web server from Microsoft which can host both static and dynamic content. This is done by a Windows kernel-mode driver named http.sys. It listens for incoming TCP requests on a configured port, performs some basic security checks and passes the request to a user-mode process. The worker fulfills the request and sends the response back to the requester. Web application are grouped into IIS application pools which has it’s own process assigned to it.

As we are migrated al our IIS applications to a new IIS 8.5 farm on Windows 2012 R2 servers, we needed a way to reliably monitor the state of our most critical IIS application pools. So I created a Powershell script which is able to check the state of an application pool and count the number of web application using it. As each IIS application pool has one w3wp.exe IIS worker process assigned, I added the % processor usage and memory usage to the perfdata.

The latest version also contains a new method to retrieve the IIS application pool information. As Get-ChildItem IIS:\AppPools has a weird bug where the command hangs sometimes I had to look for an alternative. This method uses C:\Windows\system32\inetsrv\appcmd.exe   instead, which seems much more performant.  

How to monitor your MS IIS Application Pools with Nagios?

  • Put the script in the NSClient++ scripts folder, preferably in a subfolder Powershell.
  • In the nsclient.ini configuration file, define the script like this:
  • Make a command in Nagios like this:
  • Configure your service in Nagios. Make use of the above created command. Configure something similar like this as $ARG1$:

    Or if you want to monitor an application pool which has OnDemand startmode where there is no IIS worker process when it isn’t used.

    IIS application pools OnDemand Startmode
    When you want to use the AppCmd.exe method:

Final Words

I only had the chance to test this on a Windows Server 2012 R2. It’s very possible you will experience issues on lower IIS versions. You need to install the IIS Management Scripts and Tools feature for the script to work properly.

IIS Application Pool

When you got it up and running your Nagios server should look like this:

monitoring iis application pools

 

Monitor RaspBerry Pi with Nagios

Introduction

Over the past week, I had multiple questions how to monitor RaspBerry Pi with Nagios. Monitoring is crucial to pro-actively  find out any issues that might come up. There are multiple ways to achieve this. I’ll try to build up this ‘how to’ from the ground, starting with using the standard traditional method, which is using the official Nagios NRPE Agent.

NSClient++ does not yet support Raspbian for now. Michael Medin told me in this forum thread that he is planning to port it once he finds some spare time.

It’s also possible to install Go and Telegraf on your Raspbian, but I haven’t got the time to test that. 

How to Monitor RaspBerry Pi with NRPE Agent?

The code below worked fine for me on Raspbian Jessie

Create nrpe.cfg in /usr/local/nagios/etc

The relevant part of my nrpe.cfg looks like this:

make sure to replace <ip-of-your-Nagios-server-here> with (you never guess) the ip of your Nagios server.

Let me know if you experience any issues.

Grtz

Willem

Nagios XI Docker Container

nagios_docker

I think most of you have heard of Docker. It’s free, ist’s fast, there are a lot of prebuild packages, in short: it’s the playground we’ve all been waiting for.
But when I searched for a Nagios XI docker container, there seemed to be no such thing….
Therefore I build one myself to experiment with.

So if you want to play with Nagios XI 5, check out Docker Hub and fire up a container within minutes
https://hub.docker.com/r/tgoetheyn/docker-nagiosxi/

For those not that familiar with docker, there is a bunch of helpfull information on the Docker site itself:  https://docs.docker.com/
Make sure to check out how to install docker and take som time to look at the different Docker Run options.

Enjoy!