Monitoring F5 BIG-IP Platform

Introduction

The F5 BIG-IP platform consists of software and hardware that acts as a reverse proxy and distributes network or application traffic across a number of servers. Load balancers are used to increase capacity and reliability of applications. They improve the overall performance of applications by decreasing the burden on servers associated with managing and maintaining application and network sessions, as well as by performing application-specific tasks. As it’s a critical and important part of your network, monitoring F5 BIG-IP health is critical to ensure operations are working as expected. This can be achieved via traditional SNMP polling, but in order to get a detailed view of the performance of F5 network services, you will require a combination of SNMP polling and F5 syslog message analysis.

The best platform to do syslog message analysis is still the Elastic stack. Over the past years I’ve been working on a set of F5 Logstash filters, which can be used to create beautiful Kibana dashboards which can give you detailed insights in the working and processes of your F5 BIG Load Balancer.

Load balancers are generally grouped into two categories: Layer 4 and Layer 7. Layer 4 load balancers act upon data found in network and transport layer protocols (IP, TCP, FTP, UDP). Layer 7 load balancers distribute requests based upon data found in application layer protocols such as HTTP. Requests are received by both types of load balancers and they are distributed to a particular server based on a configured algorithm. Some industry standard algorithms are:

  • Round robin
  • Weighted round robin
  • Least connections
  • Least response time

Layer 7 load balancers can further distribute requests based on application specific data such as HTTP headers, cookies, or data within the application message itself, such as the value of a specific parameter. Load balancers ensure reliability and availability by monitoring the “health” of applications and only sending requests to servers and applications that can respond in a timely manner.

Monitoring F5 BIG-IP Platform

Nagios

Nagios allows you to actively monitor the health of your F5 Load Balancer with SNMP. I’ll add some examples here asap.

Elastic

You can find the required configuration files on GitHub. The project includes F5 Logstash filters, F5 elasticsearch templates and F5 Logstash patterns.

Logstash configuration

F5 Logstash input

F5 Logstash filters

dcc => ASM related messages. BIG-IP Application Security Manager (ASM) enables organizations to protect against OWASP top 10 threats, application vulnerabilities, and zero-day attacks. Leading Layer 7 DDoS defenses, detection and mitigation techniques, virtual patching, and granular attack visibility thwart even the most sophisticated threats before they reach your servers.

apd => Access Policy Demon. The apd process runs a BIG-IP APM access policy for a user session.

tmm => The traffic management microkernel is the process running on the BIG-IP host O/S that performs all of the local / global traffic management for the system.

sshd => The ssh daemon provides remote access to the BIG-IP system command line interface

logger => If a BIG-IP high-availability redundant pair has the Detect ConfigSync Status feature enabled, each unit in the pair sends periodic iControl queries to its peer to determine if the redundant pair configuration is synchronized. These iControl requests occur approximately every 30 seconds on each unit. Each inbound request generates an entry in both the local /var/log/httpd/ssl_access_log file and the /var/log/httpd/ssl_request_log file. As I never saw anything useful coming out of it, I asked our F5 engineer to have a look at this F5 article , which describes how to exclude these messages in the F5 syslog configuration.

F5 Logstash custom grok patterns

You will need to add these F5 Logstash custom grok patterns to your Logstash patterns directory. For me it’s located in /etc/logstash/patterns

Elasticsearch configuration

Included in the GitHub project you can find my f5 elasticsearch template, with the correct mappings for each field. This enables you to use your data more efficiently and allow for advanced ip aggregations. You can find more information about mapping types here. If you have ideas about better mappings (I know they need some work), please let me know on GitHub by making an issue.

F5 Remote Logging Configuration

You will need to configure your F5 with one or more remote syslog servers  to send logs your Logstash nodes. Ideally you will want to specify a custom port dedicated for F5 syslog traffic. You can find the official F5 remote syslog documentation here

You can use the F5 Configuration Utility to add a remote syslog server like this:

  1. Log on to the Configuration utility.
  2. Navigate to System > Logs > Configuration > Remote Logging.
  3. Enter the destination syslog server IP address in the Remote IP text box.
  4. Enter the remote syslog server UDP port (default is 514) in the Remote Port text box.
  5. Enter the local IP address of the BIG-IP system in the Local IP text box (optional).

    Note: For BIG-IP systems in a high availability (HA) configuration, the non-floating self IP address is recommended if using a Traffic Management Microkernel (TMM) based IP address.

  6. Click Add.
  7. Click Update.

Kibana Dashboards

The Logstash filters I created allow you do some awesome things in Kibana. I’m working on a set of dashboards with a menu which will allow you to drilldown to interesting stuff, such as apd processors, session, dcc scraping and other violations. 

Elastic F5 Home Dashboard

F5 Logstash fitler Kibana dashboard

Elastic F5 dcc scraping dashboard

F5 Logstash filter dcc scraping

I will open source them when I consider them ready for public. If you are truly interested in helping me develop and expand them, send me an email (willem.dhaeseatgmail), and I’ll consider sending you my Kibana 5 dashboards json exports. The only requirement is that you use f5-* as index pattern.

Greetings

Willem

Monitoring Network Connections Nagios 2

Monitoring Windows Network Connections

Introduction

Monitoring the network connections on your Windows servers can be crucial to examine server load and investigate bottlenecks and anomalies. There are many ways to monitor your network connections. This blog post will go into detail of some of the tools that can be used to achieve optimal monitoring of your Windows network connections.

How To monitor your Windows Network Connections?

PerfMon

In the Windows Performance Monitor, you can find several counters for all kinds network connections. This set of counters is available for TCPv4 and TCPv6 connections.

Counter NameCounter Description
Connection FailuresConnection Failures is the number of times TCP connections have made a direct transition to the CLOSED state from the SYN-SENT state or the SYN-RCVD state, plus the number of times TCP connections have made a direct transition to the LISTEN state from the SYN-RCVD state.
Connections ActiveConnections Active is the number of times TCP connections have made a direct transition to the SYN-SENT state from the CLOSED state. In other words, it shows a number of connections which are initiated by the local computer. The value is a cumulative total.
Connections EstablishedConnections Established is the number of TCP connections for which the current state is either ESTABLISHED or CLOSE-WAIT.
Connections PassiveConnections Passive is the number of times TCP connections have made a direct transition to the SYN-RCVD state from the LISTEN state. In other words, it shows a number of connections to the local computer, which are initiated by remote computers. The value is a cumulative total.
Connections ResetConnections Reset is the number of times TCP connections have made a direct transition to the CLOSED state from either the ESTABLISHED state or the CLOSE-WAIT state.
Segments Received/secSegments Received/sec is the rate at which segments are received, including those received in error. This count includes segments received on currently established connections.
Segments Retransmitted/secSegments Retransmitted/sec is the rate at which segments are retransmitted, that is, segments transmitted containing one or more previously transmitted bytes.
Segments Sent/secSegments Sent/sec is the rate at which segments are sent, including those on current connections, but excluding those containing only retransmitted bytes.
Segments/secSegments/sec is the rate at which TCP segments are sent or received using the TCP protocol.

At the moment there seems to be no Performance Monitor counter available  in Windows to show the UDP connection count.  Although the Windows Performance Monitor is an easy choice to have a quick glance at how many TCP connections are currently active, it is not an optimal tool to use for debugging or alerting. The PerfMon user interface also hasn’t changed much over the years. 

UDP Connection Count

This means that we will have to look at other options, such as Netstat:

Netstat

Netstat is a command-line tool that displays very detailed information about your network connections, both incoming and outgoing, routing tables, network interfaces and network protocol statistics.
It is mostly used for finding problems in the network and to determine the amount of traffic on the network as a performance measurement. 

Although Netstat is the perfect tool for looking in real-time at your network connections, you will need some way to graph the Netstat values. Being able to analyze the connection count over time really helps with getting a better understanding of what your servers and applications are doing.

Nagios

As I saw multiple plugins to check network connections with Netstat on Linux hosts, but not on Windows hosts, I decided to write a Powershell script which uses Netstat to monitor your TCP and UDP network connections on Windows hosts.

How to monitor your network connections with Nagios?

  1. Download the latest version of check_ms_win_network_connections on GitHub.
  2. Put the script in the NSClient++ scripts folder, preferably in a subfolder Powershell.
  3. In the nsclient.ini configuration file, define the script like this:

  4. Make a command in Nagios like this:

  5. Configure your service in Nagios. Make use of the above created command. Configure something similar like this as $ARG1$:

Additional Information

The script initiates a ‘netstat -ano’ , which will display all active network connections with their respective ip addresses, port number and the corresponding process id’s, parse the results and apply the optional filters.
This could of course also be accomplished by just retrieving the ‘\TCPv4Connections Established’ performance countera and it’s UDP variant, but the real strength of the script are it’s parameters. If you think your systems have been compromised by a virus or other malicious software, you can distribute the check_ms_network_connections plugin to all Windows servers and then check your network connections for a given process, port or ip address. This could quickly result in an overview of all impacted systems.

Usage

Because the Powershell command  get-process  doesn’t add file extensions, the -P parameter also does not need it’s file extensions eg ‘.exe’. For example in order to look for all connections made by svchost.exe, the parameters would look like this: -H server.fqdn -P svchost 

Another usage example could be the need to monitor a server that needs a continuous link with another server. By specifying, the -wl and -cl parameters like this -H server.fqdn -wl 2 -cl 0 -wh 10 -ch 15  , you should get a warning alert when the amount of TCP connections drops below 2 and a critical alert when there is no TCP connection with the remote server.
Please note that when using different filter parameters, ‘or’ is used, not ‘and’. So if any of the filters apply’s, the connection should be added. 

If you don’t want to filter on IP address or port, I suggest you use the ‘-c’ parameter, which improves performance a lot. If you are running the plugin on a server with a very high amount of connections, I also suggest using the -c parameter.
The ‘-c’ parameter will execute  (netstat -abn -proto TCP).count which is way faster then having to loop through each individual connection. It does imply you will get less information, as it only counts the active TCP connections.

Results

The result of using Nagios XI to monitor your network connections looks like this:

Monitoring Network Connections Nagios

TIG

A third option is to use a TIG stack, which will use Telegraf to query the counters from PerfMon and sends them to an InfluxDB time series database. Visualization is done with Grafana.

The Telegraf agent configuration file needs this input:

TIG Network Connections

Grafana allows you to create a query which will show all values for all hosts with a certain tag. With the help of templates, it becomes very easy to create beautiful graphs with filterable, sortable min, max, avg and current values o all your network connections counters. And this with a one second granular interval.

TIG-Windows-Network-Connections-Top-Avg

A disadvantage of using Telegraf is that you are limited to using PerfMon counters. This means it’s not possible to get the UDP connection count. There seems to be a way to execute Powershell scripts with telegraf, but my guess is that the resulting load will be too high to execute this with a one second interval.

Final Words

As you can seen there are multiple options to monitor your Windows network connections. I’ll try to extend this documentation with some alerting examples.

NSClient++ 0.5.0.7

Introduction

There seems to be quite some development work done on NSClient++ lately by Michael Medin, as you can see in this GitHub commit graph.

nsclient

As I’m still on 0.4.1.105, I though it was time to make a little review on the latest (nightly) version of NSClient++, which is at this moment 0.5.0.7. To be honest, I’m not looking forward to migrating all our old NSClient installations to later versions. As the nsclient.ini configuration has changed drastically, this will imply I will have some work to migrate everything without issues.

There aren’t really any alternatives at the moment. As far as I know NSClient++ is still the only client offering real-time eventlog monitoring capabilities and this is imho a must-have. 

So in this review I will go through all my old 0.4.1.105 checks, and check out if they are still working in 0.5.0.7. Please not that this is a nightly build and is not fit for production environments yet.

Installation

So download the latest version of NSClient++ and start the installation.
Choose the generic monitoring option. 

nsclient-0.5.0.7-installation-01

Choose custom setup:

nsclient-0.5.0.7-installation-02

Set the ip address of your Nagios server in the allowed hosts field and a strong password in the password field. You will need this passsword later to log in to the website. For now, choose the ‘Insecure legacy mode’ option. In order to use the ‘Safe mode’ and ‘Secure’ mode, you will have to install NSClient on your Nagios server too, but if you are only monitoring internal servers (not over the Internet), the ‘Insecure legacy mode’ should be ok. I’ll try to make a post about the other modes in the future.

nsclient-0.5.0.7-installation-03

Click next and NSClient++ will finish the installation.In order to understand al the default options and settings, it’s generally a good idea to add the default settings to the nsclient.ini file. This can be done with the following command from the NSClient++ installation folder:

In my case, with a fresh install, this generated some errors, but I’m quite sure this won’t be a real issue. Probabaly just related to the nightly build 0.5.0.7.

Webserver

So I was curious to see if I could get the NSClient++ webserver working, as in my previous tests (0.4.3.x) I never got it to work properly.

As checked the ‘Enable Web server’ checkbox, I was expecting it to work out of the box, which was not the case. Browsing to https://localhost:8443/ resulted in an ‘ERR_CONNECTION_REFUSED’ error.

So I had a look through the nsclient.ini and noticed that although I did enable it in the installation I still found this:

So after setting it to 1 and restarting the nscp service, I was able to log in with the password I configured during the installation. The webserver is using a self-signed certificate, which is better then nothing. If you have a certificate authority, you should be able to generate secure certificates so don’t get the ‘red cross’ in your browser.

nsclient-0.5.0.7-webserver-01

Home

After logging in, you immediately arrive in the Home webpage with some basic information, such as CPU Load, Processes, threads, handles and uptime

nsclient-0.5.0.7-webserver-02

It looks a bit like a remake of the Windows Task Manager, but with a little less information.

nsclient-0.5.0.7-webserver-03

There seems to be no X-axis information in the NSClient PCU webpage. The interval used in the Task Manager is one second, while in the NSclient webpage it is  seconds, which of course results in slightly different results. 

The NSClient webpage is of course accessible from ‘anywhere’ if set up correctly, which is definitely a plus. 

One more small remark, is that Michael seems to have chosen ‘CPU Load’ as the name which represents the CPU utilization. Imho this is quite confusing as on Linux servers, CPU load is more a value representing the current CPU queue. As NSClient is supposed to also work on Linux servers now, I think it should be named ‘CPU Usage’ (which is a bit shorter then ‘CPU Utilization’)

Besides CPU info, there is also some memory information:

nsclient-0.5.0.7-webserver-04

And a list of 38 metrics. I think these are all the metrics NSClient++ is caching, enabling it to calculate nice averages instead of current values.

nsclient-0.5.0.7-webserver-05

Modules

The second menu item ‘Modules’ lists all the available modules and their state. 

nsclient-0.5.0.7-webserver-06

So I tried checking an extra module to see if it is changed in the nsclient.ini, but apart from the checkbox being checked, nothing really changed. 

nsclient-0.5.0.7-webserver-07

As there is almost no documentation about the new webserver, I tried some things myself, but to no effect. I’m not quite sure what the reload and shutdown actions are supposed to do.

nsclient-0.5.0.7-webserver-08

I’ve tested this and it does not restart the nscp service. Shutdown doesn’t really seem to do anything yet.

And then I suddenly noticed that a new menu item appears ‘Changes’, which allows me to Save or Undo the configuration.

nsclient-0.5.0.7-webserver-10

It felt a bit weird that this menu items just appeared out of nowhere. Maybe it would better if it was always there, but with a green icon or when there are no detected changes. Something else I noticed is that when loading a module, you cannot enable this module unless you save it first.

In the nsclient.ini the modules I activated were properly adjusted. the only weird thing is that changes done with the web gui are using ‘enabled or diasbled’, while changes done in commandline, such as generating the defaults are using ‘0’ or ‘1’ to disable or enable a module. It would be nice if this was somehow more consistent.

Settings

The settings menu seems to need some work, as I saw a lot of ‘TODO’ and ‘Unknown’ strings for several items.

Also, I’m not quite sure what the ‘Changed’, ‘Basic’, ‘Advanced’ tabs are supposed to do.

nsclient-0.5.0.7-webserver-11

Queries

The queries menu gives an overview of all possible queries. 

nsclient-0.5.0.7-webserver-12

When you click on a query, you are linked to the module which enable you to use this query and you are able to see a ‘Help’ file with the usable arguments for the selected query.

nsclient-0.5.0.7-webserver-13

And it seems Michael also enables us to test a query:

nsclient-0.5.0.7-webserver-16

Which is a very nice feature. It would be nice to see a list of more complex working examples.

Log

The Log menu gives a nice filterable overview of the NSClient logfile:

nsclient-0.5.0.7-webserver-14

Console

Similar to the Logs menu, the Console menu gives also a filterable overview of all console messages.

nsclient-0.5.0.7-webserver-15

(Almost) Final words

The features I just listed are just a few of the many new exiting features in the new NSClient++. The webserver has a nice gui and is a nice preview of things to come. Thanks a lot Michael for sharing your work with the world.

I will continue writing on this review when I find the time.