Monitoring Windows with check_mkMay 19. 2009
The Windows AgentDo you want to try an alternative of NSClient++? Check_mk ships its own agent for monitoring Windows hosts: check_mk_agent.exe. This agent installs as a Windows service and has several advantages over NSClient++:
Installation of the agentThe installation of the agent is easy. Just copy check_mk_agent.exe to your Windows host and run it from a command shell with the option install: C:\some\directory\> check_mk_agent.exe install This will install a new Windows service called Check_MK_Agent. This service can be started with the Windows service manager or simply by entering: C:\some\directory\> net start check_mk_agent TestIf the agent is running properly you should be able to connect to the Windows host to TCP port 6556 from the Nagios host. You can test this for example with telnet: user@host> telnet windowshost 6556 <<<check_mk>>> Version: 1.0.28 <<<df>>> C:\ NTFS 18434584 6796648 11637936 37% C:\ <<<ps>>> [System Process] System smss.exe csrss.exe winlogon.exe services.exe lsass.exe ... and so on... Integration into check_mkThe integration of Windows hosts is nothing special and goes the usual way: add the host to all_hosts in main.mk and run the inventory with check_mk -I. After that update your Nagios configuration files with check_mk -U and restart Nagios:
Trying out the agent without installing itThe check_mk_agent allows you to try it out without installing it as a service. The simplest way is to call it with the option test. This does not open a TCP socket but simply output all current data to your console: C:\some\directory\> check_mk_agent.exe test <<<check_mk>>> Version: 1.0.29rc <<<df>>> C:\ NTFS 18434584 6559144 11875440 36% C:\ <<<ps>>> [System Process] System smss.exe csrss.exe winlogon.exe Another option is to start the agent with the option adhoc. No it will open TCP port 6556 and handle requests just like the service. It will do so until you abort it by pressing Control-C: C:\some\directory\> check_mk_agent.exe adhoc Listening for TCP connections on port 6556 Close window or press Ctrl-C to exit Information provided by the agentAs of version 1.0.28 the agent provides access to the following data:
The performance counters provide useful information about a variety of system and application parameters. Check_mk currently extracts disk throughput and CPU usage. Please contact us if you need support for monitoring further aspects of your system or application. Configuring Checks in main.mkMost of the items to be checked are found by the inventory function. This does not hold for processes and services since it is up to your choice which of them to monitor. ProcessesThe output of the Windows agent is compatible with that of the Linux and UNIX agents with respect to the processes. Please refer to "How to monitor processes". ServicesIn order to monitor services you need first to determine, which services are of interest to you. The easiest way is to look at raw the output of the agent and look for the section <<<services>>>. You can use check_mk -d for this: user@host> check_mk -d winhostxy | fgrep -A 10 '<< The first column of the output is the exact internal name of the service. Let's say you want to check if ALG is running and host winhostxy. The put the following line into your checks variable: main.mk checks = [ ( 'winhostxy', 'services', 'ALG', None ), # some other checks... ] If you have a larger number of windows hosts it is a tedious and error prone work to define for each host which services you expect. Check_mk helps you by providing an inventory mechanism for services. All you have to do is to provide a list of relevant services. This list is global and needs only to be defined once in main.mk in the variable inventory_services. When during the inventory check_mk scans a Windows host it will look for such relevant services and automatically create a check for each one found running. Lets assume that the services TSMListener, Httpd and TapiSrv should always be monitored if found running on a machine. All you have to do is to add to your main.mk: main.mk inventory_services = ['TSMListener', 'Httpd', 'TapiSrv' ] At the next inventory all hosts where that services run will be detected and checks created automatically. EventlogThe Windows agent sends output that is fully compatible with that of the Logwatch extension of the Linux/UNIX agent and is thus handled in the same way. For sake of simplicity there are some differences, nevertheless:
What does this mean in detail? When the agent it started (most probably at boot time of the host) it will try to seek to the current end of the Eventlogs and waits there for new records. Only records appearing while the agent is running will be sent to Nagios. If the agent stopped and started again, it theoretically could miss some messages. As the agent is running permanently this should not be a practical problem, though. Since the agent is completely configuration-less, it does not specific filtering of events. It simply looks for messages of type Warning or Error. If such a message is seen, then the complete check interval will be declared as relevant and the agent sends all messages of that logfile to Nagios that appeared since the previous check - even those of type Information. This allows the administrator to have more context information about the problem at hand on the Nagios server. If you want to suppress some messages or reclassify them from Warning to Critical or vice versa, you can define a message filter in main.mk. This is done by setting the variable logwatch_patterns. This is a Python dictionary with a key for each logfile. The value is a list of pairs: main.mk
logwatch_patterns = {
'System': [
( 'W', 'sshd' ),
( 'W', 'rebooting.*system' ),
( 'C', 'path link down' ),
( 'I', 'ORA-4711' )
],
'Application': [
( 'W', 'crash.exe' ),
( 'C', 'ssh' ),
( 'I', 'test.*failed' )
]
}
All patterns for a logfile are executed from first to last. The first match wins. The entry ( 'W', 'sshd' ) reclassifies all messages containing sshd to Warning. There are three possible types:
Note that the patterns are regular expressions. Thus the the entry ( 'I', 'test.*failed' ) reclassifies all messages containing the word test and later the word failed. Messages that do not match any pattern retain their classification from the agent. Messages that are classified as context messages by the agent are never reclassified. Host specific filtering of messagesAs of version 1.0.37 of check_mk, host specific message filtering is supported. That means, that you can have your reclassification in logwatch_patterns depend on the host where the message has been found. Host specific patterns include a host list, or a host tag list and a host list as first elements of the entry. This works quite similar to many other configuration variables. Please read more about host tags for details on that. The following example makes some of the patterns of the upper example host specific: main.mk
logwatch_patterns = {
'System': [
# reclassify only on host abc123
( ["abc123"], 'W', 'sshd' ),
# the following holds for all hosts
( 'C', 'path link down' ),
# reclassify message to "ignore" on all hosts with the tag "test"
( ["test"], ALL_HOSTS, 'I', 'ORA-4711' )
],
'Application': [
# Do not reclassify on host "testhost"
( ["!testhost"], 'W', 'crash.exe' ),
# make ssh critical on "dmz" hosts that do not have the tag "test"
( ["dmz", "!test"], ALL_HOSTS, 'C', 'ssh' ),
# this is for all hosts again
( 'I', 'test.*failed' )
]
}
Extending the Windows agent (requires 1.1.7i3 or later)Plugins and local checksAs of version 1.1.7i3 or later - the Check_MK agent for windows can be extended just as the Unix agents with local checks and plugins. Local checks are (usually simple) scripts or programs performing self written checks and computing the results directly on the target machines. Plugins are scripts or programs that output agent sections similar to those builtin in the agent. Several such scripts are shipped together with the agent and are found in the subdirectory plugins of where the agent is found. In order to use such plugins you need to:
One example of such a plugin is wmicchecks.bat, which uses wmic in order to output a list of processes with their ressource consumption: wmicchecks.bat @echo off echo ^<^<^<wmic_process:sep^(44^)^>^>^> wmic process get name,pagefileusage,virtualsize,workingsetsize,usermodetime,kernelmodetime /format:csv In order to make use of that agent information, your installation of Check_MK needs a check that can process that data. The checks needed for the shipped plugins are part of Check_MK. A tutorial for writing your own checks can be found here. MRPEThe MK's remote plugins executor is not yet part of the windows agent, but will come in a future version of the agent and will allow to call classical Nagios plugins. |
| ||||||||||||||||||||||||||||||