Local checks - how to write your own checks
Letzte Aktualisierung: 04. März 2013
1. The principle of local checks
Local checks are an easy way to write your own checks with Check_MK. You do not need to change or extend Check_MK itself, you just need to write simple check scripts in the programming language of your choice that directly run on the host being checked. Since version 1.1.7i3 all agents support local checks.
2. How to write local checks
Writing local checks is done by putting executable programs (usually shell scripts or similar) into a specific directory which is scanned by the agent. The default directory for such local checks is /usr/lib/check_mk_agent/local for Linux/UNIX agents - on your target hosts.
The Windows agents looks in the subdirectory local of where the file check_mk_agent.exe is installed. If you are unsure, you can see the path in the first view lines of the agent output:
root@linux# check_mk -d winhost | head <<<check_mk>>> Version: 1.1.7i2 AgentOS: windows WorkingDirectory: C:\WINDOWS\system32 AgentDirectory: C:\check_mk PluginsDirectory: C:\check_mk\plugins LocalDirectory: C:\check_mk\local <<<df>>> C:\ NTFS 4184900 4083036 101864 98% C:\ <<<ps>>> close failed: [Errno 32] Broken pipe
On Windows you have to manually create the local directory.
The agent simply executes every executable file in that directory and prints its output in the section <<<local>>>. Each local check may output one or more lines with four columns of text:
This is an example output of a check script counting the number of files in certain directories:
2 Filecount_/var/log count=102 CRITICAL - 102 files in /var/log 0 Filecount_/tmp count=7 OK - 7 files in /tmp
And this is a possible implementation of the upper check:
#!/bin/bash DIRS="/var/log /tmp" for dir in $DIRS do count=$(ls $dir | wc --lines) if [ $count -lt 50 ] ; then status=0 statustxt=OK elif [ $count -lt 100 ] ; then status=1 statustxt=WARNING else status=2 statustxt=CRITICAL fi echo "$status Filecount_$dir count=$count;50;100;0; $statustxt - $count files in $dir" done
3. Inventory - Integration into Nagios
The integration of the local checks into Nagios is easy: Simply do an Inventory of the host - either with all checks (cmk -I HOSTNAME) or only with the local checks (check_mk -I --checks local HOSTNAME). All (new) local checks will be found, Nagios services will be created for you. The performance data will be collected and processed just like that of the other checks.
4. PNP Templates for local checks
Note for OMD users: The setup will work out of the box as described below. If you want to have custom graph templates, read on.
PNP4Nagios uses the name of the check command for selecting a template for rendering the RRD graph. Since all local checks have the same command check_mk-local, the template check_mk-local.php will be used for all cases.
Since version 1.0.37, Check_mk ships a wrapper template with that name, that tries to find a service specific template file matching a prefix of the service description. Let's assume the service description is LDAP auth/4. The wrapper then first tries to find the file LDAP auth_4.php (slashes are replaced with underscores). If that file is present, it is used as template. Otherwise the file LDAP auth_.php will be tried, then LDAP auth.php, LDAP aut.php and so on until L.php. If none of those files is present, then PNP4Nagios' default.php will be loaded and will generate a graph based on your warn/crit levels.
5. Multiple performance values
As of version 1.1.5 it is possible to send more than one variable in a check. Simply seperated your variable definitions by a pipe symbol (do not use spaces). Here is an example output of a check sending three performance values:
0 Nagios_Status ok=33012|crit=17|warn=120|unkw=16 Services ok/cr/wa/un: 33012/17/120/16
6. Cached local checks (new in 1.2.3i1 in the Linux agent)
Sometimes a script will run for longer than a few seconds. If the run time of all script and plugins of an agent exceed the timeout for active checks of the monitoring core (usually 60 or 120 seconds), then the complete check will be aborted. In order to avoid this you can have local checks be run asynchronously and use cache files. This is done by putting your script into a subdirectory that is named by a number - the number of seconds that the output of the script is valid:
In this case the agent will:
1.2.3i7, 1.2.3i7 Werk #0016 - Linux+Windows agent: allow spooling plugin outputs via files
The Windows and Linux agent now have a new feature for sending the contents of files as a part of the agent output. This is useful for generating monitoring data asychronously, e.g. by a cron job.
Simply let the job create or update a file in the directory /etc/check_mk/spool (Linux) or in the subdirection spool of the agent directory (Windows). If that directory is missing, simply create it. The agent will then add the contents of all files contained in that directory to its output. You can use any filename you like. Just files beginning with a dot are ignored. This is an easy way to have applications on the host drop monitoring data into Check_MK. Especially conveniant is using a local <<<local>>> section here.
If you prefix the file name with a number (e.g. 600MyOutput or 3600_app_data.txt) then that number is interpreted as a number of seconds. If the last modification of the file is older than that number, it will be ignored. This will usually set the corresponding services in the monitoring to UNKNOWN. That way you can make sure that you will be alarmed if no fresh monitoring data is available.
Here is an example for a spool file using a local section:
<<<local>>> 0 Service_FOO V=1 This Check is OK 1 Bar_Service - This is WARNING and has no performance data 2 NotGood V=120;50;100;0;1000 A critical check
1.2.3i7, 1.2.3i7 Werk #0017 - local: New state type P for state computation based on perfdata
The section <<<local>>> now allows a new state marker P (next to 0, 1, 2 and 3). When setting this marker, the check plugin computes the state according to the levels contained in the performance data. Take the following example:
<<<local>>> P Environment temp=30;28;35|humidity=33;40:60;35:70;0;100 This is a text
The check will first check the variable temp. It's current value is 30. The levels are at 28 and 35 for warning and critical, resp. Because 30 is greater then 28 this will trigger a warning.
The second performance value humidity has both lower and upper levels for warning and critical - separated by a colon. The current value is 33, which is lower than the lower critical level of 35. This will make this variable und thus the total check critical.
Example for a local line without a text:
<<<local>>> P Environment temp=30;28;35|humidity=33;40:60;35:70;0;100
The output will be:
CRIT - temp 30.0 > 28.0(!), humidity 33.0 < 35.0(!!)
7. Limitations of local checks
Local checks are easy to setup and have many other advantages. A few limitations are there, however: