Manually configured checks
September 19. 2009
When not to use inventory
Inventory is one of the most important reasons to use check_mk. In some cases, however, inventory is either not possible or the check parameters it chooses are not desirable for you. For example, the check cpu.threads monitors the number of processes and threads on a Linux host. This check is automatically found and configured by the inventory. The warning and critical levels can be set globally by setting threads_default_levels in main.mk.
But what, if for some hosts other levels than the default levels should be applied? Manual configuration is neccessary in that case.
Another example are checks that do not support inventory at all. currently - as of version 1.0.36 - this holds for the check type ps. If you want to monitor certain processes for their existance this has always to be done manually.
How to configure manual checks
Checks are configured manually by entering them into the configuration variable checks in main.mk. This variable is a list of entries (tuples). Each entry specifies:
A first example
Let's first see an example of two manually configured checks:
checks = [ ( "abc123", "df", "/var", (80.0, 90.0) ), ( "abc123", "df", "/tmp", (90.0, 95.0) ), ]
All four entries will now be discussed in detail:
1. Host specification
In the upper example we used a single hostname (abc123) as host specification. It is also possible to specify more then one host for a check. There are three possible ways to do this. The first one is to use a list of hostnames. Lists are written in square brackets. The following example configures a filesystem check for the two hosts abc123 and def456:
checks = [ ( ["abc123", "def456"], "df", "/var", (80.0, 90.0) ),
If you have more the a couple of hosts, then lists become inconveniant. Host tags are a much more powerful and yet simple way to specify groups of hosts. Tags are described in detail in an own article, so I will just give a simple example here. Lets assume, you want to configure a process check for /usr/sbin/ntpd on all Linux hosts. Then first tag all Linux hosts with some tag - let's say lnx - in all_hosts. Use a vertical bar to separate the tags and the hostname:
all_hosts = [ "abc123|lnx", "def456|lnx", "xyz987|win", ]
Now replace the list of hosts with a list of tags and append the keyword ALL_HOSTS. Without that keyword check_mk cannot know that you mean tags and not hostnames:
checks = [ ( ["lnx"], ALL_HOSTS, "df", "/var", (80.0, 90.0) ), ]
That check will be performed on all hosts with the tag lnx. A third way is to configure a check for simply all hosts. This is done by just using the keyword ALL_HOSTS as host specification:
checks = [ # This check will be performed on simply all hosts ( ALL_HOSTS, "df", "/var", (80.0, 90.0) ), ]
2. The check type
The second entry in the tuple is the check type. A list of all check types your version of check_mk supports is output by check_mk -L. Please note, that it is not a bug but intention, that some check types contain underscores and others contain dots:
root@linux# check_mk -L Available check types: plugin perf- in- Name type data vent. service description ------------------------------------------------------------------------- blade_bays snmp no yes BAY %s blade_misc snmp yes yes SENSOR %s blade_powerfan snmp yes yes Power Module Cooling Device %s blade_powermod snmp no yes Power Module %s bluecoat_diskcpu snmp yes yes %s bluecoat_sensors snmp yes yes %s cisco_fan snmp no yes %s cisco_locif snmp yes yes Port %s cisco_power snmp no yes %s cisco_temp snmp yes yes %s cpu.loads tcp yes yes CPU load cpu.threads tcp yes yes Number of threads df tcp yes yes fs_%s df_netapp snmp yes yes fs_%s diskstat tcp yes yes Disk IO %s fc_brocade_port snmp yes yes PORT %s fsc_fans snmp yes yes FSC %s fsc_subsystems snmp no yes FSC %s fsc_temp snmp yes yes FSC TEMP %s ifoperstatus snmp yes yes Interface %s ipmi tcp yes yes IPMI Sensor %s ironport_misc snmp yes yes %s kernel tcp yes no Kernel %s kernel.util tcp yes yes CPU utilization local tcp yes yes %s logwatch tcp no yes LOG %s lsi.array tcp no yes RAID array %s lsi.disk tcp no yes RAID disk %s ...
For each checktype you will also be informed:
Some checks (hopefully all in some lucky time in future) provide a manual page. Please refer to the table of checks for a list of all check manual pages.
3. Check item
The check item comes next after the check type. The item specifies which particular piece of hardware or software should be checked. For example, the check type df needs the mount point of the filesystem (or the Windows drive letter) as item. Other check types just have a single item. One example is mem, which measures the current usage of physical and virtual memory. In such cases the check item must be None (without quotes!). Please refer to the check manual for details of each check. Here is an example:
checks = [ ( "abc123", "mem", None, (80.0, 120.0) ), ]
4. The check parameter
The kind parameter that the different check types expect varies. Some simply ignore the parameter. Specify None or an empty string "" in such a case. If more than one value is needed, all values must be enclosed in round brackets (a Python tuple) as seen in the upper examples.
What happens, if a check for a specific host and item is defined more then once? Only one check will be created from your configuration. Two rules are applied here:
Example of how to define CPU load parameters for all Linux hosts but make an exception for two specific hosts:
checks = [ # All hosts with Tag 'lnx' get levels 10 and 20... ( ["lnx"], ALL_HOSTS, "cpu.loads", None, (10, 20) ), # ...but sv01 and sv02 get levels 5 and 10 ( ["sv01", "sv01"], "cpu.loads", None, ( 5, 10) ), ]
Checks found by the inventory are stored into text files within /var/lib/check_mk/autochecks/. Those files contain lists with the same syntax as needed for checks. Here is an example:
# /var/lib/check_mk/autochecks/mrpe-2009-09-18_01.20.54.mk [ # === localhost === ("localhost", "mrpe", 'LOAD', ""), # ("localhost", "mrpe", 'FS_var', ""), # ("localhost", "mrpe", 'FS_log', ""), # ("localhost", "mrpe", 'Aptitude', ""), # ("localhost", "mrpe", 'Smart_sda', ""), # ]
You can copy lines you need into checks. If a check exists
both in checks and in