Inventory - How Check_MK finds services to check

1. Introduction

Configuring which check should be done on which host is a tedious work in Nagios. More than that: Another issue is keepng your configuration up-to-date. Your colleagues introduce new filesystems, new network interfaces and new database instances without always informing you. How can you be sure that every important item is really being monitored?

Check_MK helps you not only to scan new hosts for items to check but also to keep track of your existing hosts. It can do so because of the special nature of its agents: They always send all interesting data about the host regardless of which items are checked with Nagios.

All of Check_MK's check plugins support automatic detection of service - i.e. inventory. A few of them need a bit of configuration (for example checks for processes and services). But in most cases everything happens automatically. If you are curious which checks are shipped with Check_MK, use the option -L (the list is abbreviated here):

root@linux# cmk -L
Available check types:

                      plugin   perf-  in-
Name                  type     data   vent.  service description
-------------------------------------------------------------------------
3ware_disks           tcp      no     yes    RAID 3ware disk %s
3ware_info            tcp      no     yes    RAID 3ware controller %s
3ware_units           tcp      no     yes    RAID 3ware unit %s
ad_replication        tcp      no     yes    AD Replication %s
aironet_clients       snmp     yes    yes    Average client signal %s
aironet_errors        snmp     yes    yes    MAC CRC errors radio %s
apc_symmetra          snmp     yes    yes    APC Symmetra status
apc_symmetra_ext_temp   snmp     yes    yes    APC External Temp %s
apc_symmetra_power    snmp     yes    yes    Power phase %s
apc_symmetra_temp     snmp     yes    yes    %s
blade_bays            snmp     no     yes    BAY %s
blade_blowers         snmp     yes    yes    Blower %s
blade_health          snmp     no     yes    Summary health state
blade_mediatray       snmp     no     yes    Media tray
blade_misc            snmp     yes    yes    SENSOR %s
blade_powerfan        snmp     yes    yes    Power Module Cooling Device %s
blade_powermod        snmp     no     yes    Power Module %s
bluecoat_diskcpu      snmp     yes    yes    %s
bluecoat_sensors      snmp     yes    yes    %s
cisco_fan             snmp     no     yes    FAN %s

2. Performing an inventory

Inventory is not done automatically (for good reasons). You perform it by calling cmk with the option -I and the list of hosts to inventorize (i.e. to scan for new checks on):

root@linux# cmk -I somehost otherhost

It is also allowed to leave out the host names - Check_MK will then inventorize all hosts (you'll probably do this only in small installations):

root@linux# cmk -I

When you want to restrict the inventory to one or several check types, you need the option --checks= before the option -I. Separate several check types with commas. The following call inventorizes the checks snmp_info and df_netapp:

root@linux# cmk --checks=snmp_info,df_netapp -I filer01 filer02

2.1. More flexible host specification

As of version 1.1.13i2 it is also allowed to specify one or more host tags by prefixing them with a @:

root@linux# cmk -I @linux @windows

The upper call will inventorize all linux hosts and all windows hosts. When you need a combination of host tags in order to make the inventory more specific, join the tags with commas. The following example will inventorize all Hosts having the tags prod and linux at the same time:

root@linux# cmk -I @linux,prod

As long as none of your hosts incidentally has the name of a tag, it's also allowed to leave out the @:

root@linux# cmk -I linux,prod

When you have defined clusters (configuration variable clusters), then please note that inventory is always done on the physical nodes. As of version 1.1.13i2 - however - it's possible to specify the cluster when doing inventory. Check_MK will automatically replace this by the list of nodes of the cluster.

3. Cache files

When you do not specify hosts to -I, Check_MK scans all host for new services. In order to speed up that procedure, Check_MK does not retrieve the data from the hosts if they already have been checked at least once. Each time a check is running a cache file is kept in /var/lib/check_mk/cache. Inventory information is drawn from there if available.

You can force Check_MK to retrieve fresh data with the option --no-cache:

root@linux# cmk --no-cache -I

This should not be neccessary in normal situations. It's just that a change on a host can take up to a minute (normal Nagios check interval) to be reflected by the inventory. If the change happened more than one check interval ago, it will already be in your cached data.

Caching does not happen as long as you specify one or more hosts. In that case the inventory will always retrieve fresh data.

4. SNMP checks

SNMP based checks can also be inventorized as the upper example has shown. There is not much difference to the checks based on the Check_MK agent. The good news: Check_MK does not have to retrieve the complete SNMP data in order to find interesting OIDs. Each SNMP check provides a specific scan function that just retrieves one or two single OIDs in order to know if the check will make sense on that particular device. Since most checks make use of the same OIDs for scanning, only few OIDs needs to be fetched in order to know which of the more then 100 shipped SNMP checks need to be inventorized.

The gross result: Doing a cmk -I on an SNMP device will find all services which are supported by Check_MK automatically. Please note, that SNMP hosts need to be tagged with snmp. Consult the SNMP page for more details.

5. What happens with the items found?

All new items Check_MK finds are saved in configuration files similar - but not quite compatible - to main.mk. They are created in a separate directory which defaults to /var/lib/check_mk/autochecks. At setup.sh you have been asked for a "working directory of check_mk". autochecks will be created as a subdirectory of that.

Each time you call check_mk it reads in all files in that directory and appends the entries to your checks variable. Let's look at such a file:

/var/lib/check_mk/autochecks/df-2009-05-20_19.21.44.mk
# /var/lib/check_mk/autochecks/df-2009-05-20_19.21.44.mk
[
  # === zwin17 ===
  ("zwin17", "df", 'C:/', filesystem_default_levels), # 36

  # === zsrv01 ===
  ("zsrv01", "df", '/', filesystem_default_levels), # 24
  ("zsrv01", "df", '/home', filesystem_default_levels), # 17
]

6. Changing and removing inventorized checks

Check_MK's inventory usually does not remove checks but only add new ones. Why? If e.g. a filesystem previously found is now missing, that is either a critical problem or it has been removed by the host's administrator. Check_MK cannot safely know which of both is the case and rather leaves the check.

There are two ways to remove checks found by previous inventories:

6.1. 1. Edit or delete autochecks files

Check_MK never overwrites files in autochecks. It is completely save to edit them and remove checks not longer needed. You can either delete files or open them with an editor and delete single entries:

/var/lib/check_mk/autochecks/df-2009-05-20_19.21.44.mk
# /var/lib/check_mk/autochecks/df-2009-05-20_19.21.44.mk
[
  # === zwin17 ===
  ("zwin17", "df", 'C:/', filesystem_default_levels), # 36

  # === zsrv01 ===
  ("zsrv01", "df", '/', (98, 99) ), # DELETE THIS LINE
  ("zsrv01", "df", '/home', filesystem_default_levels), # 17
]

6.2. 2. Reinventorize with -II

As of version 1.1.7i1 Check_MK supports the option -II. It does exactly the same as -I but removes all existing checks before doing the inventory. Only those checks are affected that are being inventorized. Example 1:

root@linux# cmk -II df xyzsrv01

This first removes all checks of type df on host xyzsrv01 and then does inventory.

Example 2:

root@linux# cmk -II xyzsrv01

This removes all agent based of host xyzsrv01 before doing inventory.

You can even do a check_mk -II and thus reinventorize all agent based checks on all hosts - and removing all checks currently not found on the target hosts.

7. Cleaning up autochecks

The fact that Check_MK creates new files for each inventory is handy if you want to revert or modify the results of recent inventories. As time goes by there are quite a lot of files in the autochecks directory, however.

As of version 1.1.7i1, Check_MK offers the new option -u or --cleanup-autochecks, which reads in all files in /var/lib/check_mk/autochecks, creates one new file per host and removes the exceeding files afterwards. That greatly reduces the number of files in the directory and also makes the removal of all data of a host an easy task. This option can either be used stand alone...

root@linux# cmk -u

... or as a modifier to -I:

root@linux# cmk -uI host123

If called that way, the cleanup is done right after the inventory. If you like that feature, you can make Check_MK always cleanup immediately after each inventory by setting in your main.mk:

main.mk
always_cleanup_autochecks = True

Note: As of version 1.2.3i1 the default for always_cleanup_autochecks is True.

8. Updating your Nagios configuration

Please do not forget to update your monitoring configuration and restart the monitoring core with:

root@linux# cmk -R

... after every inventory (that found something new) or manual change in the autochecks. That will not only update your Nagios configuration files but also recompile all host checks.

9. Inventorized versus manual checks

Even when checks can be found via inventory it is allowed to configure them manually. You can have various reasons for that. One is that you want to define levels others than those the inventory sets.

Whenever a check is defined manually in main.mk the inventory will never find that item again.

10. Excluding items from the inventory

Sometimes the inventory finds things that you do not want to check. Removing that items from the files in autochecks is not a perfect idea: At the next inventory those items will reappear again.

It is better to explicitely exclude them. Check_mk provides three configuration variables for doing that:

Config variableMeaning
ignored_checktypesSimple list of checktypes to exclude from inventory
ignored_servicesHost specific configuration list of service names to exclude
ignored_checksHost specific configuration list of checktypes to exclude NEW in 1.1.9i1

In ignored_checktypes you can switch off inventory for certain check types completely and globally. Lets assume, that you do not want to monitor network interface throughput and link settings at all. Simply list the corresponding check types (see check_mk -L) in this list:

main.mk
ignored_checktypes = [ "netctr.combined", "netif.params" ]

If you want to control inventory more specific you need ignored_services. This is a configuration list with the following values in each entry:

  1. Optional: List of host tags
  2. List of hosts
  3. List of service patterns

The following example will exclude the Eventlog Security from the two hosts win01 and win02:

main.mk
ignored_services = [
  ( [ "win01", "win02" ], [ "LOG Security" ] )
]

Note that the list of services is interpreted as regular expressions matching the beginning of the service description as displayed in Nagios. The following example will not only ignore one but all Logfiles, i.e. all services beginning with LOG, as well as the drives with the letter C::

main.mk
ignored_services = [
  ( [ "win01", "win02" ], [ "LOG", "fs_C:" ] )
]

If you are unsure about the correct spelling of a service you can call check_mk -D to dump all services.

If you have tagged all your windows host with win the following configuration snippet will do the same but for all Windows hosts:

main.mk
ignored_services = [
  ( [ "win" ], ALL_HOSTS, [ "LOG", "fs_C:" ] )
]

NEW in 1.1.9i1 Using the option ignored_checks you can exclude specific checktypes for several host. This options behaves like ignored_checktypes with the advantage that you can configure different options for different hosts.

To disable all hr_* checks for all your linux hosts you can use the following configuration:

ignored_checks = [
  ( [ "hr_cpu", "hr_mem", "hr_fs" ], [ "linux" ], ALL_HOSTS)
]

This is useful when you monitor your windows servers using the Check_MK Agent AND SNMP at the same time for some reason. That setup could result in duplicate services e.g. for the filesystems, memory and cpu checks. And with the above line you can prevent these duplicate servicenames by disabling these checks via SNMP.

You can also use this option very selective. This line disables the df check on the host win01:

ignored_checks = [
  ( "df", [ "win01" ])
]

Please note that the two ignore_... variables only affect future inventories. They have no effect on the checking or on previously inventorized services.