Configuring which check should be done on which host is a tedious work in Nagios. More than that: Another issue is keepng your configuration up-to-date. Your colleagues introduce new filesystems, new network interfaces and new database instances without always informing you. How can you be sure that every important item is really being monitored?
Check_MK helps you not only to scan new hosts for items to check but also to keep track of your existing hosts. It can do so because of the special nature of its agents: They always send all interesting data about the host regardless of which items are checked with Nagios.
All of Check_MK's check plugins support automatic detection of service - i.e. inventory. A few of them need a bit of configuration (for example checks for processes and services). But in most cases everything happens automatically. If you are curious which checks are shipped with Check_MK, use the option -L (the list is abbreviated here):
root@linux# cmk -L Available check types: plugin perf- in- Name type data vent. service description ------------------------------------------------------------------------- 3ware_disks tcp no yes RAID 3ware disk %s 3ware_info tcp no yes RAID 3ware controller %s 3ware_units tcp no yes RAID 3ware unit %s ad_replication tcp no yes AD Replication %s aironet_clients snmp yes yes Average client signal %s aironet_errors snmp yes yes MAC CRC errors radio %s apc_symmetra snmp yes yes APC Symmetra status apc_symmetra_ext_temp snmp yes yes APC External Temp %s apc_symmetra_power snmp yes yes Power phase %s apc_symmetra_temp snmp yes yes %s blade_bays snmp no yes BAY %s blade_blowers snmp yes yes Blower %s blade_health snmp no yes Summary health state blade_mediatray snmp no yes Media tray blade_misc snmp yes yes SENSOR %s blade_powerfan snmp yes yes Power Module Cooling Device %s blade_powermod snmp no yes Power Module %s bluecoat_diskcpu snmp yes yes %s bluecoat_sensors snmp yes yes %s cisco_fan snmp no yes FAN %s
2. Performing an inventory
Inventory is not done automatically (for good reasons). You perform it by calling cmk with the option -I and the list of hosts to inventorize (i.e. to scan for new checks on):
root@linux# cmk -I somehost otherhost
It is also allowed to leave out the host names - Check_MK will then inventorize all hosts (you'll probably do this only in small installations):
root@linux# cmk -I
When you want to restrict the inventory to one or several check types, you need the option --checks= before the option -I. Separate several check types with commas. The following call inventorizes the checks snmp_info and df_netapp:
root@linux# cmk --checks=snmp_info,df_netapp -I filer01 filer02
2.1. More flexible host specification
As of version 1.1.13i2 it is also allowed to specify one or more host tags by prefixing them with a @:
root@linux# cmk -I @linux @windows
The upper call will inventorize all linux hosts and all windows hosts. When you need a combination of host tags in order to make the inventory more specific, join the tags with commas. The following example will inventorize all Hosts having the tags prod and linux at the same time:
root@linux# cmk -I @linux,prod
As long as none of your hosts incidentally has the name of a tag, it's also allowed to leave out the @:
root@linux# cmk -I linux,prod
When you have defined clusters (configuration variable clusters), then please note that inventory is always done on the physical nodes. As of version 1.1.13i2 - however - it's possible to specify the cluster when doing inventory. Check_MK will automatically replace this by the list of nodes of the cluster.
3. Cache files
When you do not specify hosts to -I, Check_MK scans all host for new services. In order to speed up that procedure, Check_MK does not retrieve the data from the hosts if they already have been checked at least once. Each time a check is running a cache file is kept in /var/lib/check_mk/cache. Inventory information is drawn from there if available.
You can force Check_MK to retrieve fresh data with the option --no-cache:
root@linux# cmk --no-cache -I
This should not be neccessary in normal situations. It's just that a change on a host can take up to a minute (normal Nagios check interval) to be reflected by the inventory. If the change happened more than one check interval ago, it will already be in your cached data.
4. SNMP checks
SNMP based checks can also be inventorized as the upper example has shown. There is not much difference to the checks based on the Check_MK agent. The good news: Check_MK does not have to retrieve the complete SNMP data in order to find interesting OIDs. Each SNMP check provides a specific scan function that just retrieves one or two single OIDs in order to know if the check will make sense on that particular device. Since most checks make use of the same OIDs for scanning, only few OIDs needs to be fetched in order to know which of the more then 100 shipped SNMP checks need to be inventorized.
The gross result: Doing a cmk -I on an SNMP device will find all services which are supported by Check_MK automatically. Please note, that SNMP hosts need to be tagged with snmp. Consult the SNMP page for more details.
5. What happens with the items found?
All new items Check_MK finds are saved in configuration files similar - but not quite compatible - to main.mk. They are created in a separate directory which defaults to /var/lib/check_mk/autochecks. At setup.sh you have been asked for a "working directory of check_mk". autochecks will be created as a subdirectory of that.
Each time you call check_mk it reads in all files in that directory and appends the entries to your checks variable. Let's look at such a file:
# /var/lib/check_mk/autochecks/df-2009-05-20_19.21.44.mk [ # === zwin17 === ("zwin17", "df", 'C:/', filesystem_default_levels), # 36 # === zsrv01 === ("zsrv01", "df", '/', filesystem_default_levels), # 24 ("zsrv01", "df", '/home', filesystem_default_levels), # 17 ]
6. Changing and removing inventorized checks
Check_MK's inventory usually does not remove checks but only add new ones. Why? If e.g. a filesystem previously found is now missing, that is either a critical problem or it has been removed by the host's administrator. Check_MK cannot safely know which of both is the case and rather leaves the check.
There are two ways to remove checks found by previous inventories:
6.1. 1. Edit or delete autochecks files
Check_MK never overwrites files in autochecks. It is completely save to edit them and remove checks not longer needed. You can either delete files or open them with an editor and delete single entries:
# /var/lib/check_mk/autochecks/df-2009-05-20_19.21.44.mk [ # === zwin17 === ("zwin17", "df", 'C:/', filesystem_default_levels), # 36 # === zsrv01 === ("zsrv01", "df", '/', (98, 99) ), # DELETE THIS LINE ("zsrv01", "df", '/home', filesystem_default_levels), # 17 ]
6.2. 2. Reinventorize with -II
As of version 1.1.7i1 Check_MK supports the option -II. It does exactly the same as -I but removes all existing checks before doing the inventory. Only those checks are affected that are being inventorized. Example 1:
root@linux# cmk -II df xyzsrv01
This first removes all checks of type df on host xyzsrv01 and then does inventory.
root@linux# cmk -II xyzsrv01
This removes all agent based of host xyzsrv01 before doing inventory.
7. Cleaning up autochecks
The fact that Check_MK creates new files for each inventory is handy if you want to revert or modify the results of recent inventories. As time goes by there are quite a lot of files in the autochecks directory, however.
As of version 1.1.7i1, Check_MK offers the new option -u or --cleanup-autochecks, which reads in all files in /var/lib/check_mk/autochecks, creates one new file per host and removes the exceeding files afterwards. That greatly reduces the number of files in the directory and also makes the removal of all data of a host an easy task. This option can either be used stand alone...
root@linux# cmk -u
... or as a modifier to -I:
root@linux# cmk -uI host123
If called that way, the cleanup is done right after the inventory. If you like that feature, you can make Check_MK always cleanup immediately after each inventory by setting in your main.mk:
always_cleanup_autochecks = True
8. Updating your Nagios configuration
Please do not forget to update your monitoring configuration and restart the monitoring core with:
root@linux# cmk -R
9. Inventorized versus manual checks
Even when checks can be found via inventory it is allowed to configure them manually. You can have various reasons for that. One is that you want to define levels others than those the inventory sets.
10. Excluding items from the inventory
Sometimes the inventory finds things that you do not want to check. Removing that items from the files in autochecks is not a perfect idea: At the next inventory those items will reappear again.
It is better to explicitely exclude them. Check_mk provides three configuration variables for doing that:
|ignored_checktypes||Simple list of checktypes to exclude from inventory|
|ignored_services||Host specific configuration list of service names to exclude|
|ignored_checks||Host specific configuration list of checktypes to exclude NEW in 1.1.9i1|
In ignored_checktypes you can switch off inventory for certain check types completely and globally. Lets assume, that you do not want to monitor network interface throughput and link settings at all. Simply list the corresponding check types (see check_mk -L) in this list:
ignored_checktypes = [ "netctr.combined", "netif.params" ]
If you want to control inventory more specific you need ignored_services. This is a configuration list with the following values in each entry:
- Optional: List of host tags
- List of hosts
- List of service patterns
The following example will exclude the Eventlog Security from the two hosts win01 and win02:
ignored_services = [ ( [ "win01", "win02" ], [ "LOG Security" ] ) ]
Note that the list of services is interpreted as regular expressions matching the beginning of the service description as displayed in Nagios. The following example will not only ignore one but all Logfiles, i.e. all services beginning with LOG, as well as the drives with the letter C::
ignored_services = [ ( [ "win01", "win02" ], [ "LOG", "fs_C:" ] ) ]
If you are unsure about the correct spelling of a service you can call check_mk -D to dump all services.
If you have tagged all your windows host with win the following configuration snippet will do the same but for all Windows hosts:
ignored_services = [ ( [ "win" ], ALL_HOSTS, [ "LOG", "fs_C:" ] ) ]
NEW in 1.1.9i1 Using the option ignored_checks you can exclude specific checktypes for several host. This options behaves like ignored_checktypes with the advantage that you can configure different options for different hosts.
To disable all hr_* checks for all your linux hosts you can use the following configuration:
ignored_checks = [ ( [ "hr_cpu", "hr_mem", "hr_fs" ], [ "linux" ], ALL_HOSTS) ]
This is useful when you monitor your windows servers using the Check_MK Agent AND SNMP at the same time for some reason. That setup could result in duplicate services e.g. for the filesystems, memory and cpu checks. And with the above line you can prevent these duplicate servicenames by disabling these checks via SNMP.
You can also use this option very selective. This line disables the df check on the host win01:
ignored_checks = [ ( "df", [ "win01" ]) ]
Please note that the two ignore_... variables only affect future inventories. They have no effect on the checking or on previously inventorized services.