Guidelines for writing checks for the official distribution
December 10. 2013
Writing a really good check has many aspects. If you want your check to be part
of the official Check_MK distribution, you have to make it adher to the following
The check file names should be named short and unique. They must consist only of lower case characters, digits and underscores and begin with a lower case character.
Vendor specific checks must be prefixed with a vendor specific unique abbreviation (which you think of). Example: fsc_ for Fujitsu Siemens Computers.
Product specific checks must be prefixed with a product abbreviation, for example steelhead_status. for a Steelhead appliance of Riverbed.
SNMP based checks: if the check makes use of a standardized MIB which is or might be implemented by more than one vendor, then the check should not be named after the vendor but after the MIB. An example are the hr_* checks.
The service description of different check types that essentially
do the same must be identical (e.g. if/if64/ifoperstatus). Reason:
this makes rules in main.mk simpler for the user!
All checks follow the same order of implementation:
Add an author
If the check is contributed by a third party (i.e. not by the developers of Check_MK), you must
add your name and your email address as a comment into the check, right after the header.
Avoid long lines. In an optimal case your lines don't exceed 100 chars.
Use four spaces for intending your code. Just don't use tab chars.
And if you really can't life without tabs set the tab width to 8 spaces.
For checks that are part of the official Check_MK project the file header with the
copyright information must be present. This will be automatically
created if you call 'make headers' in the main source directory
Including example output of the agent is very helpful for understanding how the check parser works.
TCP-Agent based checks must include an output example of the agent. If the agent output can have different formats or output styles then put an example for each kind of style the check supports (e.g.: the output of multipath -l has changed its layout between SLES 10 and SLES 11).
For SNMP based checks include examples if the kind of output is
in some respect remarkable.
Configuration variables for main.mk should be named after the check, if they are only used by this check. This does not hold for variables, that are used by several checks (e.g. filesystem_default_levels is used by df, hr_fs, df_netapp, ...)
If a check does not use check parameters, then the inventory function must return None as parameter and the check function must name the parameter argument _no_params.
The name of the inventory and check function must be
prefixed with the name of the check type, for example
inventory_h3c_lanswitch_cpu for the check h3c_lanswitch.
Default values for check parameters (e.g. switch_cpu_default_levels) must be
chosen in a way that they make sense for everybody, not just for your
In case you are unsure then rather choose too loose than too tight levels.
This helps avoid false alarms.
If the same configuration variable is used in multiple checks, all of them
must set a default value and all those values must be identical!
Your check should assume that the agent is always producing valid data.
It should not try to handle cases where the agent output is broken.
This is handled by Check_MK via Python exceptions. Otherwise this will disable the
debug handler (make the code more ugly).
int(s) will throw an exception if if is not a valid number string (or empty). Then Check_MK will catch the exception and make the check result "UNKNOWN" with an according error message. saveint(s) will assume 0, if s is not valid.
Use saveint() in all places, where you know or suspect that some device does not supply valid data but the check can work with the rest of the data and produce useful results.
Only set the perfdata flag (the third parameter in the check_info declaration)
to 1if the check really produces performance data output.
Each check that outputs performance data must have a dedicated PNP
graph definition in pnp-templates. If the check has warning and critical
levels then the graph must display those levels as yellow and red
Each check that outputs performance data must also have an RRA definition
the specifies which of MAX, MIN and AVERAGE is needed to display the
graph in its current (and maybe future) forms. Those are in pnp-rraconf.
Use a symlink here.
Each check that outputs performance data should have a Perf-O-Meter.
For checks part of Check_MK this must be done in
web/plugins/perfometer/check_mk.py, for third party checks this should
be done in a separate file in web/plugins/perfometer.
Only use numeric OIDs in your checks. Name based OIDs rely on MIB files
and the check won't work when the MIB files are not in place.
Always have your OIDs start with a root, for example: .184.108.40.206.4.1
Neither the check- nor the inventory function may use the print command
or otherwise output any data to stdout or stderr nor otherwise communicate
with the outside. An rare exception to this are checks that need a dedicated
data storage (such as logwatch: it keeps unread log messages in files).
Each check must have a man page. This should be:
Information that must be contained in the check description: