Writing checks for Check_MK
March 04. 2011
Why not use local checks or MRPE?
Using local checks or MRPE for adding your own self-written checks to Check_MK is easy. Even inventory and performance data are supported. So why should you want to write native checks for Check_MK? Well, there can be several reasons:
If one or more of those issues are relevant for you, then you'll find all information needed for writing your own checks in this article and a couple of further articles.
Do I have to learn Python?
Well, to be honest: yes - at least to a certain basic degree. People have suggested to change Check_MK such that checks can be written in other languages, as well. I understand this request very well. But from a technical point of view I cannot image how such an integration could be done in a clean, simple and performant way. Check_MK's checks are not standalone programs or scripts but are closely integrated into the check mechanism. They need to have access to some of Check_MK's internal functions. And at the end, for each host one Python program will be created by combining a base and all checks used by that host into one new program. That feature saves about 75% of the CPU ressources when compared to directly calling check_mk for checking.
On the other hand, Python is a language which is cleanly designed, elegant and easy learn. I'm sure you'll like it once you have some experiance with it (even if you dislike its style of indentation).
Within this tutorial I assume that you have some basic knowledge of Python. Looking at the code of some of the existing checks might help, if you are new to Python.
How Check_MK's checks work
Each check consists at least of the following three components:
Two further components are optional but strongly recommended:
If your check outputs performance data, then two further components form a perfect check:
The data source
Everything begins with the data source, i.e. source of the data the check operates on. Currently there are two different kinds of data sources: agent sections (tcp), and SNMP queries (snmp). An agent section is a part of the output of an agent, for example the output of the Linux command df. An SNMP based data source returns data retrieved by one or several SNMP queries on certain OIDs. Both data sources are presented to the check function as a table (a Python list of lists). We will call that data the "agent data".
The agent plugin
If you write a TCP based check you need a plugin for the agent. This is a usually small executable script that is put into the plugins directory of the agent. That plugin uses standard operating system methods for retrieving the data of interest.
It is important to understand the philosophy of Check_MK at that point. The plugin should:
The inventory function
When you want your check to support inventory (which is always a good idea), then you have to supply an inventory function. That function examines the agent data and creates a list of all items to be checked on that specific host. An item uniquely identifies a thing to be checked on a host within that type of check. Some examples of items are:
Some checks do not need to distinguish different items. That is because the thing they check does exist at most once on a host. An example is the check mem. But Check_MK always requires an item, so those checks simply use None as the item.
Please note that this does not mean that you cannot do an inventory on mem. It's just that the number of items the inventory returns is at most one. In some cases it is even zero: when the agent output does not contain the information needed for the check. This is a very useful feature and enables the Nagios administrator to automatically perform the right checks on the right operating systems.
Your inventory function does not need to worry whether a certain item was already configured manually or detected by a previous inventory. Check_MK handles this in a general way and makes sure, that only newly detected items are added to the list of services.
The check function
When an actual check of a host is done, all services for that host will be checked in turn. If it's the turn for your check, Check_MK will call your check function for each item that is automatically or manually configured for that check and host. Your function will be provided with the checked item, the (optional) parameters of the check and the agent data.
The check function then:
This is very similar to what standard Nagios plugins do, with the important difference, that our check is already provided with data from the agent and does not have to retrieve it by itself.
The manual page
If you want to pass your check along to others, a manual page for the check is strongly
recommended. Check_MK has its own concept and syntax of check manuals. You do not need to learn
NROFF syntax or stuff like that. A check manual is a relatively simple text file named after
the check and usually installed in
Please read our article of how to write man pages for further information.
The PNP template
If you check outputs performance data (i.e. not only returns a status and an explainatory text but also values like memused=77364), you should provide a template for PNP4Nagios that is nicely layouting the history evolution of that value.
If you are using no or an other graphing tool then a PNP template is not useful for you - of course. Then you only need one if you want your check to be officially part of Check_MK.
The same holds for the Perf-O-Meter for Multisite. People like Perf-O-Meters. If you do not use Multisite then Perf-O-Meters are of no use to you. Checks wanting to be part of Check_MK must provide Perf-O-Meters (even if some older checks of Check_MK still do not have ones either).
Let's jump to practice: Preparing the agent
Let's now jump into practice and write our first own check. We offer two tutorials: