Check_MK BI - IntroductionApril 23. 2011
BI - Make more out of your monitoring dataCheck_MK Business Intelligence - or simply BI - is an addon to Multisite that helps you making more out of your monitoring data. Even medium sized Nagios systems monitor several thousands of single items (hosts and services). While for many tasks the classical GUI and views are sufficient to keep track of those, other tasks ask for more top-level aggregated views, however. For the collegues whose task is to repair things, a detailed list of currently unhandled problems is what they need in the first place. But other collegues might have questions like "Which applications are affected by a certain problem?" or "What availability had my application XYZ?" ask for tools that aggregate the basic details into more higher level information. Check_MK BI comes with two modules addressing that topic:
BI AggregationBI Aggregations compute the overall state of applications, hosts or other items of interest from a subset of your basic Nagios hosts and services. Each aggregation defines a tree of dependencies, which is visualized and executed by the BI component in Multisite. Such aggregation trees help answering lots of questions occurring in daily monitoring situation, e.g.:
How to aggregate - best or worst?Other than plain Nagios or NagVis - which always use the worst state of a list of items when displaying grouped data - Check_MK BI is much more flexible in this area. Consider the following example: You have several database instances running on HA clusters made of two nodes. You monitor both nodes and also - by making use of Check_MK Clusters - have a virtual cluster host in your monitoring, where all components are attached to that might move around from one node to the other. By defining a tree of dependencies made out of your basic host and service states, you can compute an overal state of your database instance. Where things (hardware and software) are redundant, the aggregated state should use the best state of the underlying items. In other cases the worst state is used. In other cases yet, only the state of one of the underlying items is relevant and the rest should be ignored. A good example is the operating system state of the two physical cluster hosts. At each point of time the database is just running on one of them. So problems with memory consumption or a high CPU load affects the databases only if they are on the host it is currently running on. This is especially interesting, if you want to compute availablity reports. You surely do not want your availability to be reported as degraded just because a high CPU load on the stand-by host. The following screen shot shows a (somewhat simplified) aggregation for such a scenario: Features and AdvantagesCheck_MK BI Aggregations provide a lot of interesting features, many of which are unique in the world of Nagios:
BI ReportingThe second module of BI - the reporting - will be able to make use of the BI aggregations. That way you can compute the availability of your applications. One key feature will be the fact, that you will be able to modify your aggregations and recreate a report - even for the past! That way you can make sure that your reports always reflects the reality - even if you detect an inaccuracy in your aggregation rules. The reporting module will be introduced during the 1.1.13i series. Please stay tuned... |
| |||||||||||||||||||||