Monitoring MySQL with Check_MK


Dieser Artikel wird nicht mehr gepflegt und ist unter Umständen nicht mehr gültig!

1. How to monitor MySQL

(or any other application)

A walkthrough for setting up a MySQL server monitoring. This walkthrough tries to touch base on many Check_MK usage topics and give you an idea how you could monitor other kinds of software.

Please understand this is exemplary, and intended as an inspiration for seeing what config options (which are, after all, documented) you want to look at.

1.1. The Agent

The Agent is delivered as a plugin right now, but we might add it to the base agent later on - as soon as we have collected more experience with it and gotten enough feedback.

To install the plugin, copy the file mk_mysql to your $MK_LIBDIR/plugins directory. $MK_LIBDIR is configured in the beginning of the base Check_MK agent and is often set to /usr/lib/check_mk_agent.

If you're not on Linux you might need to change the shell it uses to run, which is defined in the the first line. By default it should point to #!/bin/bash. After that, make the script executable using chmod 700 mk_mysql and try running it using "./mk_mysql".

For most people it will be complaining it's not authorized to access MySQL.

This is because the MySQL root user needs to be accessible for the OS user used by the check_mk_agent. Almost any way of solving this has some security impact, we settled for storing the password in a file in $MK_CONFIDR which normally is /etc/check_mk. This file should have mode "400" (set with chmod again) and be called "mysql.cfg". Setting mode 400 ensures it will not be readable for non-root users.

In it, you store the password and optional a user for MySQL to use for authentication in the following MySQL format:

/etc/check_mk/mysql.cfg
[client]
user=monitor
password=MyPassWord

You can specify any other user, it need not to be the system monitoring user, that has the following global rights in MySQL to perform the SQL statements:

  • Select
  • Show Databases

If you create a MySQL user only for monitoring, you can give the rights with following SQL statement:

GRANT SELECT, SHOW DATABASES ON *.*
TO 'username'@'servername';

Please replace username with your user you want to use. In this example it will be monitor. The servername can be localhost, a specified servername or %. % is a wildcard. This matches every servername.

After creating the file, running the Agent should now give a lot more output.

The output is from the default MySQL monitoring command "show global status". It gives much useful info, but still not very much compared to other databases.

The patch sets by Facebook / Percona bring a lot better monitoring than MySQL allows out of the box. If those ever get added to MySQL, or if you run the Percona edition, then you can extend the client to monitor a lot(!) more.

1.2. MySQL global checks

MySQL Sessions allows you to track the number of sessions and setting limits on the numbers. If you want, you use this to alert in case the "sane" usage numbers for your server are exceeded. You also get a reporting for connections /s.

Lets set it so that we have a rather high limit for the total connections, a lower one for those that are running queries and a low one for constantly reconnecting stuff (in case your application reuses connections!)

main.mk
checkgroup_parameters['mysql_sessions'] += [
  ( {'connections': (20, 40),
     'running'    : (30, 60),
     'total'      : (100, 300)}, [], ALL_HOSTS ),
]

1.3. InnoDB

InnoDB tables report a "available_mb" and a "free_mb" value. I'm not entirely sure what the basis of "free" is, unless you have deployed database quotas. If you know more about this value, let us know.

We monitor that and will alert if it is reaching configurable thresholds. Since that check is based on "fs" it also brings along averaging and trending, which means you can now track capacity growth and both plan and alert ahead of the database growing over the size you assumed.

Additionaly InnoDB keeps track of bytes it wrote/read, and we also monitor these. It's not on a per DB level since there's only one "engine" for all.

We can define a service that will alert us if exceptionally high throughputs of 90MB/s arise.

We then tune it so that would only alert if it's really needed by enabling a 15 minute average - if you run the numbers, this check will go to WARN if 27GB were read from the database in just 15 minutes. So if that ever triggers, you'll know some extremely read-heavy thing is running, which indicates a design issue.

main.mk
checkgroup_parameters['mysql_innodb_io'] += [
  ( {'read'   : (30.0, 90.0),
     'write'  : (20.0, 50.0),
     'average': 15}, [], ALL_HOSTS ),
]

This way you get to catch the classical "but that select * worked just fine on my laptop" developer, and anything that is normal application behaviour will be under the threshold. An non-snapshotting online-backup like mysqldump will probably also trigger this, but you can use downtimes for that.

Last to monitor the existence and size of the InnoDB data files and redologs, we used our fileinfo check.

/etc/check_mk/fileinfo.cfg
  /var/lib/mysql/ib*

On the server side we can then configure checks on the files. The maxium file size should be in-tie with what your DBs configured. On my test box I configured an alert in case it goes over a few GB since then there's only 40GB space.

main.mk>
checkgroup_parameters['fileinfo'] += [
  ( {'maxsize': (10737418240, 21474836480),
     'maxage': (0, 0)}, [], ALL_HOSTS, ['/var/lib/mysql/ibdata'] ),
]

Just the same you can also make a check that ensures your ib_logfiles never grow beyond their configured size.

Hint: since this all being python you can of course also write 100 * 1024 * 1024 * 1024 or anything else that is valid for python and Check_MK.

Additionally it'd be possible to define a check on the file age.

1.4. MyISAM

The database usage check also runs for MyISAM (actually, for all engines) and there we see small unused databases can trigger false alarms: If your database size is under 1MB, then things can go wrong. You need to filter those by ignoring the service, until someone can hint to a better solution.

MyISAM tables are stored in plain files in subdirectories of the MySQL data directory, i.e. /var/lib/mysql.

To monitor these, we need to find a way to get their names, while avoiding to monitor the other subdirectories like mysql or performance_schema

The MyISAM engine does not track its read/write blocks. You can get around that using Bytes_sent and Bytes_read using the community exchange MySQL plugin.

We don't want to track more than one value of this kind, beside that it is still open if we're more interested in IO specifics or total size of what applications requested. Since the InnoDB reads would also reflect the Bytes_sent it should be enough for the real world.

1.5. Monitoring arbitrary parameters

Monitoring any parameter you ever want - this is for the always hungry :)

FIRST: This section is using a community plugin. These plugins are only supported via the Check_MK mailing list, there is no official support for them.

If needed, it is possible to extend the Agent and query many more parameters. The downside is you'll have to manually configure them, and to run "show global status" twice, or modify your agent to do caching. None of this is really elegant. We're not recommending it and also NOT supporting it, but that does not neccessarily mean you can't do be doing it. :)

Monitoring arbitrary parameters is done using a plugin from the Check_MK Community Exchange at http://exchange.check-mk.org/

Head there, and grab the MySQL status plugin. It's agent is just running 'show status', so the first step is to modify the agent; if you want, you can also add a query for InnoDB stats like show variables where variable_name like '%innodb%';

The rest of the configuration is done on the server side. The check is configured by explicitely listing the variables you care about, their expected values / thresholds (since MySQL sends boolean, int and other values all in one big batch) in your configuration.

Of course, don't add any parameters you already get from the standard mysql checks now and remove them as we extend them.

This is a snippet from my old "live" configuration:

conf.d/mysqlmon.mk
if "mysql_status" in inventorable_checktypes("all"):
    mysql_status_vars = [
         ( "Innodb_os_log_pending_fsyncs",          5, 10),
         ( "Innodb_data_pending_fsyncs",          50, 100),
         ( "Innodb_buffer_pool_pages_free",    None, None),
         ( "Qcache_free_memory",               32*1024*1024, 10*1024*1024),
         ( "Qcache_hits",                      None, None),
         ( "Aborted_clients",                      10, 20),
         ( "Aborted_connects",                   909, 500),
         ( "Bytes_received",                   None, None),
         ( "Bytes_sent",                1*1024*1024, 10*1024*1024),
         ( "Slave_running",                    "ON", "ON"),
    ]

The first line makes sure it's only loaded if the plugin is installed.

If you set "None, None" as levels for an integer value it will only be be tracked, but not alerted.

Keep an eye on how the different warn/crit values are configured. Some of the values MySQL reports as positive, some as negative (meaning: For those, CRIT must be a lower number than WARN

Last, the check by default assumes the value in question is a gauge. If it's not, you need to configure that inside the check. Same goes for the negative values.

This is simply a design issue in MySQL, but it's not too hard to work around it by configuring the check.

1.6. Processes

MySQL Daemon

Catch the MySQL daemon at work by using inventory_processes. Then also make it track performance / usage for the MySQL processes in nice graphs simply by adding _perf!

conf.d/processes.mk
inventory_processes_perf += [
    ( "MySQL", "~.*sbin/mysqld", ANY_USER, 0, 1, 5, 8),
]

1.7. Filesystems

Ideally you'd have different filesystems for the IB redologs, Ib datafiles, and maybe even one per Database. That would give you the same level of control as a Oracle system would, at the same penalty in administrative overhead :)

On the systems we have access to, this is not true, so we have either a single / filesystem or a /var/lib/mysql.

I'll outline what do to for an "idealistic" MySQL server.

FS check for the main database filesystem.

This filesystem will not alert based on usage until it hits 95%. Instead, we mostly use trend alerting as a main indicator here.

We'll get alerted, if the filesystem is growing at a rate that will fill it in less than 10 days, or critically if it would be full in 2 days. This trend is set to be based on monthly averages.

main.mk
checkgroup_parameters['filesystem'] += [
  ( {'trend_timeleft': (10, 2),
     'levels': (95.0, 98.0),
     'trend_range': 720,
     'trend_perfdata': True}, [], ALL_HOSTS, ['/data/mysql/db'] ),
]

That way, we'll always still have time to attach another disk and extend the filesystem (unless you work for $MEGACORP and it takes 9 weeks to get a filesystem extension)

Trend Check for redos

For our InnoDB redo logs we want to quickly be alerted if something grows unexpectedly - they should have a static size.

We allow for 99% FS usage, but quickly alert on even 1MByte of daily growth.

main.mk
checkgroup_parameters['filesystem'] += [
  ( {'levels': (99.0, 99.0),
     'trend_range': 24,
     'trend_mb': (1, 1),
     'trend_perfdata': True}, [], ALL_HOSTS, ['/data/mysql/redo'] ),
]

1.8. Logfiles

After making sure the MK Logwatch Agent module is installed, configure it to track the MySQL error log using

/etc/check_mk/logwatch.cfg
/var/lib/mysql/error.log
 W Can't create/write to file
 C [ERROR] Can't start server
 C mysqld_safe mysqld from pid file /var/run/mysql/mysqld.pid ended

And then also double-check this log is written to by MySQL. This is enabled in the server section of your MySQL config file, usually a my.cnf in /etc.

/etc/my.cnf
[mysqld]
log_error               = /var/log/mysql/error.log

(If you're going the Debian way: identify which of 10+ MysQL config files a mere admin is entitled to edit, then add the entry to that file.)

1.9. Service Grouping

A service group is rarely used with Check_MK, but allows to easily build views.

Create a service group named "mysql-services". Add all the MySQL services to the service group. Check_MK will match services with similar names, i.e. you can specify "fs_/data/mysql" and this will match all filesystems that shared this name.