percona / percona-monitoring-plugins Goto Github PK

Percona Monitoring Plugins

License: GNU General Public License v2.0

Shell 31.86% Perl 14.80% PHP 36.66% Python 16.33% Makefile 0.34%

percona-monitoring-plugins's Introduction

⚠ Percona Monitoring Plugins is End of Life ⚠

Effective August 1, 2020, Percona moved the Percona Monitoring Plugins to end of life status. This means that open PRs and issues cannot be addressed. No new versions, enhancements, bug fixes, or security updates will be released. The software will continue to be available at our download site.

Call for Maintainers

If you are interested in continuing this project in your name/organization and would like to become its core maintainer, please contact us under [email protected] or on our forums.

The Percona Monitoring Plugins are high-quality components to add enterprise-grade MySQL monitoring and graphing capabilities to your existing in-house, on-premises monitoring solutions. The components are designed to integrate seamlessly with widely deployed solutions such as Nagios, Cacti and Zabbix.

Project home page: http://www.percona.com/software/percona-monitoring-plugins

percona-monitoring-plugins's People

Contributors

Stargazers

Watchers

Forkers

mhagstrand johnpupu evgeni eduardo-dedalus tersmitten hielw djandruczyk dbaklikov stardata lmtwga volans- romka is00hcw tomkrouper krmcbride adrianlzt gregoriol vonrosenchild tom2jack yfix koskv maxbube dankow marshallm nagyistge tpunder swriddle cezmunsta khanku tim-group julianobarbosa ahmedaljunied mattlk13 aycs mikhail-nikitin dhaase pondix guoyu07 deniskin82 zhishutech gdsotirov jbrahy callmedba evgeniypatlan majkiit tony-caffe adamotonete tangxunhu ademnoer bsmr netniv gpavinteractiv mrmilu tm8jbg meineerde natinosuke grantstreetgroup aboussetta farazbyk jcruzfrontline atlab liuqian1990 starjoe matsuu jevvz xstone0527 jkramarz fiowro usternes brunogain omitech rf2055 jnan1234 charnier lorf hbadmin sgargel trashkpi shaileshjathar wilfriedroset quyetmv mcgoldrickm jbontech liongwxyz fadzali seegras ede-n nethalo csabyka otterblitzar isabella232 zarmstrong x3ds-prog todd-pettit-ha muyd linuxraja gvasileiou dcmbrown berukann asdaas2

percona-monitoring-plugins's Issues

pmp-check-mysql-replication-delay shell hangs with syntax error

Description:

# su -l nagios -c "env -i HOME=/usr/local/nagios /usr/local/nagios/libexec/pmp-check-mysql-replication-delay -u -w 1 -T -s"
ERROR 1064 (42000) at line 1: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '-s WHERE (0 = 0 OR server_id = 0)' at line 2
^C
Session terminated, killing shell... ...killed.

pmp-check-mysql-ts-count shows error message directly

Based on discussion below, showing error message directly will expose information:
#58

How to reproduce:

mysql> create table deadlocks(id int not null);
Query OK, 0 rows affected (0.01 sec)

# su -l nagios -c "env -i HOME=/usr/local/nagios /usr/local/nagios/libexec/pmp-check-mysql-ts-count"
ERROR 1054 (42S22) at line 1: Unknown column 'ts' in 'where clause'
UNK could not count deadlocks

get_slave_status is called conditionally

get_slave_status outputs the current status for the slave to file, which is used to check for errors and replication delay.

When pt-heartbeat is used the function is only called when -s is set to MASTER

pmp-check-mongo.py: Could not connect or exec 'isMaster' command: 'No servers found yet'

I'm running a cronjob every 15 minutes that executes the pmp-check-mongo.py Nagios plugin to check a dozen MongoDB servers. I'm seeing intermittent errors like the following:

CRITICAL - Could not connect or exec 'isMaster' command: 'No servers found yet'

Here's a sample of errors over several days, including timestamps:

---- MONDAY ----

19:30
Check if the shards are balanced...
  dcamongodb2 failed | msg: non-zero return code | stdout: CRITICAL - Could not connect or exec 'isMaster' command: 'No servers found yet'

23:45
Check if there was a recent election...
  dcbmongodb2 failed | msg: non-zero return code | stdout: CRITICAL - Could not connect or exec 'isMaster' command: 'No servers found yet'

---- TUESDAY ----

03:45
Check connection...
  dcbmongodb2 failed | msg: non-zero return code | stdout: CRITICAL - Could not connect or exec 'isMaster' command: 'No servers found yet'

05:15
Check that the cluster has a primary server...
  dcbmongodb2 failed | msg: non-zero return code | stdout: CRITICAL - Could not connect or exec 'isMaster' command: 'No servers found yet'

07:00
Check that the cluster has a primary server...
  dcbmongodb2 failed | msg: non-zero return code | stdout: CRITICAL - Could not connect or exec 'isMaster' command: 'No servers found yet'

09:00
Check if the shards are balanced...
  dcbmongodb2 failed | msg: non-zero return code | stdout: CRITICAL - Could not connect or exec 'isMaster' command: 'No servers found yet'

09:30
Check if there was a recent election...
  dcbmongodb2 failed | msg: non-zero return code | stdout: CRITICAL - Could not connect or exec 'isMaster' command: 'No servers found yet'

I have no reason to suspect there is anything wrong with the MongoDB servers themselves. The errors go away if I add some exception handling and retry logic; see #112 for the code that fixes the problem.

The revision of pmp-check-mongo.py I was testing with is ca16cdc.

The server running the cronjob is CentOS 7.7 using the system /usr/bin/python. The pymongo library is version 3.7.2, and was provided by the CentOS repository:

$ rpm -qi python2-pymongo
Name        : python2-pymongo
Version     : 3.7.2
Release     : 1.el7
Architecture: x86_64
Install Date: Fri 17 Jul 2020 12:22:26 PM EDT
Group       : Unspecified
Size        : 1852358
License     : ASL 2.0 and MIT
Signature   : (none)
Source RPM  : python-pymongo-3.7.2-1.el7.src.rpm
Build Date  : Wed 06 Mar 2019 12:23:09 AM EST
Build Host  : c1bk.rdu2.centos.org
Relocations : (not relocatable)
Packager    : CBS <[email protected]>
Vendor      : CentOS

Script unaware of all regions

Have discovered that the RDS script doesn't show all DB instances in all zones. For example, I've recently added an instance in Canada and it doesn't appear to know this is a valid region. Additionally, having it list all instances only shows up 9 regions:

/usr/lib64/nagios/plugins/check_rds -r all -l
List of all DB instances in all region(s):
{'ap-northeast-1': [],
 'ap-southeast-1': [],
 'ap-southeast-2': [DBInstance:xxx],
 'eu-central-1': [],
 'eu-west-1': [DBInstance:xxxx, DBInstance:xxx],
 'sa-east-1': [],
 'us-east-1': [],
 'us-west-1': [],
 'us-west-2': []}

And if trying to query against an instance in Canada, using either ca-central-1 or ca-central-1a

/usr/lib64/nagios/plugins/check_rds -r ca-central-1 -i xxxxx -m status

Traceback (most recent call last):
  File "/usr/lib64/nagios/plugins/check_rds", line 399, in <module>
    main()
  File "/usr/lib64/nagios/plugins/check_rds", line 220, in main
    rds = RDS(region=options.region, profile=options.profile, identifier=options.ident)
  File "/usr/lib64/nagios/plugins/check_rds", line 49, in __init__
    self.info = rds.get_all_dbinstances(self.identifier)
AttributeError: 'NoneType' object has no attribute 'get_all_dbinstances'

pmp-check-mysql-status to have a 3rd option

Hi,

As a feature request, it would be nice to have a way to compute a 3rd option so that my checks, like checking Uptime, can be done in minutes (Uptime*60) before doing any of the other checks. Having output to show up as Minutes versus seconds is a little more readable.

A simple example of a command that would benefit from minute conversion:

command[dba_check_mysql_status_uptime]=/usr/lib64/nagios/plugins/pmp-check-mysql-status -H localhost -x Uptime -C '<' -w 300

Thanks,
Anthony

Null in Canada region

Is it possible that the RDS plugin isn't yet compatible to RDS Instances in the Central Canada region? I'm getting a null reading in Nagios.

pmp-check-mysql-replication-running shows sql error message directly

Description:

# su -l nagios -c "env -i HOME=/usr/local/nagios /usr/local/nagios/libexec/pmp-check-mysql-replication-running --master-conn master"
ERROR 1064 (42000) at line 1: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''master' STATUS' at line 1
UNK could not determine replication status

Unknown DB instance class "db.t3.large" in Icinga

Hi, during reconfig AWS instance RDS, we have a "Unknown DB instance class "db.t3.large" error in icinga monitoring, please can you fix?

Tks

error in jmx-monitor

Hi
when i run bellow command, error will issue!
ant -Djmx.server.port=1099 -e -q -f jmx-monitor.xml

BUILD FAILED
/home/test/jmx/jmx-monitor.xml:29: Problem: failed to create task or type antlib:org.apache.catalina.ant.jmx:open
Cause: The name is undefined.
Action: Check the spelling.
Action: Check that any custom tasks/types have been declared.
Action: Check that any / declarations have taken place.
No types or tasks have been defined in this namespace yet

This appears to be an antlib declaration.
Action: Check that the implementing library exists in one of:
-/usr/share/ant/lib
-/home/test/.ant/lib
-a directory added on the command line with the -lib argument

Total time: 0 seconds

Please help me

pmp-check-mysql-ts-count -x option accepts any value passed

--help:

 -x TARGET       Metric monitored; default deadlocks.
                    Other options: kills, fkerrors.

Based on --help output it is clear that, the accepted values are going to be "deadlocks", "kills", "fkerrors"

But I can pass any string here:

# su -l nagios -c "env -i HOME=/usr/local/nagios /usr/local/nagios/libexec/pmp-check-mysql-ts-count -x kjfgdkjfngdjfg"
OK 0 kjfgdkjfngdjfg in last 1 minutes | kjfgdkjfngdjfg=0;12;60;0;

Missing perl dependency

pmp-check-mysql-status requires perl to show help:

$ docker run --rm anchorfree/percona-nagios-plugins /usr/lib64/nagios/plugins/pmp-check-mysql-status
Error: you must specify either -c or -w. Try --help.

$ docker run --rm anchorfree/percona-nagios-plugins /usr/lib64/nagios/plugins/pmp-check-mysql-status --help
/usr/lib64/nagios/plugins/pmp-check-mysql-status: line 47: perl: command not found

Just faced with this after putting tools into fresh centos based container. After installing perl it works:

$ docker run --rm anchorfree/percona-nagios-plugins /usr/lib64/nagios/plugins/pmp-check-mysql-status --help
  Usage: pmp-check-mysql-status [OPTIONS]
  Options:
    -c CRIT         Critical threshold.
    --defaults-file FILE Only read mysql options from the given file.
                    Defaults to /etc/nagios/mysql.cnf if it exists.
    -C COMPARE      Comparison operator to apply to -c and -w.
                    Possible values: == != >= > < <=. Default >=.
    -H HOST         MySQL hostname.
    -I INCR         Make SHOW STATUS incremental over this delay.
    -l USER         MySQL username.
    -L LOGIN-PATH   Use login-path to access MySQL (with MySQL client 5.6).
    -o OPERATOR     The operator to apply to -x and -y.
    -p PASS         MySQL password.
    -P PORT         MySQL port.
    -S SOCKET       MySQL socket file.
    -T TRANS        Transformation to apply before comparing to -c and -w.
                    Possible values: pct str.
    -w WARN         Warning threshold.
    -x VAR1         Required first status or configuration variable.
    -y VAR2         Optional second status or configuration variable.
    --help          Print help and exit.
    --version       Print version and exit.
  Options must be given as --option value, not --option=value or -Ovalue.
  Use perldoc to read embedded documentation with more details.

No build documentation to build form source or 17.04 support

I cant find any documentation how to build the percona-monitoring-tool from the source tar and i need that since i am on ubuntu server 17.04 is there any documantation available ?

bashism in /bin/sh script in nagios/bin/

Running checkbashisms (https://anonscm.debian.org/cgit/collab-maint/devscripts.git/plain/scripts/checkbashisms.pl) against nagios/bin/* from the 1.1.5 release results into the following error.

Not using bash as /bin/sh is likely to lead to errors or unexpected behaviours. Please be aware that dash is the default /bin/sh in Debian (and Derivates) and maybe other Distributions.

possible bashism in nagios/bin/pmp-check-lvm-snapshots line 103 ($_):
   [ "${0##*/}" = "pmp-check-lvm-snapshots" ] || [ "${0##*/}" = "bash" -a "$_" = "$0" ]
possible bashism in nagios/bin/pmp-check-mysql-deadlocks line 97 ($_):
   [ "${0##*/}" = "pmp-check-mysql-deadlocks" ] || [ "${0##*/}" = "bash" -a "$_" = "$0" ]
possible bashism in nagios/bin/pmp-check-mysql-deleted-files line 154 ($_):
   [ "${0##*/}" = "pmp-check-mysql-deleted-files" ] || [ "${0##*/}" = "bash" -a "$_" = "$0" ]
possible bashism in nagios/bin/pmp-check-mysql-file-privs line 147 ($_):
   [ "${0##*/}" = "pmp-check-mysql-file-privs" ] || [ "${0##*/}" = "bash" -a "$_" = "$0" ]
possible bashism in nagios/bin/pmp-check-mysql-innodb line 237 ($_):
   [ "${0##*/}" = "pmp-check-mysql-innodb" ] || [ "${0##*/}" = "bash" -a "$_" = "$0" ]
possible bashism in nagios/bin/pmp-check-mysql-pidfile line 163 ($_):
   [ "${0##*/}" = "pmp-check-mysql-pidfile" ] || [ "${0##*/}" = "bash" -a "$_" = "$0" ]
possible bashism in nagios/bin/pmp-check-mysql-processlist line 184 ($_):
   [ "${0##*/}" = "pmp-check-mysql-processlist" ] || [ "${0##*/}" = "bash" -a "$_" = "$0" ]
possible bashism in nagios/bin/pmp-check-mysql-replication-delay line 139 ($_):
   [ "${0##*/}" = "pmp-check-mysql-replication-delay" ] || [ "${0##*/}" = "bash" -a "$_" = "$0" ]
possible bashism in nagios/bin/pmp-check-mysql-replication-running line 126 ($_):
   [ "${0##*/}" = "pmp-check-mysql-replication-running" ] || [ "${0##*/}" = "bash" -a "$_" = "$0" ]
possible bashism in nagios/bin/pmp-check-mysql-status line 304 ($_):
   [ "${0##*/}" = "pmp-check-mysql-status" ] || [ "${0##*/}" = "bash" -a "$_" = "$0" ]
possible bashism in nagios/bin/pmp-check-mysql-ts-count line 64 (should be 'b = a'):
   if [ "${OPT_TARGET}" == "kills" ]; then
possible bashism in nagios/bin/pmp-check-mysql-ts-count line 67 (should be 'b = a'):
   elif [ "${OPT_TARGET}" == "fkerrors" ]; then
possible bashism in nagios/bin/pmp-check-mysql-ts-count line 109 ($_):
   [ "${0##*/}" = "pmp-check-mysql-ts-count" ] || [ "${0##*/}" = "bash" -a "$_" = "$0" ]
possible bashism in nagios/bin/pmp-check-unix-memory line 118 ($_):
   [ "${0##*/}" = "pmp-check-unix-memory" ] || [ "${0##*/}" = "bash" -a "$_" = "$0" ]

Hints about how to fix bashisms can be found at:
https://wiki.ubuntu.com/DashAsBinSh

pmp-check-mysql-replication-running unclear --master-conn option in --help output

# su -l nagios -c "env -i HOME=/usr/local/nagios /usr/local/nagios/libexec/pmp-check-mysql-replication-running --help"
  Usage: pmp-check-mysql-replication-running [OPTIONS]
  Options:
    -c CRIT         Report CRITICAL when replication is stopped with or w/o errors.
    --defaults-file FILE Only read mysql options from the given file.
                    Defaults to /etc/nagios/mysql.cnf if it exists.
    -d              Useful for slaves delayed by pt-slave-delay. It will not alert
                    when IO thread is running, SQL one is not and no errors.
    -H HOST         MySQL hostname.
    -l USER         MySQL username.
    -L LOGIN-PATH   Use login-path to access MySQL (with MySQL client 5.6).
    -p PASS         MySQL password.
    -P PORT         MySQL port.
    -S SOCKET       MySQL socket file.
    -w WARN         Report WARNING when SHOW SLAVE STATUS output is empty.
    --master-conn NAME  Master connection name for MariaDB multi-source replication.
    --help          Print help and exit.
    --version       Print version and exit.

So as you see, --master-conn is accepting the ?NAME?
and it is for ?MariaDB?

pmp-check-mysql-status wrong error message for -c if passed without value

Hi,

Trying to run without option:

# su -l nagios -c "env -i HOME=/usr/local/nagios /usr/local/nagios/libexec/pmp-check-mysql-status"
Error: you must specify either -c or -w. Try --help.

So the -c or -w is required:

# su -l nagios -c "env -i HOME=/usr/local/nagios /usr/local/nagios/libexec/pmp-check-mysql-status -c"
Error: you must specify either -c or -w. Try --help.

But I have specified -c, so from user point it should be something like:
Error: -c requires a value. Try --help.

The same is true for -w as well:

# su -l nagios -c "env -i HOME=/usr/local/nagios /usr/local/nagios/libexec/pmp-check-mysql-status -w"
Error: you must specify either -c or -w. Try --help.

pmp-check-mysql-innodb Output Long Query

Hi,

For the monitor check, pmp-check-mysql-innodb, for max_duration option, when receiving an alert, it gave good information such as user host and duration but I think having the SQL Query would be helpful as well, that way its easier to track if needing to optimize and you dont have to go looking for it. I think this would be a great Feature Request that would be very useful.

Thanks!

Redundant argument in sprintf at /usr/local/bin/pmp-cacti-template line 25.

command to generate above error

pmp-cacti-template --script /usr/local/share/cacti/scripts/ss_get_mysql_stats.php definitions/mysql.def > mysql-template.xml

I prefer to just use pre made templates but seems is no download link.

"UNK couldn't query the checksum table" - uknown explanation for pmp-check-pt-table-checksum

For pmp-check-pt-table-checksum usage we have 3 different type of output:

# su -l nagios -c "env -i HOME=/usr/local/nagios /usr/local/nagios/libexec/pmp-check-pt-table-checksum"
UNK table 'percona.checksums' doesn't exist

# su -l nagios -c "env -i HOME=/usr/local/nagios /usr/local/nagios/libexec/pmp-check-pt-table-checksum"
OK pt-table-checksum found no out-of-sync tables

And:

# su -l nagios -c "env -i HOME=/usr/local/nagios /usr/local/nagios/libexec/pmp-check-pt-table-checksum"
UNK couldn't query the checksum table

So couldn't query? why?

How to reproduce:

mysql> use percona;
mysql> create table checksums(id int not null);
Query OK, 0 rows affected (0.01 sec)

# su -l nagios -c "env -i HOME=/usr/local/nagios /usr/local/nagios/libexec/pmp-check-pt-table-checksum"

It seems to be related to pt-table-checksum:

PT-192

Maybe improving UNK message or directly giving output from pt-table-checksum will be more informative.

no input validations in ss_get_mysql_stats.php parse_cmdline

Have an installation of Cacti which upgraded from 0.8.8h to v1.0.4 with monitor plugins 1.1.7 that stopped pulling data after the upgrade. Traced the lack of data to the poller call to ss_get_mysql_stats.php. Results from the script logging:

2017-04-07 17:21:04 at /var/lib/cacti/scripts/ss_get_mysql_stats.php:71
'Found configuration file /etc/cacti/ss_get_mysql_stats.php.cnf'
2017-04-07 17:21:04 at /var/lib/cacti/scripts/ss_get_mysql_stats.php:126
array (
0 => '/usr/share/cacti/scripts/ss_get_mysql_stats.php',
1 => '--host',
2 => 'x.x.x.x', //redacted to protect the guilty
3 => '--items',
4 => 'ju,jv,jw,jx,jy,jz,kg,kh,ki,kj,kk',
5 => '--user',
6 => '',
7 => '--pass',
8 => '',
9 => '--port',
10 => '',
11 => '--server-id',
12 => '',
)
2017-04-07 17:21:04 parse_cmdline() at /var/lib/cacti/scripts/ss_get_mysql_stats.php:239
array (
'host' => 'x.x.x.x', //redacted to protect the guilty
'items' => 'ju,jv,jw,jx,jy,jz,kg,kh,ki,kj,kk',
'user' => '',
'pass' => '',
'port' => '',
'server-id' => '',
)
2017-04-07 17:21:04 ss_get_mysql_stats() at /var/lib/cacti/scripts/ss_get_mysql_stats.php:266
'Cache file is /tmp/x.x.x.x-mysql_cacti_stats.txt:' //redacted to protect the guilty
2017-04-07 17:21:04 ss_get_mysql_stats() at /var/lib/cacti/scripts/ss_get_mysql_stats.php:283
'The cache file seems too small or stale'
2017-04-07 17:21:04 ss_get_mysql_stats() at /var/lib/cacti/scripts/ss_get_mysql_stats.php:316
array (
0 => 'Connecting to',
1 => 'x.x.x.x', //redacted to protect the guilty
2 => '',
3 => '',
4 => '',
)

Looks like the post upgrade version is passing empty strings as arguments when they are not explicitly set in the interface where previous versions would pass NULL. The set empty strings then don't permit the variables from the script config file to take precedence. Propose adding input validation to parse_cmdline as long as there isn't any real expectation to ever need to actually use an empty string for one of the script parameters. Workaround code from my instance:

function parse_cmdline( $args ) {
   $options = array();
   while (list($tmp, $p) = each($args)) {
      if (strpos($p, '--') === 0) {
         $param = substr($p, 2);
         $value = null;
         $nextparam = current($args);
         if ($nextparam !== false && strpos($nextparam, '--') !==0) {
            list($tmp, $value) = each($args);
         }
         if ($value != NULL){
            if ($value != ''){
               $options[$param] = $value;
            }
         }
      }
   }
   if ( array_key_exists('host', $options) ) {
      $options['host'] = substr($options['host'], 0, 4) == 'tcp:' ? substr($options['host'], 4) : $options['host'];
   }
   debug($options);
   return $options;
}

Results from logging after validation:

2017-04-07 17:51:04 at /var/lib/cacti/scripts/ss_get_mysql_stats.php:71
'Found configuration file /etc/cacti/ss_get_mysql_stats.php.cnf'
2017-04-07 17:51:04 at /var/lib/cacti/scripts/ss_get_mysql_stats.php:126
array (
0 => '/usr/share/cacti/scripts/ss_get_mysql_stats.php',
1 => '--host',
2 => 'x.x.x.x', //redacted to protect the guilty
3 => '--items',
4 => 'ju,jv,jw,jx,jy,jz,kg,kh,ki,kj,kk',
5 => '--user',
6 => '',
7 => '--pass',
8 => '',
9 => '--port',
10 => '',
11 => '--server-id',
12 => '',
)
2017-04-07 17:51:04 parse_cmdline() at /var/lib/cacti/scripts/ss_get_mysql_stats.php:241
array (
'host' => 'x.x.x.x', //redacted to protect the guilty
'items' => 'ju,jv,jw,jx,jy,jz,kg,kh,ki,kj,kk',
)
2017-04-07 17:51:04 ss_get_mysql_stats() at /var/lib/cacti/scripts/ss_get_mysql_stats.php:268
'Cache file is /tmp/x.x.x.x-mysql_cacti_stats.txt' //redacted to protect the guilty
2017-04-07 17:51:04 ss_get_mysql_stats() at /var/lib/cacti/scripts/ss_get_mysql_stats.php:285
'The cache file seems too small or stale'
2017-04-07 17:51:04 ss_get_mysql_stats() at /var/lib/cacti/scripts/ss_get_mysql_stats.php:318
array (
0 => 'Connecting to',
1 => 'x.x.x.x', //redacted to protect the guilty
2 => 3306,
3 => 'user', //redacted to protect the guilty
4 => 'password', //redacted to protect the guilty
)

I am not quite sure if the change in how the unset value is put into the input method is intentional with the release or a result of an upgrade to the cacti database when upgrading the server, but it may be a good idea to put the input validation in anyways. I might do some testing with a fresh install when I have some more time to see if it was the result of the upgrade.

No option for socket (zabbix script)

Recently I downloaded version 1.1.8 and made scripts by make.sh.

And in script release/percona-monitoring-plugins-1.1.8/zabbix/script/ss_get_mysql_stats.php, array $opts has no content 'socket'.

I checked 1.1.7 and this issue would be remained for a while.

Please add 'socket' for $opts

# ============================================================================
# Validate that the command-line options are here and correct
# ============================================================================
function validate_options($options) {
   $opts = array('host', 'items', 'user', 'pass', 'nocache', 'port', 'server-id');
   # Show help
   if ( array_key_exists('help', $options) ) {
      usage('');
   }

   # Required command-line options
   foreach ( array('host', 'items') as $option ) {
      if ( !isset($options[$option]) || !$options[$option] ) {
         usage("Required option --$option is missing");
      }
   }
   foreach ( $options as $key => $val ) {
      if ( !in_array($key, $opts) ) {
         usage("Unknown option --$key");
      }
   }
}

Issue with wrapper for mysql zabbix template

STEPS TO REPRODUCE:

root@localhost:~# grep yn /etc/zabbix/zabbix_agentd.d/galera.conf
UserParameter=MySQL.galera.wsrep_cert_index_size,/var/lib/zabbix/percona/scripts/get_mysql_stats_wrapper.sh yn

root@localhost:~# sh /var/lib/zabbix/percona/scripts/get_mysql_stats_wrapper.sh yn
Synced
48

That's because this line:
cat $CACHEFILE | sed 's/ /\n/g; s/-1/0/g'| grep $ITEM | awk -F: '{print $2}'

FIX:
In this line:
cat $CACHEFILE | sed 's/ /\n/g; s/-1/0/g'| grep $ITEM | awk -F: '{print $2}'

change: $ITEM -> ^$ITEM

root@localhost:# CACHEFILE="/tmp/localhost-mysql_cacti_stats.txt"
root@localhost:# ITEM="yn"
root@localhost:~# cat $CACHEFILE | sed 's/ /\n/g; s/-1/0/g'| grep ^$ITEM | awk -F: '{print $2}'
36

Percona MySQL Template doesn't import in Zabbix 3.4

I was unable to import template in Zabbix 3.4 due to changed XML schema.
#91 fixes this

pmp-check-mysql-status fails if a status variable contains a space

To test, run pmp-check-mysql-status against a status variable containing a space:

Variable_name: wsrep_provider_vendor
        Value: Codership Oy `<[email protected]>

-bash-4.2$ /usr/lib64/nagios/plugins/pmp-check-mysql-status -x wsrep_provider_vendor -C '!=' -T str -w Synced
awk: cmd. line:8:             if ( Codership Oy <[email protected]> != 0 ) {
awk: cmd. line:8:                                     ^ syntax error
awk: cmd. line:19:             if ( Codership Oy <[email protected]> != Synced ) {
awk: cmd. line:19:                                     ^ syntax error

fix is to add escaped quotes around ${VAR} in awk sections:

-bash-4.2$ diff -u /usr/lib64/nagios/plugins/pmp-check-mysql-status ./pmp-check-mysql-status
--- /usr/lib64/nagios/plugins/pmp-check-mysql-status    2016-12-09 18:22:03.000000000 +0000
+++ ./pmp-check-mysql-status    2020-02-06 09:40:15.643168709 +0000
@@ -172,7 +172,7 @@
                exit $STATE_CRITICAL
             }
          } else {
-            if ( ${VAR} ${CMP} ${CRIT:-0} ) {
+            if ( \"${VAR}\" ${CMP} \"${CRIT:-0}\" ) {
                exit $STATE_CRITICAL
             }
          }
@@ -183,7 +183,7 @@
                exit $STATE_WARNING
             }
          } else {
-            if ( ${VAR} ${CMP} ${WARN:-0} ) {
+            if ( \"${VAR}\" ${CMP} \"${WARN:-0}\" ) {
                exit $STATE_WARNING
             }
          }

Stretch version

Unlike jessie, percona-monitoring-plugins is no longer included in repositoriies for stretch. See:

$ curl http://repo.percona.com/apt/dists/stretch/main/binary-amd64/Packages| grep nagios

$ curl http://repo.percona.com/apt/dists/jessie/main/binary-amd64/Packages| grep nagios
Filename: pool/main/p/percona-monitoring-plugins/percona-nagios-plugins_1.1.7-2.jessie_all.deb

pmp for Zabbix doesn't include Galera template

Hi,
I can monitor single MySQL instance using PMP for Zabbix.
Galera values are defined in Cacti Definitions but are not included in zabbix template.

Percona Monitoring doesn't work well on Cacti 1.X.X

Hi guys,

Following the creation of issue Cacti/cacti#890 on cacti repo regarding some errors with Percona Monitoring Plugins, one of the developers did an audit of the Percona templates and presented his conclusions:

Running those pmp-* commands on a Cacti 1.x system would not be advised without Percona doing some updates first as they are not compatible with Cacti 1.x.
Cacti no longer supports the rra table for example. So, running all those commands will likely, at a minimum, result in errors.
Please point them to the install_template.php file in the cli directory. I think the best router to get this template to Cacti 1.x is to import it into a 0.8.x version. Then export the various Device Templates. Those templates can be imported into Cacti 1.x using the import_template.php file. They should avoid all their Database magic and leave it to the Cacti API.

I believe that Percona Monitoring Plugins are excellent and maybe some of you can help out with this and the Cacti/cacti#890 issue.

Monitor Long Queries via pmp-check-mysql-processlist

Hi,

Feature request to allow us to monitor Long queries taking over N amount of seconds and output the query and related info. This would be a good add on to the pmp-check-mysql-processlist monitor. I dont use PT tools so this would be a good feature to add to work out of the box.

Anthony

Debian 10 (Buster) support

Hi there,
are there any plans, when the package will be available for Debian Buster?

Incorrect max_duration for long running trx's when --default-time-zone is used

When the server is using --default-time-zone= option (eg --default-time-zone="+00:00"), the computation for max_duration in pmp-check-mysql-innodb is inaccurate since trx.started is set with a different TZ from unix_timestamp().

Pmp-check-mysql-innodb uses this query to compute max_duration:
https://github.com/percona/percona-monitoring-plugins/blob/master/nagios/bin/pmp-check-mysql-innodb#L106
SELECT UNIX_TIMESTAMP() - UNIX_TIMESTAMP(t.trx_started), p.id, CONCAT(p.user, '@', p.host) FROM INFORMATION_SCHEMA.INNODB_TRX AS t JOIN INFORMATION_SCHEMA.PROCESSLIST AS p ON p.id = t.trx_mysql_thread_id ORDER BY t.trx_started LIMIT 1;

Sample output if default timezone is "+00:00"

mysql> SELECT UNIX_TIMESTAMP() - UNIX_TIMESTAMP(t.trx_started), p.id, CONCAT(p.user, '@', p.host) FROM INFORMATION_SCHEMA.INNODB_TRX AS t JOIN INFORMATION_SCHEMA.PROCESSLIST AS p ON p.id = t.trx_mysql_thread_id ORDER BY t.trx_started LIMIT 1; show processlist;
+--------------------------------------------------+----+-----------------------------+
| UNIX_TIMESTAMP() - UNIX_TIMESTAMP(t.trx_started) | id | CONCAT(p.user, '@', p.host) |
+--------------------------------------------------+----+-----------------------------+
| 25233 | 4 | root@localhost |
+--------------------------------------------------+----+-----------------------------+
1 row in set (0.01 sec)

+----+------+-----------+------+---------+------+----------+------------------+-----------+---------------+
| Id | User | Host | db | Command | Time | State | Info | Rows_sent | Rows_examined |
+----+------+-----------+------+---------+------+----------+------------------+-----------+---------------+
| 4 | root | localhost | test | Sleep | 33 | | NULL | 0 | 0 |
| 13 | root | localhost | NULL | Query | 0 | starting | show processlist | 0 | 0 |
+----+------+-----------+------+---------+------+----------+------------------+-----------+---------------+
2 rows in set (0.00 sec)

pmp-check-mysql-pidfile unclear output

The sample run:

# su -l nagios -c "env -i HOME=/usr/local/nagios /usr/local/nagios/libexec/pmp-check-mysql-pidfile -w 1 -c 1"
OK all PID files exist.

I have only 1 instance of MySQL so it should be OK PID file exists.

https://www.percona.com/doc/percona-monitoring-plugins/LATEST/nagios/pmp-check-mysql-pidfile.html

By default, this plugin will attempt to detect all running instances of MySQL, and verify the PID file’s existence for each one

Not able to view RDS instances under mumbai region

Hi Team,
I am using this plugin to monitor my RDS instances under multiple accounts. For the RDS instances which are hosted in Singapore region, i can able to monitor successfully, but when i am trying to do the same for Mumbai Region(ap-south-1), i am not able to view the region, is there anything i am lagging at configuration point of view.

BR
Venkatesh.P

No HTTPS support for nginx/apache with ss_get_by_ssh.php

Trying to implement Nginx plugin for monitoring when I realized there's no support for HTTPS. It seems reasonable based on the fact that usually it wouldn't be to hard to make server-status by an http page, but our infrastructure is a bit wonky so it would be somewhat painful for us.

I'm currently working on an enhancement for it. Do you guys have PR guidelines anywhere?

Can't run the pmp-check-aws-rds.py

No matter how I try I only get this:

Traceback (most recent call last):
File "./pmp-check-aws-rds.py", line 361, in
main()
File "./pmp-check-aws-rds.py", line 192, in main
info = rds.get_list()
File "./pmp-check-aws-rds.py", line 69, in get_list
except (boto.provider.ProfileNotFoundError, boto.exception.BotoServerError) as msg:
AttributeError: 'module' object has no attribute 'ProfileNotFoundError'

pmp-check-mysql-deleted-files has no support for sudo

In the code for pmp-check-mysql-deleted-files, there is this bit:

     # If lsof exists, but you run it as non-root, you'll get a file with a
     # bunch of this stuff:
     # mysqld 15287 ... /proc/15287/cwd (readlink: Permission denied)
     # We have to detect this and return UNK.
     if grep -v -e denied -e COMMAND "${TEMP}" >/dev/null 2>&1; then
        local FILES=$(check_deleted_files "${TEMP}" "${OPT_TMPDIR}")
        NOTE="open but deleted files: ${FILES}"
        if [ "${FILES}" -a -z "${OPT_WARN}" ]; then
           NOTE="CRIT $NOTE"
        elif [ "${FILES}" ]; then
           NOTE="WARN $NOTE"
        else
           NOTE="OK no deleted files"
        fi
     else
        NOTE="UNK You must execute lsof with root privileges"
     fi

So, the script acknowledges that running lsof requires root privileges, enough to give the condition its own error message. But, what then? There is no switch to fix the lsof command to use sudo.

NRPE shouldn't ever be ran as root, so I'm not even considering that option. Although, it seems like this script is expected to run as root.

Invalid perfdata output for `pmp-check-mysql-status`

pmp-check-mysql-status seems to incorporate invalid performance data in it's output.
This yields errors in processing the plugin's output in monitoring systems such as Icinga, as shown below, performance data, according to nagios guidelines, may not contain strings:

Ignoring invalid perfdata value: wsrep_local_state_comment=Synced;Synced;;0;

I found a pull-request that addresses this issue, but it seems to be inactive at the moment. #38

This issue could be circumvented by allowing an option to omit performance data, or by allowing an option to incorporate some nagios-compliant values in performancedata.

Add max_duration to Cacti and Zabbix templates

It would appear that the Nagios template has been updated to add max_duration; which just outputs the longest running transaction. This is a good metric to have, for alerting on long running transactions. However, it would appear that this was NOT updated in the Cacti and Zabbix templates.

Is there a reason this was only implemented in the Nagios template, and not the other templates? Can this be added?

Or, maybe this data point is just named something different? If so, can someone please point it out?

Thanks,
Kelly Shutt
Vargo Companies

Support for MongoDB Monitoring Template for zabbix

pmp-check-mongo check_cannary_test throws FAILED: 'float' object has no attribute 'total_seconds'

Hi!

I get an error using the pmp-check-mongo when I try to use the check_cannary_test

Error Message

/usr/lib64/nagios/plugins/pmp-check-mongo.py -H localhost -P 27017 -u <user> -p <password> -A check_cannary_test -W 3 -C 5 -d test1 -c test -q "db.test.find()"

CRITICAL - Collection test1.test  query FAILED: 'float' object has no attribute 'total_seconds'

User and password are fine, I can login and other check method work as well - only the cannary test fails.

The Database and collection exists:

PRIMARY> db.test
PRIMARY> db.test.find()
{ "_id" : ObjectId("5abce2eda51d76dba2666695"), "name" : "test1" }

Server
OS: Debian 9 stretch 64 bit, python 2.7
pmp-nagios-plugins from percona repository:

Package: percona-nagios-plugins
Version: 1.1.8-1.stretch
New: yes
State: installed
...
Maintainer: Percona LLC

regards and thanks in advance,
david

The mysql plugin does not work with cacti 1.1.6

I used to have an old version of the plugin and after updating server couple months ago it stopped working. I updated cacti and removed all Percona related things like "Data Input Methods", "Device Templet", "Graph Templates", "Data Source Templates", "CDEFs", "GPRINTs" and I imported the template XML file from the latest release of your plugin: cacti_host_template_percona_mysql_server_ht_0.8.6i-sver1.1.7.xml and I updated php scripts on the server

I readded everything to my devices and nothing works. All Percona graphs are empty... however, when I run the plugin manually plugin it gives me data, for example:
command:
/usr/bin/php -q /srv/www/cacti/scripts/ss_get_mysql_stats.php --host db-02.col.end --items iz,ir
output:
ir:4 iz:164219

Any idea what is going on?
Thanks

Can't specify database with pmp-check-mysql-replication-delay

With pt-heartbeat I am writing entries to the table "heartbeat" in the database "heartbeat" on our master server:

pt-heartbeat -D heartbeat -h mysql-master -u root -pbar --update --create-table

When trying to use pmp-check-mysql-replication-delay on the slave to check the delay I have trouble reading those values:

# /usr/lib64/nagios/plugins/pmp-check-mysql-replication-delay -H mysql-slave -l root -p bar -T heartbeat.heartbeat
/usr/lib64/nagios/plugins/pmp-check-mysql-replication-delay: 71: [: unexpected operator
Warning: Using a password on the command line interface can be insecure.
CRIT 4286 seconds of replication delay | replication_delay=4286;300;600;0;

The value I get is correct, but the error message confuses me.

Snippet from pmp-check-mysql-replication-delay:

     64    # Get replication delay from a heartbeat table or from SHOW SLAVE STATUS.
     65    if [ "${OPT_TABLE}" ]; then
     66       if [ -z "${OPT_UTC}" ]; then
     67          NOW_FUNC='UNIX_TIMESTAMP()'
     68       else
     69          NOW_FUNC='UNIX_TIMESTAMP(UTC_TIMESTAMP)'
     70       fi
     71       if [ "${OPT_SRVID}" == "MASTER" ]; then

Only specifying the table does not work either:

# /usr/lib64/nagios/plugins/pmp-check-mysql-replication-delay -H mysql-slave -l root -p bar -T heartbeat
/usr/lib64/nagios/plugins/pmp-check-mysql-replication-delay: 71: [: unexpected operator
Warning: Using a password on the command line interface can be insecure.
ERROR 1046 (3D000) at line 1: No database selected
UNK could not determine replication delay

I'm using the latest version:

# /usr/lib64/nagios/plugins/pmp-check-mysql-replication-delay --version
Percona Monitoring Plugins pmp-check-mysql-replication-delay 1.1.6

Support for multi-source MySQL replication

As of 5.7 MySQL started supporting multi-source replication.

E.g commands:

show slave status for channel 'mymaster1'\G
show slave status for channel 'mymaster2'\G

Is it something you are considering to add?

check_election() in pmp-check-mongo.py doesn't do anything useful

I think the intention with this function is to compare the previous state to the current state, but what it's actually doing is comparing the current state to itself, so it will always return OK.

https://github.com/percona/percona-monitoring-plugins/blob/master/nagios/bin/pmp-check-mongo.py#L411-L424

unflushed_log handling in ss_get_mysql_stats.php

According to the comment in cacti/scripts/ss_get_mysql_stats.php

percona-monitoring-plugins/cacti/scripts/ss_get_mysql_stats.php

Lines 608 to 613 in d429df0

 # TODO: I'm not sure what the deal is here; need to debug this. But the 

 # unflushed log bytes spikes a lot sometimes and it's impossible for it to 

 # be more than the log buffer. 

 debug("Unflushed log: $status[unflushed_log]"); 

 $status['unflushed_log'] 

 = max($status['unflushed_log'], $status['innodb_log_buffer_size']);

apparently unflushed_log:

spikes a lot sometimes and it's impossible for it to be more than the log buffer

but in the code under this comment max(unflushed_log, innodb_log_buffer_size) is taken. This seems wrong, as this'll bottom unflushed_log out at innodb_log_buffer_size, instead of capping it at this value.

(I just happened to stumble upon this while looking at this code for no particular reason)

Ubuntu 18.04 support

Hello!
I'd like to ask whether percona-monitoring-plugins will be released for Ubuntu 18.04 LTS.

unknown db instance class in pmp-check-aws-rds.py

Getting Unknown DB instance class "db.r4.16xlarge" while running the script.

[root@localhost]$ /usr/lib64/nagios/plugins/pmp-check-aws-rds.py -i <instance> -m memory -w 10 -c 1
Unknown DB instance class "db.r4.16xlarge"

	# TODO: I'm not sure what the deal is here; need to debug this. But the
	# unflushed log bytes spikes a lot sometimes and it's impossible for it to
	# be more than the log buffer.
	debug("Unflushed log: $status[unflushed_log]");
	$status['unflushed_log']
	= max($status['unflushed_log'], $status['innodb_log_buffer_size']);