Integrating PNP4Nagios with Icinga2

Once you've installed Icinga2 and got it performing some checks, you should install PNP4Nagios.

Without that, you have no check history, no trends, no way of seeing "what's normal for this check at this time of day / week / whatever?".

In my opinion PNP4Nagios should be a standard feature which can be installed alongside Icinga2.

Here's how to get it working under Debian / Devuan (the instructions below work as-is for Wheezy and Jessie, but for Stretch / Ascii, you'll first need to configure both the Jessie and the Jessie-backports repositories (since pnp4nagios is only in Jessie-backports, and it depends on some PHP5 stuff which is only in Jessie, plus the Jessie version of rrdtool)):

  1. If you're installing on Stretch / Ascii, start by installing the rrdtool package from Jessie, otherwise you'll get one which no longer understands some of the options used by pnp4nagios:
    # aptitude install -t jessie rrdtool
    • you probably want to pin this package to stop it getting upgraded to the broken one later on:
    • /etc/apt/preferences.d/rrdtool
      Package: rrdtool
      Pin: release n=jessie
      Pin-Priority: 1001
  2. Install the pnp4nagios package with without recommended packages:
    # aptitude install -R pnp4nagios
    • This should install ~30-40 packages. If you allow it to install recommended packages as well, you'll get more like twice that number, including Icinga1 and Samba!
  3. If you're installing on Jessie or later, also install the package pnp4nagios-web-config-icinga
  4. If you're installing on Stretch or Ascii, also install the package php-gd
  5. Turn on performance data reporting in Icinga2:
    # icinga2 feature enable perfdata
  6. Edit /etc/default/npcd and set RUN="yes"
  7. Edit /etc/pnp4nagios/config.php
    • Uncomment (remove the # at the start of the line) from $views[] = array('title' ⇒ 'One Hour', 'start' ⇒ (60*60) );
  8. Edit /etc/pnp4nagios/npcd.cfg
    • Change perfdata_spool_dir = /var/spool/pnp4nagios/npcd/ to perfdata_spool_dir = /var/spool/icinga2/perfdata
  9. On Wheezy, edit /etc/pnp4nagios/apache.conf, on Jessie, Ascii or Stretch edit /etc/apache2/conf-available/pnp4nagios.conf:
    • Comment-out (insert a # at the start) or delete the lines:
      AuthName "Icinga Access"
      AuthType Basic
      AuthUserFile /etc/icinga/htpasswd.users
      Require valid-user
  10. Restart icinga2 and npcd, and reload Apache:
    # /etc/init.d/icinga2 restart
    # /etc/init.d/npcd restart
    # /etc/init.d/apache2 reload
  11. Install the Icinga PNP module:
    • Download the Zip file to /usr/share/icingaweb2/modules and unpack it:
      # cd /usr/share/icingaweb2/modules
      # wget https://github.com/Icinga/icingaweb2-module-pnp/archive/master.zip
      # unzip master.zip
    • Rename the folder to just pnp:
      # mv icingaweb2-module-pnp-master pnp
    • You can now remove the ZIP file if you wish
    • Enable the module in Icingaweb2:
      • Configuration (left hand menu)
      • Modules
      • Click on pnp
      • Click on enable in the right-hand pane

Service checks should now start showing clickable PNP4Nagios graphs.

Note

The default graphing template for PNP4Nagios uses the "average" function for calculating values over long time periods, and this can very often give highly misleading results; it works far better if you change the file /usr/share/pnp4nagios/html/templates.dist/default.php:

$def[$KEY]  = rrd::def     ("var1", $VAL['RRDFILE'], $VAL['DS'], "AVERAGE");

to:

$def[$KEY]  = rrd::def     ("var1", $VAL['RRDFILE'], $VAL['DS'], "MAX");

The reason why "average" is not a good idea is neatly summed up in a quote from Stéphane Bortzmeyer:

Measuring average network latency is about as useful as measuring
the mean temperature of patients in a hospital.

Suppose you are measuring network latency (ping round-trip times), and over a ten-minute period you get nine results of 5ms and one result of 100ms.

If you look at a graph of this showing 1-minute intervals, you will see the 100ms spike quite clearly.

If you then look at a longer timescale with 10-minute intervals, though, "average" would show this interval as having a value of 14.5ms, whereas you would probably prefer still to see that there was a 100ms peak sometime during that 10-minute window, and that is what "max" will show you.


Go up
Return to main index.