Icinga2 checks over SSH

Normally, Icinga2 runs on every machine in the system - Master and Satellite nodes where the configurations are managed, and you probably have Icingaweb2 running as well, but also on the Endpoint nodes (the machines being monitored).

Sometimes, however, you can't, or don't want to, run Icinga2 on the monitored nodes, even though they can perfectly well run Icinga(Nagios) plugins (ie: they can perform the checks, if only there's some way of telling them to, and getting the results back to Icinga).

Nagios and Icinga1 used to use NRPE (Nagios Remote Procedure Execution) for this, but the Icinga2 developers have decided this is insecure (and the insecurity isn't going to be fixed upstream), therefore NRPE is no longer an option (or at least a specifically deprecated one) with Icinga2.

The alternative option to running Icinga2 on the target machine is to use SSH transport to send the command to the target machine, run it, and get the result back to Icinga2.

The official Icinga2 document outlines how to implement SSH-based checks, however there's a neat and efficient way of implementing an SSH version of any existing service checks by using the following:

  1. Define a Service template which appends "-ssh" to the check command, and sets the SSH port number if required:
    template Service "ssh-too" {
            check_command = check_command + "-ssh"
            vars.by_ssh_logname = "nagios"
            if ( host.vars.ssh_port ) { vars.by_ssh_port = host.vars.ssh_port }
    }
  2. Define a list of all the service checks you may want to perform over SSH, and programmatically create "-ssh" versions of them (this is the neat bit):
    var ssh_checks = [ "disk", "interface_bytecount", "load", "procs", "processes", "swap", "users" ]
    
    for (ssh_check in ssh_checks) {
            object CheckCommand ssh_check+"-ssh" use(ssh_check) {
                    import ssh_check
                    vars.by_ssh_command = command
                    vars.by_ssh_arguments = arguments
                    import "by_ssh"
            }
    }

    There's probably a way of doing this so that all service check names get detected automatically, but:

    1. I don't know how to do that yet, and
    2. you might not want to create SSHised versions of all your service checks anyway.
  3. Finally, import the "ssh-too" template into each of your service check definitions (Note: this must be included after setting any service check variables - for simplicity, put it at the end of the check definition):
    apply Service "load" {
            import "generic-service"
            check_command = "load"
            vars.load_percpu = true
            import "ssh-too"
            assign where (host.address || host.address6) && host.vars.os == "Linux"
    }

Note: the following instructions were created to work with a specific type of Gentoo-based target system. For other systems, you might need to adjust things to suit…

To get the checks to be able to be performed, you need to:

  1. install the check plugins onto the target machine
    • these should go into the standard directory of /usr/lib/nagios/plugins, which will probably need creating first
  2. create an account on the target machine which can run the checks
    • this needs to be an account with a login shell, but preferably with no password (see the next item below)
      # useradd -m nagios
  3. set up public keys so that the user on the Icinga machine can log in to perform the checks on the target machines
    • the home directory under Debian for the 'nagios' user (which is what, for historical reasons, Icinga2 runs as) is /var/lib/nagios, therefore create a .ssh directory under this and set it to be owned by 'nagios:'
    • the nagios user has no login shell, therefore create a public key pair on the parent node as the root user and make it owned by the nagios user:
      # ssh-keygen
      Generating public/private rsa key pair.
      Enter file in which to save the key (/root/.ssh/id_rsa): /var/lib/nagios/.ssh/id_rsa
      Enter passphrase (empty for no passphrase): 
      Enter same passphrase again: 
      Your identification has been saved in /var/lib/nagios/.ssh/id_rsa.
      Your public key has been saved in /var/lib/nagios/.ssh/id_rsa.pub.
      The key fingerprint is:
      1e:59:c2:f6:81:f9:c0:02:74:92:14:c2:33:a4:ae:46 root@eye.eye.callswitch.net
      # chown nagios: /var/lib/nagios/.ssh/id_rsa*
    • /var/lib/nagios/.ssh/id_rsa.pub now needs copying into /home/nagios/.ssh/authorized_keys on the target system/s
    • finally, SSH from the parent Icinga server to each target machine as the nagios user, and accept the key fingerprints, then copy those fingerprints to the nagios user's .ssh directory:
      # ssh -i /var/lib/nagios/.ssh/id_rsa nagios@tar.get.ser.ver ls
      The authenticity of host '[XXXXX] ([192.0.2.44])' can't be established.
      ECDSA key fingerprint is ec:b0:bb:02:13:91:2e:a9:66:26:22:f6:2c:a3:8d:65.
      Are you sure you want to continue connecting (yes/no)? yes
      Warning: Permanently added '[XXXXX],[192.0.2.44]' (ECDSA) to the list of known hosts.
      # cp ~/.ssh/known_hosts /var/lib/nagios/.ssh
      # chown nagios: /var/lib/nagios/.ssh/known_hosts

      The nagios user on the parent server should now be able to SSH to the target machine/s and run check plugins.


Go up
Return to main index.