Integrating Nagios with Bright Cluster Manager
[This is only valid for Bright 7.2, and not for later versions.]
The integration between Nagios and Bright is done via a Nagios-BCM interface. The Nagios-BCM interface allows sending notifications from BCM to a remote Nagios server, using the Nagios Service Check Acceptor (NSCA).
BCM monitoring framework allows defining monitoring rules and corresponding actions to be triggered. BCM users thus have the flexibility to specify the following details for the notifications they expect from Nagios:
- service
- type (OK, WARNING, CRITICAL, UNKNOWN)
- content
Below are shown configurations for server side (Nagios server) and client side (BCM head node).
Server Side Configurations
Install Nagios on the head node (Nagios Server)
[root@ma-b72-c7 ~]# yum install nagios nsca nrpe nagios-plugins nagios-plugins-{ping,disk,users,procs,load,swap,ssh,http}
Add nagios admin user
[root@ma-b72-c7-nagios ~]# htpasswd /etc/nagios/passwd nagiosadmin
Enable services
[root@ma-b72-c7-nagios ~]# systemctl start nagios
[root@ma-b72-c7-nagios ~]# systemctl enable nagios.service
Created symlink from /etc/systemd/system/multi-user.target.wants/nagios.service to /usr/lib/systemd/system/nagios.service.
[root@ma-b72-c7-nagios ~]#
[root@ma-b72-c7-nagios ~]# systemctl restart httpd
Access Nagios Admin web portal
http://b72-c7-nagios/nagios/
Authenticate with the nagiosadmin/system credentials
Access Nagios Command
[root@ma-b72-c7-nagios ~]# cant /etc/nagios/objects/commands.cfg
[...]
# 'check_cmd' command definition
define command{
command_name check_cmd
command_line $USER1$/check_cmd -H $HOSTADDRESS$ $ARG1$
}
[...]
Add Nagios Host and Service
[root@ma-b72-c7-nagios ~]# cat /etc/nagios/objects/bright.cfg
define host{
uselinux-server; Inherit default values from a template
host_nameb72-c7; The name we're giving to this host
aliasBright; A longer name associated with the host
address10.2.59.186; IP address of the host
}
define service{
usegeneric-service
host_nameb72-c7
service_descriptionCMDaemon
check_commandcheck_cmd
}
[root@ma-b72-c7-nagios ~]# cat /etc/nagios/nagios.cfg
[...]
cfg_file=/etc/nagios/objects/bright.cfg
[...]
[root@ma-b72-c7-nagios ~]# systemctl restart nagios.service
Check the validity of the configurations
[root@ma-b72-c7-nagios ~]# /usr/sbin/nagios -v -d /etc/nagios/nagios.cfg
Reading configuration data...
Read main config file okay...
Read object config files okay...
Running pre-flight check on configuration data...
Checking objects...
Checked 9 services.
Checked 2 hosts.
Checked 1 host groups.
Checked 0 service groups.
Checked 1 contacts.
Checked 1 contact groups.
Checked 25 commands.
Checked 5 time periods.
Checked 0 host escalations.
Checked 0 service escalations.
Checking for circular paths...
Checked 2 hosts
Checked 0 service dependencies
Checked 0 host dependencies
Checked 5 timeperiods
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...
Total Warnings: 0
Total Errors: 0
Things look okay - No serious problems were detected during the pre-flight check
[root@ma-b72-c7-nagios ~]#
Enable and start NSCA server
[root@ma-b72-c7-nagios ~]# systemctl enable nsca.service
[root@ma-b72-c7-nagios ~]# systemctl start nsca.service
Client Side configurations
Install NSCA client on the head node to be monitored (Nagios Client)
[root@ma-b72-c7 ~]# yum install nsca-client.x86_64
Add The NSCA server to /etc/hosts
[root@ma-b72-c7 ~]# cat /etc/hosts
[...]
10.2.60.170 nagios-server
Test NSCA moniter
[root@ma-b72-c7 ~]# echo "ma-b72-c7;CMDaemon;0;test-output" | send_nsca -H nagios-server -p 5667 -c /etc/nagios/send_nsca.cfg -d ";"
(logs on the server side)
Mar 25 12:59:58 ma-b72-c7-nagios nsca[30028]: Handling the connection...
Mar 25 12:59:58 ma-b72-c7-nagios nsca[30028]: Time difference in packet: 0 seconds for host ma-b72-c7
Mar 25 12:59:58 ma-b72-c7-nagios nsca[30028]: SERVICE CHECK -> Host Name: 'ma-b72-c7', Service Description: 'CMDaemon', Return Code: '0', Output: 'test-output'
Mar 25 12:59:58 ma-b72-c7-nagios nsca[30028]: Attempting to write to nagios command pipe
Mar 25 12:59:58 ma-b72-c7-nagios nsca[30028]: End of connection...
[root@ma-b72-c7 ~]# echo "ma-b72-c7;CMDaemon;2;test-output" | send_nsca -H nagios-server -p 5667 -c /etc/nagios/send_nsca.cfg -d ";"
1 data packet(s) sent to host successfully.
(logs from the NSCA server)
Mar 25 16:32:20 ma-b72-c7-nagios nsca[30028]: Handling the connection...
Mar 25 16:32:20 ma-b72-c7-nagios nsca[30028]: Time difference in packet: 0 seconds for host ma-b72-c7
Mar 25 16:32:20 ma-b72-c7-nagios nsca[30028]: SERVICE CHECK -> Host Name: 'ma-b72-c7', Service Description: 'CMDaemon', Return Code: '2', Output: 'test-output'
Mar 25 16:32:20 ma-b72-c7-nagios nsca[30028]: Attempting to write to nagios command pipe
Mar 25 16:32:20 ma-b72-c7-nagios nsca[30028]: End of connection...
Mar 25 16:32:20 ma-b72-c7-nagios nagios: EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;ma-b72-c7;CMDaemon;2;test-output
Mar 25 16:32:20 ma-b72-c7-nagios nagios: PASSIVE SERVICE CHECK: ma-b72-c7;CMDaemon;2;test-output
Mar 25 16:32:20 ma-b72-c7-nagios nagios: SERVICE ALERT: ma-b72-c7;CMDaemon;CRITICAL;SOFT;2;test-output
Add a Custom Action
copy "nagios_if.pl" in a convenient location, e.g. /cm/local/apps/cmd/scripts/actions/
Important note on NSCA
-
It is strongly suggested to use the same version for both NSCA daemon and client. There is a known incompatibility between versions 2.7 and 2.9 of client and server packages.
-
check the value returned by ´hostname´. It should match the host definition in Nagios
Configuring Bright Actions to send NSCA messages to the NSCA server
Example of usage in CMGUI
In "Monitoring Configuration -> Actions", define an action with:
Name: Nagios interface
Command: /cm/local/apps/cmd/scripts/actions/nagios_if.pl
In "Monitoring Configuration -> Overview, define a rule with:
Action: Nagios interface
Action Parameter: "BCM,0,CPUUser less than 20 %" (with quotation marks)
The action parameter is a single parameter (hence the use of quotation marks). It consists of three values, separated by comma:
-
name of the service
should match the "service_description" in service definition in Nagios
-
a single digit representing the type of notification, according to the values used by Nagios
0 OK
1 Warning
2 Critical
3 Unknown
- Comment
string representing status information and optional performance data, separated by "|" (pipe)
File Listing
nagio_if.pl
##################################################
#!/usr/bin/perl
#
# Copyright (c) 2004-2013 Bright Computing Holding BV. All Rights Reserved.
#
# This software is the confidential and proprietary information of
# Bright Computing Holding BV ("Confidential Information"). You shall not
# disclose such Confidential Information and shall use it only in
# accordance with the terms of the license agreement you entered into
# with Bright Computing Holding BV or its subsidiaries.
use strict;
my $params = shift;
if ($params =~ /^(.*),(\d{1}),(.*)$/) {
my $host = ´hostname´;
chomp $host;
my $service = $1;
my $retcode = $2;
my $comment = $3;
my $tosend = "$host\t$service\t$retcode\t$comment";
my $cmd = "echo -e \"$tosend\" | send_nsca nagios-server";
exec $cmd;
}
##################################################