HOWTO Install Nagios with graphing services
From Gentoo Linux Wiki
| Installation • Kernel & Hardware • Networks • Portage • Software • System • X Server • Gaming • Non-x86 • Emulators • Misc |
Contents |
[edit] Introduction
Nagios is a fairly complex tool for monitoring the status of IT infrastructure - everything ranging from web servers to routers. Installation is a complex and lengthy procedure, and that's before you even start on the plugins required to actually make it useful! This article is designed to simplify the process, rather than a distribution independent guide, we provide you with the 'down and dirty' of how to get it working on Gentoo!
Notice: comments and constructive criticism are welcome on the discussion and bugs page!
[edit] About Nagios
From the www.nagios.org homepage:
- NagiosĀ® is a host and service monitor designed to inform you of network problems before your clients, end-users or managers do. It has been designed to run under the Linux operating system, but works fine under most *NIX variants as well. The monitoring daemon runs intermittent checks on hosts and services you specify using external "plugins" which return status information to Nagios. When problems are encountered, the daemon can send notifications out to administrative contacts in a variety of different ways (email, instant message, SMS, etc.). Current status information, historical logs, and reports can all be accessed via a web browser.
For a further impression take a look at the screenshots or read the Wikipedia article on it.
There's also an extended hack for Nagios that you should try (not in production, however) if you want to use the experimental NagiosGrapher. See bugs.gentoo.org and [1].
[edit] Nagios Server Installation
[edit] Installation Prerequisites
Since version 2.x the native DB support for storing various types of data (status, retention, comment, downtime, etc.) in MySQL and PostgreSQL has been dropped.
- apache2 is installed
- mod_perl is not enabled in apache2. This will slow down apache restart considerably (minutes)
- nmap needs to be installed for some of my own examples used
For version < 2.x:
- mysql is installed
| Code: Now let's prove our system complies |
$ file /usr/sbin/apache2 /usr/sbin/apache2: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), for GNU/Linux 2.4.1, dynamically linked (uses shared libs), stripped $ # add some smart commands to check mod_perl not being loaded $ grep -Ri 'mod_perl' /etc/apache2/* $ which nmap /usr/bin/nmap |
[edit] Installing Nagios
Let's first take a look at the USE flags. Depending on your setup you may get more or less lines in your output. Type the following command:
# emerge -pv net-analyzer/nagios These are the packages that I would merge, in order: Calculating dependencies ...done! [ebuild N ] net-analyzer/nagios-plugins-1.4.2 -ipv6 -ldap +mysql +nagios-dns -nagios-game -nagios-ntp +nagios-ping +nagios-ssh -postgres -radius +samba +snmp +ssl -ups 948 kB [ebuild N ] net-analyzer/nagios-nsca-2.4 53 kB [ebuild N ] mail-client/mailx-support-20030215 8 kB [ebuild N ] net-libs/liblockfile-1.06 31 kB [ebuild N ] mail-client/mailx-8.1.2.20040524-r1 126 kB [ebuild N ] net-analyzer/nagios-core-1.2-r4 +apache2 -debug +mysql -noweb +perl -postgres 1,587 kB [ebuild N ] net-analyzer/nagios-imagepack-1.0 1,610 kB [ebuild N ] net-analyzer/nagios-nrpe-2.0-r1 -command-args +ssl 50 kB [ebuild N ] net-analyzer/nagios-1.2 0 kB Total size of downloads: 4.416 kB
Especially notice:
- your screen output will differ as this was for version 1.2, we will currently be building 1.4 in this example
- the USE flags for net-analyzer/nagios-plugins. You may choose to update your /etc/portage/package.use to enable monitors. For my home server I enabled nagios-ping and nagios-ssh.
Now let us build the package. Type the following command:
| Code: Install Nagios |
# emerge net-analyzer/nagios # cd /usr/local # ln -s ../nagios nagios # mkdir /var/log/nagios # chmod 750 /var/log/nagios # chown nagios:apache /var/log/nagios |
Now we will set up the sample configuration files. It appears that Nagios looks for its config files directly under /etc, therefore we need to create symlinks.
| Code: Setup configuration files |
# cp /usr/share/doc/nagios-core-1.4.1/sample-configs/* /etc/nagios/ # cd /etc/nagios # bunzip2 *.bz2 # rename .cfg-sample .cfg *.cfg-sample # touch serviceextinfo.cfg # chmod 640 *.cfg # chown nagios:apache *.cfg # ln -s /etc/nagios/nagios.cfg /etc/nagios.cfg # cd /usr/nagios # ln -s /etc/nagios/ etc |
For Nagios 2.5, you may define MULTIPLE serviceextinfo config files.
# mkdir /etc/nagios/serviceextinfo.d # chown nagios:apache /etc/nagios/serviceextinfo.d/
[edit] Check installation
This paragraph is based on the Nagios documentation: Installing Nagios
Notice that the portage version of Nagios differs slightly from the Nagios documentation.
Gentoo Use Nagios docs /etc/nagios/ Main, resource, object, and CGI configuration files should be put here /usr/local/nagios/etc /etc/conf.d/nagios /usr/nagios/ Nagios home directory /usr/local/nagios /usr/nagios/bin/ Nagios core program /usr/local/nagios/bin /usr/nagios/libexec/ Nagios plugins /usr/local/nagios/libexec /usr/nagios/sbin/ Nagios CGI's /usr/local/nagios/sbin /usr/nagios/share/ Nagios HTML files for the web interface /usr/local/nagios/share /var/log/nagios/ Nagios log files /usr/local/nagios/var /var/nagios/ Empty directory for the temporary files /usr/local/nagios/var /var/run/nagios.pid Nagios pid lock file /usr/local/var/nagios
Make sure that the user your webserver runs as can navigate into the directory /etc/nagios or you will get an error as follows:
Error: Could not open CGI config file '/etc/nagios/cgi.cfg' for reading!
It is suggested that permissions similar to this are applied:
- drwxr-xr-x 2 nagios nagios 4096 Apr 10 06:35 /etc/nagios
- chmod 755 /etc/nagios
[edit] Setting up Nagios
[edit] Setting Up The Web Interface
This paragraph is based on the Nagios documentation: Setting Up The Web Interface
The Nagios installation comes with an Apache module in /etc/apache2/modules.d/ To automatically load it in Apache, open /etc/conf.d/apache2 in your favorite editor and add "-D NAGIOS" to APACHE2_OPTS:
| File: Edit the following line in /etc/conf.d/apache2 |
# Added module support for Nagios APACHE2_OPTS="-D DEFAULT_VHOST -D PERL -D NAGIOS" |
Notice that your APACHE2_OPTS line may include some different parameters.
Now open /etc/apache2/modules.d/99_nagios.conf in your favorite text editor and change the last "Allow from all". This line allows access to the Nagios pages only from my local wired subnet. You probably should not want to share the Nagios pages with the whole Internet. Change the "192.168.0.128/255.255.255.128 part to whatever subnet you're on.
| File: Change /etc/apache2/modules.d/99_nagios.conf |
Allow from 127.0.0.1 192.168.0.128/255.255.255.128
|
Open /etc/apache2/modules.d/00_mod_mime.conf in your favorite editor and uncomment the line defining the cgi-handler:
| File: Edit /etc/apache2/modules.d/00_mod_mime.conf |
AddHandler cgi-script .cgi |
Add proper directory rights:
- add these lines on the httpd.conf only if for some reason you did not set the -D NAGIOS in /etc/conf.d/apache2
- if you do not like the -D nagios option, the following can be added directly to /etc/apache2/httpd.conf.
| File: Edit /etc/apache2/httpd.conf |
ScriptAlias /nagios/cgi-bin /usr/nagios/sbin
<Directory "/usr/nagios/sbin">
Options ExecCGI
AllowOverride None
Order allow,deny
Allow from 127.0.0.1 192.168.0.128/255.255.255.128
AuthName "Nagios Access"
AuthType Basic
AuthUserFile /etc/nagios/htpasswd.users
Require valid-user
</Directory>
Alias /nagios /usr/nagios/share
<Directory "/usr/nagios/share">
Options None
AllowOverride None
Order allow,deny
Allow from 127.0.0.1 192.168.0.128/255.255.255.128
AuthName "Nagios Access"
AuthType Basic
AuthUserFile /etc/nagios/htpasswd.users
Require valid-user
</Directory>
|
Restart Apache.
On my ancient Intel PII with several webapps running, this takes several minutes (probably caused by mod_perl though).
| Code: Restart Apache2 |
# /etc/init.d/apache2 restart * Caching service dependencies ... [ ok ] * Stopping apache2 ... [ ok ] * Starting apache2 ... [ ok ] |
[edit] Setting up CGI authentication
This paragraph is based on the Nagios documentation: Authentication And Authorization In The CGIs
Create the following identical files: /usr/nagios/sbin/.htaccess and /usr/nagios/share/.htaccess both with the following content:
| File: /usr/nagios/sbin/.htaccess and /usr/nagios/share/.htaccess |
AuthName "Nagios Access" AuthType Basic AuthUserFile /etc/nagios/htpasswd.users AuthGroupFile /etc/nagios/htpasswd.group Require group nagios |
It's better to make single file i.e. /etc/nagios/htaccess and create two symlinks on that file and grant/check permissions for apache:
| Code: Another way to make .htaccess files |
# ln -s /etc/nagios/htaccess /usr/nagios/sbin/.htaccess # ln -s /etc/nagios/htaccess /usr/nagios/share/.htaccess # ls -ld /etc/nagios # should be drwxr-x--- # chmod +x /etc/nagios # it should be drwxr-x--x now |
We will also need to add the names into the htpasswd.group file, please note the usernames are space delimited. This file will need to be created with Nagios v1.6 and higher.
| File: /etc/nagios/htpasswd.group |
nagios: nagiosadmin <username> |
Now let's create the nagiosadmin user first:
| Code: Create nagiosadmin |
# htpasswd2 -c /etc/nagios/htpasswd.users nagiosadmin New password: Re-type new password: Adding password for user nagiosadmin # chown apache:apache /etc/nagios/htpasswd.users # chmod 640 /etc/nagios/htpasswd.users # chown apache:apache /etc/nagios/htpasswd.group # chmod 640 /etc/nagios/htpasswd.group # chmod 640 /usr/nagios/sbin/.htaccess # chown apache:apache /usr/nagios/sbin/.htaccess |
To create additional users use the following command (notice the missing "-c"):
| Code: Create additional user |
# htpasswd2 /etc/nagios/htpasswd.users <username> |
Before browsing to the newly created website you might need to do the following:
| Code: Setup symlink for website |
ln -s /usr/nagios/share/ /var/www/localhost/htdocs/nagios |
Now start up your favorite web browser and surf to http://yourmachine/nagios/ You should see the Nagios main page. Don't forget to type in the last slash! Click on Tactical Overview in the side bar. login with username: nagiosadmin If everything is correct so far, you should see the following text appear:
| Code: Browser output |
Whoops!
Error: Could not read host and service status information!
The most common cause of this error message (especially for new users), is the
fact that Nagios is not actually running. If Nagios is indeed not running, this
is a normal error message. It simply indicates that the CGIs could not obtain
the current status of hosts and services that are being monitored. If you've just
installed things, make sure you read the documentation on starting Nagios.
Some other things you should check in order to resolve this error include:
1. Check the Nagios log file for messages relating to startup or status data
errors.
2. Always verify configuration options using the -v command-line option before
starting or restarting Nagios!
3. Make sure you've compiled the main program and the CGIs to use the same status
data storage options (i.e. text file or database). If the main program is storing
status data in a text file and the CGIs are trying to read status data from a
database, you'll have problems.
Make sure you read the documentation on installing, configuring and running Nagios
thoroughly before continuing. If all else fails, try sending a message to one of the
mailing lists. More information can be found at http://www.nagios.org. |
[edit] Setting up MySQL (Nagios 1.x only)
I assume that you have already installed MySQL, if you have not do so now. Please refer to an appropriate document for installing and initializing MySQL.
Create the Nagios database itself -- it will be devoid of tables; we'll get to that shortly:
| Code: Create the nagios db |
# mysqladmin -u root -p create nagios |
Having created the Nagios DB, we now need to create the various tables in it:
| Code: Create tables for Nagios DB |
# mysql -u root -p mysql> use nagios mysql> source /usr/share/doc/nagios-core-1.4.1/contrib/database/create_mysql mysql> quit |
Now we need to create a database username for nagios to insert entries into the database as well as a seperate one for the nagios cgi's. The usernames and passwords should not be the same for security reasons, in this example the username for nagios is nagios-db and for the cgi's it will be nagios-cgi.
We will now make a script to create these users:
| File: ./mknagios-sql.sh |
#!/bin/sh
# Name: mknagios-sql.sh
#
# Purpose: This file will ask the user for the Nagios DB user/pass and
# for the Nagios CGI user/pass, then generate a file with the SQL
# statements required to set up access for both users.
#
# How to run: This script takes no input arguments; it prompts for the info.
#
# Note: This assumes Nagios and MySQL will be running on the same
# server. If this assumption is false, then modify nagios_host variable to
# match the FQDN hostname of the Nagios server.
#
# License: Public domain with no redistribution or modification limitations.
# Clean up stuff in case we abort due to ctrl-c'ing out early
trap "stty echo ; rm -f /tmp/mknagios.sql >/dev/null 2>&1 ; exit 1" 2
# Define key variables
outfile="/tmp/mknagios.sql"
nagios_host="localhost"
dbusr_tables="hostdowntime servicedowntime hostcomments servicecomments"
dbusr_tables="${dbusr_tables} programstatus hoststatus"
dbusr_tables="${dbusr_tables} servicestatus programretention"
dbusr_tables="${dbusr_tables} hostretention serviceretention"
cgiusr_tables="hostdowntime servicedowntime hostcomments servicecomments"
cgiusr_tables="${cgiusr_tables} programstatus hoststatus servicestatus"
cgiusr_tables="${cgiusr_tables} hostextinfo serviceextinfo"
# If the output file already exists from a previous run, be nice and
# move it out of the way instead of nuking it.
mv -f ${outfile} ${outfile}.old >/dev/null 2>&1
# Inform the user they may accept default usernames but MUST enter a password
echo "Note: you may press enter for the username choices if you accept the"
echo "suggested default names."
echo
echo "You MUST enter a password, and the passwords echo SHOULD be different."
echo
# Ask for desired nagios DB username
echo -n "Please enter the Nagios DB username: [nagios-db] "
read dbusr
# Select the default username if user pressed return to accept it
if [ -z "${dbusr}" ]; then
dbusr="nagios-db"
fi
# Ask for desired nagios DB password
echo -n "Please enter the Nagios DB password: "
stty -echo
read dbpass
stty echo
echo
echo
# Did the user enter a password? If not, bail out
if [ -z "${dbpass}" ]; then
echo "Oops. You didn't enter a password. Exiting."
exit 1
fi
# Ask for desired nagios CGI username
echo -n "Please enter the Nagios CGI username: [nagios-cgi] "
read cgiusr
# Select the default username if user pressed return to accept it
if [ -z "${cgiusr}" ]; then
cgiusr="nagios-cgi"
fi
# Ask for desired nagios CGI password
echo -n "Please enter the Nagios CGI password: "
stty -echo
read cgipass
stty echo
echo
echo
# Did the user enter a password? If not, bail out
if [ -z "${cgipass}" ]; then
echo "Oops. You didn't enter a password. Exiting."
exit 1
fi
# Create output file with SQL statements
for table in ${dbusr_tables}
do
(echo -n "GRANT select,insert,update,delete"
echo -n " ON nagios.${table}"
echo -n " TO '${dbusr}'@'${nagios_host}'"
echo -n " IDENTIFIED BY '${dbpass}';"
echo ) >> ${outfile}
done
for table in ${cgiusr_tables}
do
(echo -n "GRANT select"
echo -n " ON nagios.${table}"
echo -n " TO '${cgiusr}'@'${nagios_host}'"
echo -n " IDENTIFIED BY '${cgipass}';"
echo ) >> ${outfile}
done
(echo -n "GRANT lock tables ON nagios.* TO '${dbusr}'@'${nagios_host}'"
echo " IDENTIFIED BY '${dbpass}';" ) >> ${outfile}
(echo -n "GRANT lock tables ON nagios.* TO '${cgiusr}'@'${nagios_host}'"
echo " IDENTIFIED BY '${cgipass}';" ) >> ${outfile}
echo "FLUSH PRIVILEGES;" >> ${outfile}
# Inform user the file is ready and where to pick it up.
echo "You may use ${outfile} for the next step of Nagios+MySQL
installation."
# We're done, go in peace.
exit 0 |
Now we need to run the script:
| Code: sh ./mknagios-sql.sh |
# chmod 700 mknagios-sql.sh # ./mknagios-sql.sh |
Now we will actually execute these SQL statements by:
| Code: sourcing /tmp/mknagios.sql |
# mysql -u root -p mysql> source /tmp/mknagios.sql mysql> quit |
Now we need to edit resource.cfg, you will need to change usernames if you did not use the defaults using the mknagios-sql.sh script. The resource.cfg file should look similar to this:
| File: /etc/nagios/resource.cfg |
# Sets $USER1$ to be the path to the plugins $USER1$=/usr/nagios/libexec # Sets $USER2$ to be the path to event handlers $USER2$=/usr/nagios/libexec/eventhandlers # Store some usernames and passwords (hidden from the CGIs) #$USER3$=nagios-db #$USER4$=<password> # DB STATUS DATA # Note: These config directives are only used if you compiled # in database support for status data! # The user you specify here needs SELECT, INSERT, UPDATE, and # DELETE privileges on the 'programstatus', 'hoststatus', # and 'servicestatus' tables in the database. xsddb_host=localhost xsddb_port=3306 xsddb_database=nagios xsddb_username=nagios-db xsddb_password=<password> xsddb_optimize_data=1 xsddb_optimize_interval=3600 # DB COMMENT DATA # Note: These config directives are only used if you compiled # in database support for comment data! # The user you specify here needs SELECT, INSERT, UPDATE, and # DELETE privileges on the 'hostcomments' and 'servicecomments' # tables in the database. xcddb_host=localhost xcddb_port=3306 xcddb_database=nagios xcddb_username=nagios-db xcddb_password=<password> xcddb_optimize_data=1 # DB DOWNTIME DATA # Note: These config directives are only used if you compiled # in database support for downtime data! # The user you specify here needs SELECT, INSERT, UPDATE, and # DELETE privileges on the 'hostdowntime' and 'servicedowntime' # tables in the database. xdddb_host=localhost xdddb_port=3306 xdddb_database=nagios xdddb_username=nagios-db xdddb_password=<password> xdddb_optimize_data=1 # DB RETENTION DATA # Note: These config directives are only used if you compiled # in database support for retention data! # The user you specify here needs SELECT, INSERT, UPDATE, and # DELETE privileges on the 'programretention', 'hostretention', # and 'serviceretention' tables in the database. xrddb_host=localhost xrddb_port=3306 xrddb_database=nagios xrddb_username=nagios-db xrddb_password=<password> xrddb_optimize_data=1 |
At this time we will also setup the cgi.cfg file for usage not only for the mysql information but for the cgi authentication as well. You may notice that I have cliped most of the comments out of the file you may wish to read them at your leisure:
| File: /etc/nagios/cgi.cfg |
# MAIN CONFIGURATION FILE main_config_file=/etc/nagios/nagios.cfg # PHYSICAL HTML PATH physical_html_path=/usr/nagios/share # URL HTML PATH url_html_path=/nagios # CONTEXT-SENSITIVE HELP show_context_help=0 # NAGIOS PROCESS CHECK COMMAND # If you are using database backend nagios_check_command=/usr/nagios/libexec/check_nagios_db.pl -e 5 -C '/usr/nagios/bin/nagios' # Else # nagios_check_command=/usr/nagios/libexec/check_nagios /var/nagios/status.log 5 '/usr/nagios/bin/nagios' # AUTHENTICATION USAGE use_authentication=1 # SYSTEM/PROCESS INFORMATION ACCESS authorized_for_system_information=nagiosadmin,<username> # CONFIGURATION INFORMATION ACCESS authorized_for_configuration_information=nagiosadmin,<username> # SYSTEM/PROCESS COMMAND ACCESS authorized_for_system_commands=nagiosadmin,<username> # GLOBAL HOST/SERVICE VIEW ACCESS authorized_for_all_services=nagiosadmin,<username> authorized_for_all_hosts=nagiosadmin,<username> # GLOBAL HOST/SERVICE COMMAND ACCESS authorized_for_all_service_commands=nagiosadmin,<username> authorized_for_all_host_commands=nagiosadmin,<username> # DEFAULT STATUSMAP LAYOUT METHOD default_statusmap_layout=5 # DEFAULT STATUSWRL LAYOUT METHOD default_statuswrl_layout=4 # PING SYNTAX ping_syntax=/bin/ping -n -U -c 5 $HOSTADDRESS$ # REFRESH RATE refresh_rate=90 # DG EXTENDED DATA xeddb_host=localhost xeddb_port=3306 xeddb_database=nagios xeddb_username=nagios-cgi xeddb_password=<password> # DB STATUS DATA (Read-Only For CGIs) xsddb_host=localhost xsddb_port=3306 xsddb_database=nagios xsddb_username=nagios-cgi xsddb_password=<password> # DB COMMENT DATA (Read-Only For CGIs) xcddb_host=localhost xcddb_port=3306 xcddb_database=nagios xcddb_username=nagios-cgi xcddb_password=<password> # DB DOWNTIME DATA (Read-Only For CGIs) xdddb_host=localhost xdddb_port=3306 xdddb_database=nagios xdddb_username=nagios-cgi xdddb_password=<password> |
Since we installed Nagios with MySQL support, I would suggest modifying the init startup script to ensure it will start the database before starting Nagios.
| File: /etc/init.d/nagios |
depend() {
need net mysql
use dns logger
after mysql
}
|
We should restart MySQL for posterity:
| Code: Restart MySQL |
# /etc/init.d/mysql restart |
[edit] Configuring Nagios
This paragraph is based on the Nagios documentation: Configuring Nagios This paragraph should describe the nagios configuration files
[edit] Main Configuration
The main configuration file is /etc/nagios/nagios.cfg which contains a load of sample settings. The documentation for this file can be found here. There are a lot of useful comments in the nagios.cfg file as well. This paragraph describes the changes I had to make to get my nagios server up and running *and* monitored. (Just a little add ;) don't forget to create the log files in /var/log/nagios/ because the cgi will not work ;) )
| File: Edit: /etc/nagios/nagios.cfg |
# log_file=/var/nagios/nagios.log log_file=/var/log/nagios/nagios.log # status_file=/var/nagios/status.log status_file=/var/log/nagios/status.log # comment_file=/var/nagios/comment.log comment_file=/var/log/nagios/comment.log # downtime_file=/var/nagios/downtime.log downtime_file=/var/log/nagios/downtime.log # lock_file=/var/nagios/nagios.lock lock_file=/var/run/nagios.pid # use_syslog=1 use_syslog=0 # enable_flap_detection=0 enable_flap_detection=1 # date_format=us date_format=euro # admin_email=nagios admin_email=email.address@my.provider.nl |
[edit] Nagios 1.x only
[edit] Hosts For Monitoring
This paragraph is based on the Nagios documentation Template-Based Object Data Configuration File Options
Edit the /etc/nagios/hosts.cfg file according to the Nagios documentation. Below is the file I use.
| File: /etc/nagios/hosts.cfg |
# Generic host definition template
define host{
name generic-host ; The name of this host template - referenced in other host definitions, used for template recursion/resolution
notifications_enabled 1 ; Host notifications are enabled
event_handler_enabled 1 ; Host event handler is enabled
flap_detection_enabled 1 ; Flap detection is enabled
process_perf_data 1 ; Process performance data
retain_status_information 1 ; Retain status information across program restarts
retain_nonstatus_information 1 ; Retain non-status information across program restarts
register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!
}
# 'localhost' host definition
define host{
use generic-host ; Name of host template to use
host_name localhost
alias nagios server
address 127.0.0.1
check_command check-host-alive
max_check_attempts 10
notification_interval 120
notification_period 24x7
notification_options d,u,r
}
# 'workstation' host definition
define host{
use generic-host ; Name of host template to use
host_name <another host> ; Make sure that this value is resolvable test by ping if not place it in your /etc/hosts file
alias workstation
address ip.ad.dr.ess ; Host IP address
check_command check-host-alive
max_check_attempts 10
notification_interval 120
notification_period 24x7
notification_options d,u,r
}
|
[edit] Services To Monitor
Template-Based Object Data Configuration File Options
| File: /etc/nagios/services.cfg |
# Generic service definition template
define service{
name generic-service ; The 'name' of this service template, referenced in other service definitions
active_checks_enabled 1 ; Active service checks are enabled
passive_checks_enabled 1 ; Passive service checks are enabled/accepted
parallelize_check 1 ; Active service checks should be parallelized (disabling this can lead to major performance problems)
obsess_over_service 1 ; We should obsess over this service (if necessary)
check_freshness 0 ; Default is to NOT check service 'freshness'
notifications_enabled 1 ; Service notifications are enabled
event_handler_enabled 1 ; Service event handler is enabled
flap_detection_enabled 1 ; Flap detection is enabled
process_perf_data 1 ; Process performance data
retain_status_information 1 ; Retain status information across program restarts
retain_nonstatus_information 1 ; Retain non-status information across program restarts
register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL SERVICE, JUST A TEMPLATE!
}
# Service definition
define service{
use generic-service ; Name of service template to use
host_name localhost
service_description SSH
is_volatile 0
check_period 24x7
max_check_attempts 3
normal_check_interval 3
retry_check_interval 1
contact_groups <edit-this> ; Make sure that the value here is also located in the contactgroup.cfg
notification_interval 120
notification_period 24x7
notification_options w,u,c,r
check_command "/usr/bin/nmap -sT -p22 -P0 localhost| grep open 2> /dev/null"
}
# Service definition
define service{
use generic-service ; Name of service template to use
host_name *
service_description PING
is_volatile 0
check_period 24x7
max_check_attempts 3
normal_check_interval 5
retry_check_interval 1
contact_groups <edit-this> ; Make sure that the value here is also located in the contactgroup.cfg
notification_interval 120
notification_period 24x7
notification_options c,r
check_command check_ping!100.0,20%!500.0,60%
}
|
[edit] Configuring Contact Definition
| File: /etc/nagios/contacts.cfg |
# 'nagios' contact definition
define contact{
contact_name nagios
alias Nagios Admin
service_notification_period 24x7
host_notification_period 24x7
service_notification_options w,u,c,r
host_notification_options d,u,r
service_notification_commands notify-by-email
host_notification_commands host-notify-by-email
email email.address@my.provider.nl
}
|
[edit] Configuring Host Group Definition
| File: /etc/nagios/hostgroups.cfg |
# 'linux-boxes' host group definition
define hostgroup{
hostgroup_name linux-boxes
alias Linux Servers
contact_groups <edit-this> ; This needs to be the same value as the value located in service.cfg file.
members localhost,<another>,<more>,<blah-blah-blah> ;make sure all hosts in host.cfg are attached to a hostgroup
}
|
[edit] Configuring Contact Group Definitions
| File: /etc/nagios/contactgroups.cfg |
# 'linux-admins' contact group definition
define contactgroup{
contactgroup_name linux-admins
alias Linux Administrators
members nagios
}
|
For now we can leave the rest of the files as they are with the exception of the depenencies.cfg either comment out the lines, clear out its contents, or delete the file and remove its entry from the nagios.cfg file.
[edit] Nagios 2.x only
This paragraph is based on the Nagios documentation: Configuring Nagios This paragraph should describe the nagios configuration files
- Services
- Service Groups
- Hosts
- Host Groups
- Contacts
- Contact Groups
- Commands
- Time Periods
- Service Escalations
- Service Dependencies
- Host Escalations
- Host Dependencies
- Extended Host Information
- Extended Service Information
[edit] Hosts
This paragraph is based on the Nagios documentation Template-Based Object Data Configuration File Options
Edit the /etc/nagios/hosts.cfg file according to the Nagios documentation. Below is the file I use.
| File: /etc/nagios/hosts.cfg |
# Generic host definition template
define host{
name generic-host ; The name of this host template - referenced in other host definitions, used for template recursion/resolution
notifications_enabled 1 ; Host notifications are enabled
event_handler_enabled 1 ; Host event handler is enabled
flap_detection_enabled 1 ; Flap detection is enabled
process_perf_data 1 ; Process performance data
retain_status_information 1 ; Retain status information across program restarts
retain_nonstatus_information 1 ; Retain non-status information across program restarts
register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!
}
# 'localhost' host definition
define host{
name localhost
use generic-host ; Name of host template to use
host_name localhost
alias nagios server
address 127.0.0.1
check_command check-host-alive
check_period 24x7 ; new
contact_groups linux-admins ; new
max_check_attempts 10
notification_interval 120
notification_period 24x7
notification_options d,u,r
register 1
}
|
[edit] Services
Template-Based Object Data Configuration File Options
| File: /etc/nagios/services.cfg |
# myhost-service template
define service{
use generic-service
name myhost-service
hostgroups myhost
register 0
}
# nmap ssh
define service{
use generic-service ; Name of service template to use
host_name localhost
service_description nmap ssh
is_volatile 0
check_period 24x7
max_check_attempts 3
normal_check_interval 3
retry_check_interval 1
contact_groups <edit-this> ; Make sure that the value here is also located in the contactgroup.cfg
notification_interval 120
notification_period 24x7
notification_options w,u,c,r
check_command "/usr/bin/nmap -sT -p22 -P0 localhost| grep open 2> /dev/null"
}
# PING
define service{
use generic-service ; Name of service template to use
host_name *
service_description PING
is_volatile 0
check_period 24x7
max_check_attempts 3
normal_check_interval 5
retry_check_interval 1
contact_groups <edit-this> ; Make sure that the value here is also located in the contactgroup.cfg
notification_interval 120
notification_period 24x7
notification_options c,r
check_command check_ping!100.0,20%!500.0,60%
}
# SSH
define service{
use myhost-service
host_name *
service_description SSH
is_volatile 0
check_period 24x7
max_check_attempts 3
normal_check_interval 5
retry_check_interval 1
contact_groups <edit this>
notification_interval 120
notification_period 24x7
notification_options c,r
check_command check_ssh
}
|
[edit] Commands
[edit] Contacts
| File: /etc/nagios/contacts.cfg |
# 'nagios' contact definition
define contact{
contact_name nagios
alias Nagios Admin
service_notification_period 24x7
host_notification_period 24x7
service_notification_options w,u,c,r
host_notification_options d,u,r
service_notification_commands notify-by-email
host_notification_commands host-notify-by-email
email email.address@my.provider.nl
}
|
[edit] Host Groups
| File: /etc/nagios/hostgroups.cfg |
# 'linux-boxes' host group definition
define hostgroup{
hostgroup_name linux-boxes
alias Linux Servers
# contact_groups <edit-this> ; This needs to be the same value as the value located in service.cfg file. Nagios 2.5 produces an error if you define this.
members localhost,<another>,<more>,<blah-blah-blah> ;make sure all hosts in host.cfg are attached to a hostgroup
}
|
Alternatively, you don't need to specify the hosts here, if you define this hostgroup in the hosts.cfg template. The rest of the examples assume you do it this way.
| File: hostgroups.cfg |
define hostgroup{
hostgroup_name myhost
alias Myhost at the lab
}
|
[edit] Contact Groups
| File: /etc/nagios/contactgroups.cfg |
# 'linux-admins' contact group definition
define contactgroup{
contactgroup_name linux-admins
alias Linux Administrators
members nagios
}
|
For now we can leave the rest of the files as they are with the exception of the depenencies.cfg either comment out the lines, clear out its contents, or delete the file and remove its entry from the nagios.cfg file.
[edit] Timeperiods
| File: /etc/nagios/timeperiods.cfg |
###############################################################################
###############################################################################
#
# TIME PERIODS
#
###############################################################################
###############################################################################
# This defines a timeperiod where all times are valid for checks,
# notifications, etc. The classic "24x7" support nightmare. :-)
define timeperiod{
timeperiod_name 24x7
alias 24 Hours A Day, 7 Days A Week
sunday 00:00-24:00
monday 00:00-24:00
tuesday 00:00-24:00
wednesday 00:00-24:00
thursday 00:00-24:00
friday 00:00-24:00
saturday 00:00-24:00
}
|
[edit] Configuring CGI
If you're running Nagios 1.x, the cgi was configured with mysql in the previous chapter. Otherwise proceed here.
| File: /etc/nagios/cgi.cfg |
# MAIN CONFIGURATION FILE main_config_file=/etc/nagios/nagios.cfg # PHYSICAL HTML PATH physical_html_path=/usr/nagios/share # URL HTML PATH url_html_path=/nagios # CONTEXT-SENSITIVE HELP show_context_help=0 # NAGIOS PROCESS CHECK COMMAND nagios_check_command=/usr/nagios/libexec/check_nagios -e 5 -C '/usr/nagios/bin/nagios' -F '/var/log/nagios/status.log' # AUTHENTICATION USAGE use_authentication=1 # SYSTEM/PROCESS INFORMATION ACCESS authorized_for_system_information=nagiosadmin,<username> # CONFIGURATION INFORMATION ACCESS authorized_for_configuration_information=nagiosadmin,<username> # SYSTEM/PROCESS COMMAND ACCESS authorized_for_system_commands=nagiosadmin,<username> # GLOBAL HOST/SERVICE VIEW ACCESS authorized_for_all_services=nagiosadmin,<username> authorized_for_all_hosts=nagiosadmin,<username> # GLOBAL HOST/SERVICE COMMAND ACCESS authorized_for_all_service_commands=nagiosadmin,<username> authorized_for_all_host_commands=nagiosadmin,<username> # DEFAULT STATUSMAP LAYOUT METHOD default_statusmap_layout=5 # DEFAULT STATUSWRL LAYOUT METHOD default_statuswrl_layout=4 # PING SYNTAX ping_syntax=/bin/ping -n -U -c 5 $HOSTADDRESS$ # REFRESH RATE refresh_rate=90 |
[edit] Starting Nagios Server Processes
| Code: Start Nagios |
# /etc/init.d/nagios start |
[edit] Monitoring hosts
[edit] hostgroups and services template
[edit] Ordinary computer
This is quite an ordinary box to monitor for PING and SSH.
| File: hosts.cfg |
# client
define host{
use generic-template
host_name client.myhost
alias client at the lab
address 192.168.0.100
check_command check-host-alive
register 1
}
|
[edit] Plugins
[edit] Nagios-Grapher
Installation
Get the ebuild from bugs.gentoo.org. Check out the experimental version for a nicer view of the graphs. If the ebuild doesn't do it for you, emerge the dependencies:
# echo "media-gfx/imagemagick X bzip2 graphviz perl png tiff xml gs" >> /etc/portage/package.use # emerge dev-perl/Image-Imlib2 dev-perl/XML-NamespaceSupport dev-perl/XML-SAX dev-perl/XML-Dumper imagemagick
Nagios configuration
After installation, make sure nagios and nagios-grapher understand each other.
| File: /etc/nagios/nagios.cfg |
process_performance_data=1 #service_perfdata_command=process-service-perfdata service_perfdata_file=/var/nagios/rw/ngraph.pipe service_perfdata_file_mode=w service_perfdata_file_processing_interval=10 service_perfdata_file_template=$HOSTNAME$\t$SERVICEDESC$\t$SERVICEOUTPUT$\t$SERVICEPERFDATA$ |
| File: /etc/nagios/commands.cfg |
define command {
command_name process-service-perfdata
command_line echo -e '$HOSTNAME$\t$SERVICEDESC$\t$SERVICEOUTPUT$\t$SERVICEPERFDATA$' > /var/nagios/rw/ngraph.pipe
}
|
Grapher configuration
All NagiosGrapher Configuration can be done by altering ngraph.cfg. The first part - only the "config" paragraph configures NagiosGrapher itself. All the rest configures the graphs. Here are the configure options you need to check, the other defaults most likely are fine for now.
| File: ngraph.ncfg |
define config{
pipe /var/nagios/rw/ngraph.pipe
interface pipe
pidfile /var/run/nagios-grapher/nagios-grapher.pid
user nagios
group nagios
rrdpath /var/nagios/ngraph.rrd.d/
tmppath /tmp/nagiosgrapher/
fontfile /usr/share/rrdtool/fonts/DejaVuSansMono-Roman.ttf
icon_image_src /nagios/images/graph.png
nagios_config /etc/nagios/nagios.cfg
cgi_config /etc/nagios/cgi.cfg
serviceext_type MULTIPLE
serviceext_path /etc/nagios/serviceextinfo.d
log_file /var/log/nagios/ngraph.log
}
define ngraph{
}
cfg_file=...
cfg_dir=...
|
Startup and debugging
Stop Nagios.
# /etc/init.d/nagios stop
For debugging, start the grapher in verbose mode (possibly in screen splitview)
# /usr/nagios/contrib/collect2.pl -v
Then start Nagios
# /usr/nagios/bin/nagios /etc/nagios/nagios.cfg
After you've seen that the pipe works, start creating the config files for the services you want to graph. See TIP graph arbitrary data for more information about this. Please add your regular expressions to the list.
After you've created a service ncfg, or loaded a template, restart Nagios so it loads the serviceextinfo data.
# /etc/init.d/nagios reload
You will see small icons appear next to the hosts. Click on the graph icon and you should see a new graph window open with the graphics. Gongratulations!
After the Grapher appears stable, start it by the init script and add nagios-grapher to your favorite runlevel. Note that the Grapher has to start first, and after that Nagios. You cannot get the pipe to work the other way around.
Apache error_log will be helpful if the graphs won't work.
[edit] Configure for client machine
[edit] NRPE client for linux
working on it stillbourne
In order to run check commands on remote clients, you need to either use the check_by_ssh command, or use the NRPE module. The latter is slightly more elaborate, so this guide focuses on that.
NRPE uses a standard port (5666) to send commands and receive the output, and it relies on (x)inetd to listen for incoming calls. Here are the necessary files to get you going. Read the documentation.
# echo "nrpe 5666/tcp # NRPE" >> /etc/services
| File: /etc/xinetd.d/nrpe |
# default: on
# description: NRPE
service nrpe
{
flags = REUSE
socket_type = stream
wait = no
user = nagios
server = /usr/sbin/nrpe
server_args = -c /etc/nagios/nrpe.cfg --inetd
log_on_failure += USERID
disable = no
only_from = 192.168.0.0/24
}
|
| File: /etc/nagios/nrpe.cfg |
############################################################################# # Sample NRPE Config File # Written by: Ethan Galstad (nagios@nagios.org) # # Last Modified: 02-23-2006 # # NOTES: # This is a sample configuration file for the NRPE daemon. It needs to be # located on the remote host that is running the NRPE daemon, not the host # from which the check_nrpe client is being executed. ############################################################################# # PID FILE # The name of the file in which the NRPE daemon should write it's process ID # number. The file is only written if the NRPE daemon is started by the root # user and is running in standalone mode. pid_file=/var/run/nrpe.pid # PORT NUMBER # Port number we should wait for connections on. # NOTE: This must be a non-priviledged port (i.e. > 1024). # NOTE: This option is ignored if NRPE is running under either inetd or xinetd server_port=5666 # SERVER ADDRESS # Address that nrpe should bind to in case there are more than one interface # and you do not want nrpe to bind on all interfaces. # NOTE: This option is ignored if NRPE is running under either inetd or xinetd #server_address=192.168.1.1 # NRPE USER # This determines the effective user that the NRPE daemon should run as. # You can either supply a username or a UID. # # NOTE: This option is ignored if NRPE is running under either inetd or xinetd nrpe_user=nagios # NRPE GROUP # This determines the effective group that the NRPE daemon should run as. # You can either supply a group name or a GID. # # NOTE: This option is ignored if NRPE is running under either inetd or xinetd nrpe_group=nagios # ALLOWED HOST ADDRESSES # This is an optional comma-delimited list of IP address or hostnames # that are allowed to talk to the NRPE daemon. # # Note: The daemon only does rudimentary checking of the client's IP # address. I would highly recommend adding entries in your /etc/hosts.allow # file to allow only the specified host to connect to the port # you are running this daemon on. # # NOTE: This option is ignored if NRPE is running under either inetd or xinetd #allowed_hosts=127.0.0.1,192.168.0.2 # COMMAND ARGUMENT PROCESSING # This option determines whether or not the NRPE daemon will allow clients # to specify arguments to commands that are executed. This option only works # if the daemon was configured with the --enable-command-args configure script # option. # # *** ENABLING THIS OPTION IS A SECURITY RISK! *** # Read the SECURITY file for information on some of the security implications # of enabling this variable. # # Values: 0=do not allow arguments, 1=allow command arguments dont_blame_nrpe=0 # COMMAND PREFIX # This option allows you to prefix all commands with a user-defined string. # A space is automatically added between the specified prefix string and the # command line from the command definition. # # *** THIS EXAMPLE MAY POSE A POTENTIAL SECURITY RISK, SO USE WITH CAUTION! *** # Usage scenario: # Execute restricted commmands using sudo. For this to work, you need to add # the nagios user to your /etc/sudoers. An example entry for alllowing # execution of the plugins from might be: # # nagios ALL=(ALL) NOPASSWD: /usr/lib/nagios/plugins/ # # This lets the nagios user run all commands in that directory (and only them) # without asking for a password. If you do this, make sure you don't give # random users write access to that directory or its contents! # command_prefix=/usr/bin/sudo # DEBUGGING OPTION # This option determines whether or not debugging messages are logged to the # syslog facility. # Values: 0=debugging off, 1=debugging on debug=0 # COMMAND TIMEOUT # This specifies the maximum number of seconds that the NRPE daemon will # allow plugins to finish executing before killing them off. command_timeout=60 # WEEK RANDOM SEED OPTION # This directive allows you to use SSL even if your system does not have # a /dev/random or /dev/urandom (on purpose or because the necessary patches # were not applied). The random number generator will be seeded from a file # which is either a file pointed to by the environment valiable $RANDFILE # or $HOME/.rnd. If neither exists, the pseudo random number generator will # be initialized and a warning will be issued. # Values: 0=only seed from /dev/[u]random, 1=also seed from weak randomness allow_weak_random_seed=1 # INCLUDE CONFIG FILE # This directive allows you to include definitions from an external config file. include=/etc/nagios/nrpe_commands.cfg # INCLUDE CONFIG DIRECTORY # This directive allows you to include definitions from config files (with a # .cfg extension) in one or more directories (with recursion). #include_dir=<somedirectory> #include_dir=<someotherdirectory> |
| File: /etc/nagios/nrpe_commands.cfg |
######################################### # # NRPE COMMAND DEFINITIONS # ######################################### # Command definitions that this daemon will run. Definitions # are in the following format: # # command[<command_name>]=<command_line> # # When the daemon receives a request to return the results of <command_name> # it will execute the command specified by the <command_line> argument. # # Unlike Nagios, the command line cannot contain macros - it must be # typed exactly as it should be executed. # # Note: Any plugins that are used in the command lines must reside # on the machine that this daemon is running on! The examples below # assume that you have plugins installed in a /usr/local/nagios/libexec # directory. Also note that you will have to modify the definitions below # to match the argument format the plugins expect. Remember, these are # examples only! # The following examples use hardcoded command arguments... command[check_users]=/usr/nagios/libexec/check_users -w 5 -c 10 command[check_load]=/usr/nagios/libexec/check_load -w 15,10,5 -c 30,25,20 command[check_root]=/usr/nagios/libexec/check_local_disk -w 20 -c 10 command[check_disk1]=/usr/nagios/libexec/check_disk -w 20 -c 10 -p /dev/hda1 command[check_disk2]=/usr/nagios/libexec/check_disk -w 20 -c 10 -p /dev/hdb1 command[check_zombie_procs]=/usr/nagios/libexec/check_procs -w 5 -c 10 -s Z command[check_total_procs]=/usr/nagios/libexec/check_procs -w 150 -c 200 command[check_nscd]=/etc/init.d/nscd status # The following examples allow user-supplied arguments and can # only be used if the NRPE daemon was compiled with support for # command arguments *AND* the dont_blame_nrpe directive in this # config file is set to '1'... #command[check_users]=/usr/nagios/libexec/check_users -w $ARG1$ -c $ARG2$ #command[check_load]=/usr/nagios/libexec/check_load -w $ARG1$ -c $ARG2$ #command[check_disk]=/usr/nagios/libexec/check_disk -w $ARG1$ -c $ARG2$ -p $ARG3$ #command[check_procs]=/usr/nagios/libexec/check_procs -w $ARG1$ -c $ARG2$ -s $ARG3$ |
Start the services:
# /etc/init.d/xinetd start # /etc/init.d/nrpe start
On the server
Define the check_nrpe command:
| File: commands.cfg |
define command{
command_name check_nrpe
command_line /usr/local/nagios/libexec/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
}
|
In any service definitions that use the nrpe plugin/daemon to get their results, you would set the service check command portion of the definition to something like this:
define service{
use generic-service
host_name someremotehost
service_description NSCD
check_command check_nrpe!check_nscd
}
[edit] Using NSClient for monitoring Windows Systems
NOTE: A drop-in replacement for NSClient, called NSClient++ (or NSCP), can be found here. It supports 32- and 64-bit (as well as Itanium) Windows systems. It is highly recommended that this be used over the older NSClient, so please refer to the current NSCP Documentation for configuration.
There are serveral means by which you can setup a Windows machine to be monitored by Nagios, the most common methods are SNMP and NSClient(NetSaint Client). Since I have not used SNMP on a Windows machine I will show you how to use NSClient instead. NSClient can be obtained from here It hasn't been updated in a while but still works as needed ( I have tested the client on Windows 2000 and Windows XP 32-bit, I am sure that it will work just fine on Windows 2003 32-bit although I have yet to try it. I am not sure what would happen if anyone was to attempt to load it on a 64 bit Windows system, however the source code is provided with the system even though it is writen in pascal.
Download the client and extract the files. From the Readme.html:
| File: readme.html |
Installation On the Windows machine 1. Copy pNSClient.exe, pdh.dll, psapi.dll and counters.defs in any directory on the machine you want to monitor. ie. (c:\nsclient). 2. Open a dos prompt in the installation directory 3. Run the following command : >pNSClient.exe /install 4. Type 'net start nsclient' on the command line or start the service 'Nagios Agent' in the services applet of the control panel. |
The readme.html is outdated on the checkcommands.cfg entries you should add these entries as follows:
| File: /etc/nagios/checkcommands.cfg |
# NSClient
define command{
command_name check_nt_disk
command_line $USER1$/check_nt -H $HOSTADDRESS$ -p 1248 -v USEDDISKSPACE -l $ARG1$ -w $ARG2$ -c $ARG3$
}
define command{
command_name check_nt_cpuload
command_line $USER1$/check_nt -H $HOSTADDRESS$ -p 1248 -v CPULOAD -l $ARG1$
}
define command{
command_name check_nt_uptime
command_line $USER1$/check_nt -H $HOSTADDRESS$ -p 1248 -v UPTIME
}
define command{
command_name check_nt_clientversion
command_line $USER1$/check_nt -H $HOSTADDRESS$ -p 1248 -v CLIENTVERSION
}
define command{
command_name check_nt_process
command_line $USER1$/check_nt -H $HOSTADDRESS$ -p 1248 -v PROCSTATE -l $ARG1$
}
define command{
command_name check_nt_service
command_line $USER1$/check_nt -H $HOSTADDRESS$ -p 1248 -v SERVICESTATE -d SHOWALL -l $ARG1$
}
define command{
command_name check_nt_memuse
command_line $USER1$/check_nt -H $HOSTADDRESS$ -p 1248 -v MEMUSE -w $ARG1$ -c $ARG2$
}
define command{
command_name check_nt_pagingfile
command_line $USER1$/check_nt -H $HOSTADDRESS$ -p 1248 -v COUNTER -l "\\Paging File(_Total)\\% Usage","Paging File usage is %.2f %%" -w $ARG1$ -c $ARG2$
}
|
Make sure you add an entry into your hosts.cfg file:
| File: /etc/nagios/hosts.cfg |
define host{
use generic-host
host_name ip.ad.dr.ess ; or add a entry into your /etc/hosts file and replace with that hostname
alias workstation NSClient
address ip.ad.dr.ess
check_command check-host-alive
max_check_attempts 10
notification_interval 120
notification_period 24x7
notification_options d,u,r
}
|
Aslo add the host_name value into the hostgourp.cfg file:
| File: /etc/nagios/hostgroups.cfg |
define hostgroup{
hostgroup_name windows
alias Windows Boxen
contact_groups admins
members ip.ad.dr.ess ; or hostname if it you've added it to the hosts file
}
|
[edit] Acknowledgements
This article is loosely based on the German Nagios gentoo-wiki
Gentoo Forum post by dsf for mysql stuff
