Monday, December 6, 2010

Linux audit files to see who made changes to a file

This is one of the key questions many new sys admin ask:
How do I audit file events such as read / write etc? How can I use audit to see who changed a file in Linux?
The answer is to use 2.6 kernel’s audit system. Modern Linux kernel (2.6.x) comes with auditd daemon. It’s responsible for writing audit records to the disk. During startup, the rules in /etc/audit.rules are read by this daemon. You can open /etc/audit.rules file and make changes such as setup audit file log location and other option. The default file is good enough to get started with auditd.
In order to use audit facility you need to use following utilities
=> auditctl - a command to assist controlling the kernel’s audit system. You can get status, and add or delete rules into kernel audit system. Setting a watch on a file is accomplished using this command:
=> ausearch - a command that can query the audit daemon logs based for events based on different search criteria.
=> aureport - a tool that produces summary reports of the audit system logs.
Note that following all instructions are tested on CentOS 4.x and Fedora Core and RHEL 4/5 Linux.
Task: install audit package
The audit package contains the user space utilities for storing and searching the audit records generate by the audit subsystem in the Linux 2.6 kernel. CentOS/Red Hat and Fedora core includes audit rpm package. Use yum or up2date command to install package
# yum install audit
or
# up2date install audit
Auto start auditd service on boot
# ntsysv
OR
# chkconfig auditd on
Now start service:
# /etc/init.d/auditd start
How do I set a watch on a file for auditing?
Let us say you would like to audit a /etc/passwd file. You need to type command as follows:
# auditctl -w /etc/passwd -p war -k password-file
Where,
-w /etc/passwd : Insert a watch for the file system object at given path i.e. watch file called /etc/passwd
-p war : Set permissions filter for a file system watch. It can be r for read, w for write, x for execute, a for append.
-k password-file : Set a filter key on a /etc/passwd file (watch). The password-file is a filterkey (string of text that can be up to 31 bytes long). It can uniquely identify the audit records produced by the watch. You need to use password-file string or phrase while searching audit logs.
In short you are monitoring (read as watching) a /etc/passwd file for anyone (including syscall) that may perform a write, append or read operation on a file.
Wait for some time or as a normal user run command as follows:
$ grep 'something' /etc/passwd
$ vi /etc/passwd
Following are more examples:
File System audit rules
Add a watch on "/etc/shadow" with the arbitrary filterkey "shadow-file" that generates records for "reads, writes, executes, and appends" on "shadow"
# auditctl -w /etc/shadow -k shadow-file -p rwxa
syscall audit rule
The next rule suppresses auditing for mount syscall exits
# auditctl -a exit,never -S mount
File system audit rule
Add a watch "tmp" with a NULL filterkey that generates records "executes" on "/tmp" (good for a webserver)
# auditctl -w /tmp -p e -k webserver-watch-tmp
syscall audit rule using pid
To see all syscalls made by a program called sshd (pid - 1005):
# auditctl -a entry,always -S all -F pid=1005
How do I find out who changed or accessed a file /etc/passwd?
Use ausearch command as follows:
# ausearch -f /etc/passwd
OR
# ausearch -f /etc/passwd | less
OR
# ausearch -f /etc/passwd -i | less
Where,
-f /etc/passwd : Only search for this file
-i : Interpret numeric entities into text. For example, uid is converted to account name.
Output:
----
type=PATH msg=audit(03/16/2007 14:52:59.985:55) : name=/etc/passwd flags=follow,open inode=23087346 dev=08:02 mode=file,644 ouid=root ogid=root rdev=00:00
type=CWD msg=audit(03/16/2007 14:52:59.985:55) : cwd=/webroot/home/lighttpd
type=FS_INODE msg=audit(03/16/2007 14:52:59.985:55) : inode=23087346 inode_uid=root inode_gid=root inode_dev=08:02 inode_rdev=00:00
type=FS_WATCH msg=audit(03/16/2007 14:52:59.985:55) : watch_inode=23087346 watch=passwd filterkey=password-file perm=read,write,append perm_mask=read
type=SYSCALL msg=audit(03/16/2007 14:52:59.985:55) : arch=x86_64 syscall=open success=yes exit=3 a0=7fbffffcb4 a1=0 a2=2 a3=6171d0 items=1 pid=12551 auid=unknown(4294967295) uid=lighttpd gid=lighttpd euid=lighttpd suid=lighttpd fsuid=lighttpd egid=lighttpd sgid=lighttpd fsgid=lighttpd comm=grep exe=/bin/grep
Let us try to understand output
audit(03/16/2007 14:52:59.985:55) : Audit log time
uid=lighttpd gid=lighttpd : User ids in numerical format. By passing -i option to command you can convert most of numeric data to human readable format. In our example user is lighttpd used grep command to open a file
exe="/bin/grep" : Command grep used to access /etc/passwd file
perm_mask=read : File was open for read operation
So from log files you can clearly see who read file using grep or made changes to a file using vi/vim text editor. Log provides tons of other information. You need to read man pages and documentation to understand raw log format.
Other useful examples
Search for events with date and time stamps. if the date is omitted, today is assumed. If the time is omitted, now is assumed. Use 24 hour clock time rather than AM or PM to specify time. An example date is 10/24/05. An example of time is 18:00:00.
# ausearch -ts today -k password-file
# ausearch -ts 3/12/07 -k password-file
Search for an event matching the given executable name using -x option. For example find out who has accessed /etc/passwd using rm command:
# ausearch -ts today -k password-file -x rm
# ausearch -ts 3/12/07 -k password-file -x rm
Search for an event with the given user name (UID). For example find out if user vivek (uid 506) try to open /etc/passwd:
# ausearch -ts today -k password-file -x rm -ui 506
# ausearch -k password-file -ui 506

How to keep a detailed audit trail of what’s being done on your Linux systems

Intrusions can take place from both authorized (insiders) and unauthorized (outsiders) users. My personal experience shows that unhappy user can damage the system, especially when they have a shell access. Some users are little smart and removes history file (such as ~/.bash_history) but you can monitor all user executed commands.
It is recommended that you log user activity using process accounting. Process accounting allows you to view every command executed by a user including CPU and memory time. With process accounting sys admin always find out which command executed at what time :)
The psacct package contains several utilities for monitoring process activities, including ac, lastcomm, accton and sa.
The ac command displays statistics about how long users have been logged on.
The lastcomm command displays information about previous executed commands.
The accton command turns process accounting on or off.
The sa command summarizes information about previously executed commmands.
Task: Install psacct or acct package
Use up2date command if you are using RHEL ver 4.0 or less
# up2date psacct
Use yum command if you are using CentOS/Fedora Linux / RHEL 5:
# yum install psacct
Use apt-get command if you are using Ubuntu / Debian Linux:
$ sudo apt-get install acct OR # apt-get install acct
Task: Start psacct/acct service
By default service is started on Ubuntu / Debian Linux by creating /var/account/pacct file. But under Red Hat /Fedora Core/Cent OS you need to start psacct service manually. Type the following two commands to create /var/account/pacct file and start services:
# chkconfig psacct on
# /etc/init.d/psacct start

If you are using Suse Linux, the name of service is acct. Type the following commands:
# chkconfig acct on
# /etc/init.d/acct start
Now let us see how to utilize these utilities to monitor user commands and time.
Task: Display statistics about users' connect time
ac command prints out a report of connect time in hours based on the logins/logouts. A total is also printed out. If you type ac without any argument it will display total connect time:
$ acOutput:
total 95.08
Display totals for each day rather than just one big total at the end:
$ ac -dOutput:
Nov 1 total 8.65
Nov 2 total 5.70
Nov 3 total 13.43
Nov 4 total 6.24
Nov 5 total 10.70
Nov 6 total 6.70
Nov 7 total 10.30
.....
..
...
Nov 12 total 3.42
Nov 13 total 4.55
Today total 0.52
Display time totals for each user in addition to the usual everything-lumped-into-one value:
$ ac -pOutput:
vivek 87.49
root 7.63
total 95.11
Task: find out information about previously executed user commands
Use lastcomm command which print out information about previously executed commands. You can search command using usernames, tty names, or by command names itself.
Display command executed by vivek user:
$ lastcomm vivekOutput:
userhelper S X vivek pts/0 0.00 secs Mon Nov 13 23:58
userhelper S vivek pts/0 0.00 secs Mon Nov 13 23:45
rpmq vivek pts/0 0.01 secs Mon Nov 13 23:45
rpmq vivek pts/0 0.00 secs Mon Nov 13 23:45
rpmq vivek pts/0 0.01 secs Mon Nov 13 23:45
gcc vivek pts/0 0.00 secs Mon Nov 13 23:45
which vivek pts/0 0.00 secs Mon Nov 13 23:44
bash F vivek pts/0 0.00 secs Mon Nov 13 23:44
ls vivek pts/0 0.00 secs Mon Nov 13 23:43
rm vivek pts/0 0.00 secs Mon Nov 13 23:43
vi vivek pts/0 0.00 secs Mon Nov 13 23:43
ping S vivek pts/0 0.00 secs Mon Nov 13 23:42
ping S vivek pts/0 0.00 secs Mon Nov 13 23:42
ping S vivek pts/0 0.00 secs Mon Nov 13 23:42
cat vivek pts/0 0.00 secs Mon Nov 13 23:42
netstat vivek pts/0 0.07 secs Mon Nov 13 23:42
su S vivek pts/0 0.00 secs Mon Nov 13 23:38
For each entry the following information is printed. Take example of first output line:
userhelper S X vivek pts/0 0.00 secs Mon Nov 13 23:58
Where,
userhelper is command name of the process
S and X are flags, as recorded by the system accounting routines. Following is the meaning of each flag:
S -- command executed by super-user
F -- command executed after a fork but without a following exec
D -- command terminated with the generation of a core file
X -- command was terminated with the signal SIGTERM
vivek the name of the user who ran the process
prts/0 terminal name
0.00 secs - time the process exited
Search the accounting logs by command name:
$ lastcomm rm
$ lastcomm passwdOutput:
rm S root pts/0 0.00 secs Tue Nov 14 00:39
rm S root pts/0 0.00 secs Tue Nov 14 00:39
rm S root pts/0 0.00 secs Tue Nov 14 00:38
rm S root pts/0 0.00 secs Tue Nov 14 00:38
rm S root pts/0 0.00 secs Tue Nov 14 00:36
rm S root pts/0 0.00 secs Tue Nov 14 00:36
rm S root pts/0 0.00 secs Tue Nov 14 00:35
rm S root pts/0 0.00 secs Tue Nov 14 00:35
rm vivek pts/0 0.00 secs Tue Nov 14 00:30
rm vivek pts/1 0.00 secs Tue Nov 14 00:30
rm vivek pts/1 0.00 secs Tue Nov 14 00:29
rm vivek pts/1 0.00 secs Tue Nov 14 00:29
Search the accounting logs by terminal name pts/1
$ lastcomm pts/1
Task: summarizes accounting information
Use sa command to print summarizes information about previously executed commands. In addition, it condenses this data into a summary file named savacct which contains the number of times the command was called and the system resources used. The information can also be summarized on a per-user basis; sa will save this iinformation into a file named usracct.
# saOutput:
579 222.81re 0.16cp 7220k
4 0.36re 0.12cp 31156k up2date
8 0.02re 0.02cp 16976k rpmq
8 0.01re 0.01cp 2148k netstat
11 0.04re 0.00cp 8463k grep
18 100.71re 0.00cp 11111k ***other*
8 0.00re 0.00cp 14500k troff
5 12.32re 0.00cp 10696k smtpd
2 8.46re 0.00cp 13510k bash
8 9.52re 0.00cp 1018k less
Take example of first line:
4 0.36re 0.12cp 31156k up2date
Where,
0.36re "real time" in wall clock minutes
0.12cp sum of system and user time in cpu minutes
31156k cpu-time averaged core usage, in 1k units
up2date command name
Display output per-user:
# sa -uOutput:
root 0.00 cpu 595k mem accton
root 0.00 cpu 12488k mem initlog
root 0.00 cpu 12488k mem initlog
root 0.00 cpu 12482k mem touch
root 0.00 cpu 13226k mem psacct
root 0.00 cpu 595k mem consoletype
root 0.00 cpu 13192k mem psacct *
root 0.00 cpu 13226k mem psacct
root 0.00 cpu 12492k mem chkconfig
postfix 0.02 cpu 10696k mem smtpd
vivek 0.00 cpu 19328k mem userhelper
vivek 0.00 cpu 13018k mem id
vivek 0.00 cpu 13460k mem bash *
lighttpd 0.00 cpu 48240k mem php *
Display the number of processes and number of CPU minutes on a per-user basis
# sa -mOutput:
667 231.96re 0.17cp 7471k
root 544 51.61re 0.16cp 7174k
vivek 103 17.43re 0.01cp 8228k
postfix 18 162.92re 0.00cp 7529k
lighttpd 2 0.00re 0.00cp 48536k
Task: Find out who is eating CPU
By looking at re, k, cp/cpu (see above for output explanation) time you can find out suspicious activity or the name of user/command who is eating up all CPU. An increase in CPU/memory usage (command) is indication of problem.
Please note that above commands and packages also available on other UNIX like oses such as Sun Solaris and *BSD oses.

Asp.Net Under Suse Linux Sample Example

/srv/www/htdocs - is your web root.
/usr/share/mono/asp.net/apps – Mono sample applications and starter kits.
/etc/apache2/conf.d - place where Apache is looking for configuration files.
/etc/xsp/2.0/applications-available – Mono configuration files.
First things first – in Windows file explorer select blog engine web project (I renamed it to be131) and drag it into VMware straight to the web root (/srv/www/htdocs). Set write permission on App_Data folder – right click it and set permissions almost like you would in Windows. If you run into problem - let everybody do anything on that folder. This is what sandboxes for :) You might think you are ready to go straight to the browser, but that won’t work (it would be way too easy). You need to tell Apache and Mono about new application they have to take care of.
In Apache, unlike IIS, you don’t have UI to set applications, virtual directories etc. Again, that would be too easy and no fun at all. Instead, you have configuration files and text editor. Let’s go ahead and edit those files. You’ll need to use root (admin) account. Start terminal window and type in “su root”, hit enter and, when prompted, “mono” as a password. Congratulations – you’ve just been promoted to almighty “root”. Now you can run any application from command line as administrator. Start text editor by typing “gedit” followed by file name you want to open. If file does not exist, Linux will create one for you.
>gedit /etc/apache2/conf.d/be131.conf
We need a new configuration file to tell apache where to find our application. We want it to look at physical directory “/srv/www/htdocs/be131” and map it to virtual path “/be131”, so that we can type “http://localhost/be131” to get to the site. Because Apache is going to use Mono module to run application, we need to supply a path to Mono configuration file and tell what .NET version to use (mod-mono-server2 for ASP.NET 2.0).
Copy and paste into the editor:
// /etc/apache2/conf.d/be131.conf
Alias /be131 /srv/www/htdocs/be131
MonoApplicationsConfigFile be131 /etc/xsp/2.0/applications-available/be131.webapp
MonoServerPath be131 "/usr/bin/mod-mono-server2"
MonoSetEnv be131 MONO_IOMAP=all

Allow from all
Order allow,deny
SetHandler mono
MonoSetServerAlias be131

Save file and close editor window. Now create another configuration file that will ask Mono to take care about our application:
>gedit /etc/xsp/2.0/applications-available/be131.webapp
Mono also needs to know physical and virtual path to the application to work properly.
// /etc/xsp/2.0/applications-available/be131.webapp


be131
/be131
/srv/www/htdocs/be131
true

Save. Close. Having fun yet? Almost there, all we need to do is to restart Apache:
> rcapache2 restart
Now you can open browser and navigate to http://localhost/be131 - and you will be greeted with usual “welcome to BlogEngine” page.

Wasn’t all that hard, wasn’t it? Well, to be honest it is because image we used already had all hard work done for us – installing Apache and Mono and configuring them to play together nicely. This is probably the easiest way to get your feet wet with Mono and ASP.NET in general, not just BlogEngine. If you know better way or have some interesting input or run into a problem – don’t hesitate to leave a message.

Using netstat

Just typing netstat should display a long list of information that's usually more than you want to go through at any given time.
The trick to keeping the information useful is knowing what you're looking for and how to tell netstat to only display that information.
1) For example, if you only want to see TCP connections, use netstat --tcp.
This shows a list of TCP connections to and from your machine. The following example shows connections to our machine on ports 993 (imaps), 143 (imap), 110 (pop3), 25 (smtp), and 22 (ssh).It also shows a connection from our machine to a remote machine on port 389 (ldap).
Note: To speed things up you can use the --numeric option to avoid having to do name resolution on addresses and display the IP only.
Code Listing 1:# netstat --tcp
% netstat --tcp --numeric
Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 192.168.128.152:993 192.168.128.120:3853 ESTABLISHED
tcp 0 0 192.168.128.152:143 192.168.128.194:3076 ESTABLISHED
tcp 0 0 192.168.128.152:45771 192.168.128.34:389 TIME_WAIT
tcp 0 0 192.168.128.152:110 192.168.33.123:3521 TIME_WAIT
tcp 0 0 192.168.128.152:25 192.168.231.27:44221 TIME_WAIT
tcp 0 256 192.168.128.152:22 192.168.128.78:47258 ESTABLISHED
If you want to see what (TCP) ports your machine is listening on, use netstat --tcp --listening.

2). Another useful flag to add to this is --programs which indicates which process is listening on the specified port.
The following example shows a machine listening on ports 80 (www), 443 (https), 22 (ssh), and 25 (smtp);
Code Listing 2: # netstat --tcp --listening --programs
# sudo netstat --tcp --listening --programs

Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 *:www *:* LISTEN 28826/apache2
tcp 0 0 *:ssh *:* LISTEN 26604/sshd
tcp 0 0 *:smtp *:* LISTEN 6836/
tcp 0 0 *:https *:* LISTEN 28826/apache2
Note: Using --all displays both connections and listening ports.
3) The next example uses netstat --route to display the routing table. For most people, this will show one IP and and the gateway address but if you have more than one interface or have multiple IPs assigned to an interface, this command can help troubleshoot network routing problems.
Code Listing 3: # netstat --route
% netstat --route
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
192.168.1.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
0.0.0.0 192.168.1.1 0.0.0.0 UG 1 0 0 eth0
The last example of netstat uses the --statistics flag to display networking statistics. Using this flag by itself displays all IP, TCP, UDP, and ICMP connection statistics.

4) To just show some basic information. For example purposes, only the output from --raw is displayed here.
Combined with the uptime command, this can be used to get an overview of how much traffic your machine is handling on a daily basis.
Code Listing 4: # netstat --statistics --route
% netstat --statistics --raw
Ip:
620516640 total packets received
0 forwarded
0 incoming packets discarded
615716262 incoming packets delivered
699594782 requests sent out
5 fragments dropped after timeout
3463529 reassemblies required
636730 packets reassembled ok
5 packet reassembles failed
310797 fragments created
// ICMP statistics truncated
Note: For verbosity, the long names for the various flags were given. Most can be abbreviated to avoid excessive typing (e.g. netstat -tn, netstat -tlp, netstat -r, and netstat -sw).

Move linux to another hard drive (dump, restore, backup)

There are several methods to move running Linux to another hard drive at the same server. But I used Unix dump/restore utility to perform this…
First of all it’s necessary to partition new hard drive in the same way as it’s done with old drive (Linux is running at). I usually use ‘fdisk’ utility. Let’s assume that old drive is /dev/hda and new one is /dev/hdb. To view hda’s partition table please run ‘fdisk -l /dev/hda’ which should show something like this:
Disk /dev/hda: 60.0 GB, 60022480896 bytes
255 heads, 63 sectors/track, 7297 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/hda1 * 1 15 120456 83 Linux
/dev/hda2 16 276 2096482+ 82 Linux swap
/dev/hda3 277 7297 56396182+ 83 Linux
After this run ‘fdisk /dev/hdb’ and make the same partitions at it. Interactive mode of fdisk utility is well documented and is very intuitive, so I don’t think it would be difficult to perform partitioning.
After this is done, we should make new filesystems at partitions we’ve created:
mkfs -t ext3 /dev/hdb1
mkfs -t ext3 /dev/hdb3
mkswap /dev/hdb2
When it’s done it’s NECESSARY to mark newly created filesystems as it’s done with old ones. To check filesystem volume name run command ‘tune2fs -l /dev/hda1 | grep volume’ and etc. You’ll see something like this:
Filesystem volume name: /boot
It means that we should mark new hdb1 with label /boot. It can be done by command:
tune2fs -L “/boot” /dev/hdb1
The same should be performed for all partitions except swap one. In my case I should label hdb3 by command:
tune2fs -L “/” /dev/hdb3
At this point new hard drive preparation is finished and we can proceed with moving Linux to it. Mount new filesystem and change directory to it:
mount /dev/hdb1 /mnt/hdb1
cd /mnt/hdb1
When it’s done we can perform moving by command:
dump -0uan -f – /boot | restore -r -f -
And the same with / partition:
mount /dev/hdb3 /mnt/hdb3
cd /mnt/hdb3
dump -0uan -f / | restore -r -f -
When dump/restore procedures are done we should install boot loader to new HDD. Run ‘grub’ utility and execute in it’s console:
root (hd1, 0)
setup (hd1)
quit
In case everything is done carefully and right (I’ve tested this method by myself) you can boot from new hard drive and have ‘old’ Linux running at new hard drive running.
Good luck!

Linux Remove All Partitions / Data And Create Empty Disk

How do I remove all partitions, data and create clean empty hard disk under Linux operating systems?

The simplest command to remove everything from Linux hard drive is as follows. Please note that this will remove all data so be careful:
# dd if=/dev/zero of=/dev/hdX bs=512 count=1
OR for sata disk, use the following syntax:
# dd if=/dev/zero of=/dev/sdX bs=512 count=1
In this example, empty sata disk /dev/sdb, enter (you must be login as the root user):
# fdisk /dev/sdb
# dd if=/dev/zero of=/dev/sdb bs=512 count=1
# fdisk -l /dev/sdb

Securely Wipe Hard Disk
You can use the shred command to securely remove everything so that no one recover any data:
# shred -n 5 -vz /dev/sda

Friday, December 3, 2010

The Ultimate Wget Download Guide With 15 Awesome Examples

wget utility is the best option to download files from internet. wget can pretty much handle all complex download situations including large file downloads, recursive downloads, non-interactive downloads, multiple file downloads etc.,
In this article let us review how to use wget for various download scenarios using 15 awesome wget examples.
1. Download Single File with wget
The following example downloads a single file from internet and stores in the current directory.
$ wget http://www.openss7.org/repos/tarballs/strx25-0.9.2.1.tar.bz2
While downloading it will show a progress bar with the following information:
%age of download completion (for e.g. 31% as shown below)
Total amount of bytes downloaded so far (for e.g. 1,213,592 bytes as shown below)
Current download speed (for e.g. 68.2K/s as shown below)
Remaining time to download (for e.g. eta 34 seconds as shown below)
Download in progress:
$ wget http://www.openss7.org/repos/tarballs/strx25-0.9.2.1.tar.bz2
Saving to: `strx25-0.9.2.1.tar.bz2.1'

31% [=================> 1,213,592 68.2K/s eta 34s
Download completed:
google_protectAndRun("ads_core.google_render_ad", google_handleError, google_render_ad);$ wget http://www.openss7.org/repos/tarballs/strx25-0.9.2.1.tar.bz2
Saving to: `strx25-0.9.2.1.tar.bz2'

100%[======================>] 3,852,374 76.8K/s in 55s

2009-09-25 11:15:30 (68.7 KB/s) - `strx25-0.9.2.1.tar.bz2' saved [3852374/3852374]
2. Download and Store With a Different File name Using wget -O
By default wget will pick the filename from the last word after last forward slash, which may not be appropriate always.
Wrong: Following example will download and store the file with name: download_script.php?src_id=7701
$ wget http://www.vim.org/scripts/download_script.php?src_id=7701
Even though the downloaded file is in zip format, it will get stored in the file as shown below.
$ ls
download_script.php?src_id=7701
Correct: To correct this issue, we can specify the output file name using the -O option as:
$ wget -O taglist.zip http://www.vim.org/scripts/download_script.php?src_id=7701
3. Specify Download Speed / Download Rate Using wget –limit-rate
While executing the wget, by default it will try to occupy full possible bandwidth. This might not be acceptable when you are downloading huge files on production servers. So, to avoid that we can limit the download speed using the –limit-rate as shown below.
In the following example, the download speed is limited to 200k
$ wget --limit-rate=200k http://www.openss7.org/repos/tarballs/strx25-0.9.2.1.tar.bz2
4. Continue the Incomplete Download Using wget -c
Restart a download which got stopped in the middle using wget -c option as shown below.
$ wget -c http://www.openss7.org/repos/tarballs/strx25-0.9.2.1.tar.bz2
This is very helpful when you have initiated a very big file download which got interrupted in the middle. Instead of starting the whole download again, you can start the download from where it got interrupted using option -c
Note: If a download is stopped in middle, when you restart the download again without the option -c, wget will append .1 to the filename automatically as a file with the previous name already exist. If a file with .1 already exist, it will download the file with .2 at the end.
5. Download in the Background Using wget -b
For a huge download, put the download in background using wget option -b as shown below.
$ wget -b http://www.openss7.org/repos/tarballs/strx25-0.9.2.1.tar.bz2
Continuing in background, pid 1984.
Output will be written to `wget-log'.
It will initiate the download and gives back the shell prompt to you. You can always check the status of the download using tail -f as shown below.
$ tail -f wget-log
Saving to: `strx25-0.9.2.1.tar.bz2.4'

0K .......... .......... .......... .......... .......... 1% 65.5K 57s
50K .......... .......... .......... .......... .......... 2% 85.9K 49s
100K .......... .......... .......... .......... .......... 3% 83.3K 47s
150K .......... .......... .......... .......... .......... 5% 86.6K 45s
200K .......... .......... .......... .......... .......... 6% 33.9K 56s
250K .......... .......... .......... .......... .......... 7% 182M 46s
300K .......... .......... .......... .......... .......... 9% 57.9K 47s
Also, make sure to review our previous multitail article on how to use tail command effectively to view multiple files.
6. Mask User Agent and Display wget like Browser Using wget –user-agent
Some websites can disallow you to download its page by identifying that the user agent is not a browser. So you can mask the user agent by using –user-agent options and show wget like a browser as shown below.
$ wget --user-agent="Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.3) Gecko/2008092416 Firefox/3.0.3" URL-TO-DOWNLOAD
7. Test Download URL Using wget –spider
When you are going to do scheduled download, you should check whether download will happen fine or not at scheduled time. To do so, copy the line exactly from the schedule, and then add –spider option to check.
$ wget --spider DOWNLOAD-URL
If the URL given is correct, it will say
$ wget --spider download-url
Spider mode enabled. Check if remote file exists.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Remote file exists and could contain further links,
but recursion is disabled -- not retrieving.
This ensures that the downloading will get success at the scheduled time. But when you had give a wrong URL, you will get the following error.
$ wget --spider download-url
Spider mode enabled. Check if remote file exists.
HTTP request sent, awaiting response... 404 Not Found
Remote file does not exist -- broken link!!!
You can use the spider option under following scenarios:
Check before scheduling a download.
Monitoring whether a website is available or not at certain intervals.
Check a list of pages from your bookmark, and find out which pages are still exists.
8. Increase Total Number of Retry Attempts Using wget –tries
If the internet connection has problem, and if the download file is large there is a chance of failures in the download. By default wget retries 20 times to make the download successful.
If needed, you can increase retry attempts using –tries option as shown below.
$ wget --tries=75 DOWNLOAD-URL
9. Download Multiple Files / URLs Using Wget -i
First, store all the download files or URLs in a text file as:
$ cat > download-file-list.txt
URL1
URL2
URL3
URL4
Next, give the download-file-list.txt as argument to wget using -i option as shown below.
$ wget -i download-file-list.txt
10. Download a Full Website Using wget –mirror
Following is the command line which you want to execute when you want to download a full website and made available for local viewing.
$ wget --mirror -p --convert-links -P ./LOCAL-DIR WEBSITE-URL
–mirror : turn on options suitable for mirroring.
-p : download all files that are necessary to properly display a given HTML page.
–convert-links : after the download, convert the links in document for local viewing.
-P ./LOCAL-DIR : save all the files and directories to the specified directory.
11. Reject Certain File Types while Downloading Using wget –reject
You have found a website which is useful, but don’t want to download the images you can specify the following.
$ wget --reject=gif WEBSITE-TO-BE-DOWNLOADED
12. Log messages to a log file instead of stderr Using wget -o
When you wanted the log to be redirected to a log file instead of the terminal.
$ wget -o download.log DOWNLOAD-URL
13. Quit Downloading When it Exceeds Certain Size Using wget -Q
When you want to stop download when it crosses 5 MB you can use the following wget command line.
$ wget -Q5m -i FILE-WHICH-HAS-URLS
Note: This quota will not get effect when you do a download a single URL. That is irrespective of the quota size everything will get downloaded when you specify a single file. This quota is applicable only for recursive downloads.
14. Download Only Certain File Types Using wget -r -A
You can use this under following situations:
Download all images from a website
Download all videos from a website
Download all PDF files from a website
$ wget -r -A.pdf http://url-to-webpage-with-pdfs/
15. FTP Download With wget
You can use wget to perform FTP download as shown below.
Anonymous FTP download using Wget
$ wget ftp-url
FTP download using wget with username and password authentication.
$ wget --ftp-user=USERNAME --ftp-password=PASSWORD DOWNLOAD-URL

Steps to Perform SSH Login Without Password Using ssh-keygen & ssh-copy-id

</b>


You can login to a remote Linux server without entering password in 3 simple steps using ssky-keygen and ssh-copy-id as explained in this article.
ssh-keygen creates the public and private keys. ssh-copy-id copies the local-host’s public key to the remote-host’s authorized_keys file. ssh-copy-id also assigns proper permission to the remote-host’s home, ~/.ssh, and ~/.ssh/authorized_keys.

This article also explains 3 minor annoyances of using ssh-copy-id and how to use ssh-copy-id along with ssh-agent.


Step 1: Create public and private keys using ssh-key-gen on local-host

jsmith@local-host$ [Note: You are on local-host here]

jsmith@local-host$ ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/home/jsmith/.ssh/id_rsa):[Enter key]
Enter passphrase (empty for no passphrase): [Press enter key]
Enter same passphrase again: [Pess enter key]
Your identification has been saved in /home/jsmith/.ssh/id_rsa.
Your public key has been saved in /home/jsmith/.ssh/id_rsa.pub.
The key fingerprint is:
33:b3:fe:af:95:95:18:11:31:d5:de:96:2f:f2:35:f9 jsmith@local-host

Step 2: Copy the public key to remote-host using ssh-copy-id

jsmith@local-host$ ssh-copy-id -i ~/.ssh/id_rsa.pub remote-host
jsmith@remote-host's password:
Now try logging into the machine, with "ssh 'remote-host'", and check in:

.ssh/authorized_keys

to make sure we haven't added extra keys that you weren't expecting.
Note: ssh-copy-id appends the keys to the remote-host’s .ssh/authorized_key.

Step 3: Login to remote-host without entering the password

jsmith@local-host$ ssh remote-host
Last login: Sun Nov 16 17:22:33 2008 from 192.168.1.2
[Note: SSH did not ask for password.]

jsmith@remote-host$ [Note: You are on remote-host here]

The above 3 simple steps should get the job done in most cases.

We also discussed earlier in detail about performing SSH and SCP from openSSH to openSSH without entering password.

If you are using SSH2, we discussed earlier about performing SSH and SCP without password from SSH2 to SSH2 , from OpenSSH to SSH2 and from SSH2 to OpenSSH.


Using ssh-copy-id along with the ssh-add/ssh-agent

When no value is passed for the option -i and If ~/.ssh/identity.pub is not available, ssh-copy-id will display the following error message.

jsmith@local-host$ ssh-copy-id -i remote-host
/usr/bin/ssh-copy-id: ERROR: No identities found

If you have loaded keys to the
ssh-agent using the ssh-add, then ssh-copy-id will get the keys from the ssh-agent to copy to the remote-host. i.e, it copies the keys provided by ssh-add -L command to the remote-host, when you don’t pass option -i to the ssh-copy-id.

jsmith@local-host$ ssh-agent $SHELL

jsmith@local-host$ ssh-add -L
The agent has no identities.

jsmith@local-host$ ssh-add
Identity added: /home/jsmith/.ssh/id_rsa (/home/jsmith/.ssh/id_rsa)

jsmith@local-host$ ssh-add -L
ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAsJIEILxftj8aSxMa3d8t6JvM79DyBV
aHrtPhTYpq7kIEMUNzApnyxsHpH1tQ/Ow== /home/jsmith/.ssh/id_rsa

jsmith@local-host$ ssh-copy-id -i remote-host
jsmith@remote-host's password:
Now try logging into the machine, with "ssh 'remote-host'", and check in:

.ssh/authorized_keys

to make sure we haven't added extra keys that you weren't expecting.
[Note: This has added the key displayed by ssh-add -L]

Three Minor Annoyances of ssh-copy-id

Following are few minor annoyances of the ssh-copy-id.

  1. Default public key: ssh-copy-id uses ~/.ssh/identity.pub as the default public key file (i.e when no value is passed to option -i). Instead, I wish it uses id_dsa.pub, or id_rsa.pub, or identity.pub as default keys. i.e If any one of them exist, it should copy that to the remote-host. If two or three of them exist, it should copy identity.pub as default.
  2. The agent has no identities: When the ssh-agent is running and the ssh-add -L returns “The agent has no identities” (i.e no keys are added to the ssh-agent), the ssh-copy-id will still copy the message “The agent has no identities” to the remote-host’s authorized_keys entry.
  3. Duplicate entry in authorized_keys: I wish ssh-copy-id validates duplicate entry on the remote-host’s authorized_keys. If you execute ssh-copy-id multiple times on the local-host, it will keep appending the same key on the remote-host’s authorized_keys file without checking for duplicates. Even with duplicate entries everything works as expected. But, I would like to have my authorized_keys file clutter free.

Process of add new services to Nagios Monitor


Scenario / Question:

I followed the Tutorial on installing and configuring NRPE, but now I want to monitor more services on the remote server.

Solution / Answer:

You can add services by configuring nrpe.cfg on the remote host and adding a new service definition on the Nagios monitoring server.

Download Service Plugin on Remote Host Server

1. Download the plugin to folder /usr/local/nagios/libexec/
2. Change permissions on plugin to nagios
# cd /usr/local/nagios/libexec/
# chown nagios.nagios check_something
# chmod 775 check_something
3. Test the plugin works
# /usr/local/nagios/libexec/check_something

Edit nrpe.cfg File on Remote Host Server

Add a new command definition to the nrpe.cfg file on the remote host
# vi /usr/local/nagios/etc/nrpe.cfg
Add a new check_something command definition. (replace “check_something” with the actual plugin file name). Also you must define any arguments. Do a “check_something -h” for a plugins arguments.
command[check_something]=/usr/local/nagios/libexec/check_something -t 20 -c 30
Because we are running xinetd we do not need to restart the NRPE daemon. Otherwise you would have to restart the NRPE daemon for the changes to take effect.

Add Service Definition to Nagios Monitoring Server

On the monitoring host, you need to define a new service for check_something on the remote host. Add
the following entry to one of your object configuration files (linux-server-remote.cfg)
define service{
use generic-service
host_name remotehost
service_description Check Something
check_command check_nrpe!check_something
}

Next, verify your Nagios configuration files and restart Nagios.
# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

# service nagios restart

Nagios NRPE APC UPS monitor



Scenario / Question:

How do I monitor APC UPS Battery via apcaccess on linux

Solution / Answer:

Use the check_apcupsd plugin and NRPE

Download check_apcupsd Plugin on Remote Host Server

The original check_apcupsd plugin can be found here.
check_apcupsd
I modified the check_apcupsd plugin to include  the STATUS (Online or Offline) and renamed it to check_apcups
Which can be downloaded from here.
check_apcups
1. Download the plugin to folder /usr/local/nagios/libexec/
2. Change permissions on plugin to nagios
# cd /usr/local/nagios/libexec/
# chown nagios.nagios check_raid
# chmod 775 check_raid
3. Test the plugin works
# /usr/local/nagios/libexec/check_apcups -w 80 -c 60 bcharge

Edit nrpe.cfg File on Remote Host Server

Add a new command definition to the nrpe.cfg file on the remote host
# vi /usr/local/nagios/etc/nrpe.cfg
Add new check_apcups command definitions.
command[check_apcups_bcharge]=/usr/local/nagios/libexec/check_apcups -w 95 -c 50 bcharge
command[check_apcups_itemp]=/usr/local/nagios/libexec/check_apcups -w 35 -c 40 itemp
command[check_apcups_loadpct]=/usr/local/nagios/libexec/check_apcups -w 75 -c 85 loadpct
command[check_apcups_status]=/usr/local/nagios/libexec/check_apcups status
Because we are running xinetd we do not need to restart the NRPE daemon. Otherwise you would have to restart the NRPE daemon for the changes to take effect.

Add Service Definition to Nagios Monitoring Server

On the monitoring host, you need to define a new service for check_something on the remote host. Add
the following entry to one of your object configuration files (linux-server-remote.cfg)

define service {
        use                     generic-service
        host_name               remotehost
        service_description     APC STATUS
        check_command           check_nrpe!check_apcups_status
        }

define service {
        use                     generic-service
        host_name               remotehost
        service_description     APC CHARGE
        check_command           check_nrpe!check_apcups_bcharge
        }

define service {
        use                     generic-service
        host_name               remotehost
        service_description     APC TEMP
        check_command           check_nrpe!check_apcups_itemp
        }

define service {
        use                     generic-service
        host_name               remotehost
        service_description     APC LOAD
        check_command           check_nrpe!check_apcups_loadpct
        }
Next, verify your Nagios configuration files and restart Nagios.
# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

# service nagios restart

How To Monitor VPN Active Sessions and Temperature Using Nagios

Previously we discussed about how to use Nagios to monitor a Linux and Windows server. In this article, let us review how to monitor active sessions and temperature of VPN device using Nagios. You can monitor pretty much anything about a hardware using the nagios check_snmp plug-in.
1. Identify a cfg file to define host, hostgroup and services for VPN device
You can either create a new vpn.cfg file or re-use one of the existing .cfg file. In this article, I’ve added the VPN service and hostgroup definition to an existing switch.cfg file. Make sure the switch.cfg line in nagios.cfg file is not commented as shown below.
<!-- @page { margin: 0.79in } P { margin-bottom: 0.08in } PRE.cjk { font-family: "DejaVu LGC Sans Mono", monospace } -->
# grep switch.cfg /usr/local/nagios/etc/nagios.cfg
cfg_file=/usr/local/nagios/etc/objects/switch.cfg
2. Add new hostgroup for VPN device in switch.cfg
Add the following ciscovpn hostgroup to the /usr/local/nagios/etc/objects/switch.cfg file.
define hostgroup{
hostgroup_name  ciscovpn
alias           Cisco VPN Concentrator
}
3. Add new host for VPN device in switch.cfg
In this example, I’ve defined two hosts–one for primary and another for secondary Cisco VPN concentrator in the /usr/local/nagios/etc/objects/switch.cfg file. Change the address directive to your VPN device ip-address accordingly.
define host{
use                     generic-host
host_name               cisco-vpn-primary
alias                   Cisco VPN Concentrator Primary
address                 192.168.1.7
check_command           check-host-alive
max_check_attempts      10
notification_interval   120
notification_period     24x7
notification_options    d,r
contact_groups          admins
hostgroups              ciscovpn
}

define host{
use                     generic-host
host_name               cisco-vpn-secondary
alias                   Cisco VPN Concentrator Secondary
address                 192.168.1.9
check_command           check-host-alive
max_check_attempts      10
notification_interval   120
notification_period     24x7
notification_options    d,r
contact_groups          admins
hostgroups              ciscovpn
}
4. Add new services to monitor VPN active sessions and temperature in switch.cfg
Add the “Temperature” service and “Active VPN Sessions” service to the /usr/local/nagios/etc/objects/switch.cfg file.
define service{
use                             generic-service
hostgroup_name                  ciscovpn
service_description             Temperature
is_volatile                     0
check_period                    24x7
max_check_attempts              4
normal_check_interval           10
retry_check_interval            2
contact_groups                  admins
notification_interval           960
notification_period             24x7
check_command                   check_snmp!-l Temperature -o .1.3.6.1.4.1.3076.2.1.2.22.1.29.0,.1.3.6.1.4.1.3076.2.1.2.22.1.33.0 -w 37,:40 -c :40,:45
}

define service{
use                             generic-service
hostgroup_name                  ciscovpn
service_description             Active VPN Sessions
is_volatile                     0
check_period                    24x7
max_check_attempts              4
normal_check_interval           5
retry_check_interval            1
contact_groups                  admins
notification_interval           960
notification_period             24x7
check_command                   check_snmp!-l ActiveSessions -o 1.3.6.1.4.1.3076.2.1.2.17.1.7.0,1.3.6.1.4.1.3076.2.1.2.17.1.9.0 -w :70,:8 -c :75,:10
}
5. Validate the check_snmp from command line
Check_snmp plug-in uses the ‘snmpget’ command from the NET-SNMP package. Make sure the net-snmp is installed on your system as shown below. If not, download it from NET-SNMP website.
# rpm -qa | grep -i net-snmp
net-snmp-libs-5.1.2-11.el4_6.11.2
net-snmp-5.1.2-11.el4_6.11.2
net-snmp-utils-5.1.2-11.EL4.10
Make sure the check_snmp works from command line as shown below.
# /usr/local/nagios/libexec/check_snmp -H 192.168.1.7 \
-P 2c -l Temperature -w :35,:40 -c :40,:45 \
-o .1.3.6.1.4.1.3076.2.1.2.22.1.29.0,.1.3.6.1.4.1.3076.2.1.2.22.1.33.0

Temperature OK - 35 38 | iso.3.6.1.4.1.3076.2.1.2.22.1.29.0=35
                         iso.3.6.1.4.1.3076.2.1.2.22.1.33.0=38

# /usr/local/nagios/libexec/check_snmp -H 192.168.1.7 \
-P 2c -l ActiveSessions -w :80,:40 -c :100,:50 \
-o 1.3.6.1.4.1.3076.2.1.2.17.1.7.0,1.3.6.1.4.1.3076.2.1.2.17.1.9.0

ActiveSessions CRITICAL - *110* 20 | iso.3.6.1.4.1.3076.2.1.2.17.1.7.0=110
                                     iso.3.6.1.4.1.3076.2.1.2.17.1.9.0=20
In this example, following parameters are passed to the check_snmp:
  • -H, –hostname=ADDRESS Host name, IP Address, or unix socket (must be an absolute path)
  • -P, –protocol=[1|2c|3] SNMP protocol version
  • -l, –label=STRING Prefix label for output from plugin. i.e Temerature or ActiveSessions
  • -w, –warning=INTEGER_RANGE(s) Range(s) which will not result in a WARNING status
  • -c, –critical=INTEGER_RANGE(s) Range(s) which will not result in a CRITICAL status
  • -o, –oid=OID(s) Object identifier(s) or SNMP variables whose value you wish to query. Make sure to refer to the manual of your device to see all the supported and available oid’s for your equipment. If you have more than two oid’s, separate them with comma.
In the ActiveSessions example, two OID’s are getting monitored. i.e one for VPN LAN-2-LAN tunnels (iso.3.6.1.4.1.3076.2.1.2.17.1.7.0) and another for PPTP sessions (iso.3.6.1.4.1.3076.2.1.2.17.1.9.0). In the above example, VPN LAN-2-LAN active sessions has exceeded the critical limit of 100.
Object Identifier (OID) is arranged in a hierarchical Management Information Base (MIB) tree with roots and branches based on the internet standard.
6. Validate configuration and restart nagios
Verify the nagios configuration to make sure there are no warnings and errors.
# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

Total Warnings: 0
Total Errors:   0
Things look okay - No serious problems were detected during the pre-flight check
Restart the nagios server to start monitoring the VPN device.
# /etc/rc.d/init.d/nagios stop
Stopping nagios: .done.

# /etc/rc.d/init.d/nagios start
Starting nagios: done.
Verify the status of the ActiveSessions and Temperature of the VPN device from the Nagios web UI (http://{nagios-server}/nagios) as shown below.

How To Monitor Network Switch and Ports Using Nagios

Nagios is hands-down the best monitoring tool to monitor host and network equipments. Using Nagios plugins you can monitor pretty much monitor anything.

I use Nagios intensively and it gives me peace of mind knowing that I will get an alert on my phone, when there is a problem. More than that, if warning levels are setup properly, Nagios will proactively alert you before a problem becomes critical.

Earlier I wrote about, how to setup Nagios to monitor Linux Host, Windows Host and VPN device.

In this article, I’ll explain how to configure Nagios to monitor network switch and it’s active ports.

1. Enable switch.cfg in nagios.cfg
Uncomment the switch.cfg line in /usr/local/nagios/etc/nagios.cfg as shown below.
<!-- @page { margin: 0.79in } P { margin-bottom: 0.08in } PRE.cjk { font-family: "DejaVu LGC Sans Mono", monospace } -->
[nagios-server]# grep switch.cfg /usr/local/nagios/etc/nagios.cfg
cfg_file=/usr/local/nagios/etc/objects/switch.cfg
2. Add new hostgroup for switches in switch.cfg
Add the following switches hostgroup to the /usr/local/nagios/etc/objects/switch.cfg file.
define hostgroup{
hostgroup_name  switches
alias           Network Switches
}
3. Add a new host for the switch to be monitered
In this example, I’ve defined a host to monitor the core switch in the /usr/local/nagios/etc/objects/switch.cfg file. Change the address directive to your switch ip-address accordingly.
define host{
use             generic-switch
host_name       core-switch
alias           Cisco Core Switch
address         192.168.1.50
hostgroups      switches
}
4. Add common services for all switches
Displaying the uptime of the switch and verifying whether switch is alive are common services for all switches. So, define these services under the switches hostgroup_name as shown below.
# Service definition to ping the switch using check_ping
define service{
use                     generic-service
hostgroup_name          switches
service_description     PING
check_command           check_ping!200.0,20%!600.0,60%
normal_check_interval   5
retry_check_interval    1
}

# Service definition to monitor switch uptime using check_snmp
define service{
use                     generic-service
hostgroup_name          switches
service_description     Uptime
check_command           check_snmp!-C public -o sysUpTime.0
}
5. Add service to monitor port bandwidth usage
check_local_mrtgtraf uses the Multil Router Traffic Grapher – MRTG. So, you need to install MRTG for this to work properly. The *.log file mentioned below should point to the MRTG log file on your system.
define service{
use                             generic-service
host_name                       core-switch
service_description     Port 1 Bandwidth Usage
check_command           check_local_mrtgtraf!/var/lib/mrtg/192.168.1.11_1.log!AVG!1000000,2000000!5000000,5000000!10
}
6. Add service to monitor an active switch port
Use check_snmp to monitor the specific port as shown below. The following two services monitors port#1 and port#5. To add additional ports, change the value ifOperStatus.n accordingly. i.e n defines the port#.
<!--
google_ad_client = "pub-8090601437064582";
/* TGS Inside Content Big */
google_ad_slot = "8006040097";
google_ad_width = 336;
google_ad_height = 280;
//-->google_protectAndRun("ads_core.google_render_ad", google_handleError, google_render_ad);# Monitor status of port number 1 on the Cisco core switch
define service{
use                  generic-service
host_name            core-switch
service_description  Port 1 Link Status
check_command        check_snmp!-C public -o ifOperStatus.1 -r 1 -m RFC1213-MIB
}

# Monitor status of port number 5 on the Cisco core switch
define service{
use                  generic-service
host_name            core-switch
service_description  Port 5 Link Status
check_command          check_snmp!-C public -o ifOperStatus.5 -r 1 -m RFC1213-MIB
}
7. Add services to monitor multiple switch ports together
Sometimes you may need to monitor the status of multiple ports combined together. i.e Nagios should send you an alert, even if one of the port is down. In this case, define the following service to monitor multiple ports.
# Monitor ports 1 - 6 on the Cisco core switch.
define service{
use                   generic-service
host_name             core-switch
service_description   Ports 1-6 Link Status
check_command         check_snmp!-C public -o ifOperStatus.1 -r 1 -m RFC1213-MIB, -o ifOperStatus.2 -r 1 -m RFC1213-MIB, -o ifOperStatus.3 -r 1 -m RFC1213-MIB, -o ifOperStatus.4 -r 1 -m RFC1213-MIB, -o ifOperStatus.5 -r 1 -m RFC1213-MIB, -o ifOperStatus.6 -r 1 -m RFC1213-MIB
}
8. Validate configuration and restart nagios
Verify the nagios configuration to make sure there are no warnings and errors.
# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

Total Warnings: 0
Total Errors:   0
Things look okay - No serious problems were detected during the pre-flight check
Restart the nagios server to start monitoring the VPN device.
# /etc/rc.d/init.d/nagios stop
Stopping nagios: .done.

# /etc/rc.d/init.d/nagios start
Starting nagios: done.
Verify the status of the switch from the Nagios web UI: http://{nagios-server}/nagios as shown below:

Fig: Nagios GUI displaying status of a Network Switch

9. Troubleshooting
Issue1: Nagios GUI displays “check_mrtgtraf: Unable to open MRTG log file” error message for the Port bandwidth usage
Solution1: make sure the *.log file defined in the check_local_mrtgtraf service is pointing to the correct location.

Issue2: Nagios UI displays “Return code of 127 is out of bounds – plugin may be missing” error message for Port Link Status.

Solution2: Make sure both net-snmp and net-snmp-util packages are installed. In my case, I was missing the net-snmp-utils package and installing it resolved this issue as shown below.
[nagios-server]# rpm -qa | grep net-snmp
net-snmp-libs-5.1.2-11.el4_6.11.2
net-snmp-5.1.2-11.el4_6.11.2

[nagios-server]# rpm -ivh net-snmp-utils-5.1.2-11.EL4.10.i386.rpm
Preparing...       ########################################### [100%]
1:net-snmp-utils   ########################################### [100%]

[nagios-server]# rpm -qa | grep net-snmp
net-snmp-libs-5.1.2-11.el4_6.11.2
net-snmp-5.1.2-11.el4_6.11.2
net-snmp-utils-5.1.2-11.EL4.10

Nagios NRPE Software RAID Monitor



Scenario / Question:

How do I monitor linux software raid 1 array status on a remote linux server

Solution / Answer:

Use the check_raid plugin and NRPE
Download check_raid Plugin on Remote Host Server
Downlaod check_lsi_megaraid plugin to /usr/local/nagios/libexec/ from https://www.monitoringexchange.org/inventory/Check-Plugins/Hardware/Devices/RAID-Controller/check_raid
check_raid
1. Download the plugin to folder /usr/local/nagios/libexec/
2. Change permissions on plugin to nagios
# cd /usr/local/nagios/libexec/
# chown nagios.nagios check_raid
# chmod 775 check_raid
3. Test the plugin works
# /usr/local/nagios/libexec/check_raid
Add sudo alias for Nagios user
check_raid is a command that needs to be executed by root user. We need to create a sudo alias so that nagios user can execute check_raid with root privileges.
# visudo
Add the following:
# Allow nagios to run certain plugins as root
  nagios  ALL=(ALL) NOPASSWD: /usr/local/nagios/libexec/check_raid
Uncomment the following:
#Defaults requiretty
Edit nrpe.cfg File on Remote Host Server
Add a new command definition to the nrpe.cfg file on the remote host
# vi /usr/local/nagios/etc/nrpe.cfg
Add a new check_raid command definition.
command[check_raid]=/usr/bin/sudo /usr/local/nagios/libexec/check_raid
Because we are running xinetd we do not need to restart the NRPE daemon. Otherwise you would have to restart the NRPE daemon for the changes to take effect.
Add Service Definition to Nagios Monitoring Server
On the monitoring host, you need to define a new service for check_something on the remote host. Add
the following entry to one of your object configuration files (linux-server-remote.cfg)
define service{
use generic-service
host_name remotehost
service_description RAID STATUS
check_command check_nrpe!check_raid
}

Next, verify your Nagios configuration files and restart Nagios.
# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

# service nagios restart