I wanted to check whether our business critical web server is up in every 6 hours.
First I thought I run ping every 6 hours via ‘cron’. But I want to ping more often once server is detected down and until it comes back up.
So, I came up with this self registering ‘at’ script. This begins pinging the server in google:”incremental backoff” intervals once it’s detected down. Starts at 1 minute, and then 2 minutes. 3,4,5… If you replace “+1″ wit “*2″, it will do google:”exponential backoff”. 1,2,4,8,16…
$ cat ~/misc/myServerPing.at # THISFILE should be full path or relative from $HOME # Run this in bash by ". {this file}" THISFILE=misc/myServerPing.at INTERVAL=1 curl --silent --connect-timeout 8 http://ourserver.sun.com | grep "Our critical page" > /dev/null if [ $? -ne 0 ]; then date | mailx -s "Failed ping to ourserver" my.mail.address@sun.com sed "s/^\(INTERVAL=\)[1-9]*$/\\1$(($INTERVAL+1))/" $THISFILE | at now + $INTERVAL minutes > /dev/null 2>&1 else at now + 360 minutes /dev/null 2>&1 fi
Advertisement
2009/11/07 at 00:21 |
isn’t the role of tool like nagios? what’s best one?
2009/11/07 at 06:40 |
if the server could be down for 6hours and you not hear about it, its not very mission critical, typically mission critical stuff is checked every 1-5 minutes at least everywhere I have worked.