Tuesday, October 16, 2007

Self-healing Domino

I was on the train yesterday going home when my Blackberry (BB) vibrates followed by the dreaded audible alarm. I have my BB setup to vibrate for most messages but whenever it receives an alert from our GSX Monitor it makes noise. I look at the alert and one of our Domino servers isn't responding. It used to be this would initiate frantic phone calls to find someone near a computer that would know how to look into the problem with the server and get it back online. Not in today's world.

This particular Domino server is running on one of our iSeries (System i) and I have Rove Mobile's (formerly Idokorro) Mobile SSH on my BB. I fired up a 5250 green screen from my BB and get direct access to the Domino server console. This server also has Auto Recovery enabled so I could see from the console and i5OS jobs that the server had initiated an auto restart and was on it's way back up. Within 5 minutes of receiving the server down notification e-mail I received the server up e-mail. All while sitting on a train with just a BB. I didn't even need the BB but it provided peace of mind that I could see the server taking care of itself.

Now the reason the server faulted was due to SPR# MIAS6VALFX which is fixed in 7.0.2 FP2 and 7.0.3. This server is currently running 7.0.2. I'm hoping this was just an isolated incident but as my earlier post stated, I have started the upgrade process to 7.0.3.

This isn't the first time we have had a Domino server fault and I'm sure it won't be the last. It's nice to have a system that will attempt to take care of itself in an emergency. Thanks IBM/Lotus from all the Domino admins and the users that never know what happened behind the scenes.

4 comments:

Chris Whisonant said...

Good call. The BB is a great system analyis tool. If I woke up in the morning and had no emails, I knew something was wrong... :)

By the way, don't you like how CHGDOMSVR (and CFGDOMSVR too) allows you to specify that the System i Domino server can be enabled for DB2? Is this still in 7.0.3? I understand that the option is also available in 8.0.

Kevin Kanarski said...

I looked on our test system and I can see *DB2ACCESS for Option. It has both 7.0.3 and 8.0.0 installed so it should be using the 8.0 CFGDOMSVR. I did notice this little tidbit in the 7.0.3 release notes for System i...

The following features are not supported in Domino 7

* NSFDB2
* IPV6

Mark said...

Hey Kevin, as I always say, you can also thank transaction logging for the speedy restart. Now users only talk about a server "hiccup" instead of a server outage, which they could also just as easily attribute as a "network" hiccup and put it on the network folks for us! ;)

Kevin Kanarski said...

Yes, this server also has trans logging. This saves a huge amount of time on restart especially on an active mail server. We used to wait hours for the consistency checks to finish on some of our mail servers prior to trans logging.