Tuesday, September 29, 2009

Domino on IBM i - Omit /TMP from backups

It has been a while since my last post. I have been deep in ND 8.5 upgrade mode and working on some issues.

If you run Domino on IBM i (i5/OS) you need to omit the /TMP folder from online backups NOW! If you backup the IFS while Domino is running you need to do this or will potentially run into a major database corruption issue. IBM is currently drafting an SPR for this and will add /TMP to the list of items to omit which also include *.id and notes.ini files.

Here is what can potentially happen. There are hourly jobs (tasks) that Domino runs such as Chronos and in our case SMDUPD (ScanMail Update). When these jobs run they look to the /TMP folder to see if Domino is already running. If BRMS has that folder locked the job will think Domino isn’t running and will start a second LOGASIO job. This is the job that controls transaction logging. Running 2 LOGASIO jobs against the same transaction logs is bad and will end up corrupting databases. Most likely the ones that had transactions waiting to be written. The second LOGASIO job tries to perform a recovery since it thinks Domino ended abnormally. So you have one normal running LOGASIO job and another comes along and tries to replay logs.

Running a fixup -F -J on the database will fix the database however you don’t know which ones are corrupt until some action is performed on the corrupt area of the database. The action could be the router delivering a message with an attachment or designer running on the database or some other action. When this issue hit us it would start around 10pm (when backups were running) and we would be up all night and into the next day fixing databases as the server reported them. Auto fixup wouldn’t always catch them.

IBM is also working on changing the behavior so a second LOGASIO job can’t start.

You can see this happening in the OS history job logs. You will see CHRONOS start on an active Domino server and then immediately after a LOGASIO job starts. You may also notice 2 Domino console logs that span the same time range if you have console logging enabled.