This morning I went into work to perform a routine installation of patches on both our mail server and file server. I expected that because there were only a few patches which required a reboot it should only take about an hour. I started the process at about 10:45am…
All went well for the first set of patches and the mail server rebooted fine. The file server initially rebooted OK but needed a couple of patches in single user mode. Again, they SEEMED to go O.K… until the final reboot. The system came back up and but couldn’t fsck the UFS filesystem on a zvol device.. so I rebooted and booted with the “-r” option to reconfigure devices…. even more errors… then lots of programs crashed on start.. so I rolled back the patches and rebooted.
The system wouldn’t boot at all, other than in failsafe mode. Rebuilding the boot_cache made no difference.. but I did notice a fleeting message saying that GRUB couldn’t mount the root partition!
Hmm… So, I boot in failsafe mode again and run fsck. OH-MY-GOD! Huge numbers of duplicate blocks, corrupted directory entries and corrupted directories. After about 30 sets of fsck runs it wasn’t getting any better so I had to cut my losses and do a full re-install.
Suffice it to say that I eventually got the system back up and running. Zpool imported the data filesystems, the user home directory UFS filesystem checked out and when I NFS exported them the clients didn’t even have problems with stale file handles.
So, a job which should have taken 1 hour actually took 11. I’m off to bed in a minute without supper ‘cos I’m so tired and, as I didn’t get to Sainsbury’s, don’t have much to eat in the house anyway.