Solaris Live Upgrade: NOOOOOOOOO! 

I’ve been having a lot of fun with Solaris Live Upgrade at work lately. I’ve discovered a few interesting things that I thought I should share.

Live Upgrade can down your server if you’re not carefull

I don’t know why, but I’ve managed to down my server twice in the last week trying to create live upgrade boot environments. One zone lost the ability to see any mounted directories there by scaring the crap outta the DBA and requiring a zone reboot to fix, another abbandonned a cpio process copying data to the root file system. While it didn’t cause a crash, it could have broken some processes and SMF requires free space in /etc to operate correctly (aka save crashed processes)

Live Upgrade lucreate doesn’t fail cleanly

If live upgrades lucreate fails for any reason, it is very hard to recover. You can’t unconfigure the new boot environment, you cant delete the new boot environment, you can only complete the new boot environment, and that doesn’t work if say there isn’t enough physical space, or another hardware problem emerges

Live Upgrade ludelete doesn’t work most of the time

If you accidentaly destroy the metadevices or zfs file systems that live upgrade expects to exist in a boot environment, you cannot delete it. If the boot environment is “in complete” you can’t delete it, you preaty much can’t do anything with ludelete except remove pristine live upgrade environments. AKA only about 10% of the boot environments you wanted to delete.

Live Upgrade is iffy at best

So far this week I’ve had live upgrade refuse to patch zones because a single temp file didn’t copy correctly during boot creation. I’ve had lucreate mangle zone names then complain that the mangled name doesn’t exist. If you have a zone that mounts a file system, you have to include it in an exclude list file, or live upgrade will try to copy the contents of that additional file system onto your zones root drive.

How I Live upgrade

  1. tar up /etc/lu*
  2. ls -al /tmp for each zone including global-zone
  3. create an exclude file listing all that you dont want on the zone’s root filesystem
  4. create a new live upgrade boot environment
  5. luupgrade either to a new version of solaris or to install patches
  6. luactive your new boot environment
  7. reboot as instructed via init 6
  8. confirm that the correct disk/filesystem is booting
  9. delete /etc/lu*
  10. restore /etc/lu* from the tar file
  11. If at any time something doesn’t work right, blow away all of the live upgrade configuration and restore with the tar file. Also remove any additional files from zones under /tmp that may be associated with live upgrade

This is probably quite bad advice but I find live upgrade only seems to work when it is the first time you are using it. All subsequent times you get stuck with missing file systems that you removed since the last time you upgraded, weird file access errors, zone misnames, file systems filled to 100% with data you didn’t want copied, left of processes changing things you probably didn’t want it to.