Thursday, May 28, 2009

Smoke and Mirrors - Veritas Disaster Recovery: Part 1: An unbootable root disk

I know my Data is there! I just mirrored it! At least it said it did?

Sometimes for whatever reason Veritas volume manger will mirror successfully but the mirror disk wont boot.
Perhaps you forgot to run vxbootsetup or it just didn't work correctly.

What if your primary root mirror now died and you had to get the data off the alternate disk but it wont boot!
Using some spare hardware here's what I went through.

I moved the un-bootable (Solaris 8) SCSI disk (mirror) from my Sunfire 4800 (D240 tray) into a spare Sunfire V240 system in my lab. It has 4 SCSI disk slots so it's a perfect recovery platform in this case.
We installed Veritas Volume manager 4.1 on Solaris 10 and encapsulated our root drive.
Here's the gotcha, to keep the new rootdg (Veritas root data disk) separate from the disk we are recovering we changed the default name rootdg to spamdg instead.

Once our Solaris 10 system disk was encapsulated and booted we could plug our bad mirror into a spare disk slot, plug a new disk into another spare slot and ufsdump the data to a new disk.
Heres the main steps.
  • The V240 installed with Solaris 10 was installed with Veritas VXVM 4.1
  • The root disk (disk0 slot) was encapsulated with rootdg renamed to spamdg (call it what you want, just NOT rootdg or bootdg)
  • The recovery source disk was installed in slot 2
  • A new target disk same size or larger was put in slot 3
  • system was booted and rootdg was imported
  • root / , var and opt filesystems were mounted.
  • new slices are created on the target disk.
  • target disk is newfs 'ed and mounted
  • ufsdump is used to copy the data to the new disk
  • new disk is manually un-encapsulated.
  • New disk is returned to original Sunfire 4800 system and is bootable.
  • New disk is remirrored for redundancy
Recovery platform:
  • I used a Spare Sunfire V240 which runs Solaris 10 (could have been Solaris 8 or 9.)
  • It has 4 disk slots I am using only disk0 as the boot disk. 4 slots are numbered disk 0,1,2,3.
  • I have the install CDROM for Veritas Storage foundation 4.1

Detailed steps

1. Remove the un-bootable Solaris 8 disk from the F4800 system's D240 media tray.

2. Install Veritas Volume manager (VXVM) on the V240 using the Veritas Storage Foundation install cd.

You will not be able to read the data on the disk unless you are on a system running Veritas Volume Manager. Encapsulate the root disk using the menu vxdiskadm. When asked to create the rootdg (root data group) change the default name to prevent confusion with the rootdg from your source recoverdisk.

4. Plug in the Solaris 8 disk (Recovery source disk) to slot3 of the V240

5. If you have hot plugged the disk run this command to get Solaris and vxvm to see the disk.

# vxdiskconfig

VxVM INFO V-5-2-1401 This command may take a few minutes to complete execution Executing Solaris command: devfsadm (part 1 of 2) at 11:14:15 EDT Executing VxVM command: vxdctl enable (part 2 of 2) at 11:14:29 EDT
Command completed at 11:15:08 EDT

6. Import the rootdg of this disk
# vxdg -Cf import rootdg VxVM vxdg WARNING V-5-1-560 Disk rootdisk1: Not found, last known location: c0t0d0s2

The error is because the disk had formerly been one of a mirrored pair.

7. Do vxdisk list and vxprint to see the disks and volumes

# vxdisk list

DEVICE TYPE DISK GROUP STATUS

c1t0d0s2 auto:sliced spamdg01 spamdg online

c1t3d0s2 auto:sliced rootdisk3 rootdg online

- - rootdisk1 rootdg failed was:c0t0d0s2


# vxprint -g rootdg

Disk group: rootdg <<<<<<<~~~~~~~~~~~~here it is ~~~~~~~~ TY NAME ASSOC KSTATE LENGTH PLOFFS STATE TUTIL0 PUTIL0
dg rootdg rootdg - - - - - - dm rootdisk1 - - - - NODEVICE - -
dm rootdisk3 c1t3d0s2 - 35363560 - - - -
sd rootdisk1Priv - DISABLED 4711 - NODEVICE - PRIVATE
v opt gen DISABLED 8528720 - ACTIVE - -
pl opt-01 opt DISABLED 8528720 - NODEVICE - -
sd rootdisk1-03 opt-01 DISABLED 8528720 0 NODEVICE - -
pl opt-02 opt DISABLED 8528720 - ACTIVE - -
sd rootdisk3-01 opt-02 ENABLED 8528720 0 - - -
v rootvol root DISABLED 14338616 - ACTIVE - -
pl rootvol-01 rootvol DISABLED 14338616 - NODEVICE - -
sd rootdisk1-B0 rootvol-01 DISABLED 1 0 NODEVICE - Block0
sd rootdisk1-02 rootvol-01 DISABLED 14338615 1 NODEVICE - -
pl rootvol-02 rootvol DISABLED 14338616 - ACTIVE - -
sd rootdisk3-03 rootvol-02 ENABLED 14338616 0 - - -
v swapvol swap DISABLED 4094728 - ACTIVE - -
pl swapvol-01 swapvol DISABLED 4094728 - NODEVICE - -
sd rootdisk1-01 swapvol-01 DISABLED 4094728 0 NODEVICE - -
pl swapvol-02 swapvol DISABLED 4094728 - ACTIVE - -
sd rootdisk3-04 swapvol-02 ENABLED 4094728 0 - - -
v var gen DISABLED 8194168 - ACTIVE - -
pl var-01 var DISABLED 8194168 - NODEVICE - -
sd rootdisk1-04 var-01 DISABLED 8194168 0 NODEVICE - -
pl var-02 var DISABLED 8194168 - ACTIVE - -
sd rootdisk3-05 var-02 ENABLED 8194168 0 - - -


8. Start the volumes so you can mount them (or reboot then the volumes are enabled.)

# vxvol -g rootdg start rootvol
# vxvol -g rootdg start swapvol
# vxvol -g rootdg start var
# vxvol -g rootdg start opt

9. Mount the volumes so you can now get at your data

# mount /dev/vx/dsk/rootdg/rootvol /mnt

# cd /mnt

# ls -l
You should now see your root disk data
You can do the same for var and opt .Now back it up or begin rebuilding a new boot drive


NEXT:
Veritas Disaster Recovery: Part 2: rebuilding the boot disk .... Stay tuned

1 comment:

  1. I also repeated this process with a recovery machine running Solaris 8 and Veritas 4.1.
    I still promise to Do part 2, I've just been busy lately.

    ReplyDelete