Mirroring the operating system
In the steps below, I'm using DiskSuite to mirror the active root disk (c0t0d0) to a mirror (c0t1d0). I'm assuming that partitions five and six of each disk have a couple of cylinders free for DiskSuite's state database replicas.
Introduction
First, we start with a filesystem layout that looks as follows: Filesystem kbytes used avail capacity Mounted on
/dev/dsk/c0t0d0s0 6607349 826881 5714395 13% /
/proc 0 0 0 0% /proc
fd 0 0 0 0% /dev/fd
mnttab 0 0 0 0% /etc/mnttab
/dev/dsk/c0t0d0s4 1016863 8106 947746 1% /var
swap 1443064 8 1443056 1% /var/run
swap 1443080 24 1443056 1% /tmp
We're going to be mirroring from c0t0d0 to c0t1d0. When the operating system was installed, we created unassigned slices five, six, and seven of roughly 10 MB each. We will use slices five and six for the DiskSuite state database replicas. The output from the "format" command is as follows: # format
Searching for disks...done
AVAILABLE DISK SELECTIONS:
0. c0t0d0 <SEAGATE-ST19171W-0024 cyl 5266 alt 2 hd 20 sec 168>
/pci@1f,4000/scsi@3/sd@0,0
1. c0t1d0 <SEAGATE-ST19171W-0024 cyl 5266 alt 2 hd 20 sec 168>
/pci@1f,4000/scsi@3/sd@1,0
Specify disk (enter its number): 0
selecting c0t0d0
[disk formatted]
...
partition> p
Current partition table (original):
Total disk cylinders available: 5266 + 2 (reserved cylinders)
Part Tag Flag Cylinders Size Blocks
0 root wm 0 - 3994 6.40GB (3995/0/0) 13423200
1 swap wu 3995 - 4619 1.00GB (625/0/0) 2100000
2 backup wm 0 - 5265 8.44GB (5266/0/0) 17693760
3 unassigned wu 0 0 (0/0/0) 0
4 var wm 4620 - 5244 1.00GB (625/0/0) 2100000
5 unassigned wm 5245 - 5251 11.48MB (7/0/0) 23520
6 unassigned wm 5252 - 5258 11.48MB (7/0/0) 23520
7 unassigned wm 5259 - 5265 11.48MB (7/0/0) 23520
DiskSuite Mirroring
Note that much of the process of mirroring the root disk has been automated with the sdsinstall script. With the exception of the creation of device aliases, all of the work done in the following steps can be achieved via the following:
# ./sdsinstall -p c0t0d0 -s c0t1d0 -m s5 -m s6
Ensure that the partition tables of both disks are identical:
# prtvtoc /dev/rdsk/c0t0d0s2 | fmthard -s - /dev/rdsk/c0t1d0s2
Add the state database replicas. For redundancy, each disk has two state database replicas.
# metadb -a -f c0t0d0s5
# metadb -a c0t0d0s6
# metadb -a c0t1d0s5
# metadb -a c0t1d0s6
Note that there appears to be a lot of confusion regarding the recommended number and location of state database replicas. According the the DiskSuite reference manual:
State database replicas contain configuration and status information of all metadevices and hot spares. Multiple copies (replicas) are maintained to provide redundancy. Multiple copies also prevent the database from being corrupted during a system crash (at most, only one copy if the database will be corrupted).
State database replicas are also used for mirror resync regions. Too few state database replicas relative to the number of mirrors may cause replica I/O to impact mirror performance.
At least three replicas are recommended. DiskSuite allows a maximum of 50 replicas. The following guidelines are recommended:
For a system with only a single drive: put all 3 replicas in one slice.
For a system with two to four drives: put two replicas on each drive.
For a system with five or more drives: put one replica on each drive.
In general, it is best to distribute state database replicas across slices, drives, and controllers, to avoid single points-of-failure.
Each state database replica occupies 517 KB (1034 disk sectors) of disk storage by default. Replicas can be stored on: a dedicated disk partition, a partition which will be part of a metadevice, or a partition which will be part of a logging - device.
Note - Replicas cannot be stored on the root (/), swap, or /usr slices, or on slices containing existing file systems or data.
Starting with DiskSuite 4.2.1, an optional /etc/system parameter exists which allows DiskSuite to boot with just 50% of the state database replicas online. For example, if one of the two boot disks were to fail, just two of the four state database replicas would be available. Without this /etc/system parameter (or with older versions of DiskSuite), the system would complain of "insufficient state database replicas", and manual intervention would be required on bootup. To enable the "50% boot" behaviour with DiskSuite 4.2.1, execute the following command:
# echo "set md:mirrored_root_flag=1" >> /etc/system
Define the metadevices on c0t0d0 (/):
# metainit -f d10 1 1 c0t0d0s0
# metainit -f d20 1 1 c0t1d0s0
# metainit d0 -m d10
The metaroot command edits the /etc/vfstab and /etc/system files:
# metaroot d0
Define the metadevices for c0t0d0s1 (swap):
# metainit -f d11 1 1 c0t0d0s1
# metainit -f d21 1 1 c0t1d0s1
# metainit d1 -m d11
Define the metadevices for c0t0d0s4 (/var):
# metainit -f d14 1 1 c0t0d0s4
# metainit -f d24 1 1 c0t1d0s4
# metainit d4 -m d14
Edit /etc/vfstab so that it references the DiskSuite metadevices instead of simple slices:
#device device mount FS fsck mount mount
#to mount to fsck point type pass at boot options
#
fd - /dev/fd fd - no -
/proc - /proc proc - no -
/dev/md/dsk/d1 - - swap - no -
/dev/md/dsk/d0 /dev/md/rdsk/d0 / ufs 1 no logging
/dev/md/dsk/d4 /dev/md/rdsk/d4 /var ufs 1 no logging
swap - /tmp tmpfs - yes -
Reboot the system:
# lockfs -fa
# sync;sync;sync;init 6
After the system reboots from the metadevices for /, /var, and swap, set up mirrors:
# metattach d0 d20
# metattach d1 d21
# metattach d4 d24
The process of synchronizing the data to the mirror disk will take a while. You can monitor its progress via the command:
# metastat|grep -i progress
Capture the DiskSuite configuration in the text file md.tab. With Solaris 2.6 and Solaris 7, this text file resides in the directory /etc/opt/SUNWmd; however, more recent versions of Solaris place the file in the /etc/lvm directory. We'll assume that we're running Solaris 8 here:
# metastat -p | tee /etc/lvm/md.tab
In order for the system to be able to dump core in the event of a panic, the dump device needs to reference the DiskSuite metadevice:
# dumpadm -d /dev/md/dsk/d1
If the primary boot disk should fail, make it easy to boot from the mirror. Some sites choose to alter the OBP "boot-device" variable; in this case, we choose to simply define the device aliases "sds-root" and "sds-mirror". In the event that the primary boot device ("disk" or "sds-root") should fail, the administrator simply needs to type "boot sds-mirror" at the OBP prompt.
Determine the device path to the boot devices for both the primary and mirror:
# ls -l /dev/dsk/c0t0d0s0 /dev/dsk/c0t1d0s0
lrwxrwxrwx 1 root root 41 Oct 17 11:48 /dev/dsk/c0t0d0s0 -> ../..
/devices/pci@1f,4000/scsi@3/sd@0,0:a
lrwxrwxrwx 1 root root 41 Oct 17 11:48 /dev/dsk/c0t1d0s0 -> ../..
/devices/pci@1f,4000/scsi@3/sd@1,0:a
Use the device paths to define the sds-root and sds-mirror device aliases (note that we use the label "disk" instead of "sd" in the device alias path):
# eeprom "nvramrc=devalias sds-root /pci@1f,4000/scsi@3/disk@0,0
devalias sds-mirror /pci@1f,4000/scsi@3/disk@1,0"
# eeprom "use-nvramrc?=true"
Test the process of booting from either sds-root or sds-mirror.
Once the above sequence of steps has been completed. the system will look as follows:
# metadb
flags first blk block count
a m p luo 16 1034 /dev/dsk/c0t0d0s5
a p luo 16 1034 /dev/dsk/c0t0d0s6
a p luo 16 1034 /dev/dsk/c0t1d0s5
a p luo 16 1034 /dev/dsk/c0t1d0s6
# df -k
Filesystem kbytes used avail capacity Mounted on
/dev/md/dsk/d0 6607349 845208 5696068 13% /
/proc 0 0 0 0% /proc
fd 0 0 0 0% /dev/fd
mnttab 0 0 0 0% /etc/mnttab
/dev/md/dsk/d4 1016863 8414 947438 1% /var
swap 1443840 8 1443832 1% /var/run
swap 1443848 16 1443832 1% /tmp
Trans metadevices for logging
UFS filesystem logging was first supported with Solaris 7. Prior to that release, one could create trans metadevices with DiskSuite to achieve the same effect. For Solaris 7 and up, it's much easier to simply enable ufs logging by adding the word "logging" to the last field of the /etc/vfstab file. The following section is included for those increasingly rare Solaris 2.6 installations.
The following two steps assume that you are have an available (<=64MB) slice 3 available for logging.
Define the trans metadevice mirror (c0t0d0s3):
# metainit d13 1 1 c0t0d0s3
# metainit d23 1 1 c0t1d0s3
# metainit d3 -m d13
# metattach d3 d23
Make /var use the trans meta device for logging:
# metainit -f d64 -t d4 d3
Edit vfstab as follows:
/dev/md/dsk/d64 /dev/md/rdsk/d64 /var ufs 1 no -
Ensure that no volumes are syncing before running the following:
# sync;sync;sync;init 6
|