Mirroring Disks with Solstice DiskSuite
Introduction
This paper will present a short introduction to mirroring two disks using Solstice DiskSuite. Although not as robust as Veritas Volume Manager (VxVM) (also distributed by Sun as the "Sun Enterprise Volume Manager" (SEVM)), DiskSuite is nonetheless still a popular choice for doing basic disk mirroring. This tutorial will be presented using an actual mirroring session with comments and explanations interspersed.
Installation
The first step to setting up mirroring using DiskSuite is to install the DiskSuite packages and any necessary patches. The latest recommended version of DiskSuite is 4.2 for systems running Solaris 2.6 and higher. There are currently two packages and one patch necessary to install DiskSuite. They are:
SUNWmd
- SUNWmdg
- 106627-04 (obtain latest revision) The packages should be installed in the same order as listed above. Note that a reboot is necessary after the install as new drivers will be added to the Solaris kernel.
Also to make life easier, be sure to update your PATH and MANPATH variables to add DiskSuite's directories. Executables reside in /usr/opt/SUNWmd/sbin and man pages in /usr/opt/SUNWmd/man.
The Environment
In this example we will be mirroring two disks, both on the same controller. The first disk will be the primary disk and the second will be the mirror. The disks are:
Disk 1: c0t0d0
Disk 2: c0t1d0
The partitions on the disks are presented below. There are a few items of note here. Each disk is partitioned exactly the same. This is necessary to properly implement the mirrors. Slice 2, commonly referred to as the 'backup' slice, which represents the entire disk must not be mirrored. There are situations where slice 2 is used as a normal slice, however, this author would not recommend doing so.
The three unassigned partitions on each disk are configured to each be 4MB. These 4MB slices will hold the DiskSuite State Database Replicas, or metadb's. More information on the state database replicas will be presented below. Although a metadb only occupies 517KB of space, I prefer to create the slices slightly larger.
Disk 1:
c0t0d0s0: /
c0t0d0s1: swap
c0t0d0s2: backup
c0t0d0s3: unassigned
c0t0d0s4: /var
c0t0d0s5: unassigned
c0t0d0s6: unassigned
c0t0d0s7: /export
Disk 2:
c0t1d0s0: /
c0t1d0s1: swap
c0t1d0s2: backup
c0t1d0s3: unassigned
c0t1d0s4: /var
c0t1d0s5: unassigned
c0t1d0s6: unassigned
c0t1d0s7: /export
The Database State Replicas
The database state replicas serve a very important function in DiskSuite. They are the repositories of information on the state and configuration of each metadevice (A logical device created through DiskSuite is known as a metadevice). Having multiple replicas is critical to the proper operation of DiskSuite.
There must be a minimum of three replicas. DiskSuite requires at least 51% of the replicas to be present in order to operate.
- Replicas should be spread across disks and controllers where possible.
- In a three drive configuration, at least one replica should be on each disk, thus allowing for a one disk failure.
- In a two drive configuration, such as the one we present here, there must be at least two replicas per disk. If there were only three and the disk which held two of them failed, there would not be enough information for DiskSuite to function. Here we will create our state replicas using the metadb command: # /usr/opt/SUNWmd/sbin/metadb -a -f /dev/dsk/c0t0d0s3
# /usr/opt/SUNWmd/sbin/metadb -a -f /dev/dsk/c0t0d0s5
# /usr/opt/SUNWmd/sbin/metadb -a -f /dev/dsk/c0t0d0s6
# /usr/opt/SUNWmd/sbin/metadb -a -f /dev/dsk/c0t1d0s3
# /usr/opt/SUNWmd/sbin/metadb -a -f /dev/dsk/c0t1d0s5
# /usr/opt/SUNWmd/sbin/metadb -a -f /dev/dsk/c0t1d0s6
The -a and -f options used together create the initial replicas. The -a option attaches a new database device and automatically edits the /etc/system file. The -f option creates the actual database replica.
Initializing Submirrors
Each mirrored meta device contains two or more submirrors. The meta device gets mounted by the operating system rather than the original logical device. Below we will walk through the steps involved in creating metadevices for our primary filesystems.
Here we create the two submirrors for the / (root) filesystem, as well as a one way mirror between the meta device and its first submirror.
# metainit -f d10 1 1 c0t0d0s0
# metainit -f d20 1 1 c0t1d0s0
# metainit d0 -m d10
The first two commands create the two submirrors. The -f option forces the creation of the submirror even though the specified slice is a mounted filesystem. The second two options 1 1 specify the number of stripes on the metadevice and the number of slices that make up the stripe. In a mirroring situation, this should always be 1 1. Finally, we specify the logical device that we will be mirroring.
After mirroring the root partition, we need to run the metaroot command. This command will update the root entry in /etc/vfstab with the new metadevice as well as add the appropriate configuration information into /etc/system. Ommitting this step is one of the most common mistakes made by those unfamiliar with DiskSuite. If you do not run the metaroot command before you reboot, you will not be able to boot the system!
# metaroot d0
Next, we continue to create the submirrors and initial one way mirrors for the metadevices which will replace the swap, and /var partitions.
# metainit -f d11 1 1 c0t0d0s1
# metainit -f d21 1 1 c0t1d0s1
# metainit d1 -m d11
# metainit -f d14 1 1 c0t0d0s4
# metainit -f d24 1 1 c0t1d0s4
# metainit d4 -m d14
# metainit -f d17 1 1 c0t0d0s7
# metainit -f d27 1 1 c0t1d0s7
# metainit d7 -m d17
Rebooting the System
The /etc/vfstab file must be updated at this point to reflect the changes made to the system. The / partition will have already been updated through the metaroot command run earlier, but the system needs to know about the new devices for swap and /var. The entries in the file will look something like the following:
/dev/md/dsk/d1 - - swap - no -
/dev/md/dsk/d4 /dev/md/rdsk/d4 /var ufs 1 yes -
/dev/md/dsk/d7 /dev/md/rdsk/d7 /export ufs 1 yes -
Notice that the device paths for the disks have changed from the normal style /dev/dsk/c#t#d#s# and /dev/rdsk/c#t#d#s# to the new metadevice paths, /dev/md/dsk/d# and /dev/md/rdsk/d#.
In preparation for a reboot run the lockfs command to flush all of the pending transactions out to disk. # lockfs -fa
The system can now be rebooted. When it comes back up it will be running off of the one way mirror metadevices. In the next step we will attach the second half of the mirrors and wait for the mirrors to sync.
Attaching the Mirrors
Once the system is back up, we can now attach the second half of the mirrors. Once the mirrors are attached it will begin an automatic synchonization process to ensure that both halves of the mirror are identical. The progress of the synchonization can be monitored using the metastat command. To attach the submirrors, issue the following commands:
# metattach d0 d20
# metattach d1 d21
# metattach d4 d24
# metattach d7 d27
Final Thoughts
With an eye towards recovery in case of a future disaster it may be a good idea to find out the physical device path of the root partition on the second disk in order to create an Open Boot PROM (OBP) device alias to ease booting the system if the primary disk fails. In order to find the physical device path, simply do the following:
# ls -l /dev/dsk/c0t1d0s0
This should return something similar to the following: /sbus@3,0/SUNW,fas@3,8800000/sd@1,0:a
Using this information, create a device alias using an easy to remember name such as altboot. To create this alias, do the following in the Open Boot PROM: ok nvalias altboot /sbus@3,0/SUNW,fas@3,8800000/sd@1,0:a
Replacing a failed disk mirrored with Solstice DiskSuite.
The following assumptions have been made for this example procedure:
A. Server name= bigserver.company.com
B. OS = Solaris 7, 8 and 9
C. DiskSuite versions 4.2 or 4.2.1
D. The ‘bad’ disk is hot-swappable.
E. Metadevice numbers and state database replicas are taken from those used on
the actual system.
F. Actual command line syntax is shown in bold text.
1. Collect output from the following:
# metastat
# metastat -p
# metadb -i
2. To identify the disk to be replaced:
Examine the "metadb -i" output. You should see a "W" in the flags field associated with slice 7 of the
disk experiencing write errors. Another indication is to look at the output from the ”format“ command.
Next to the device name the text string ”<drive type unknown>“ will appear indicating that the disk label
cannot be read and therefore it is very likely that the disk has failed. For this example, we will assume
the failed disk device is c0t0d0.
3. Delete any metadevice state database replicas that are on the 'bad' disk:
# metadb -d c0t0d0s7
# metadb -i (to make sure they have been deleted)
4. State of the submirrors:
The ”metastat“ command output reports that all submirrors on the bad disk are at a State of ”Needs
maintenance“. This indicates that DiskSuite has automatically disabled the submirrors, so there is no
need to ”metadetach“ the submirrors.
5. Physically replace the failed hot-swappable disk.
6. Partition the new disk:
Easiest way to do this is to copy the partition table from the root mirror (c0t1d0s2) to the new disk
(c0t0d0s2) with the following dd command:
# dd if=/dev/rdsk/c0t1d0s2 of=/dev/rdsk/c0t0d0s2 count=16
|