Solaris 10 u3 – SC 3.2 ZFS/NFS HA with shared discs
PLEASE BE AWARE THAT ANY INFORMATION YOU MAY FIND HERE MAY BE INACCURATE.
Conventions:
/dir1/dir2/: It’s the path where you mount your ZFS pool.
POOLNAME: It’s the ZFS pool’s name.
#: The superuser prompt. In Solaris you should execute that commands as root or with a role that has the necessary power.
ZFS/NFS HA with shared discs
Operating System:
First off, you will need download and install Solaris 10 u3. After that, apply the updates (patches). The installation of the Sun Cluster 3.2 software should be out of the box. It’s important to download and install this patches, and to execute that procedure if you did use the Secure by Default feature on Solaris 10.
Flash:
I think is a good idea use the solaris flash install to do the installation of the second node. but that procedure is site dependent, and you can use jumpstart, or whatever install method (just remember that the resulting Solaris installation should be in the same patch level). Here is the command to generate the flash archive:
# mkdir /b; cd /b; # flarcreate -n sol10u3.flar -R / -c -x /b sol10u3.flar ps.: The -x option is to exclude the dir /b from the archive
ZFS:
Create you ZFS pool and filesystems… ZFS was designed to be simple, and really is. The important part is to know what is the name of the pool you want to make high available. That is a example:
# zpool create -m /dir1/dir2/POOLNAME POOLNAME mirror
ps.: where < disco1 > and < disco2 > are shared devices. hence,
all nodes in the cluster can access them.
# zfs create POOLNAME/fs01 # zfs set mountpoint=/export/fs01 POOLNAME/fs01 # zfs create POOLNAME/fs02 # zfs set mountpoint=/export/fs02 POOLNAME/fs02 ...
Sun Cluster 3.2:
1) Register the resource type HA-NFS for Sun Cluster (SUNW.nfs) for NFS services, and HA Storage Plus (SUNW.HAStoragePlus) for ZFS.
2) Create the resource group (For ZFS/NFS i think use the pool name is a good convention).
3) Create the resouce Logical Hostname (SUNW.LogicalHostname) for the external hostname of this service.. example: servernfs (this name must be configured in /etc/hosts file, and /etc/inet/ipnodes). The cluster will configure that ip address on the node where the resource group will be.
4) Create the resource HAStoragePlus
5) Bring the resource group online
6) Create the resource NFS and describe the dependencies
So, let’s do it:
# clresourcetype register SUNW.HAStoragePlus SUNW.nfs # clresourcegroup create -p \\ PathPrefix=/dir1/dir2/POOLNAME/HA poolname-rg # clreslogicalhostname create -g poolname-rg -h \\ servernfs servernfs-lh-rs # clresource create -g poolname-rg -t \\ SUNW.HAStoragePlus -p \\ Zpools=POOLNAME poolname-hastorageplus-rs
For now on, the ZFS pool is handled by the Sun Cluster, and is unmounted. So, we will need to bring the resource group online to see the pool again…
# clresourcegroup online -M poolname-rg
Now, the resource group associated with the ZFS pool (POOLNAME) is online, hence the ZFS pool too. Before configure the resource for the NFS services, you will need to create a file named: dfstab.poolname-nfs-rs, in the directory: /dir1/dir2/POOLNAME/HA/SUNW.nfs/.
p.s: As you know, the dfstab is the file containing commands for sharing resources across a network (NFS shares – “man dfstab” for informations about the sintaxe of that file).
# clresource create -g poolname-rg -t SUNW.nfs -p \\ Resource_dependencies=poolname-hastorageplus-rs \\ poolname-nfs-rs
The last step is bring all the resources online…
# clresourcegroup online -M poolname-rg
You can test the fail over/switch back scenarios with the commands:
# clnode evacuate < nodename >
The command above will take all the resources from the nodename, and will bring them online on other cluster node.. or you can use the clresourcegroup command to switch the resource group to one specific host (nodename)
clresourcegroup switch -n < nodename > poolname-rg
WARNING: There is a “time” (60 seconds by default), to keep resource groups from switching back onto a node, after the resources have been evacuated from a node (man clresourcegroup). Look here..
If you need undo all the configurations (if something goes wrong), here is the step-by-step procedure for understanding:
# clresourcegroup offline poolname-rg # clresource disable poolname-hastorageplus-rs # clresource disable poolname-nfs-rs # clresource delete poolname-nfs-rs # clresource delete poolname-hastorageplus-rs # clresource disable servernfs-lh-rs # clresource delete servernfs-lh-rs # clresourcegroup delete poolname-rg
That’s all..
Cleanely done and written. Very nice. Any reason behind not using opensolaris?
Hello Sri!
There was no OpenSolaris/OHAC on that time…
Leal