Installation et configuration du SUN Cluster
Contents
- 1 Introduction
- 2 Requierements
- 3 Installation
- 4 Configuration
- 5 Manage
- 6 Maintenance
- 7 FAQ
- 7.1 Can't integrate cluster
- 7.2 The cluster is in installation mode
- 7.3 How to change Private Interconnect IP for cluster ?
- 7.4 Some commands cannot be executed on a cluster in Install mode
- 7.5 Disk path offline
- 7.6 Force uninstall
- 7.7 How to Change Sun Cluster Node Names
- 7.8 Can't switch an RG from one node to another
- 7.9 Cluster is unavailable when a node crash on a 2 nodes cluster
- 8 References
1 Introduction
Solaris Cluster (sometimes Sun Cluster or SunCluster) is a high-availability cluster software product for the Solaris Operating System, created by Sun Microsystems.
It is used to improve the availability of software services such as databases, file sharing on a network, electronic commerce websites, or other applications. Sun Cluster operates by having redundant computers or nodes where one or more computers continue to provide service if another fails. Nodes may be located in the same data center or on different continents.
This Documentation has been released with :
- Sun Solaris update 7
- Sun Cluster 3.2u2
2 Requierements
All thoses things are requiered before installing Sun Cluster. Follow all this steps before installing.
2.1 Hardware
To make a real cluster, you need here is the requiered hardware lsit :
- 2 nodes
- sun-node1
- sun-node2
- 4 network cards
- 2 for Public interface (with IPMP on it)
- 2 for Private interface (for cluster : heartbeat & nodes informations exchange)
- 1 disks array with 1 spare disk
2.2 Partitionning
While you install Solaris, you should make a slice called /globaldevices containing at least 512Mo. This slice should be in UFS (ZFS not work as global device for the moment).
If you didn't do this slice during Solaris installation, you can :
- Use the format command to create a new slice
- Use newfs command to format filesystem as UFS
- Mount this filesystem with global option in /globaldevices
- Add it in /etc/vfstab, for example :
/etc/vfstab |
/dev/did/dsk/d6s3 /dev/did/rdsk/d6s3 /global/.devices/node@2 ufs 2 no global |
Note : Since Sun Cluster 3.2 update 2, you needn't /globaldevices anymore and use ZFS as default root system.
2.3 Hostname Configuration
Change the hostname to assume the cluster nomenclature you which :
Changing Solaris hostname
Do not forget to apply the same /etc/hosts file to all cluster nodes !!! And when you made change, change it on every nodes !
2.4 Patchs
Use Sun Update Manager if you have a graphical interface to update all the available packages. If you don't have graphical interfaces, please install all available patchs to avoid installation problems.
2.5 IPMP Configuration
You need to configure at least 2 interfaces for your public network. Follow this documentation :
IPMP Configuration
You don't have to do it for your private network because it will be automatically done by the cluster during installation.
2.6 Activate all network cards
With your 4 network cards, you could activate all your cards to be easilly reconnized during installation. In the first step, ifonfig -a to know if all your cards are plumbed. If not touch them :
touch |
touch /etc/hostname.e1000g2 touch /etc/hostname.e1000g3 ifconfig e1000g2 plumb ifconfig e1000g3 plumb |
2.7 Remove RPC and Webconsole binding
If you have installed the lastest Solaris version, you may encounter Node integration problem due to RPC binding. This is new SUN security features. As we need to allow communication between nodes, we need to disable binding on RPC protocol (and could do it for the webconsole as well). You should do this operation on each nodes.
- Ensure that the local_only property of rpcbind is set to false:
svcprop |
svcprop network/rpc/bind:default |
If local_only is set to true, run those commands and refresh service :
svcfg |
$ svccfg svc:> select network/rpc/bind svc:/network/rpc/bind> setprop config/local_only=false svc:/network/rpc/bind> quit svcadm refresh network/rpc/bind:default |
Now communication between nodes works.
- Ensure that the tcp_listen property of webconsole is set to true:
svcprop |
svcprop /system/webconsole:console |
If tcp_listen is not true, run those commands and restart service :
svccfg |
$ svccfg svc:> select system/webconsole svc:/system/webconsole> setprop options/tcp_listen=true svc:/system/webconsole> quit /usr/sbin/smcwebserver restart |
It is needed for Sun Cluster Manager communication. To verify if the port is listen to *.6789 you can execute
netstat |
netstat -a |
If you want a faster solution to do those 2 things faster, use those commands :
svccfg -s network/rpc/bind setprop config/local_only=false svcadm refresh network/rpc/bind:default svccfg -s system/webconsole setprop options/tcp_listen=true /usr/sbin/smcwebserver restart |
2.8 Profile
Configure the root profile (~/.profile) or for all users(/etc/profile) by adding those lines :
~/.profile |
PATH=$PATH:/usr/cluster/bin/ |
Now refresh your configuration :
source |
source ~/.profile or source /etc/profile |
2.9 Ending
Restart all your nodes when all is finished.
3 Installation
First of all, download the Sun Cluster package (normally in zip) and uncompress it on all nodes. You should have a "Solaris_x86" folder.
Now launch the installer on all the nodes :
cd /Solaris_x86 ./installer |
We'll need to install Sun Cluster Core and Quorum (if we want to add more than 2 nodes now or later).
4 Configuration
4.1 Wizard configuration
Before launching installation, you should know there is 2 way to configure all the nodes :
- One by one
- All in one shot
If you want to do all in one shot, you should have to exchange all root ssh public keys between all nodes.
scinstall |
scinstall |
*** Main Menu *** Please select from one of the following (*) options: * 1) Create a new cluster or add a cluster node 2) Configure a cluster to be JumpStarted from this install server 3) Manage a dual-partition upgrade 4) Upgrade this cluster node 5) Print release information for this cluster node * ?) Help with menu options * q) Quit Option:
Answer : 1
*** New Cluster and Cluster Node Menu *** Please select from any one of the following options: 1) Create a new cluster 2) Create just the first node of a new cluster on this machine 3) Add this machine as a node in an existing cluster ?) Help with menu options q) Return to the Main Menu Option:
Answer : 1
*** Create a New Cluster *** This option creates and configures a new cluster. You must use the Java Enterprise System (JES) installer to install the Sun Cluster framework software on each machine in the new cluster before you select this option. If the "remote configuration" option is unselected from the JES installer when you install the Sun Cluster framework on any of the new nodes, then you must configure either the remote shell (see rsh(1)) or the secure shell (see ssh(1)) before you select this option. If rsh or ssh is used, you must enable root access to all of the new member nodes from this node. Press Control-d at any time to return to the Main Menu. Do you want to continue (yes/no)
Answer : yes
>>> Typical or Custom Mode <<< This tool supports two modes of operation, Typical mode and Custom. For most clusters, you can use Typical mode. However, you might need to select the Custom mode option if not all of the Typical defaults can be applied to your cluster. For more information about the differences between Typical and Custom modes, select the Help option from the menu. Please select from one of the following options: 1) Typical 2) Custom ?) Help q) Return to the Main Menu Option [1]:
Answer : 2
>>> Cluster Name <<< Each cluster has a name assigned to it. The name can be made up of any characters other than whitespace. Each cluster name should be unique within the namespace of your enterprise. What is the name of the cluster you want to establish ?
Answer : sun-cluster
>>> Cluster Nodes <<< This Sun Cluster release supports a total of up to 16 nodes. Please list the names of the other nodes planned for the initial cluster configuration. List one node name per line. When finished, type Control-D: Node name: sun-node1 Node name: sun-node2 Node name (Control-D to finish): ^D
Enter the 2 nodes name and finish with Ctrl+D.
This is the complete list of nodes: sun-node1 sun-node2 Is it correct (yes/no) [yes]?
Answer : yes
Attempting to contact "sun-node2" ... done Searching for a remote configuration method ... done The secure shell (see ssh(1)) will be used for remote execution. Press Enter to continue:
Answer : yes
>>> Authenticating Requests to Add Nodes <<< Once the first node establishes itself as a single node cluster, other nodes attempting to add themselves to the cluster configuration must be found on the list of nodes you just provided. You can modify this list by using claccess(1CL) or other tools once the cluster has been established. By default, nodes are not securely authenticated as they attempt to add themselves to the cluster configuration. This is generally considered adequate, since nodes which are not physically connected to the private cluster interconnect will never be able to actually join the cluster. However, DES authentication is available. If DES authentication is selected, you must configure all necessary encryption keys before any node will be allowed to join the cluster (see keyserv(1M), publickey(4)). Do you need to use DES authentication (yes/no) [no]?
Answer : no
>>> Network Address for the Cluster Transport <<< The cluster transport uses a default network address of 172.16.0.0. If this IP address is already in use elsewhere within your enterprise, specify another address from the range of recommended private addresses (see RFC 1918 for details). The default netmask is 255.255.248.0. You can select another netmask, as long as it minimally masks all bits that are given in the network address. The default private netmask and network address result in an IP address range that supports a cluster with a maximum of 64 nodes and 10 private networks. Is it okay to accept the default network address (yes/no) [yes]?
Answer : yes
Is it okay to accept the default netmask (yes/no) [yes]?
Answer : yes
>>> Minimum Number of Private Networks <<< Each cluster is typically configured with at least two private networks. Configuring a cluster with just one private interconnect provides less availability and will require the cluster to spend more time in automatic recovery if that private interconnect fails. Should this cluster use at least two private networks (yes/no) [yes]?
Answer : yes
>>> Point-to-Point Cables <<< The two nodes of a two-node cluster may use a directly-connected interconnect. That is, no cluster switches are configured. However, when there are greater than two nodes, this interactive form of scinstall assumes that there will be exactly one switch for each private network. Does this two-node cluster use switches (yes/no) [yes]?
Answer : no
>>> Cluster Transport Adapters and Cables <<< You must configure the cluster transport adapters for each node in the cluster. These are the adapters which attach to the private cluster interconnect. Select the first cluster transport adapter for "sun-node1": 1) e1000g1 2) e1000g2 3) e1000g3 4) Other Option:
Answer : 3
Adapter "e1000g3" is an Ethernet adapter. Searching for any unexpected network traffic on "e1000g3" ... done Verification completed. No traffic was detected over a 10 second sample period. The "dlpi" transport type will be set for this cluster. Name of adapter on "sun-node2" to which "e1000g3" is connected? e1000g3 Select the second cluster transport adapter for "sun-node1": 1) e1000g1 2) e1000g2 3) e1000g3 4) Other Option:
Answer : 2
Adapter "e1000g2" is an Ethernet adapter. Searching for any unexpected network traffic on "e1000g2" ... done Verification completed. No traffic was detected over a 10 second sample period. Name of adapter on "sun-node2" to which "e1000g2" is connected?
Answer : e1000g2
>>> Quorum Configuration <<< Every two-node cluster requires at least one quorum device. By default, scinstall will select and configure a shared SCSI quorum disk device for you. This screen allows you to disable the automatic selection and configuration of a quorum device. The only time that you must disable this feature is when ANY of the shared storage in your cluster is not qualified for use as a Sun Cluster quorum device. If your storage was purchased with your cluster, it is qualified. Otherwise, check with your storage vendor to determine whether your storage device is supported as Sun Cluster quorum device. If you disable automatic quorum device selection now, or if you intend to use a quorum device that is not a shared SCSI disk, you must instead use clsetup(1M) to manually configure quorum once both nodes have joined the cluster for the first time. Do you want to disable automatic quorum device selection (yes/no) [no]?
Answer : yes
>>> Global Devices File System <<< Each node in the cluster must have a local file system mounted on /global/.devices/node@<nodeID> before it can successfully participate as a cluster member. Since the "nodeID" is not assigned until scinstall is run, scinstall will set this up for you. You must supply the name of either an already-mounted file system or raw disk partition which scinstall can use to create the global devices file system. This file system or partition should be at least 512 MB in size. If an already-mounted file system is used, the file system must be empty. If a raw disk partition is used, a new file system will be created for you. The default is to use /globaldevices. Is it okay to use this default (yes/no) [yes]?
Answer : yes
Testing for "/globaldevices" on "sun-node1" ... done For node "sun-node2", Is it okay to use this default (yes/no) [yes]?
Answer : yes
Is it okay to create the new cluster (yes/no) [yes]?
Answer : yes
>>> Automatic Reboot <<< Once scinstall has successfully initialized the Sun Cluster software for this machine, the machine must be rebooted. After the reboot, this machine will be established as the first node in the new cluster. Do you want scinstall to reboot for you (yes/no) [yes]?
Answer : yes
During the cluster creation process, sccheck is run on each of the new cluster nodes. If sccheck detects problems, you can either interrupt the process or check the log files after the cluster has been established. Interrupt cluster creation for sccheck errors (yes/no) [no]?
Answer : no
The Sun Cluster software is installed on "sun-node2". Started sccheck on "sun-node1". Started sccheck on "sun-node2". sccheck completed with no errors or warnings for "sun-node1". sccheck completed with no errors or warnings for "sun-node2". Configuring "sun-node2" ... done Rebooting "sun-node2" ... Waiting for "sun-node2" to become a cluster member ...
4.2 Manual configuration
Here is an example of the first node and allowing an other node :
4.3 Quorum
If you've made the installation with a quorum, you'll need to set it up with the webremote or with those commands. First, you need to list all LUN with DID format :
Choose the LUN you wish have for your quorum :
clquorum |
/usr/cluster/bin/clquorum add /dev/did/rdsk/d6 |
Then, activate it :
clquorum |
/usr/cluster/bin/clquorum enable /dev/did/rdsk/d6 |
To finish, you need to reset it :
clquorum |
/usr/cluster/bin/clquorum reset |
Now you're able to configure you cluster
4.4 Network
4.4.1 Cluster connections
To check cluster interconnect, please use this command :
clinterconnect |
clinterconnect status |
To enable a network card interconnection :
clinterconnect |
clinterconnect enable hostname:card example : clinterconnect enable localhost:e1000g0 |
4.4.2 Check network interconnect interfaces
To check if all interfaces are running, configure IPs on each of private (cluster) IPs. Then broadcast a ping :
ping |
ping -s 10.255.255.255 |
You can change this kind of private IPs with your (defaults are 172.16.0.255)
4.4.3 Check traffic
Use snoop command to see traffic upcomming for example :
snoop |
snoop -d <interface> <ip> example : snoop -d nge0 192.168.76.2 |
4.4.4 Get Fiber Channel WWN
To get Fiber Channel identifiers, launch this command :
5 Manage
5.1 Get cluster state
To get cluster state, simply launch scstat command :
5.2 Registering Ressources
You can look at the available ressources :
clrt |
$ clrt list SUNW.LogicalHostname:2 SUNW.SharedAddress:2 |
Here we need to use more ressources like HA Storage (HAStoragePlus) and GDS :
clrt |
clrt register SUNW.HAStoragePlus clrt register SUNW.gds |
Now we can verify :
clrt |
$ clrt list SUNW.LogicalHostname:2 SUNW.SharedAddress:2 SUNW.HAStoragePlus:6 SUNW.gds:6 |
5.3 Creating Ressource Group
A RG (Ressource Group) is a containing for example a VIP (Virtual IP or Logical Host)
clrg |
clrg create sun-rg |
You can also specify a rg on a specific node :
clrg |
clrg create -n sun-node1 sun-rg |
5.4 Creating Logical Host (VIP) Ressource
All your requested VIP should be in /etc/hosts file on each nodes, ex :
/etc/hosts |
# # Internet host table # ::1 localhost 127.0.0.1 localhost 192.168.0.72 sun-node1 192.168.0.77 sun-node2 192.168.0.79 my_app1_vip 192.168.0.80 my_app2_vip |
Now, activate it :
clrslh |
clrslh create -g sun-rg -h my_app1_vip my_app1_vip |
- sun-rg : Ressource group (created before)
- my_app1_vip : name of the vip in the hosts files
- my_app1_vip : name of the vip ressource in cluster
To specify on only one node :
clrslh |
clrslh create -g sun-rg -h lh -N ipmp0sun-node1 my_app1_vip |
5.5 Creating FileSystem Ressource
Once your LUNs has been created, be sure you can see all available dids on all nodes :
didadm |
didadm -l |
and compare to the 'format' command. Everythings should looks like similar. If not, please run those commands on all nodes :
didadm |
didadm -C didadm -r |
This will clear all deleted LUN and add all new created LUN in cluster did configuration.
Now create for example a zpool for each of your services. Once done, use them as filesystem ressource :
clrs |
clrs create -g sun-rg -t SUNW.HAStoragePlus -p zpools=my_app1 my_app1-fs |
- sun-rg : name of the ressource group
- my_app1 : zpool name
- my_app1-fs : filesystem cluster ressource name
5.6 Creating a GDS Ressource
A GDS is used to use custom script for starting, stoppping or probbing (status) an application. To integrate a GDS in a RG :
This will create a GDS with your Zpool as dependancie. This mean it should be up before the start of the GDS.
Note : You needn't to put the VIP as resource dependencies because Sun cluster do it for you by default.
5.7 Modify / view ressource properties
You may need to change some properties or get informations from. Let's see how to do it. If you want to show :
clrs |
clrs show -v my_ressource |
And choose the ressource you want to set :
clrs |
clrs set -p my_property=value my_ressource ex : clrs set -p START_TIMEOUT=60 ressource_gds clrs set -p Probe_command="/mnt/test/bin/service_cluster.pl status my_rg" ressource_gds |
5.8 Activating Ressource Group
To activate the RG :
clrg |
clrg manage sun-rg |
Now if you want ot use it (this will activate all the resources in the RG) :
clrg |
clrg online sun-rg |
You also can specify a node by adding -n :
clrg |
clrg online -n node1 sun-rg |
6 Maintenance
6.1 Boot in non cluster mode
6.1.1 Reboot with command line
If you need to enter in non cluster mode, please use this command :
reboot |
reboot -- -x |
6.1.2 Boot from grub
Simply edit this line by adding -x at the end during server boot :
kernel /platform/i86pc/multiboot -x
6.2 Remove node from cluster
Simply run this command
clnode |
clnode remove |
7 FAQ
7.1 Can't integrate cluster
7.1.1 Solution 1
During installation, if you get this kind of problems :
Waiting for "sun-node2" to become a cluster member ...
Please follow this step :
Remove RPC and Webconsole binding
7.1.2 Solution 2
Remove node configuration and retry.
7.2 The cluster is in installation mode
If at the end of the installation you encounter this kind of problem (a message like "The cluster is in installation mode" or "Le cluster est en mode installation") this mean you need to configure something before configuring your RG or RS.
If you have the WebUI (http://127.0.0.1:6789 for example), you certainly could resolve your problem with it. But in this case, if may have installed the Quorum. So you need to configure it as well.
7.3 How to change Private Interconnect IP for cluster ?
The cluster install wanted to use a .0.0 as the private interconnect and when installed one of the private interconnects ended up on 172.16.0 and one ended up on 172.16.1 and consequently one private interconnect faulted. I found an article that indicated you could edit the cluster configuration by first booting each machine in non-cluster mode (boot-x, I actually did a reboot and then a stop A on the reboot and then a boot -x)
in /etc/cluster/ccr/infrastructure and then incorporate your changes using
ccradm |
/usr/cluster/lib/sc/ccradm -o -i /etc/cluster/ccr/infrastructure |
After I modified the file to change both private interconnects to be on the 172.16.0 subnet the second private interconnect came on-line. Once the second private interconnect came up I was able to run scsetup, select an additional quorum drive and then set the cluster out of install mode.
7.4 Some commands cannot be executed on a cluster in Install mode
This is generally the case in a 2 nodes cluster when Quorum is not alredy set. Like described in the man :
Specify the installation-mode setting for the cluster. You can specify either enabled or disabled for the installmode property. While the installmode property is enabled, nodes do not attempt to reset their quorum configurations at boot time. Also, while in this mode, many administrative functions are blocked. When you first install a cluster, the installmode property is enabled. After all nodes have joined the cluster for the first time, and shared quorum devices have been added to the configuration, you must explicitly disable the installmode property. When you disable the installmode property, the quorum vote counts are set to default values. If quorum is automatically configured during cluster creation, the installmode property is disabled as well after quorum has been configured.
Anyway, if you don't want to add a quorum or would like to use it now, simply run this command :
cluster |
cluster set -p installmode=disabled |
7.5 Disk path offline
The did number 3 is corresponding and reserved to the disks array management and may be seen by the cluster. As it cannot be written (because disks array show it in read only) by the cluster, it shows errors. Anyway, it's not errors and you can carefully use your cluster.
7.5.1 Method 1
To recover as clean as possible your did, run this command on all the cluster nodes :
devfsadm |
devfsadm |
Then if it's the same on all the nodes and only if it's like that, you can safetly run this command :
scgdevs |
If you get this kind of errors, please use Method 2 :
7.5.2 Method 2
This second method is the manual one. You can see it as a format WWN command finishing by 31, ex :
3 PA-TLH-SRV-PRD-1:/dev/rdsk/c4t600A0B800048A9B6000008304AC37A31d0 /dev/did/rdsk/d3
If you really want to disable this kind of messages, connect on all nodes integrated in the cluster and run this command :
Now we can verify everythings is ok :
DID 3 is not present anymore. If you want to reactualize everythings :
7.6 Force uninstall
This is not recommanded, but if you can't uninstall and want to force it, here is the procedure :
- Stop all cluster nodes (scshutdown -y -g 0) and start them again in the non cluster mode
ok boot -x |
- Remove the Sun Cluster packages
pkgrm |
pkgrm SUNWscu SUNWscr SUNWscdev SUNWscvm SUNWscsam SUNWscman SUNWscsal SUNWmdm |
- Remove the configurations
rm -r /var/cluster /usr/cluster /etc/cluster rm /etc/inet/ntp.conf rm -r /dev/did rm -r /devices/pseudo/did* rm /etc/path_to_inst (make sure you have backup copy of this file) |
ATTENTION: If you create a new path_to_inst at boottime with 'boot -ra' you should be on the physical bootdevice. Maybe it's not possible to write a path_to_inst on a bootmirror (SVM or VxVM).
- Edit configuration files
- edit /etc/vfstab to remove did and global entries
- edit /etc/nsswitch.conf to remove cluster references
- Reboot the node with -a option (is necessary to write a new path_to_inst file)
reboot -- -rav |
reply "y" to "do you want to rebuild path_to_inst?"
- In case of reinstalling, then ...
mkdir /globaldevices; rmdir /global |
- Uncomment /globaldevices entry from /etc/vfstab
- newfs /dev/rdsk/c?t?d?s? (wherever /globaldevices was mounted)
- mount /globaldevices
- scinstall
7.7 How to Change Sun Cluster Node Names
Make a copy of /etc/cluster/ccr/infrastructure:
cp |
cp /etc/cluster/ccr/infrastructure /etc/cluster/ccr/infrastructure.old |
Edit /etc/cluster/ccr/infrastructure and change node names as you want. For example, change srv01 to server01 and srv02 to server02.
If necessary, change the Solaris node name:
echo |
echo server01 > /etc/nodename |
Regenerate the checksum for the infrastructure file:
ccradm |
/usr/cluster/lib/sc/ccradm -i /etc/cluster/ccr/infrastructure -o |
Shut down Sun Cluster and boot both nodes:
cluster |
cluster shutdown -g 0 -y ok boot |
7.8 Can't switch an RG from one node to another
I had a problem switching a RG on a Solaris 10u7 with Sun Cluster 3.2u2 (installed patches : 126107-33, 137104-02, 142293-01, 141445-09) because. In fact the Zpool ressource didn't want to mount on another node. When I loeked at the logs, I saw :
le volume ZFS ne voulait pas se monter. dans le fichier "/var/adm/messages" je voyais ce message lors d'un montage de RG :
/var/adm/messages |
Dec 16 15:34:30 LD-TLH-SRV-PRD-3 zfs: [ID 427000 kern.warning] WARNING: pool 'ulprod-ld_mysql' could not be loaded as it was last accessed by another system (host: LD-TLH-SRV-PRD-2 hostid: 0x27812152). See: http://www.sun.com/msg/ZFS-8000-EY |
In fact, it's a bug and it could be bypassed by putting the RG offline :
clrg |
clrg offline <RG_name> |
Then manually mount and umount the zpool :
zpool |
zpool import <zpool_name> zpool export <zpool_name> |
Now put the RG online :
clrg |
clrg online -n <node_name> <rg_name> |
If the problem still occur, look in the logs files and if there is somethings like this :
If it's the case, it's apparently corrected with sun cluster 3.2u3.
To avoid installing this update create this folder '/var/cluster/run/HAStoragePlus/zfs' :
mkdir |
mkdir -p /var/cluster/run/HAStoragePlus/zfs |
Check if file "/etc/cluster/eventlog/eventlog.conf" contain line "EC_zfs - - - /usr/cluster/lib/sc/events/zpool_cachefile_plugin.so".
If it's not the case, the content should looks like :
Now mount the RG where you want, it should works.
Two types of problems can arise from cluster partitions: split brain and amnesia. Split brain occurs when the cluster interconnect between Solaris hosts is lost and the cluster becomes partitioned into subclusters, and each subcluster believes that it is the only partition. A subcluster that is not aware of the other subclusters could cause a conflict in shared resources, such as duplicate network addresses and data corruption.
Amnesia occurs if all the nodes leave the cluster in staggered groups. An example is a two-node cluster with nodes A and B. If node A goes down, the configuration data in the CCR is updated on node B only, and not node A. If node B goes down at a later time, and if node A is rebooted, node A will be running with old contents of the CCR. This state is called amnesia and might lead to running a cluster with stale configuration information.
You can avoid split brain and amnesia by giving each node one vote and mandating a majority of votes for an operational cluster. A partition with the majority of votes has a quorum and is enabled to operate. This majority vote mechanism works well if more than two nodes are in the cluster. In a two-node cluster, a majority is two. If such a cluster becomes partitioned, an external vote enables a partition to gain quorum. This external vote is provided by a quorum device. A quorum device can be any disk that is shared between the two nodes.
7.9.1 Recovering from amnesia
Scenario: Two node cluster (nodes A and B) with one Quorum Device, nodeA has gone bad, and amnesia protection is preventing nodeB from booting up.
Amnesia occurs if all the nodes leave the cluster in staggered groups. An example is a two-node cluster with nodes A and B. If node A goes down, the configuration data in the CCR is updated on node B only, and not node A. If node B goes down at a later time, and if node A is rebooted, node A will be running with old contents of the CCR. This state is called amnesia and might lead to running a cluster with stale configuration information.
Warning : this is a dangerous operation
- Boot nodeB in non-cluster mode :
reboot -- -x |
- Edit nodeB's file /etc/cluster/ccr/global/infrastructure as follows :
- Change the value of "cluster.properties.installmode" from "disabled" to "enabled"
- Change the number of votes for nodeA from "1" to "0", in the following property line "cluster.nodes.<NodeA's id>.properties.quorum_vote".
- Delete all lines with "cluster.quorum_devices" to remove knowledge of the quorum device.
/etc/cluster/ccr/global/infrastructure |
... cluster.properties.installmode enabled ... cluster.nodes.1.properties.quorum_vote 1 ... |
- On the first node (the master one, the first to boot) launch :
ccradm |
/usr/cluster/lib/sc/ccradm -i /etc/cluster/ccr/infrastructure -o or (depending on version) /usr/cluster/lib/sc/ccradm recover -o /etc/cluster/ccr/global/infrastructure |
- Reboot nodeB in cluster mode :
reboot |
reboot |
If you have more than 2 nodes, do the same command but without "-o" :
ccradm |
/usr/cluster/lib/sc/ccradm recover /etc/cluster/ccr/global/infrastructure |
8 References
Installation of Sun Cluster (old)
http://en.wikipedia.org/wiki/Solaris_Cluster
http://opensolaris.org/os/community/ha-clusters/translations/french/relnote_fr/
Ressources Properties
http://docs.sun.com/app/docs/doc/819-0177/cbbbgfij?l=ja&a=view
http://www.vigilanttechnologycorp.com/genasys/weblogRender.jsp?LogName=Sun%20Cluster
http://docs.sun.com/app/docs/doc/820-2558/gdrna?l=fr&a=view
http://wikis.sun.com/display/SunCluster/%28English%29+Sun+Cluster+3.2+1-09+Release+Notes#%28English%29SunCluster3.21-09ReleaseNotes-optgdfsinfo (Mine d'Or)
Deploying highly available zones with Solaris Cluster 3.2