Installation et configuration du SUN Cluster

From Deimos.fr / Bloc Notes Informatique
Jump to: navigation, search

1 Introduction

Solaris Cluster (sometimes Sun Cluster or SunCluster) is a high-availability cluster software product for the Solaris Operating System, created by Sun Microsystems.

It is used to improve the availability of software services such as databases, file sharing on a network, electronic commerce websites, or other applications. Sun Cluster operates by having redundant computers or nodes where one or more computers continue to provide service if another fails. Nodes may be located in the same data center or on different continents.

This Documentation has been released with :

  • Sun Solaris update 7
  • Sun Cluster 3.2u2

2 Requierements

All thoses things are requiered before installing Sun Cluster. Follow all this steps before installing.

2.1 Hardware

To make a real cluster, you need here is the requiered hardware lsit :

  • 2 nodes
    • sun-node1
    • sun-node2
  • 4 network cards
    • 2 for Public interface (with IPMP on it)
    • 2 for Private interface (for cluster : heartbeat & nodes informations exchange)
  • 1 disks array with 1 spare disk

2.2 Partitionning

While you install Solaris, you should make a slice called /globaldevices containing at least 512Mo. This slice should be in UFS (ZFS not work as global device for the moment).

If you didn't do this slice during Solaris installation, you can :

  • Use the format command to create a new slice
  • Use newfs command to format filesystem as UFS
  • Mount this filesystem with global option in /globaldevices
  • Add it in /etc/vfstab, for example :
Configuration File /etc/vfstab
/dev/did/dsk/d6s3 /dev/did/rdsk/d6s3 /global/.devices/node@2 ufs 2 no global

Note : Since Sun Cluster 3.2 update 2, you needn't /globaldevices anymore and use ZFS as default root system.

2.3 Hostname Configuration

Change the hostname to assume the cluster nomenclature you which :
Changing Solaris hostname

Do not forget to apply the same /etc/hosts file to all cluster nodes !!! And when you made change, change it on every nodes !

2.4 Patchs

Use Sun Update Manager if you have a graphical interface to update all the available packages. If you don't have graphical interfaces, please install all available patchs to avoid installation problems.

2.5 IPMP Configuration

You need to configure at least 2 interfaces for your public network. Follow this documentation :
IPMP Configuration

You don't have to do it for your private network because it will be automatically done by the cluster during installation.

2.6 Activate all network cards

With your 4 network cards, you could activate all your cards to be easilly reconnized during installation. In the first step, ifonfig -a to know if all your cards are plumbed. If not touch them :

Command touch
touch /etc/hostname.e1000g2
touch /etc/hostname.e1000g3
ifconfig e1000g2 plumb
ifconfig e1000g3 plumb

2.7 Remove RPC and Webconsole binding

If you have installed the lastest Solaris version, you may encounter Node integration problem due to RPC binding. This is new SUN security features. As we need to allow communication between nodes, we need to disable binding on RPC protocol (and could do it for the webconsole as well). You should do this operation on each nodes.

  • Ensure that the local_only property of rpcbind is set to false:
Command svcprop
svcprop network/rpc/bind:default 

If local_only is set to true, run those commands and refresh service :

Command svcfg
$ svccfg
svc:> select network/rpc/bind
svc:/network/rpc/bind> setprop config/local_only=false
svc:/network/rpc/bind> quit
svcadm refresh network/rpc/bind:default

Now communication between nodes works.

  • Ensure that the tcp_listen property of webconsole is set to true:
Command svcprop
svcprop /system/webconsole:console 

If tcp_listen is not true, run those commands and restart service :

Command svccfg
$ svccfg
svc:> select system/webconsole
svc:/system/webconsole> setprop options/tcp_listen=true
svc:/system/webconsole> quit
/usr/sbin/smcwebserver restart

It is needed for Sun Cluster Manager communication. To verify if the port is listen to *.6789 you can execute

Command netstat
netstat -a 

If you want a faster solution to do those 2 things faster, use those commands :

Command
svccfg -s network/rpc/bind setprop config/local_only=false
svcadm refresh network/rpc/bind:default
svccfg -s system/webconsole setprop options/tcp_listen=true
/usr/sbin/smcwebserver restart

2.8 Profile

Configure the root profile (~/.profile) or for all users(/etc/profile) by adding those lines :

Configuration File ~/.profile
PATH=$PATH:/usr/cluster/bin/

Now refresh your configuration :

Command source
source ~/.profile

or

source /etc/profile

2.9 Ending

Restart all your nodes when all is finished.

3 Installation

First of all, download the Sun Cluster package (normally in zip) and uncompress it on all nodes. You should have a "Solaris_x86" folder.

Now launch the installer on all the nodes :

Command
cd /Solaris_x86
./installer

We'll need to install Sun Cluster Core and Quorum (if we want to add more than 2 nodes now or later).

4 Configuration

4.1 Wizard configuration

Before launching installation, you should know there is 2 way to configure all the nodes :

  • One by one
  • All in one shot

If you want to do all in one shot, you should have to exchange all root ssh public keys between all nodes.

Command scinstall
scinstall

  *** Main Menu ***
 
    Please select from one of the following (*) options:
 
      * 1) Create a new cluster or add a cluster node
        2) Configure a cluster to be JumpStarted from this install server
        3) Manage a dual-partition upgrade
        4) Upgrade this cluster node
        5) Print release information for this cluster node
 
      * ?) Help with menu options
      * q) Quit
 
    Option:

Answer : 1

  *** New Cluster and Cluster Node Menu ***
 
    Please select from any one of the following options:
 
        1) Create a new cluster
        2) Create just the first node of a new cluster on this machine
        3) Add this machine as a node in an existing cluster
 
        ?) Help with menu options
        q) Return to the Main Menu
 
    Option:

Answer : 1

  *** Create a New Cluster ***
 
 
    This option creates and configures a new cluster.
 
    You must use the Java Enterprise System (JES) installer to install the
    Sun Cluster framework software on each machine in the new cluster 
    before you select this option.
 
    If the "remote configuration" option is unselected from the JES 
    installer when you install the Sun Cluster framework on any of the new
    nodes, then you must configure either the remote shell (see rsh(1)) or
    the secure shell (see ssh(1)) before you select this option. If rsh or
    ssh is used, you must enable root access to all of the new member 
    nodes from this node.
 
    Press Control-d at any time to return to the Main Menu.
 
 
    Do you want to continue (yes/no)

Answer : yes

  >>> Typical or Custom Mode <<<
 
    This tool supports two modes of operation, Typical mode and Custom. 
    For most clusters, you can use Typical mode. However, you might need 
    to select the Custom mode option if not all of the Typical defaults 
    can be applied to your cluster.
 
    For more information about the differences between Typical and Custom 
    modes, select the Help option from the menu.
 
    Please select from one of the following options:
 
        1) Typical
        2) Custom
 
        ?) Help
        q) Return to the Main Menu
 
    Option [1]:

Answer : 2

  >>> Cluster Name <<<
 
    Each cluster has a name assigned to it. The name can be made up of any
    characters other than whitespace. Each cluster name should be unique 
    within the namespace of your enterprise.
 
    What is the name of the cluster you want to establish ?

Answer : sun-cluster

  >>> Cluster Nodes <<<
 
    This Sun Cluster release supports a total of up to 16 nodes.
 
    Please list the names of the other nodes planned for the initial 
    cluster configuration. List one node name per line. When finished, 
    type Control-D:
 
    Node name:  sun-node1
    Node name:  sun-node2
    Node name (Control-D to finish):  ^D

Enter the 2 nodes name and finish with Ctrl+D.

    This is the complete list of nodes:
 
        sun-node1
        sun-node2
 
    Is it correct (yes/no) [yes]?

Answer : yes

    Attempting to contact "sun-node2" ... done
 
    Searching for a remote configuration method ... done
 
    The secure shell (see ssh(1)) will be used for remote execution.
 
 
Press Enter to continue:

Answer : yes

  >>> Authenticating Requests to Add Nodes <<<
 
    Once the first node establishes itself as a single node cluster, other
    nodes attempting to add themselves to the cluster configuration must 
    be found on the list of nodes you just provided. You can modify this 
    list by using claccess(1CL) or other tools once the cluster has been 
    established.
 
    By default, nodes are not securely authenticated as they attempt to 
    add themselves to the cluster configuration. This is generally 
    considered adequate, since nodes which are not physically connected to
    the private cluster interconnect will never be able to actually join 
    the cluster. However, DES authentication is available. If DES 
    authentication is selected, you must configure all necessary 
    encryption keys before any node will be allowed to join the cluster 
    (see keyserv(1M), publickey(4)).
 
    Do you need to use DES authentication (yes/no) [no]?

Answer : no

  >>> Network Address for the Cluster Transport <<<
 
    The cluster transport uses a default network address of 172.16.0.0. If
    this IP address is already in use elsewhere within your enterprise, 
    specify another address from the range of recommended private 
    addresses (see RFC 1918 for details).
 
    The default netmask is 255.255.248.0. You can select another netmask, 
    as long as it minimally masks all bits that are given in the network 
    address.
 
    The default private netmask and network address result in an IP 
    address range that supports a cluster with a maximum of 64 nodes and 
    10 private networks.
 
    Is it okay to accept the default network address (yes/no) [yes]?

Answer : yes

    Is it okay to accept the default netmask (yes/no) [yes]?

Answer : yes

  >>> Minimum Number of Private Networks <<<
 
    Each cluster is typically configured with at least two private 
    networks. Configuring a cluster with just one private interconnect 
    provides less availability and will require the cluster to spend more 
    time in automatic recovery if that private interconnect fails.
 
    Should this cluster use at least two private networks (yes/no) [yes]?

Answer : yes

  >>> Point-to-Point Cables <<<
 
    The two nodes of a two-node cluster may use a directly-connected 
    interconnect. That is, no cluster switches are configured. However, 
    when there are greater than two nodes, this interactive form of 
    scinstall assumes that there will be exactly one switch for each 
    private network.
 
    Does this two-node cluster use switches (yes/no) [yes]?

Answer : no

  >>> Cluster Transport Adapters and Cables <<<
 
    You must configure the cluster transport adapters for each node in the
    cluster. These are the adapters which attach to the private cluster 
    interconnect.
 
    Select the first cluster transport adapter for "sun-node1":
 
        1) e1000g1
        2) e1000g2
        3) e1000g3
        4) Other
 
    Option:

Answer : 3

    Adapter "e1000g3" is an Ethernet adapter.
 
    Searching for any unexpected network traffic on "e1000g3" ... done
    Verification completed. No traffic was detected over a 10 second 
    sample period.
 
    The "dlpi" transport type will be set for this cluster.
 
    Name of adapter on "sun-node2" to which "e1000g3" is connected?  e1000g3
 
    Select the second cluster transport adapter for "sun-node1":
 
        1) e1000g1
        2) e1000g2
        3) e1000g3
        4) Other
 
    Option:

Answer : 2

    Adapter "e1000g2" is an Ethernet adapter.
 
    Searching for any unexpected network traffic on "e1000g2" ... done
    Verification completed. No traffic was detected over a 10 second 
    sample period.
 
    Name of adapter on "sun-node2" to which "e1000g2" is connected?

Answer : e1000g2

  >>> Quorum Configuration <<<
 
    Every two-node cluster requires at least one quorum device. By 
    default, scinstall will select and configure a shared SCSI quorum disk
    device for you.
 
    This screen allows you to disable the automatic selection and 
    configuration of a quorum device.
 
    The only time that you must disable this feature is when ANY of the 
    shared storage in your cluster is not qualified for use as a Sun 
    Cluster quorum device. If your storage was purchased with your 
    cluster, it is qualified. Otherwise, check with your storage vendor to
    determine whether your storage device is supported as Sun Cluster 
    quorum device.
 
    If you disable automatic quorum device selection now, or if you intend
    to use a quorum device that is not a shared SCSI disk, you must 
    instead use clsetup(1M) to manually configure quorum once both nodes 
    have joined the cluster for the first time.
 
    Do you want to disable automatic quorum device selection (yes/no) [no]?

Answer : yes

  >>> Global Devices File System <<<
 
    Each node in the cluster must have a local file system mounted on 
    /global/.devices/node@<nodeID> before it can successfully participate 
    as a cluster member. Since the "nodeID" is not assigned until 
    scinstall is run, scinstall will set this up for you.
 
    You must supply the name of either an already-mounted file system or 
    raw disk partition which scinstall can use to create the global 
    devices file system. This file system or partition should be at least 
    512 MB in size.
 
    If an already-mounted file system is used, the file system must be 
    empty. If a raw disk partition is used, a new file system will be 
    created for you.
 
    The default is to use /globaldevices.
 
    Is it okay to use this default (yes/no) [yes]?

Answer : yes

    Testing for "/globaldevices" on "sun-node1" ... done
 
    For node "sun-node2",
       Is it okay to use this default (yes/no) [yes]?

Answer : yes

    Is it okay to create the new cluster (yes/no) [yes]?

Answer : yes

  >>> Automatic Reboot <<<
 
    Once scinstall has successfully initialized the Sun Cluster software 
    for this machine, the machine must be rebooted. After the reboot, this
    machine will be established as the first node in the new cluster.
 
    Do you want scinstall to reboot for you (yes/no) [yes]?

Answer : yes

    During the cluster creation process, sccheck is run on each of the new
    cluster nodes. If sccheck detects problems, you can either interrupt 
    the process or check the log files after the cluster has been 
    established.
 
    Interrupt cluster creation for sccheck errors (yes/no) [no]?

Answer : no

    The Sun Cluster software is installed on "sun-node2".
 
    Started sccheck on "sun-node1".
    Started sccheck on "sun-node2".
 
    sccheck completed with no errors or warnings for "sun-node1".
    sccheck completed with no errors or warnings for "sun-node2".
 
 
    Configuring "sun-node2" ... done
    Rebooting "sun-node2" ... 
 
    Waiting for "sun-node2" to become a cluster member ...

4.2 Manual configuration

Here is an example of the first node and allowing an other node :

Command scinstall
scinstall -i \
-C PA-TLH-CLU-UAT-1 \
-F \
-T node=PA-TLH-SRV-UAT-1,node=PA-TLH-SRV-UAT-2,authtype=sys \
-w netaddr=10.255.255.0,netmask=255.255.255.0,maxnodes=16,maxprivatenets=2,numvirtualclusters=1 \
-A trtype=dlpi,name=nge2 -A trtype=dlpi,name=nge3 \
-B type=switch,name=PA-TLH-SWI-IN-1 -B type=switch,name=PA-TLH-SWI-IN-2 \
-m endpoint=:nge2,endpoint=PA-TLH-SWI-IN-1 \
-m endpoint=:nge3,endpoint=PA-TLH-SWI-IN-2 \
-P task=quorum,state=INIT

4.3 Quorum

If you've made the installation with a quorum, you'll need to set it up with the webremote or with those commands. First, you need to list all LUN with DID format :

Command didadm -l
1        LD-TLH-SRV-UAT-1:/dev/rdsk/c0t0d0 /dev/did/rdsk/d1     
2        LD-TLH-SRV-UAT-1:/dev/rdsk/c1t0d0 /dev/did/rdsk/d2     
5        LD-TLH-SRV-UAT-1:/dev/rdsk/c2t201500A0B856312Cd31 /dev/did/rdsk/d5     
5        LD-TLH-SRV-UAT-1:/dev/rdsk/c2t201400A0B856312Cd31 /dev/did/rdsk/d5     
5        LD-TLH-SRV-UAT-1:/dev/rdsk/c3t202400A0B856312Cd31 /dev/did/rdsk/d5     
5        LD-TLH-SRV-UAT-1:/dev/rdsk/c3t202500A0B856312Cd31 /dev/did/rdsk/d5     
6        LD-TLH-SRV-UAT-1:/dev/rdsk/c5t600A0B800056312C000009CB4AAA2A14d0 /dev/did/rdsk/d6     
7        LD-TLH-SRV-UAT-1:/dev/rdsk/c5t600A0B800056381A00000E0A4AAA2A14d0 /dev/did/rdsk/d7

Choose the LUN you wish have for your quorum :

Command clquorum
/usr/cluster/bin/clquorum add /dev/did/rdsk/d6

Then, activate it :

Command clquorum
/usr/cluster/bin/clquorum enable /dev/did/rdsk/d6

To finish, you need to reset it :

Command clquorum
/usr/cluster/bin/clquorum reset

Now you're able to configure you cluster

4.4 Network

4.4.1 Cluster connections

To check cluster interconnect, please use this command :

Command clinterconnect
clinterconnect status

To enable a network card interconnection :

Command clinterconnect
clinterconnect enable hostname:card

example :

clinterconnect enable localhost:e1000g0

4.4.2 Check network interconnect interfaces

To check if all interfaces are running, configure IPs on each of private (cluster) IPs. Then broadcast a ping :

Command ping
ping -s 10.255.255.255

You can change this kind of private IPs with your (defaults are 172.16.0.255)


4.4.3 Check traffic

Use snoop command to see traffic upcomming for example :

Command snoop
snoop -d <interface> <ip>

example :

snoop -d nge0 192.168.76.2

4.4.4 Get Fiber Channel WWN

To get Fiber Channel identifiers, launch this command :

Command fcinfo hba-port
HBA Port WWN: 2100001b32892934
        OS Device Name: /dev/cfg/c2
        Manufacturer: QLogic Corp.
        Model: 375-3356-02
        Firmware Version: 4.04.01
        FCode/BIOS Version:  BIOS: 1.24; fcode: 1.24; EFI: 1.8;
        Serial Number: 0402R00-0906696990
        Driver Name: qlc
        Driver Version: 20081115-2.29
        Type: N-port
        State: online
        Supported Speeds: 1Gb 2Gb 4Gb 
        Current Speed: 4Gb 
        Node WWN: 2000001b32892934
HBA Port WWN: 2101001b32a92934
        OS Device Name: /dev/cfg/c3
        Manufacturer: QLogic Corp.
        Model: 375-3356-02
        Firmware Version: 4.04.01
        FCode/BIOS Version:  BIOS: 1.24; fcode: 1.24; EFI: 1.8;
        Serial Number: 0402R00-0906696990
        Driver Name: qlc
        Driver Version: 20081115-2.29
        Type: N-port
        State: online
        Supported Speeds: 1Gb 2Gb 4Gb 
        Current Speed: 4Gb 
        Node WWN: 2001001b32a92934

5 Manage

5.1 Get cluster state

To get cluster state, simply launch scstat command :

Command scstat
------------------------------------------------------------------
 
-- Cluster Nodes --
 
                    Node name           Status
                    ---------           ------
  Cluster node:     PA-TLH-SRV-PRD-1    Online
  Cluster node:     PA-TLH-SRV-PRD-2    Online
  Cluster node:     PA-TLH-SRV-PRD-3    Online
  Cluster node:     PA-TLH-SRV-PRD-6    Online
  Cluster node:     PA-TLH-SRV-PRD-4    Online
  Cluster node:     PA-TLH-SRV-PRD-5    Online
 
------------------------------------------------------------------
 
-- Cluster Transport Paths --
 
                    Endpoint               Endpoint               Status
                    --------               --------               ------
  Transport path:   PA-TLH-SRV-PRD-1:nge3  PA-TLH-SRV-PRD-6:nge3  Path online
  Transport path:   PA-TLH-SRV-PRD-1:nge2  PA-TLH-SRV-PRD-6:nge2  Path online
  Transport path:   PA-TLH-SRV-PRD-1:nge3  PA-TLH-SRV-PRD-2:nge3  Path online
  Transport path:   PA-TLH-SRV-PRD-1:nge3  PA-TLH-SRV-PRD-5:nge3  Path online
  Transport path:   PA-TLH-SRV-PRD-1:nge2  PA-TLH-SRV-PRD-2:nge2  Path online
  Transport path:   PA-TLH-SRV-PRD-1:nge2  PA-TLH-SRV-PRD-5:nge2  Path online
  Transport path:   PA-TLH-SRV-PRD-1:nge3  PA-TLH-SRV-PRD-4:nge3  Path online
  Transport path:   PA-TLH-SRV-PRD-1:nge2  PA-TLH-SRV-PRD-4:nge2  Path online
  Transport path:   PA-TLH-SRV-PRD-1:nge3  PA-TLH-SRV-PRD-3:nge3  Path online
  Transport path:   PA-TLH-SRV-PRD-1:nge2  PA-TLH-SRV-PRD-3:nge2  Path online
  Transport path:   PA-TLH-SRV-PRD-2:nge2  PA-TLH-SRV-PRD-5:nge2  Path online
  Transport path:   PA-TLH-SRV-PRD-2:nge3  PA-TLH-SRV-PRD-3:nge3  Path online
  Transport path:   PA-TLH-SRV-PRD-2:nge2  PA-TLH-SRV-PRD-4:nge2  Path online
  Transport path:   PA-TLH-SRV-PRD-2:nge3  PA-TLH-SRV-PRD-6:nge3  Path online
  Transport path:   PA-TLH-SRV-PRD-2:nge2  PA-TLH-SRV-PRD-3:nge2  Path online
  Transport path:   PA-TLH-SRV-PRD-2:nge3  PA-TLH-SRV-PRD-5:nge3  Path online
  Transport path:   PA-TLH-SRV-PRD-2:nge2  PA-TLH-SRV-PRD-6:nge2  Path online
  Transport path:   PA-TLH-SRV-PRD-2:nge3  PA-TLH-SRV-PRD-4:nge3  Path online
  Transport path:   PA-TLH-SRV-PRD-3:nge2  PA-TLH-SRV-PRD-5:nge2  Path online
  Transport path:   PA-TLH-SRV-PRD-3:nge3  PA-TLH-SRV-PRD-6:nge3  Path online
  Transport path:   PA-TLH-SRV-PRD-3:nge2  PA-TLH-SRV-PRD-4:nge2  Path online
  Transport path:   PA-TLH-SRV-PRD-3:nge3  PA-TLH-SRV-PRD-5:nge3  Path online
  Transport path:   PA-TLH-SRV-PRD-3:nge3  PA-TLH-SRV-PRD-4:nge3  Path online
  Transport path:   PA-TLH-SRV-PRD-3:nge2  PA-TLH-SRV-PRD-6:nge2  Path online
  Transport path:   PA-TLH-SRV-PRD-6:nge3  PA-TLH-SRV-PRD-5:nge3  Path online
  Transport path:   PA-TLH-SRV-PRD-6:nge2  PA-TLH-SRV-PRD-5:nge2  Path online
  Transport path:   PA-TLH-SRV-PRD-6:nge3  PA-TLH-SRV-PRD-4:nge3  Path online
  Transport path:   PA-TLH-SRV-PRD-6:nge2  PA-TLH-SRV-PRD-4:nge2  Path online
  Transport path:   PA-TLH-SRV-PRD-4:nge2  PA-TLH-SRV-PRD-5:nge2  Path online
  Transport path:   PA-TLH-SRV-PRD-4:nge3  PA-TLH-SRV-PRD-5:nge3  Path online
 
------------------------------------------------------------------
 
-- Quorum Summary from latest node reconfiguration --
 
  Quorum votes possible:      11
  Quorum votes needed:        6
  Quorum votes present:       11
 
 
-- Quorum Votes by Node (current status) --
 
                    Node Name           Present Possible Status
                    ---------           ------- -------- ------
  Node votes:       PA-TLH-SRV-PRD-1    1        1       Online
  Node votes:       PA-TLH-SRV-PRD-2    1        1       Online
  Node votes:       PA-TLH-SRV-PRD-3    1        1       Online
  Node votes:       PA-TLH-SRV-PRD-6    1        1       Online
  Node votes:       PA-TLH-SRV-PRD-4    1        1       Online
  Node votes:       PA-TLH-SRV-PRD-5    1        1       Online
 
 
-- Quorum Votes by Device (current status) --
 
                    Device Name         Present Possible Status
                    -----------         ------- -------- ------
  Device votes:     /dev/did/rdsk/d28s2 5        5       Online
 
------------------------------------------------------------------
 
-- Device Group Servers --
 
                         Device Group        Primary             Secondary
                         ------------        -------             ---------
 
 
-- Device Group Status --
 
                              Device Group        Status              
                              ------------        ------              
 
 
-- Multi-owner Device Groups --
 
                              Device Group        Online Status
                              ------------        -------------
 
------------------------------------------------------------------
------------------------------------------------------------------
 
-- IPMP Groups --
 
              Node Name           Group   Status         Adapter   Status
              ---------           -----   ------         -------   ------
------------------------------------------------------------------

5.2 Registering Ressources

You can look at the available ressources :

Command clrt
$ clrt list
SUNW.LogicalHostname:2
SUNW.SharedAddress:2

Here we need to use more ressources like HA Storage (HAStoragePlus) and GDS :

Command clrt
clrt register SUNW.HAStoragePlus
clrt register SUNW.gds

Now we can verify :

Command clrt
$ clrt list
SUNW.LogicalHostname:2
SUNW.SharedAddress:2
SUNW.HAStoragePlus:6
SUNW.gds:6

5.3 Creating Ressource Group

A RG (Ressource Group) is a containing for example a VIP (Virtual IP or Logical Host)

Command clrg
clrg create sun-rg

You can also specify a rg on a specific node :

Command clrg
clrg create -n sun-node1 sun-rg

5.4 Creating Logical Host (VIP) Ressource

All your requested VIP should be in /etc/hosts file on each nodes, ex :

Configuration File /etc/hosts
#
# Internet host table
#
::1             localhost
127.0.0.1       localhost
192.168.0.72       sun-node1
192.168.0.77       sun-node2
192.168.0.79       my_app1_vip
192.168.0.80       my_app2_vip

Now, activate it :

Command clrslh
clrslh create -g sun-rg -h my_app1_vip my_app1_vip

  • sun-rg : Ressource group (created before)
  • my_app1_vip : name of the vip in the hosts files
  • my_app1_vip : name of the vip ressource in cluster

To specify on only one node :

Command clrslh
clrslh create -g sun-rg -h lh -N ipmp0sun-node1 my_app1_vip

5.5 Creating FileSystem Ressource

Once your LUNs has been created, be sure you can see all available dids on all nodes :

Command didadm
didadm -l

and compare to the 'format' command. Everythings should looks like similar. If not, please run those commands on all nodes :

Command didadm
didadm -C
didadm -r

This will clear all deleted LUN and add all new created LUN in cluster did configuration.

Now create for example a zpool for each of your services. Once done, use them as filesystem ressource :

Command clrs
clrs create -g sun-rg -t SUNW.HAStoragePlus -p zpools=my_app1 my_app1-fs

  • sun-rg : name of the ressource group
  • my_app1 : zpool name
  • my_app1-fs : filesystem cluster ressource name

5.6 Creating a GDS Ressource

A GDS is used to use custom script for starting, stoppping or probbing (status) an application. To integrate a GDS in a RG :

Command clrs
clrs create -g sun-rg -t SUNW.gds -p Start_command="/bin/myscript.sh start" -p Stop_command="/bin/myscript.sh stop" -p Probe_command="/bin/myscript.sh status" -p resource_dependencies=my_app-fs -p Network_aware=false my_app1-script

This will create a GDS with your Zpool as dependancie. This mean it should be up before the start of the GDS.

Note : You needn't to put the VIP as resource dependencies because Sun cluster do it for you by default.


5.7 Modify / view ressource properties

You may need to change some properties or get informations from. Let's see how to do it. If you want to show :

Command clrs
clrs show -v my_ressource

And choose the ressource you want to set :

Command clrs
clrs set -p my_property=value my_ressource

ex :

clrs set -p START_TIMEOUT=60 ressource_gds
clrs set -p Probe_command="/mnt/test/bin/service_cluster.pl status my_rg" ressource_gds

5.8 Activating Ressource Group

To activate the RG :

Command clrg
clrg manage sun-rg

Now if you want ot use it (this will activate all the resources in the RG) :

Command clrg
clrg online sun-rg

You also can specify a node by adding -n :

Command clrg
clrg online -n node1 sun-rg

6 Maintenance

6.1 Boot in non cluster mode

6.1.1 Reboot with command line

If you need to enter in non cluster mode, please use this command :

Command reboot
reboot -- -x

6.1.2 Boot from grub

Simply edit this line by adding -x at the end during server boot :

kernel /platform/i86pc/multiboot -x

6.2 Remove node from cluster

Simply run this command

Command clnode
clnode remove

7 FAQ

7.1 Can't integrate cluster

7.1.1 Solution 1

During installation, if you get this kind of problems :

Waiting for "sun-node2" to become a cluster member ...

Please follow this step :
Remove RPC and Webconsole binding

7.1.2 Solution 2

Remove node configuration and retry.

7.2 The cluster is in installation mode

If at the end of the installation you encounter this kind of problem (a message like "The cluster is in installation mode" or "Le cluster est en mode installation") this mean you need to configure something before configuring your RG or RS.

If you have the WebUI (http://127.0.0.1:6789 for example), you certainly could resolve your problem with it. But in this case, if may have installed the Quorum. So you need to configure it as well.

7.3 How to change Private Interconnect IP for cluster ?

The cluster install wanted to use a .0.0 as the private interconnect and when installed one of the private interconnects ended up on 172.16.0 and one ended up on 172.16.1 and consequently one private interconnect faulted. I found an article that indicated you could edit the cluster configuration by first booting each machine in non-cluster mode (boot-x, I actually did a reboot and then a stop A on the reboot and then a boot -x)

in /etc/cluster/ccr/infrastructure and then incorporate your changes using

Command ccradm
/usr/cluster/lib/sc/ccradm -o -i /etc/cluster/ccr/infrastructure

After I modified the file to change both private interconnects to be on the 172.16.0 subnet the second private interconnect came on-line. Once the second private interconnect came up I was able to run scsetup, select an additional quorum drive and then set the cluster out of install mode.

7.4 Some commands cannot be executed on a cluster in Install mode

This is generally the case in a 2 nodes cluster when Quorum is not alredy set. Like described in the man :

   Specify the installation-mode setting for the cluster. You can
   specify either enabled or disabled for the installmode property.
 
   While the installmode property is enabled, nodes do not attempt to
   reset their quorum configurations at boot time. Also, while in this
   mode, many administrative functions are blocked. When you first
   install a cluster, the installmode property is enabled.
 
   After all nodes have joined the cluster for the first time, and
   shared quorum devices have been added to the configuration, you must
   explicitly disable the installmode property. When you disable the
   installmode property, the quorum vote counts are set to default
   values. If quorum is automatically configured during cluster
   creation, the installmode property is disabled as well after quorum
   has been configured.

Anyway, if you don't want to add a quorum or would like to use it now, simply run this command :

Command cluster
cluster set -p installmode=disabled

7.5 Disk path offline

The did number 3 is corresponding and reserved to the disks array management and may be seen by the cluster. As it cannot be written (because disks array show it in read only) by the cluster, it shows errors. Anyway, it's not errors and you can carefully use your cluster.

7.5.1 Method 1

To recover as clean as possible your did, run this command on all the cluster nodes :

Command devfsadm
devfsadm

Then if it's the same on all the nodes and only if it's like that, you can safetly run this command :

Command
scgdevs

If you get this kind of errors, please use Method 2 :

Command scgdevs
$ scgdevs
Configuring DID devices
/usr/cluster/bin/scdidadm: Could not open "/dev/rdsk/c0t0d0s2" to verfiy device ID - Device busy.
/usr/cluster/bin/scdidadm: Could not stat "../../devices/scsi_vhci/disk@g600a0b80005634b400005a334accd4d9:c,raw" - No such file or directory.
Warning: Path node loaded - "../../devices/scsi_vhci/disk@g600a0b80005634b400005a334accd4d9:c,raw".
Configuring the /dev/global directory (global devices)
obtaining access to all attached disks

7.5.2 Method 2

This second method is the manual one. You can see it as a format WWN command finishing by 31, ex :

3       PA-TLH-SRV-PRD-1:/dev/rdsk/c4t600A0B800048A9B6000008304AC37A31d0 /dev/did/rdsk/d3

If you really want to disable this kind of messages, connect on all nodes integrated in the cluster and run this command :

Command scdidadm
$ scdidadm -C
scdidadm:  Unable to remove driver instance "3" - No such device or address.
Updating shared devices on node 1
Updating shared devices on node 2
Updating shared devices on node 3
Updating shared devices on node 4
Updating shared devices on node 5
Updating shared devices on node 6

Now we can verify everythings is ok :

Command scdidadm
$ scdidadm -l
1        PA-TLH-SRV-PRD-1:/dev/rdsk/c0t0d0 /dev/did/rdsk/d1     
2        PA-TLH-SRV-PRD-1:/dev/rdsk/c1t0d0 /dev/did/rdsk/d2     
14       PA-TLH-SRV-PRD-1:/dev/rdsk/c4t600A0B800048A9B6000008304AC37AA3d0 /dev/did/rdsk/d14    
15       PA-TLH-SRV-PRD-1:/dev/rdsk/c4t600A0B80005634B40000594B4AC37A8Fd0 /dev/did/rdsk/d15    
16       PA-TLH-SRV-PRD-1:/dev/rdsk/c4t600A0B800048A9B60000082E4AC37A6Ad0 /dev/did/rdsk/d16    
17       PA-TLH-SRV-PRD-1:/dev/rdsk/c4t600A0B80005634B4000059494AC37A5Dd0 /dev/did/rdsk/d17    
18       PA-TLH-SRV-PRD-1:/dev/rdsk/c4t600A0B800048A9B60000082C4AC37A3Dd0 /dev/did/rdsk/d18    
19       PA-TLH-SRV-PRD-1:/dev/rdsk/c4t600A0B80005634B4000059474AC37A2Bd0 /dev/did/rdsk/d19    
20       PA-TLH-SRV-PRD-1:/dev/rdsk/c4t600A0B800048A9B60000082A4AC37A07d0 /dev/did/rdsk/d20    
21       PA-TLH-SRV-PRD-1:/dev/rdsk/c4t600A0B80005634B4000059454AC379F8d0 /dev/did/rdsk/d21    
22       PA-TLH-SRV-PRD-1:/dev/rdsk/c4t600A0B800048A9B6000008284AC379B4d0 /dev/did/rdsk/d22    
23       PA-TLH-SRV-PRD-1:/dev/rdsk/c4t600A0B80005634B4000059434AC3799Ad0 /dev/did/rdsk/d23    
24       PA-TLH-SRV-PRD-1:/dev/rdsk/c4t600A0B80005634B4000058BF4AC1DF47d0 /dev/did/rdsk/d24    
25       PA-TLH-SRV-PRD-1:/dev/rdsk/c4t600A0B80005634B4000058BD4AC1DF32d0 /dev/did/rdsk/d25    
26       PA-TLH-SRV-PRD-1:/dev/rdsk/c4t600A0B80005634B4000058BB4AC1DF20d0 /dev/did/rdsk/d26    
27       PA-TLH-SRV-PRD-1:/dev/rdsk/c4t600A0B80005634B4000058B94AC1DF0Dd0 /dev/did/rdsk/d27    
28       PA-TLH-SRV-PRD-1:/dev/rdsk/c4t600A0B80005634B4000007104A9D6B81d0 /dev/did/rdsk/d28

DID 3 is not present anymore. If you want to reactualize everythings :

Command scdidadm
$ scdidadm -r
Warning: DID instance "3" has been detected to support SCSI2 Reserve/Release protocol only. Adding path "PA-TLH-SRV-PRD-3:/dev/rdsk/c3t202300A0B85634B4d31" creates more than 2 paths to this device and can lead to unexpected node panics.
DID subpath "/dev/rdsk/c3t202300A0B85634B4d31s2" created for instance "3".
Warning: DID instance "3" has been detected to support SCSI2 Reserve/Release protocol only. Adding path "PA-TLH-SRV-PRD-3:/dev/rdsk/c2t201200A0B85634B4d31" creates more than 2 paths to this device and can lead to unexpected node panics.
DID subpath "/dev/rdsk/c2t201200A0B85634B4d31s2" created for instance "3".
Warning: DID instance "3" has been detected to support SCSI2 Reserve/Release protocol only. Adding path "PA-TLH-SRV-PRD-3:/dev/rdsk/c3t202200A0B85634B4d31" creates more than 2 paths to this device and can lead to unexpected node panics.
DID subpath "/dev/rdsk/c3t202200A0B85634B4d31s2" created for instance "3".
Warning: DID instance "3" has been detected to support SCSI2 Reserve/Release protocol only. Adding path "PA-TLH-SRV-PRD-3:/dev/rdsk/c2t201300A0B85634B4d31" creates more than 2 paths to this device and can lead to unexpected node panics.
DID subpath "/dev/rdsk/c2t201300A0B85634B4d31s2" created for instance "3".

7.6 Force uninstall

This is not recommanded, but if you can't uninstall and want to force it, here is the procedure :

Command
ok boot -x

  • Remove the Sun Cluster packages
Command pkgrm
pkgrm SUNWscu SUNWscr SUNWscdev SUNWscvm SUNWscsam SUNWscman SUNWscsal SUNWmdm

  • Remove the configurations
Command
rm -r /var/cluster /usr/cluster /etc/cluster
rm /etc/inet/ntp.conf
rm -r /dev/did
rm -r /devices/pseudo/did*
rm /etc/path_to_inst (make sure you have backup copy of this file)

ATTENTION: If you create a new path_to_inst at boottime with 'boot -ra' you should be on the physical bootdevice. Maybe it's not possible to write a path_to_inst on a bootmirror (SVM or VxVM).

  • Edit configuration files
    • edit /etc/vfstab to remove did and global entries
    • edit /etc/nsswitch.conf to remove cluster references
  • Reboot the node with -a option (is necessary to write a new path_to_inst file)
Command
reboot -- -rav

reply "y" to "do you want to rebuild path_to_inst?"

  • In case of reinstalling, then ...
Command
mkdir /globaldevices; rmdir /global

  • Uncomment /globaldevices entry from /etc/vfstab
  • newfs /dev/rdsk/c?t?d?s? (wherever /globaldevices was mounted)
  • mount /globaldevices
  • scinstall

7.7 How to Change Sun Cluster Node Names

Make a copy of /etc/cluster/ccr/infrastructure:

Command cp
cp /etc/cluster/ccr/infrastructure /etc/cluster/ccr/infrastructure.old 

Edit /etc/cluster/ccr/infrastructure and change node names as you want. For example, change srv01 to server01 and srv02 to server02.

If necessary, change the Solaris node name:

Command echo
echo server01 > /etc/nodename 

Regenerate the checksum for the infrastructure file:

Command ccradm
/usr/cluster/lib/sc/ccradm -i /etc/cluster/ccr/infrastructure -o 

Shut down Sun Cluster and boot both nodes:

Command cluster
cluster shutdown -g 0 -y ok boot 

7.8 Can't switch an RG from one node to another

I had a problem switching a RG on a Solaris 10u7 with Sun Cluster 3.2u2 (installed patches : 126107-33, 137104-02, 142293-01, 141445-09) because. In fact the Zpool ressource didn't want to mount on another node. When I loeked at the logs, I saw :

le volume ZFS ne voulait pas se monter. dans le fichier "/var/adm/messages" je voyais ce message lors d'un montage de RG :

Configuration File /var/adm/messages
Dec 16 15:34:30 LD-TLH-SRV-PRD-3 zfs: [ID 427000 kern.warning] WARNING: pool 'ulprod-ld_mysql' could not be loaded as it was last accessed by another system (host: LD-TLH-SRV-PRD-2 hostid: 0x27812152). See: http://www.sun.com/msg/ZFS-8000-EY

In fact, it's a bug and it could be bypassed by putting the RG offline :

Command clrg
clrg offline <RG_name>

Then manually mount and umount the zpool :

Command zpool
zpool import <zpool_name>
zpool export <zpool_name>

Now put the RG online :

Command clrg
clrg online -n <node_name> <rg_name>

If the problem still occur, look in the logs files and if there is somethings like this :

Configuration File /var/adm/messages
Aug 17 22:23:28 minipardus SC[,SUNW.HAStoragePlus:8,clstorage,zfspool,hastorageplus_prenet_start]: [ID 148650 daemon.notice] Started searching for devices in '/dev/dsk' to find the importable pools.
Aug 17 22:23:35 minipardus SC[,SUNW.HAStoragePlus:8,clstorage,zfspool,hastorageplus_prenet_start]: [ID 547433 daemon.notice] Completed searching the devices in '/dev/dsk' to find the importable pools.
Aug 17 22:23:35 minipardus SC[,SUNW.HAStoragePlus:8,clstorage,zfspool,hastorageplus_prenet_start]: [ID 471757 daemon.error] cannot import pool 'qnap' : ''''/var/cluster/run/HAStoragePlus/zfs' is not a valid directory'''
Aug 17 22:23:35 minipardus SC[,SUNW.HAStoragePlus:8,clstorage,zfspool,hastorageplus_prenet_start]: [ID 117328 daemon.error] The pool 'qnap' failed to import and populate cachefile.
Aug 17 22:23:35 minipardus SC[,SUNW.HAStoragePlus:8,clstorage,zfspool,hastorageplus_prenet_start]: [ID 292307 daemon.error] Failed to import:qnap

If it's the case, it's apparently corrected with sun cluster 3.2u3.

To avoid installing this update create this folder '/var/cluster/run/HAStoragePlus/zfs' :

Command mkdir
mkdir -p /var/cluster/run/HAStoragePlus/zfs

Check if file "/etc/cluster/eventlog/eventlog.conf" contain line "EC_zfs - - - /usr/cluster/lib/sc/events/zpool_cachefile_plugin.so".

If it's not the case, the content should looks like :

Configuration File /etc/cluster/eventlog/eventlog.conf
 
# Class         Subclass        Vendor  Publisher       Plugin location                         Plugin parameters
 
EC_Cluster      -               -       -               /usr/cluster/lib/sc/events/default_plugin.so
EC_Cluster      -               -       gds             /usr/cluster/lib/sc/events/gds_plugin.so
EC_Cluster      -               -       -               /usr/cluster/lib/sc/events/commandlog_plugin.so
EC_zfs          -               -       -               /usr/cluster/lib/sc/events/zpool_cachefile_plugin.so

Now mount the RG where you want, it should works.

7.9 Cluster is unavailable when a node crash on a 2 nodes cluster

Two types of problems can arise from cluster partitions: split brain and amnesia. Split brain occurs when the cluster interconnect between Solaris hosts is lost and the cluster becomes partitioned into subclusters, and each subcluster believes that it is the only partition. A subcluster that is not aware of the other subclusters could cause a conflict in shared resources, such as duplicate network addresses and data corruption.

Amnesia occurs if all the nodes leave the cluster in staggered groups. An example is a two-node cluster with nodes A and B. If node A goes down, the configuration data in the CCR is updated on node B only, and not node A. If node B goes down at a later time, and if node A is rebooted, node A will be running with old contents of the CCR. This state is called amnesia and might lead to running a cluster with stale configuration information.

You can avoid split brain and amnesia by giving each node one vote and mandating a majority of votes for an operational cluster. A partition with the majority of votes has a quorum and is enabled to operate. This majority vote mechanism works well if more than two nodes are in the cluster. In a two-node cluster, a majority is two. If such a cluster becomes partitioned, an external vote enables a partition to gain quorum. This external vote is provided by a quorum device. A quorum device can be any disk that is shared between the two nodes.

7.9.1 Recovering from amnesia

Scenario: Two node cluster (nodes A and B) with one Quorum Device, nodeA has gone bad, and amnesia protection is preventing nodeB from booting up.

Amnesia occurs if all the nodes leave the cluster in staggered groups. An example is a two-node cluster with nodes A and B. If node A goes down, the configuration data in the CCR is updated on node B only, and not node A. If node B goes down at a later time, and if node A is rebooted, node A will be running with old contents of the CCR. This state is called amnesia and might lead to running a cluster with stale configuration information.

Warning : this is a dangerous operation

  • Boot nodeB in non-cluster mode :
Command
reboot -- -x

  • Edit nodeB's file /etc/cluster/ccr/global/infrastructure as follows :
    • Change the value of "cluster.properties.installmode" from "disabled" to "enabled"
    • Change the number of votes for nodeA from "1" to "0", in the following property line "cluster.nodes.<NodeA's id>.properties.quorum_vote".
    • Delete all lines with "cluster.quorum_devices" to remove knowledge of the quorum device.
Configuration File /etc/cluster/ccr/global/infrastructure
...
cluster.properties.installmode  enabled
...
cluster.nodes.1.properties.quorum_vote  1
...

  • On the first node (the master one, the first to boot) launch :
Command ccradm
/usr/cluster/lib/sc/ccradm -i /etc/cluster/ccr/infrastructure -o

or (depending on version)

/usr/cluster/lib/sc/ccradm recover -o /etc/cluster/ccr/global/infrastructure

  • Reboot nodeB in cluster mode :
Command reboot
reboot

If you have more than 2 nodes, do the same command but without "-o" :

Command ccradm
/usr/cluster/lib/sc/ccradm recover /etc/cluster/ccr/global/infrastructure

8 References

Installation of Sun Cluster (old)
http://en.wikipedia.org/wiki/Solaris_Cluster
http://opensolaris.org/os/community/ha-clusters/translations/french/relnote_fr/
Ressources Properties
http://docs.sun.com/app/docs/doc/819-0177/cbbbgfij?l=ja&a=view
http://www.vigilanttechnologycorp.com/genasys/weblogRender.jsp?LogName=Sun%20Cluster
http://docs.sun.com/app/docs/doc/820-2558/gdrna?l=fr&a=view
http://wikis.sun.com/display/SunCluster/%28English%29+Sun+Cluster+3.2+1-09+Release+Notes#%28English%29SunCluster3.21-09ReleaseNotes-optgdfsinfo (Mine d'Or)
Deploying highly available zones with Solaris Cluster 3.2