cluster cpu panic

I've performed the following:

1. created a single node cluster, node1

2. via scsetup, added a second node, node2, to the invite list

4. via scsetup, added 2 cluster interconnect interfaces

*** NOTE: there is no transport junction - the cables are directly connected,

yet when i select "add cable" there is no option given for this

3. run scinstall on node2 and rebooted

4. node2 joins

5. init 6 on the node2

The following cpu panic then occurs on node1:

Notifying cluster that this node is panicking

panic[cpu28]/thread=300028806a0: CMM: Cluster lost operational quorum; aborting.

000002a1042ef350 cl_runtime:__1cZsc_syslog_msg_log_no_args6Fpviipkc0_nZsc_syslog_msg_status_enum __+34 (600066b6800, 3, 43, 3, 2a1042ef580, 703e7e7f)

%l0-3: 0000000000000000 00000600060c81b8 0000000000000001 0000000000000000

%l4-7: 0000000000000001 00000600060c84a0 00000000000002e8 000000000000005d

000002a1042ef410 cl_runtime:__1cCosNsc_syslog_msgDlog6MiipkcE_nZsc_syslog_msg_status_enum__+1c (60006650b48, 3, 0, 703e7e7f, 703e6c00, 2a1042ef570)

%l0-3: 0000000000000000 00000600060c81b8 0000000000000001 0000000000000000

%l4-7: 0000000000000001 00000600060c84a0 00000000000002e8 000000000000005d

000002a1042ef4e0 cl_comm:__1cOautomaton_implMqcheck_state6M_n0APqcheck_return_t__+3f4 (600060c8008, 1f, 703e7dc5, 2a1042ef728, 7c, 2)

%l0-3: 0000000000000000 00000600060c81b8 0000000000000001 0000000000000000

%l4-7: 0000000000000001 00000600060c84a0 00000000000002e8 000000000000005d

000002a1042ef750 cl_comm:__1cIcmm_implStransitions_thread6M_v_+2d8 (600060c8008, 78e7eb5730, ec10037b652c, 6000165d408, ec10037e0768, 0)

%l0-3: 0000000000000006 0000000000000012 0000000000000009 00000000703e6c66

%l4-7: 000000000000bfc8 00000600060d3ea0 000000000000bfd4 0000ec113c50c138

000002a1042ef9e0 cl_comm:cllwpwrapper+f8 (7b2d3318, 703be400, 703be400, 18ab210, 0, 703be7dc)

%l0-3: 0000000000000000 0000000000000000 0000000000000007 fffffffffffffffd

%l4-7: 000000000000003a 0000000000000000 0000000000000000 0000000000000002

000002a1042efac0 unix:const_seg_900002801+2018 (2a1042efb70, 18, 0, 0, 0, 0)

%l0-3: 0000000000000000 0000000000000000 0000000000000000 0000000000000000

%l4-7: 0000000000000000 0000000000000000 0000000000000000 0000000000000000

There's more things to sort out here, including a FC attached shared quorum device that has dissapeared from node1.

WARNING - Unable to mount one or more of the following filesystem(s):

/global/.devices/node@1

If this is not repaired, global devices will be unavailable.

... and the following message is looping every few seconds on both nodes:

Sep 11 16:27:53 node1 cl_runtime: NOTICE: clcomm: Path node1:ipge3 - node2:ipge3 being cleaned up

Sep 11 16:27:53 node1 cl_runtime: NOTICE: clcomm: Path node1:ipge3 - node2:ipge3 being drained

All up, seems like quite a mess. What could I be doing wrong?

[3063 byte] By [robindixon] at [2007-11-26 10:03:31]
# 1

Caveat: I've not tried this so this is going to be somewhat of a guess as to what is going on.

I would guess that the single node cluster doesn't have a quorum device (QD) defined? If so, that would explain what was going on. For a single node cluster, once out of install mode, it will just need to boot to reach quorum, i.e. 1 vote from a total of 1. When node 2 joins the cluster, you now have 2 possible votes. When you reboot node 2, you lose that second vote, and, node 1 without an additional QD panics because it now only has 1 vote out of a possible 2, i.e. no majority.

So, to fix this. configure a quorum device.

Not sure about the ipge messages. If these are Tx000s, then try configuring real switches in between these boxes and/or ensuring that all the network patches are up to date.

Tim

TimRead at 2007-7-7 1:37:24 > top of Java-index,Solaris Operating System,Solaris Essentials - General Technical Questions...
# 2

Thanks Tim. Indeed, this is on T2000's. I've eliminated the ipge messages by changing to crossover cables. As for quorum device - you're right! I did not have a quorum device set. So I went ahead and set this on node2. Seemed better, although I still got a cpu panic. Sorry for my vagueness here. I'm reverting both nodes back to my last flar and trying this again, as there were a bunch of cluster services on node1 which disabled themselves and I could not bring them back online again. As a consequence, node1 was unable to operate properly enough to bring the public address online. Also node2 was unable to achieve this, as node1 was the primary and node2 the failover, for this resource? I'll keep plugging away at this and try to get it working. Have read all the documentation several times over. A lot to get my head around. Hopefully get this working eventually.

robindixon at 2007-7-7 1:37:24 > top of Java-index,Solaris Operating System,Solaris Essentials - General Technical Questions...
# 3

Okay now I'm back to having a single node cluster with 1 service running: httpd

I want to add + join a second node without bringing httpd down.

As I now understand, I need to add a quorum device before proceeding with this, or a cpu panic may result.

I have already issued the following commands from node1 to prepare for node2 joining:

scconf -a -T node=node2 && \

scconf -a -A trtype=dlpi,name=ipge2,node=node1 && \

scconf -a -A trtype=dlpi,name=ipge3,node=node1

After I get quorum device configured, I will run scinstall on node2 and get it to join, then on node1 I will run the following to enable Cluster Interconnect. The interfaces ipge2 and ipge3 are connected to their corresponding ones on node1 and node2, with cat5e crossover cables.

scconf -c -m endpoint=node2:ipge2,state=enabled && \

scconf -c -m endpoint=node2:ipge3,state=enabled

Does this procedure seem correct so far?

I am currently at a loss as to how I go about adding the quorum device.

I already have the intended quorum device mounted on node1 (from vfstab):

/dev/global/dsk/d6s2/dev/global/rdsk/d6s2/global/u1ufs2yesglobal,logging

It is a SAN volume (4 in the below list).

AVAILABLE DISK SELECTIONS:

0. c1t0d0 <SUN72G cyl 14087 alt 2 hd 24 sec 424>

/pci@7c0/pci@0/pci@1/pci@0,2/LSILogic,sas@2/sd@0,0

1. c1t1d0 <SUN72G cyl 14087 alt 2 hd 24 sec 424>

/pci@7c0/pci@0/pci@1/pci@0,2/LSILogic,sas@2/sd@1,0

2. c1t2d0 <SUN72G cyl 14087 alt 2 hd 24 sec 424>

/pci@7c0/pci@0/pci@1/pci@0,2/LSILogic,sas@2/sd@2,0

3. c1t3d0 <SUN72G cyl 14087 alt 2 hd 24 sec 424>

/pci@7c0/pci@0/pci@1/pci@0,2/LSILogic,sas@2/sd@3,0

4. c2t50060E80035B0001d0 <HITACHI-OPEN-9 cyl 10014 alt 2 hd 15 sec 96>

/pci@7c0/pci@0/pci@1/pci@0,2/SUNW,qlc@1/fp@0,0/ssd@w50060e80035b0001,0

The scsetup menu interface doesn't seem to have an appropriate option for adding this as quorum.

robindixon at 2007-7-7 1:37:24 > top of Java-index,Solaris Operating System,Solaris Essentials - General Technical Questions...
# 4

Quorum devices are set up from scsetup via menu option 1!

To see which disks can be used as quorum devices use:

# scdidadm -L

Pick one of the disks that is connected to both machines, i.e. has two entries, e.g. d6

Note - I'd do all this via scsetup and scinstall rather than via the command line, just because I'm too lazy to remember the commands :-)

Tim

TimRead at 2007-7-7 1:37:24 > top of Java-index,Solaris Operating System,Solaris Essentials - General Technical Questions...
# 5

Ahhhh menu option 1! In my blind spot (right in front of me) .. .just kidding ;-)

I actually did see it, and when I tried it, the following error resulted:

scconf -a -q globaldev=d6

scconf: Failed to add quorum device (d6) - global device is not found.

Thanks for reminding me of scdidadm though:

# scdidadm -L

1beaker:/dev/rdsk/c0t0d0/dev/did/rdsk/d1

2beaker:/dev/rdsk/c1t0d0/dev/did/rdsk/d2

3beaker:/dev/rdsk/c1t1d0/dev/did/rdsk/d3

4beaker:/dev/rdsk/c1t2d0/dev/did/rdsk/d4

5beaker:/dev/rdsk/c1t3d0/dev/did/rdsk/d5

6beaker:/dev/rdsk/c2t50060E80035B0001d0 /dev/did/rdsk/d6

So does this mean I will need to make /dev/did/rdsk/d6 a global device before adding "d6" as a quorum device?

robindixon at 2007-7-7 1:37:24 > top of Java-index,Solaris Operating System,Solaris Essentials - General Technical Questions...
# 6
Ah, so you don't have any dual hosted storage! Therefore you cannot configure a quorum disk. Once you've dual hosted the storage, i.e. physically connected it to both machines, then you can configure a quorum device.Tim
TimRead at 2007-7-7 1:37:24 > top of Java-index,Solaris Operating System,Solaris Essentials - General Technical Questions...
# 7
6 beaker:/dev/rdsk/c2t50060E80035B0001d0 /dev/did/rdsk/d6 -- is dual hosted FC / SAN(beaker is the name for node1)
robindixon at 2007-7-7 1:37:24 > top of Java-index,Solaris Operating System,Solaris Essentials - General Technical Questions...
# 8

The cpu panic on shutting down 1 node was resulting when there was no quorum disk, and "split brain" was occurring. I'm currently having issues with adding the quorum disk. This is an interesting question. Quorum disk prerequisite is to be visible to at least 2 nodes, so am I being prevented from adding a quorum disk because it is a single node cluster?

There *must* be a way around this surely? Or have I done something wrong? I hope this isn't a "what comes first, the chicken or the egg".

beaker# scconf -a -q autoconfig

scconf:Did not find any suitable device to add as quorum device.

beaker# scdidadm -l

1beaker:/dev/rdsk/c0t0d0/dev/did/rdsk/d1

2beaker:/dev/rdsk/c1t0d0/dev/did/rdsk/d2

3beaker:/dev/rdsk/c1t1d0/dev/did/rdsk/d3

4beaker:/dev/rdsk/c1t2d0/dev/did/rdsk/d4

5beaker:/dev/rdsk/c1t3d0/dev/did/rdsk/d5

6beaker:/dev/rdsk/c2t50060E80035B0001d0 /dev/did/rdsk/d6

beaker# scconf -a -q globaldev=d6

scconf: Failed to add quorum device (d6) - global device is not found.

robindixon at 2007-7-7 1:37:24 > top of Java-index,Solaris Operating System,Solaris Essentials - General Technical Questions...
# 9

After joining node2, the following worked:

beaker# scconf -a -q autoconfig

scconf:Will attempt to add the following devices as quorum devices:

scconf:/dev/did/rdsk/d6s2

scconf:Attempt to add device /dev/did/rdsk/d6s2 as quorum device succeeded.

Will see where we get to this time. Hopefully no more cpu panic :-)

robindixon at 2007-7-7 1:37:24 > top of Java-index,Solaris Operating System,Solaris Essentials - General Technical Questions...
# 10

mmmmmm well i'm definately getting somewhere, however on node2, the following services don't come online after rebooting and joining the cluster.

offline12:01:25 svc:/system/cluster/clusterdata:default

offline12:01:25 svc:/system/cluster/cl-svc-enable:default

offline12:01:26 svc:/application/print/ipp-listener:default

offline12:01:35 svc:/system/cluster/rpc-pmf:default

offline12:01:35 svc:/application/management/cacao:default

offline12:01:36 svc:/system/cluster/cl-ccra:default

offline12:01:36 svc:/system/cluster/cl-event:default

offline12:01:36 svc:/system/cluster/cl-eventlog:default

offline12:01:36 svc:/system/cluster/spm:default

offline12:01:36 svc:/system/cluster/scdpm:default

offline12:01:36 svc:/system/cluster/rpc-fed:default

offline12:01:36 svc:/system/cluster/pnm:default

offline12:01:36 svc:/system/cluster/rgm:default

offline12:01:36 svc:/system/cluster/cl-svc-cluster-milestone:default

offline12:01:36 svc:/system/cluster/scsymon-srv:default

maintenance12:02:40 svc:/system/cluster/mountgfsys:default

and in scstat, it freezes at "resource groups" section, and spits out the following error:

libsecurity: create of rpc handle to program rgmd_receptionist (100141) failed, will not retry

Sep 13 13:34:39 bunsen Cluster.CCR: libsecurity: create of rpc handle to program rgmd_receptionist (100141) failed, will not retry

Sep 13 13:34:39 bunsen Cluster.CCR: libsecurity: program rgmd_receptionist (100141) rpc_createerror: : RPC: Program not registered

scstat: unexpected error.

What have I done wrong?

robindixon at 2007-7-7 1:37:24 > top of Java-index,Solaris Operating System,Solaris Essentials - General Technical Questions...
# 11

So I rebooted node2, and then the following happens on node2 after booting back up.

Notifying cluster that this node is panicking

panic[cpu13]/thread=2a100f01cc0: Reservation Conflict

000002a100f01730 ssd:ssd_mhd_watch_cb+208 (7600000002, 60001678000, 70384000, 7bbd0800, 0, 600016c3780)

%l0-3: 0000060001833850 0000060001833850 0000060001833830 0000000000000000

%l4-7: 0000000000000018 0000000000000280 0000000000000000 0000000000000001

000002a100f017e0 scsi:scsi_watch_request_intr+160 (60001833850, 18, 0, 600019b0fe0, 0, 60001833850)

%l0-3: 0000060001833850 0000060001735800 0000060001833830 0000000000000018

%l4-7: 0000000000000018 000000000000000c 00000000703597a8 000000007bf80400

000002a100f018b0 fcp:ssfcp_cmd_callback+64 (600018338a8, 0, 1, 60001833a48, 600018336b8, 600016fea40)

%l0-3: 0000000000000002 0000060001735800 00000600016a28d0 00000600016a2528

%l4-7: 00000600016a2518 00000000fbffffff 00000000703597a8 000000007bf80400

000002a100f01960 qlc:ql_task_thread+660 (600016a24c0, 60001833a48, 703597b8, 703597c8, fffffffffffff7ff, 4000000)

%l0-3: fffffffffffffeff 00000600016a24d8 00000600016a28d0 00000600016a2528

%l4-7: 00000600016a2518 00000000fbffffff 00000000703597a8 0000000070359798

000002a100f01a20 qlc:ql_task_daemon+70 (600016a24c0, 2000000, 600016a2518, 600016a24f8, 600016a2510, 600016a2512)

%l0-3: fffffffffffffffd 00000600016a2522 0000000000000000 0000000000000000

%l4-7: 0000000000000000 0000000000a20004 0000000000000002 0000000000000002

robindixon at 2007-7-7 1:37:24 > top of Java-index,Solaris Operating System,Solaris Essentials - General Technical Questions...
# 12

Hi,

Looking at this error : RPC: Program not registered

I'd say that rpcbind is not running, is this correct ?

Judging by the errors i'd say the cluster requires rpcbind to be running and the error suggests that rpcbind is not running since it hasn't been registered.

Nico

NicoB at 2007-7-7 1:37:24 > top of Java-index,Solaris Operating System,Solaris Essentials - General Technical Questions...
# 13

Boy, it shouldn't be this hard - I'm sorry but I don't really know what is going on...

It looks like something somewhere prevented the new node from completing it's initial joining phase. That would probably be in /var/adm/messages or in the log for the mountgfsys service. I would guess that you forgot to make the global mount directory on the new node.

I'm not sure what cause the panic though. If the QD was working properly then this shouldn't happen. May be there are stray SCSI keys on there. There is a command in /usr/cluster/lib/sc called scsi it can check for and remove stray keys. Don't use it lightly though!

I will add that I've only ever expanded two node cluster which is why I can't really speak with much experience here.

HTH,

Tim

TimRead at 2007-7-7 1:37:24 > top of Java-index,Solaris Operating System,Solaris Essentials - General Technical Questions...
# 14

Yeah, tell me about it! I'm looking forward to closing this chapter so to speak :-)

Node2, "bunsen", had about 5 reboot cycles until it found a state where it would not panic. There were 2 different types of panics. This is the other one.

Note still having issues with the Cluster Interconnects, with crossover cables it seems. I will try this going through a switch in my next attempt.

NOTICE: clcomm: Adapter ipge3 constructed

NOTICE: clcomm: Path bunsen:ipge3 - beaker:ipge3 being constructed

NOTICE: clcomm: Adapter ipge2 constructed

NOTICE: clcomm: Path bunsen:ipge2 - beaker:ipge2 being constructed

NOTICE: CMM: Node bunsen: attempting to join cluster.

NOTICE: clcomm: Path bunsen:ipge2 - beaker:ipge2 being initiated

WARNING: Received non interrupt heartbeat on bunsen:ipge2 - beaker:ipge2 - path timeouts are likely.

NOTICE: clcomm: Path bunsen:ipge2 - beaker:ipge2 being cleaned up

NOTICE: clcomm: Path bunsen:ipge2 - beaker:ipge2 being drained

NOTICE: clcomm: Path bunsen:ipge2 - beaker:ipge2 being constructed

NOTICE: clcomm: Path bunsen:ipge3 - beaker:ipge3 being initiated

WARNING: Received non interrupt heartbeat on bunsen:ipge3 - beaker:ipge3 - path timeouts are likely.

NOTICE: clcomm: Path bunsen:ipge2 - beaker:ipge2 being initiated

NOTICE: CMM: Quorum device 1 (gdevname /dev/did/rdsk/d6s2) can not be acquired by the current cluster members. This quorum device is held by node 1.

NOTICE: CMM: Cluster doesn't have operational quorum yet; waiting for quorum.

NOTICE: clcomm: Path bunsen:ipge2 - beaker:ipge2 being cleaned up

NOTICE: clcomm: Path bunsen:ipge2 - beaker:ipge2 being drained

NOTICE: clcomm: Path bunsen:ipge2 - beaker:ipge2 being constructed

NOTICE: clcomm: Path bunsen:ipge3 - beaker:ipge3 being cleaned up

NOTICE: clcomm: Path bunsen:ipge3 - beaker:ipge3 being drained

NOTICE: clcomm: Path bunsen:ipge3 - beaker:ipge3 being constructed

NOTICE: clcomm: Path bunsen:ipge3 - beaker:ipge3 being initiated

NOTICE: clcomm: Path bunsen:ipge2 - beaker:ipge2 errors during initiation

WARNING: Path bunsen:ipge2 - beaker:ipge2 initiation encountered errors, errno = 62. Remote node may be down or unreachable through this path.

NOTICE: clcomm: Path bunsen:ipge3 - beaker:ipge3 being cleaned up

NOTICE: clcomm: Path bunsen:ipge3 - beaker:ipge3 being drained

NOTICE: clcomm: Path bunsen:ipge3 - beaker:ipge3 being constructed

NOTICE: clcomm: Path bunsen:ipge2 - beaker:ipge2 being initiated

NOTICE: clcomm: Path bunsen:ipge3 - beaker:ipge3 being initiated

NOTICE: CMM: Node beaker (nodeid: 1, incarnation #: 1158038113) has become reachable.

NOTICE: clcomm: Path bunsen:ipge2 - beaker:ipge2 online

NOTICE: CMM: Cluster has reached quorum.

NOTICE: CMM: Node beaker (nodeid = 1) is up; new incarnation number = 1158038113.

NOTICE: CMM: Node bunsen (nodeid = 2) is up; new incarnation number = 1158133340.

NOTICE: CMM: Cluster members: beaker bunsen.

NOTICENotifying cluster that this node is panicking

panic[cpu19]/thread=30001ac1600: CMM: Cluster lost operational quorum; aborting.

000002a1040b3350 cl_runtime:__1cZsc_syslog_msg_log_no_args6Fpviipkc0_nZsc_syslog_msg_status_enum __+34 (600048f1800, 3, 43, 3, 2a1040b3580, 703e7e7f)

%l0-3: 0000000000000000 00000600062301b8 0000000000000001 0000000000000000

%l4-7: 0000000000000002 0000060006230788 00000000000005d0 00000000000000ba

000002a1040b3410 cl_runtime:__1cCosNsc_syslog_msgDlog6MiipkcE_nZsc_syslog_msg_status_enum__+1c (60001a26a70, 3, 0, 703e7e7f, 703e6c00, 2a1040b3570)

%l0-3: 0000000000000000 00000600062301b8 0000000000000001 0000000000000000

%l4-7: 0000000000000002 0000060006230788 00000000000005d0 00000000000000ba

000002a1040b34e0 cl_comm:__1cOautomaton_implMqcheck_state6M_n0APqcheck_return_t__+3f4 (60006230008, 3e, 703e7dc5, 2a1040b3728, f8, 2)

%l0-3: 0000000000000000 00000600062301b8 0000000000000001 0000000000000000

%l4-7: 0000000000000002 0000060006230788 00000000000005d0 00000000000000ba

000002a1040b3750 cl_comm:__1cIcmm_implStransitions_thread6M_v_+2d8 (60006230008, 703e6d6e, 5ee9da5f94, 600067815c8, 5ee9dca500, 0)

%l0-3: 0000000000000006 0000000000000012 0000000000000009 00000000703e6c66

%l4-7: 000000000000bfc8 000006000623bea0 000000000000bfd4 0000005eeade60b4

000002a1040b39e0 cl_comm:cllwpwrapper+f8 (7b2d3318, 703be400, 703be400, 18ab210, 0, 703be7dc)

%l0-3: 0000000000000000 0000000000000000 0000000000000007 fffffffffffffffd

%l4-7: 000000000000003a 0000000000000000 0000000000000000 0000000000000002

000002a1040b3ac0 unix:const_seg_900002801+2018 (2a1040b3b70, 18, 0, 0, 0, 0)

%l0-3: 0000000000000000 0000000000000000 0000000000000000 0000000000000000

%l4-7: 0000000000000000 0000000000000000 0000000000000000 0000000000000000

robindixon at 2007-7-7 1:37:24 > top of Java-index,Solaris Operating System,Solaris Essentials - General Technical Questions...
# 15
Well here's a clue: http://blogs.sun.com/kristien/entry/scsi_reservations_in_sun_clusterStill working on it..... grrrrr.
robindixona at 2007-7-21 15:23:50 > top of Java-index,Solaris Operating System,Solaris Essentials - General Technical Questions...
# 16
I would very much like to know why there are a bunch of cluster services that don't come online on node2 after it joins a cluster.
robindixona at 2007-7-21 15:23:50 > top of Java-index,Solaris Operating System,Solaris Essentials - General Technical Questions...
# 17

So did you manage to get the cluster booting properly and still retain the ability to take nodes down. If so, I guess you must have fixed the quorum problem.

Are the services that you have defined to come online on node 2? If you created the original service on node 1 and defined the resource group to have a node list only on node 1, then it will only come online on that node. What you need to do is extend the node list for that service.

Something like:

scrgadm -c -g <name> -h <new node list> -y maximum_primaries=<new maximum_primaries> \

-y desired_primaries=<new desired_primaries>

Regards

Tim

Tim.Reada at 2007-7-21 15:23:50 > top of Java-index,Solaris Operating System,Solaris Essentials - General Technical Questions...
# 18

The quorum problem is not yet resolved.

There is also a problem with transport paths that I think needs to be solved first:

WARNING: Received non interrupt heartbeat on beaker:ipge2 - bunsen:ipge2 - path timeouts are likely.

WARNING: Received non interrupt heartbeat on beaker:ipge0 - bunsen:ipge0 - path timeouts are likely.

NOTICE: clcomm: Path beaker:ipge2 - bunsen:ipge2 being cleaned up

NOTICE: clcomm: Path beaker:ipge0 - bunsen:ipge0 being cleaned up

NOTICE: clcomm: Path beaker:ipge0 - bunsen:ipge0 being drained

NOTICE: clcomm: Path beaker:ipge2 - bunsen:ipge2 being drained

NOTICE: clcomm: Path beaker:ipge2 - bunsen:ipge2 being constructed

NOTICE: clcomm: Path beaker:ipge0 - bunsen:ipge0 being constructed

NOTICE: clcomm: Path beaker:ipge0 - bunsen:ipge0 being initiated

NOTICE: clcomm: Path beaker:ipge2 - bunsen:ipge2 being initiated

NOTICE: clcomm: Path beaker:ipge2 - bunsen:ipge2 being cleaned up

NOTICE: clcomm: Path beaker:ipge2 - bunsen:ipge2 being drained

NOTICE: clcomm: Path beaker:ipge2 - bunsen:ipge2 being constructed

NOTICE: clcomm: Path beaker:ipge2 - bunsen:ipge2 being initiated

NOTICE: clcomm: Path beaker:ipge2 - bunsen:ipge2 being cleaned up

NOTICE: clcomm: Path beaker:ipge2 - bunsen:ipge2 being drained

I have now tried all permutations of interfaces for the Cluster Interconnect and the problem remains.

robindixona at 2007-7-21 15:23:50 > top of Java-index,Solaris Operating System,Solaris Essentials - General Technical Questions...
# 19

Recurring problem with node2 after joining cluster:

* Does not establish reliable transport path for cluster interconnect.

* Does not register its IPMP Group for public interface.

* Does not bring all the cluster SVC's online.

* Intermittant cpu panic reboots.

This is the case regardless of whether I initially define both nodes during setup of primary node, or add node2 after setting up a single node cluster.

I need to know if:

a) I am doing something wrong.

b) There's a software bug.

c) There's a hardware fault.

Perhaps I should log a call with Sun and try to get this resolved. This is taking a lot longer than it should to setup.

robindixona at 2007-7-21 15:23:50 > top of Java-index,Solaris Operating System,Solaris Essentials - General Technical Questions...
# 20

Can you confirm that you are connecting the heartbeat networks via switches? If not, this is a current requirement (this is a T2000 right?). Also, the problem may be helped by putting:

set ipge_taskq_disable=1

in /etc/system. That should fix the panic issues. Let's address the svcs issues separately.

Tim

Tim.Reada at 2007-7-21 15:23:50 > top of Java-index,Solaris Operating System,Solaris Essentials - General Technical Questions...
# 21

I have been using crossover cables, and they are both T2000's. I only have 1 switch available currently, so will need to check if this model of switch can isolate the 2 interconnects from each other. Where abouts in the documentation is this requirement specified? I don't remember reading this.

robindixona at 2007-7-21 15:23:50 > top of Java-index,Solaris Operating System,Solaris Essentials - General Technical Questions...
# 22

Rebooting with command: boot

Boot device: /pci@7c0/pci@0/pci@1/pci@0,2/LSILogic,sas@2/disk@0,0:a File and args:

sorry, variable 'ipge_taskq_disable' is not defined in the 'kernel'

SunOS Release 5.10 Version Generic_118833-22 64-bit

I realise you must have meant:

set ipge:ipge_taskq_disable=1

:-)

Message was edited by:

robindixon

robindixona at 2007-7-21 15:23:50 > top of Java-index,Solaris Operating System,Solaris Essentials - General Technical Questions...
# 23

I have 2 seperate VLANs configured in the switch now, for each of the cluster interconnects. This appears to have fixed that issue.

Now when I type scstat on both nodes, node2 gets the following:

-- IPMP Groups --

Node NameGroupStatus AdapterStatus

---

libsecurity: create of rpc handle to program rgmd_receptionist (100141) failed, will not retry

scstat: unexpected error.

Wheras node1 displays its IPMP group as online, and node2's IPMP group is not in the list.

The following services are offline on node1:

offline11:19:04 svc:/system/cluster/rpc-pmf:default

offline11:19:05 svc:/system/cluster/rpc-fed:default

The following command produces no effect:

# svcadm enable svc:/system/cluster/rpc-pmf:default

# svcadm enable svc:/system/cluster/rpc-fed:default

Message was edited by:

robindixon

robindixona at 2007-7-21 15:23:50 > top of Java-index,Solaris Operating System,Solaris Essentials - General Technical Questions...
# 24

Alternately rebooting both nodes worked for bringing both IPMP groups online in the cluster, and the remaining "offline" cluster related SVC's also came online.

I then proceeded to setup a shared address resource for the cluster, set to "arbitrary" node selection.

scrgadm -a -g soraya-public

scrgadm -a -S -g soraya-public -l soraya

scrgadm -c -j soraya -y R_description="SharedAddress resource for soraya"

scswitch -Z -g soraya-public

scstat initially showed:

Resource: soraya bunsen OnlineOnline - SharedAddress online.

Resource: soraya beaker OfflineOffline

I unplugged the public interfaces on bunsen, then scstat showed this on beaker:

Resource: soraya bunsen OfflineOffline - SharedAddress offline.

Resource: soraya beaker OnlineOnline - SharedAddress online.

I then plugged bunsen back in, and unplugged beaker:

Resource: soraya bunsen OfflineOffline - SharedAddress offline.

Resource: soraya beaker OnlineDegraded - IPMP Failure.

So it failed over one way but not back the other way.

At this point, from an external host, I was able to ping beaker's public interface but the shared address resource was in the "degraded" state, so I was unable to ping soraya.

Plugging bunsen's public interfaces back in again, scstat once again shows:

Resource: soraya bunsen OfflineOffline - SharedAddress offline.

Resource: soraya beaker OnlineOnline - SharedAddress online.

robindixona at 2007-7-21 15:23:50 > top of Java-index,Solaris Operating System,Solaris Essentials - General Technical Questions...
# 25
Evidently I did not give it enough time to failover because I just tried again and it worked both ways.
robindixona at 2007-7-21 15:23:50 > top of Java-index,Solaris Operating System,Solaris Essentials - General Technical Questions...
# 26

Went through a second round of alternate rebooting, and the following occurred on node2

obtaining access to all attached disks

mount: /dev/md/dsk/d30 is already mounted or /global/.devices/node@2 is busy

Trying to remount /global/.devices/node@2

mount: /dev/md/dsk/d30 is already mounted or /global/.devices/node@2 is busy

WARNING - Unable to mount one or more of the following filesystem(s):

/global/.devices/node@2

If this is not repaired, global devices will be unavailable.

Typing df on node2 confirmed that it had mounted /global/.devices/node@1

In this state, the cluster rpc SVC's did not go online on node2.

I rebooted node2 again after this, and it mounted correctly:

/global/.devices/node@2 and now the cluster rpc SVC's are online on node2.

Why does it sometimes mount /global/.devices/node@1 instead of /global/.devices/node@2, and sometimes does not? The vfstab still has /global/.devices/node@2 entry as was automatically generated by scinstall.

robindixona at 2007-7-21 15:23:50 > top of Java-index,Solaris Operating System,Solaris Essentials - General Technical Questions...
# 27
> I realise you must have meant:> set ipge:ipge_taskq_disable=1Yes, unfortunately I just copied it from a bug log which didn't have the ipge: prefix. :-(Tim
Tim.Reada at 2007-7-21 15:23:50 > top of Java-index,Solaris Operating System,Solaris Essentials - General Technical Questions...
# 28

> I have been using crossover cables, and they are both T2000's. I only have 1 switch available

> currently, so will need to check if this model of switch can isolate the 2 interconnects from each other.

> Where abouts in the documentation is this requirement specified? I don't remember reading this.

This is documented in Sun's internal configuration guide. Also see the release notes:

http://docs.sun.com/app/docs/doc/816-3381/6m9lratq9?a=view

under "Cluster Problems When Using the ipge3 Port"

Tim

Tim.Reada at 2007-7-21 15:23:50 > top of Java-index,Solaris Operating System,Solaris Essentials - General Technical Questions...
# 29

> Evidently I did not give it enough time to failover because I just tried again and it worked both ways.

Do you have a defaultrouter that will respond to ping? If not, that may be why IPMP isn't working. If there is no default router, then IPMP tries to find other hosts to ping to check their own health. They may end up with nothing to use except each other. That will cause a cascading problem. One fails, the other then will to a few seconds later.

Regards,

Tim

Tim.Reada at 2007-7-21 15:23:50 > top of Java-index,Solaris Operating System,Solaris Essentials - General Technical Questions...
# 30

> Went through a second round of alternate rebooting, and the following occurred on node2

> obtaining access to all attached disks

> mount: /dev/md/dsk/d30 is already mounted or /global/.devices/node@2 is busy

> Trying to remount /global/.devices/node@2

> mount: /dev/md/dsk/d30 is already mounted or /global/.devices/node@2 is busy

Odd - try the following:

Run scgdevs one node at a time. Remount /global/.devices/node@x. Do this one node at a time on all nodes. After doing this run devfsadm and reboot cluster to ensure all changes are working properly.

Tim.Reada at 2007-7-21 15:23:55 > top of Java-index,Solaris Operating System,Solaris Essentials - General Technical Questions...
# 31

> This is documented in Sun's internal configuration

> guide. Also see the release notes:

> http://docs.sun.com/app/docs/doc/816-3381/6m9lratq9?a=

> view

>

> under "Cluster Problems When Using the ipge3 Port"

I was actually referring to the requirement of switches. The cluster interconnect configuration step in scinstall implies that directly attached interconnects are an option, and this is why I was trying to make it work that way.

robindixona at 2007-7-21 15:23:55 > top of Java-index,Solaris Operating System,Solaris Essentials - General Technical Questions...
# 32

Anyway, here's my current stumbling block.

bunsen# scrgadm -c -g soraya-public -h beaker,bunsen

bunsen - No IPMP group for node bunsen.

bunsen - No IPMP group for this node.

VALIDATE on resource soraya, resource group soraya-public, exited with non-zero exit status.

Validation of resource soraya in resource group soraya-public on node bunsen failed.

bunsen# scstat |fgrep IPMP

-- IPMP Groups --

IPMP Group: beaker beaker_public Online ipge1Online

IPMP Group: beaker beaker_public Online ipge0Online

IPMP Group: bunsen bunsen_public Online ipge1Online

IPMP Group: bunsen bunsen_public Online ipge0Online

bunsen# scrgadm -c -g soraya-public -h beaker,bunsen

bunsen - No IPMP group for node bunsen.

bunsen - No IPMP group for this node.

VALIDATE on resource soraya, resource group soraya-public, exited with non-zero exit status.

Validation of resource soraya in resource group soraya-public on node bunsen failed.

robindixona at 2007-7-21 15:23:55 > top of Java-index,Solaris Operating System,Solaris Essentials - General Technical Questions...
# 33

> Odd - try the following:

> Run scgdevs one node at a time. Remount

> /global/.devices/node@x. Do this one node at a time

> on all nodes. After doing this run devfsadm and

> reboot cluster to ensure all changes are working

> properly.

I tried that, but the symptom still remains.

The sequence below best describes the symptom.

beaker# mount -g /dev/md/dsk/d30 /global/.devices/node\@1

bunsen# df -h |tail -1;umount /dev/md/dsk/d30

/dev/md/dsk/d30486M3.6M434M1%/global/.devices/node@1

beaker# mount -g /dev/md/dsk/d30 /global/.devices/node\@2

bunsen# df -h |tail -1;umount /dev/md/dsk/d30

/dev/md/dsk/d30486M3.6M434M1%/global/.devices/node@2

Are the global devices mount points supposed to behave this way across a cluster? The scinstall puts nodeid corresponding to the node in the vfstab of each node, so I find this behavior confusing.

robindixona at 2007-7-21 15:23:55 > top of Java-index,Solaris Operating System,Solaris Essentials - General Technical Questions...
# 34

> Anyway, here's my current stumbling block.

>

> bunsen# scrgadm -c -g soraya-public -h beaker,bunsen

> bunsen - No IPMP group for node bunsen.

> bunsen - No IPMP group for this node.

>

> VALIDATE on resource soraya, resource group

> soraya-public, exited with non-zero exit status.

> Validation of resource soraya in resource group

> soraya-public on node bunsen failed.

> bunsen# scstat |fgrep IPMP

> -- IPMP Groups --

> IPMP Group: beaker beaker_public

> Online ipge1Online

> IPMP Group: beaker beaker_public Online

> ipge0Online

> oup: bunsen bunsen_public Online

> ipge1Online

> roup: bunsen bunsen_public Online

> ipge0Online

> scrgadm -c -g soraya-public -h beaker,bunsen

> bunsen - No IPMP group for node bunsen.

> bunsen - No IPMP group for this node.

>

> VALIDATE on resource soraya, resource group

> soraya-public, exited with non-zero exit status.

> Validation of resource soraya in resource group

> soraya-public on node bunsen failed.

Although when I removed soraya-public and created it again, at this point; it was then registering on both nodes, and now successfully fails apache over when I rebooted the primary.

Naturally it would still be preferrable to achieve this without the downtime of recreating the resource group.

We're almost there!

Just need to resolve these intermittancies with the global device directory not always mounting after bootup.

There is also a more occasional but serious intermittancy where the ssd driver is sometimes not loading in the boot sequence.

robindixona at 2007-7-21 15:23:55 > top of Java-index,Solaris Operating System,Solaris Essentials - General Technical Questions...
# 35

> > This is documented in Sun's internal

> configuration

> > guide. Also see the release notes:

> >

> http://docs.sun.com/app/docs/doc/816-3381/6m9lratq9?a=

>

> > view

> >

> > under "Cluster Problems When Using the ipge3 Port"

>

> I was actually referring to the requirement of

> switches. The cluster interconnect configuration

> step in scinstall implies that directly attached

> interconnects are an option, and this is why I was

> trying to make it work that way.

See bullet 2

Use an Ethernet switch with your cluster interconnect cables for all ipge onboard interfaces. Direct-connect onboard interfaces are not supported by Sun Cluster software at this time.

Tim.Reada at 2007-7-21 15:23:55 > top of Java-index,Solaris Operating System,Solaris Essentials - General Technical Questions...
# 36

> Anyway, here's my current stumbling block.

>

> bunsen# scrgadm -c -g soraya-public -h beaker,bunsen

> bunsen - No IPMP group for node bunsen.

> bunsen - No IPMP group for this node.

>

> VALIDATE on resource soraya, resource group

> soraya-public, exited with non-zero exit status.

> Validation of resource soraya in resource group

> soraya-public on node bunsen failed.

> bunsen# scstat |fgrep IPMP

> -- IPMP Groups --

> IPMP Group: beaker beaker_public

> Online ipge1Online

> IPMP Group: beaker beaker_public Online

> ipge0Online

> oup: bunsen bunsen_public Online

> ipge1Online

> roup: bunsen bunsen_public Online

> ipge0Online

> scrgadm -c -g soraya-public -h beaker,bunsen

> bunsen - No IPMP group for node bunsen.

> bunsen - No IPMP group for this node.

>

> VALIDATE on resource soraya, resource group

> soraya-public, exited with non-zero exit status.

> Validation of resource soraya in resource group

> soraya-public on node bunsen failed.

This is probably because the current IPMP groups are on a different subnet to the one soraya-public needs to be on. You will need to have IPMP groups on that subnet.

Tim.Reada at 2007-7-21 15:23:55 > top of Java-index,Solaris Operating System,Solaris Essentials - General Technical Questions...
# 37

> Naturally it would still be preferrable to achieve

> this without the downtime of recreating the resource

> group.

This should be achievable. In fact, all of this should have been achieveable without any of these issues!

BTW: I would really recommend the Sun Cluster training courses if you are going to run any production clusters. This goes for anyone reading this thread. I attended them when the product first came out and they helped me get over the initial learning curve.

> We're almost there!

> Just need to resolve these intermittancies with the

> global device directory not always mounting after

> bootup.

I'm at a loss to explain this. If the system has been properly patched with all the Solaris and Sun Cluster patches, this shouldn't happen. Given the ssd problem I would hazard a guess that the system isn't patched properly. Check out Sun Update Connection.

> There is also a more occasional but serious

> intermittancy where the ssd driver is sometimes not

> loading in the boot sequence.

Tim.Reada at 2007-7-21 15:23:55 > top of Java-index,Solaris Operating System,Solaris Essentials - General Technical Questions...
# 38

Re: docs; my bad.. I missed bullet 2.

Re: subnets; the IPMP groups are both on the same subnet as the public address.

Re: patching; last patch update was performed on 15th Sep 2006.

Re: training; I would certainly like to receive training. Admittedly, the learning curve has been challenging with just the documentation to work with, and I have made mistakes along the way. Although, I'm not convinced that all of these issues I have experienced can be entirely attributed to lack of training. Probably more has to do with not having read the information on http://docs.sun.com/app/docs/doc/816-3381/6m9lratq9?a=view - I wish I had found this earlier.

I've appreciated all your help.

robindixona at 2007-7-21 15:23:55 > top of Java-index,Solaris Operating System,Solaris Essentials - General Technical Questions...
# 39

Good luck with your testing. I'd re-iterate that it shouldn't be this difficult. Occasionally when I've messed up the initial configuration, I've found that it is easier to start again that try and fix the problem, Of course, that all very well if it isn't a production cluster, hey but then I only ever really work on test clusters :-)

If you have any more questions - just post.

Tim

Tim.Reada at 2007-7-21 15:23:55 > top of Java-index,Solaris Operating System,Solaris Essentials - General Technical Questions...
# 40

Hi,

I have 2xT2000 cluster S10 SC3.1 8/05 which is suffering same panic. When I reboot one node, the other one will get panic(below). Quorum disk resides on EMC Symm box. "PER" volume falg is set on Symm luns. Whole solution is installed according to EIS installation standards.

We have another SC3.1 on S9 wich works with same Symm quorum device correctly.

It seems that SC has hard times to manage quorum device. Any ideas where problem might reside?

panic[cpu30]/thread=2a100bcbcc0: Reservation Conflict

000002a100bcb730 ssd:ssd_mhd_watch_cb+208 (7600000002, 600028eba40, 703b0000, 7b65c800, 0, 60001e56440)

%l0-3: 0000060010c6aad8 0000060010c6aad8 0000060010c6aab8 0000000000000000

%l4-7: 0000000000000018 0000000000000081 0000000000000000 0000000000000001

000002a100bcb7e0 scsi:scsi_watch_request_intr+160 (60010c6aad8, 18, 0, 6000fe7eda0, 0, 60010c6aad8)

%l0-3: 0000060010c6aad8 0000060001f0c800 0000060010c6aab8 0000000000000018

%l4-7: 0000000000000018 000000000000000c 00000000703817a8 000000007bfd2400

000002a100bcb8b0 fcp:ssfcp_cmd_callback+64 (60010c6ab30, 0, 1, 60010c6acd0, 60010c6a940, 60001f4c7c0)

%l0-3: 0000000000000002 0000060001f0c800 0000060001ef0d50 0000060001ef09a8

%l4-7: 0000060001ef0998 00000000fbffffff 00000000703817a8 000000007bfd2400

000002a100bcb960 qlc:ql_task_thread+660 (60001ef0940, 60010c6acd0, 703817b8, 703817c8, fffffffffffff7ff, 4000000)

%l0-3: fffffffffffffeff 0000060001ef0958 0000060001ef0d50 0000060001ef09a8

%l4-7: 0000060001ef0998 00000000fbffffff 00000000703817a8 0000000070381798

000002a100bcba20 qlc:ql_task_daemon+70 (60001ef0940, 2000000, 60001ef0998, 60001ef0978, 60001ef0990, 60001ef0992)

%l0-3: fffffffffffffffd 0000060001ef09a2 6144088c9210001d d0262008da5e6310

%l4-7: 9fc340009007a79f 0000000000a20004 0000000000000002 0000000000000002

syncing file systems... 59 4 done

konsulteksa at 2007-7-21 15:23:55 > top of Java-index,Solaris Operating System,Solaris Essentials - General Technical Questions...
# 41

Can you confirm that you have a separate quorum device for each cluster? I realise this sounds obvious, but I can't think off hand why you would get a reservation conflict otherwise.

Sun Cluster is very robust on quorum. Unfortunately, the SCSI ioctls required to support quorum are often broken by vendor firmware upgrades :-( This is why you have to be very careful in both the setup of the arrays and using the right firmware. This applies to all vendors.

Tim

Tim.Reada at 2007-7-21 15:23:55 > top of Java-index,Solaris Operating System,Solaris Essentials - General Technical Questions...
# 42

Jep, both clusters using their own luns for quorum devices.

Im using SUNs qlc fc hba-s, these had old fw which I updated to latest, no change, still second hosts panics when first is rebooted.

I checked EMC microcode as well, min required is 5670.72, installed is 5670.91(according to our EMC support).

There is installed software called Optimizer on EMC, is there possibility that this screws things up?

konsulteksa at 2007-7-21 15:23:55 > top of Java-index,Solaris Operating System,Solaris Essentials - General Technical Questions...
# 43

Are the EMC array(s) and associated SAN switches configured in such a way that each cluster can only see the LUNs assigned to it? Have you tried:

Adding a new QD

Removing the old one

Re-adding the old one

Remving the new one

This will clear any old reservations that are lying about on the LUN.

Not sure about the Optimizer product. That would be up to EMC to certify in the OSP entry.

Tim

Tim.Reada at 2007-7-21 15:23:56 > top of Java-index,Solaris Operating System,Solaris Essentials - General Technical Questions...
# 44

Yes, zoning exists in SAN. One cluster can see only its luns.

Now I cleared QD again, I hadnt cleared it after hba fw upgrades. And beginning from QD renewal cluster is stable.

I cleared keys from QD with "pgre" and as well with QD removal and adding before hba fw upgrade, then it didnt help.

I will keep trying to reproduce the reservation conflict for few days. I will notify then if it stays stable or not.

regards,

Andres

konsulteksa at 2007-7-21 15:23:56 > top of Java-index,Solaris Operating System,Solaris Essentials - General Technical Questions...
# 45

Reservation conflict again but its different from previous ones. This time both hosts got reservation conflicts exactly at same time by themselves. Cluster was running, all rg-s were up, no users were accessing cluster, it was just up and running and suddenly both hosts got reservation conflict.

And seems that there is LUN that is visible at least to both clusters if not even to more hosts in SAN. It is the mighty emc control lun, which is read only.

We found a way to reproduce the problem!

As I mentioned that we are deploying two clusters at the same time(2x240 with S9, SC3.1 08/5 and 2xT2000 S10, SC3.1 08/5).

Both clusters are attached to SAN where are DMX800 and DMX1000 arrays. v240 cluster uses LUNs from both arrays. T2000 cluster uses LUNs only from DMX800 array.

Now, when we send one node from v240 cluster to ok prompt with break and give "sync" command to generate panic or just halt one node in this cluster or reboot it, then both nodes in T2000 cluster will get reservation conflict and will panic.

Both clusters are completely separate tcp/ip netoworks and only thing that connects theses clsters in any way, is SAN and only by emc-s control luns.

It might be that problem comes from fencing that control lun? If the LUN data area is read only, does it mean that this part of lun where keys are written is read only as well? I will try emc support.

Regards,

Andres

konsulteksa at 2007-7-21 15:24:00 > top of Java-index,Solaris Operating System,Solaris Essentials - General Technical Questions...
# 46
I would log a support call with Sun. This seems to be a little too complex to be resolved on this forum.Tim
Tim.Reada at 2007-7-21 15:24:00 > top of Java-index,Solaris Operating System,Solaris Essentials - General Technical Questions...
# 47

We found solution. As we tought that most likly problem residesdin EMC, it appared to be true.

Problem was in that emc read only lun. The only reason why this lun is shown to hosts is that, this is the best that emc can do to map more than one lun0 to one host port. lun0 is needed according to scsi spec.

Actually there is only one lun0 that is mapped to different hosts ie. several hosts can see one lun. Ofcource this lun is read only but when SunCluster tryes to fence this lun, then if there is another SunCluster in same SAN that sees same lun, it will panic with reservation confict.

konsulteksa at 2007-7-21 15:24:00 > top of Java-index,Solaris Operating System,Solaris Essentials - General Technical Questions...
# 48
Thanks for the update.Tim
Tim.Reada at 2007-7-21 15:24:00 > top of Java-index,Solaris Operating System,Solaris Essentials - General Technical Questions...