Network failure & error in message file

Dear All,

I have find out some network failure and errors in the massage files.

error in message files:

Mar 7 05:59:15 QZMain in.mpathd[44]: [ID 594170 daemon.error] NIC failure detected on ce2 of group OAM_NE

Mar 7 05:59:15 QZMain in.mpathd[44]: [ID 832587 daemon.error] Successfully failed over from NIC ce2 to NIC ce6

Mar 7 05:59:59 QZMain in.mpathd[44]: [ID 299542 daemon.error] NIC repair detected on ce2 of group OAM_NE

Mar 7 05:59:59 QZMain in.mpathd[44]: [ID 620804 daemon.error] Successfully failed back to NIC ce2

Mar 7 06:00:14 QZMain in.rdiscd[159]: [ID 801593 daemon.error] setsockopt (IP_ADD_MEMBERSHIP): Address already in use

Mar 7 07:56:44 QZMain in.mpathd[44]: [ID 594170 daemon.error] NIC failure detected on ce2 of group OAM_NE

Mar 7 07:56:44 QZMain in.mpathd[44]: [ID 832587 daemon.error] Successfully failed over from NIC ce2 to NIC ce6

Mar 7 07:57:26 QZMain in.mpathd[44]: [ID 299542 daemon.error] NIC repair detected on ce2 of group OAM_NE

Mar 7 07:57:26 QZMain in.mpathd[44]: [ID 620804 daemon.error] Successfully failed back to NIC ce2

Mar 7 07:57:50 QZMain in.rdiscd[159]: [ID 801593 daemon.error] setsockopt (IP_ADD_MEMBERSHIP): Address already in use

Mar 7 08:00:58 QZMain in.mpathd[44]: [ID 594170 daemon.error] NIC failure detected on ce2 of group OAM_NE

Mar 7 08:00:58 QZMain in.mpathd[44]: [ID 832587 daemon.error] Successfully failed over from NIC ce2 to NIC ce6

Mar 7 08:01:41 QZMain in.mpathd[44]: [ID 299542 daemon.error] NIC repair detected on ce2 of group OAM_NE

Mar 7 08:01:41 QZMain in.mpathd[44]: [ID 620804 daemon.error] Successfully failed back to NIC ce2

Mar 7 08:02:02 QZMain in.rdiscd[159]: [ID 801593 daemon.error] setsockopt (IP_ADD_MEMBERSHIP): Address already in use

Mar 7 14:15:26 QZMain in.mpathd[44]: [ID 398532 daemon.error] Cannot meet requested failure detection time of 10000 ms on (inet ce2) new failure detection time is 20580 ms

Mar 7 14:16:49 QZMain in.mpathd[44]: [ID 398532 daemon.error] Cannot meet requested failure detection time of 10000 ms on (inet ce2) new failure detection time is 47824 ms

Mar 7 14:32:31 QZMain in.mpathd[44]: [ID 122137 daemon.error] Improved failure detection time 23912 ms

Mar 7 14:33:56 QZMain in.mpathd[44]: [ID 398532 daemon.error] Cannot meet requested failure detection time of 10000 ms on (inet ce2) new failure detection time is 54264 ms

Mar 7 14:36:33 QZMain in.mpathd[44]: [ID 594170 daemon.error] NIC failure detected on ce2 of group OAM_NE

Mar 7 14:36:33 QZMain in.mpathd[44]: [ID 832587 daemon.error] Successfully failed over from NIC ce2 to NIC ce6

Mar 7 14:36:34 QZMain in.mpathd[44]: [ID 122137 daemon.error] Improved failure detection time 27132 ms

Mar 7 14:36:34 QZMain in.mpathd[44]: [ID 122137 daemon.error] Improved failure detection time 13566 ms

Mar 7 14:36:34 QZMain in.mpathd[44]: [ID 122137 daemon.error] Improved failure detection time 10000 ms

Mar 7 14:36:48 QZMain in.mpathd[44]: [ID 299542 daemon.error] NIC repair detected on ce2 of group OAM_NE

Mar 7 14:36:48 QZMain in.mpathd[44]: [ID 620804 daemon.error] Successfully failed back to NIC ce2

Mar 7 14:38:42 QZMain in.mpathd[44]: [ID 594170 daemon.error] NIC failure detected on ce2 of group OAM_NE

Mar 7 14:38:42 QZMain in.mpathd[44]: [ID 832587 daemon.error] Successfully failed over from NIC ce2 to NIC ce6

Mar 7 14:39:22 QZMain in.mpathd[44]: [ID 299542 daemon.error] NIC repair detected on ce2 of group OAM_NE

Mar 7 14:39:22 QZMain in.mpathd[44]: [ID 620804 daemon.error] Successfully failed back to NIC ce2

Mar 7 14:39:50 QZMain in.rdiscd[159]: [ID 801593 daemon.error] setsockopt (IP_ADD_MEMBERSHIP): Address already in use

IPMP configuration:

QZMain>grep -v '\#' /etc/default/mpathd

FAILURE_DETECTION_TIME=10000

FAILBACK=yes

TRACK_INTERFACES_ONLY_WITH_GROUPS=yes

QZMain>

QZMain>more hostname.ce0

10.15.184.2 netmask + broadcast + group OAM_OAM failover up addif 10.15.184.3 netmask + broadcast + deprecated -failover up

QZMain>more hostname.ce2

192.168.1.30 netmask + broadcast + group OAM_NE failover up addif 192.168.1.31 netmask + broadcast + deprecated -failover up

QZMain>more hostname.ce4

10.15.184.4 netmask + broadcast + group OAM_OAM deprecated -failover standby up

QZMain>more hostname.ce6

192.168.1.32 netmask + broadcast + group OAM_NE deprecated -failover standby up

QZMain>

Any help would be appreciated, thanks in advance.

[4642 byte] By [gahwaha] at [2007-11-26 20:43:27]
# 1

I can't find much on this - it appears there may be a timing issue with IPMP, but as far as I can tell from the other calls logged on this, the message is harmless.

Does it cause any specific problems? If it does, you'll probably need to log a support call with Sun to get it fully investigated. All I can suggest is that you ensure that the system is sufficiently patched.

Tim

Tim.Reada at 2007-7-10 2:03:51 > top of Java-index,Solaris Operating System,Solaris Essentials - General Technical Questions...
# 2

Because this is our customer private network, it is installed a application to monitor the system networking. When this error message appear, it will cause this server to lost the connection in the system.

Can I change IPMP configuration to fix this issue?

Can you give me the command to display the system is sufficiently patched or not?

gahwaha at 2007-7-10 2:03:51 > top of Java-index,Solaris Operating System,Solaris Essentials - General Technical Questions...
# 3

I'm surprised it causes a lost of communication. TCP is pretty robust at continuing a connection across network interruptions unless they exceed a certain timeout.

If you definitely have the IPMP configuration correct, then I would log a support call with Sun.

As for checking the patches, you can run updatemanager on Solaris 10 and it will analyse your system and download the patches, etc.

One other thing - have you checked that you have the right netmasks configured in /etc/netmasks and the right /etc/nsswitch.conf file entries for netmasks?

Tim

Message was edited by:

Tim.Read

Tim.Reada at 2007-7-10 2:03:51 > top of Java-index,Solaris Operating System,Solaris Essentials - General Technical Questions...
# 4
Hi I have face the same problem..Pls update the IPMP patches and recommended cluster patches..and pls read the README file before patching the system..
relax_relax81a at 2007-7-10 2:03:51 > top of Java-index,Solaris Operating System,Solaris Essentials - General Technical Questions...
# 5
Hi,Have your problem is fixed?Can you please let me know where can I download the IPMP patches and recommended cluster patches?Thanks,
gah_waha at 2007-7-10 2:03:51 > top of Java-index,Solaris Operating System,Solaris Essentials - General Technical Questions...
# 6
You can get them via updatemanager in Solaris 10 or from Sunsolve (I believe).Tim
Tim.Reada at 2007-7-10 2:03:51 > top of Java-index,Solaris Operating System,Solaris Essentials - General Technical Questions...
# 7

Because this server is installed Solaris 8, for Solaris 8 have "updatemanager" or not?

If yes, can you please send me more information how to launch this application?

If not, can you please let me know where can I download IPMP patches and recommended cluster patches?

Thanks,

gahwaha at 2007-7-10 2:03:51 > top of Java-index,Solaris Operating System,Solaris Essentials - General Technical Questions...
# 8

Generating a list of patches is virtually impossible without knowing what you have on your system., so I'm not even going to attempt it.

You are right, Solaris 8 does not have updatemanager - that's for Solaris 10 only. What you need is the Solaris 8 'equivalent'. Have a look at:

http://patchpro.sun.com/servlet/com.sun.patchpro.servlet.PatchProServlet

See the patch manager part. That should be able to analyse your system still.

Tim

Tim.Reada at 2007-7-10 2:03:51 > top of Java-index,Solaris Operating System,Solaris Essentials - General Technical Questions...