IPMP failures on bge Interface
We've been testing IPMP on Solaris Sparc hosts that also have the Apani IPSec Agent installed. It works fine on older hosts that have 'qfe' and 'le' interfaces, but our v210's and T1000's with 'bge' interfaces have a problem. If we configure an IPMP group to use, say, bge0 and bge1 (with bge0 as the primary interface), it works fine. Disconnecting bge0 causes a failover to bge1, also fine. Disconnecting bge1 causes the following errors:
-
Nov 2 10:32:29 cs22 in.mpathd[146]: NIC failure detected on bge1 of group test
Nov 2 10:32:29 cs22 in.mpathd[146]: Successfully failed over from NIC bge1 to NIC bge0
Nov 2 10:32:37 cs2 in.mpathd[146]: All Interfaces in group test have failed
-
All interfaces fail, even though bge0 is still connected and was active before disconnecting bge1. The system recovers once bge0 is reconnected. The two interfaces are physically connected to the same switch, and the hostname.bgeX files are:
-- hostname.bge0
cs22 netmask + broadcast + group test up \
addif cs21 deprecated -failover netmask + broadcast + up
-- hostname.bge1
sp12 netmask + broadcast + group test up \
addif sp16 deprecated -failover netmask + broadcast + up
Any help would be appreciated, thanks in advance.
[1308 byte] By [
CS@apani] at [2007-11-26 11:13:23]

# 1
Could you post:
+ showrev
+ netstat -nr
+ /etc/hosts file
+ ifconfig -a (when bge0 and bge1 are connected)
+ ifconfig -a (after removing bge1)
+ ifconfig -a (after inserting bge1)
+ ifconfig -a (after removing bge0)
+ ifconfig -a (after inserting bge0)
+ /var/adm/messages file
# 2
Thanks for replying. Here's the requested information:
-> showrev
Hostname: cstoc77022
Hostid: 842a9b82
Release: 5.10
Kernel architecture: sun4v
Application architecture: sparc
Hardware provider: Sun_Microsystems
Domain: nis.nl.com
Kernel version: SunOS 5.10 Generic_118833-03
-> netstat -rn
Routing Table: IPv4
DestinationGatewayFlags RefUseInterface
-- -- -- --
63.192.85.64 63.192.77.9 UG10 bge0
63.192.78.0 63.192.77.9 UG10 bge0
63.192.77.0 63.192.77.22 U 1162 bge0
63.192.77.0 63.192.77.12 U 112 bge1
63.192.77.0 63.192.77.12 U 10 bge0:1
63.192.77.0 63.192.77.12 U 10 bge1:1
63.192.76.0 63.192.77.9 UG10 bge0
10.3.0.0 63.192.77.92 UG10 bge0
172.20.0.063.192.77.4 UG10 bge0
172.16.0.063.192.77.9 UG10 bge0
10.0.0.0 63.192.77.9 UG10 bge0
224.0.0.063.192.77.22 U 10 bge0
127.0.0.1127.0.0.1UH7328 lo0
-> more /etc/hosts
#
# Internet host table
#
127.0.0.1localhost
63.192.77.22cstoc77022loghost
63.192.77.1mls1
- BOTH CONNECTED: bge0, bge1
-> ifconfig -a
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
inet 127.0.0.1 netmask ff000000
bge0: flags=1001000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,FIXEDMTU> mtu 1442 index 2
inet 63.192.77.22 netmask ffffff00 broadcast 63.192.77.255
groupname test
ether 0:14:4f:2a:9b:82
bge0:1: flags=9040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER> mtu 1500 index 2
inet 63.192.77.21 netmask ffffff00 broadcast 63.192.77.255
bge1: flags=1001000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,FIXEDMTU> mtu 1442 index 3
inet 63.192.77.12 netmask ffffff00 broadcast 63.192.77.255
groupname test
ether 0:14:4f:2a:9b:83
bge1:1: flags=9040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER> mtu 1500 index 3
inet 63.192.77.16 netmask ffffff00 broadcast 63.192.77.255
- REMOVING bge1
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
inet 127.0.0.1 netmask ff000000
bge0: flags=1011000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,FAILED,FIXEDMTU> mtu 1442 index 2
inet 63.192.77.22 netmask ffffff00 broadcast 63.192.77.255
groupname test
ether 0:14:4f:2a:9b:82
bge0:1: flags=19040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER,FAILED> mtu 1500 i
ndex 2
inet 63.192.77.21 netmask ffffff00 broadcast 63.192.77.255
bge0:2: flags=1011000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,FAILED,FIXEDMTU> mtu 1442 index 2
inet 63.192.77.12 netmask ffffff00 broadcast 63.192.77.255
bge1: flags=1019000802<BROADCAST,MULTICAST,IPv4,NOFAILOVER,FAILED,FIXEDMTU> mtu 0 index 3
inet 0.0.0.0 netmask 0
groupname test
ether 0:14:4f:2a:9b:83
bge1:1: flags=19040803<UP,BROADCAST,MULTICAST,DEPRECATED,IPv4,NOFAILOVER,FAILED> mtu 1500 index 3
inet 63.192.77.16 netmask ffffff00 broadcast 63.192.77.255
Nov 2 13:00:22 cstoc77022 bge: NOTICE: bge1: link down
Nov 2 13:00:22 cstoc77022 in.mpathd[146]: The link has gone down on bge1
Nov 2 13:00:22 cstoc77022 in.mpathd[146]: NIC failure detected on bge1 of group test
Nov 2 13:00:22 cstoc77022 in.mpathd[146]: Successfully failed over from NIC bge1 to NIC bge0
Nov 2 13:00:30 cstoc77022 in.mpathd[146]: All Interfaces in group test have failed
- INSERTING bge1
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
inet 127.0.0.1 netmask ff000000
bge0: flags=1001000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,FIXEDMTU> mtu 1442 index 2
inet 63.192.77.22 netmask ffffff00 broadcast 63.192.77.255
groupname test
ether 0:14:4f:2a:9b:82
bge0:1: flags=9040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER> mtu 1500 index 2
inet 63.192.77.21 netmask ffffff00 broadcast 63.192.77.255
bge1: flags=1001000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,FIXEDMTU> mtu 1442 index 3
inet 63.192.77.12 netmask ffffff00 broadcast 63.192.77.255
groupname test
ether 0:14:4f:2a:9b:83
bge1:1: flags=9040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER> mtu 1500 index 3
inet 63.192.77.16 netmask ffffff00 broadcast 63.192.77.255
Nov 2 13:01:59 cstoc77022 bge: NOTICE: bge1: link up 100Mbps Full-Duplex
Nov 2 13:01:59 cstoc77022 in.mpathd[146]: The link has come up on bge1
Nov 2 13:02:14 cstoc77022 in.mpathd[146]: NIC repair detected on bge1 of group test
Nov 2 13:02:14 cstoc77022 in.mpathd[146]: Successfully failed back to NIC bge1
Nov 2 13:02:14 cstoc77022 in.mpathd[146]: At least 1 interface (bge1) of group test has repaired
Nov 2 13:02:14 cstoc77022 in.mpathd[146]: NIC repair detected on bge0 of group test
Nov 2 13:02:14 cstoc77022 in.mpathd[146]: Successfully failed back to NIC bge0
- REMOVING bge0
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
inet 127.0.0.1 netmask ff000000
bge0: flags=1019000802<BROADCAST,MULTICAST,IPv4,NOFAILOVER,FAILED,FIXEDMTU> mtu 0 index 2
inet 0.0.0.0 netmask 0
groupname test
ether 0:14:4f:2a:9b:82
bge0:1: flags=19040803<UP,BROADCAST,MULTICAST,DEPRECATED,IPv4,NOFAILOVER,FAILED> mtu 1500 index 2
inet 63.192.77.21 netmask ffffff00 broadcast 63.192.77.255
bge1: flags=1001000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,FIXEDMTU> mtu 1442 index 3
inet 63.192.77.12 netmask ffffff00 broadcast 63.192.77.255
groupname test
ether 0:14:4f:2a:9b:83
bge1:1: flags=9040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER> mtu 1500 index 3
inet 63.192.77.16 netmask ffffff00 broadcast 63.192.77.255
bge1:2: flags=1001000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,FIXEDMTU> mtu 1442 index 3
inet 63.192.77.22 netmask ffffff00 broadcast 63.192.77.255
Nov 2 13:03:20 cstoc77022 in.mpathd[146]: The link has gone down on bge0
Nov 2 13:03:20 cstoc77022 in.mpathd[146]: NIC failure detected on bge0 of group test
Nov 2 13:03:20 cstoc77022 in.mpathd[146]: Successfully failed over from NIC bge0 to NIC bge1
- INSERTING bge0
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
inet 127.0.0.1 netmask ff000000
bge0: flags=1001000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,FIXEDMTU> mtu 1442 index 2
inet 63.192.77.22 netmask ffffff00 broadcast 63.192.77.255
groupname test
ether 0:14:4f:2a:9b:82
bge0:1: flags=9040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER> mtu 1500 index 2
inet 63.192.77.21 netmask ffffff00 broadcast 63.192.77.255
bge1: flags=1001000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,FIXEDMTU> mtu 1442 index 3
inet 63.192.77.12 netmask ffffff00 broadcast 63.192.77.255
groupname test
ether 0:14:4f:2a:9b:83
bge1:1: flags=9040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER> mtu 1500 index 3
inet 63.192.77.16 netmask ffffff00 broadcast 63.192.77.255
Nov 2 13:04:20 cstoc77022 bge: NOTICE: bge0: link up 100Mbps Full-Duplex
Nov 2 13:04:20 cstoc77022 in.mpathd[146]: The link has come up on bge0
Nov 2 13:04:34 cstoc77022 in.mpathd[146]: NIC repair detected on bge0 of group test
Nov 2 13:04:34 cstoc77022 ip: WARNING: IP: Proxy ARP problem? Hardware address '00:14:4f:2a:9b:82
' thinks it is 063.192.077.022
Nov 2 13:04:34 cstoc77022 in.mpathd[146]: Successfully failed back to NIC bge0
/var/adm/messages
Nov 2 12:55:54 cstoc77022 nfs: [ID 664466 kern.notice] NFS getattr failed for server mls1: error 7 (RPC: Authentication error)
Nov 2 12:57:23 cstoc77022 last message repeated 5 times
Nov 2 13:00:22 cstoc77022 bge: [ID 801593 kern.notice] NOTICE: bge1: link down
Nov 2 13:00:22 cstoc77022 in.mpathd[146]: [ID 215189 daemon.error] The link has gone down on bge1
Nov 2 13:00:22 cstoc77022 in.mpathd[146]: [ID 594170 daemon.error] NIC failure detected on bge1 of group test
Nov 2 13:00:22 cstoc77022 in.mpathd[146]: [ID 832587 daemon.error] Successfully failed over from NIC bge1 to NIC bge0
Nov 2 13:00:30 cstoc77022 in.mpathd[146]: [ID 168056 daemon.error] All Interfaces in group test have failed
Nov 2 13:01:59 cstoc77022 bge: [ID 801593 kern.notice] NOTICE: bge1: link up 100Mbps Full-Duplex
Nov 2 13:01:59 cstoc77022 in.mpathd[146]: [ID 820239 daemon.error] The link has come up on bge1
Nov 2 13:02:14 cstoc77022 in.mpathd[146]: [ID 299542 daemon.error] NIC repair detected on bge1 of group test
Nov 2 13:02:14 cstoc77022 in.mpathd[146]: [ID 620804 daemon.error] Successfully failed back to NIC bge1
Nov 2 13:02:14 cstoc77022 in.mpathd[146]: [ID 237757 daemon.error] At least 1 interface (bge1) of group test has repaired
Nov 2 13:02:14 cstoc77022 in.mpathd[146]: [ID 299542 daemon.error] NIC repair detected on bge0 of group test
Nov 2 13:02:14 cstoc77022 in.mpathd[146]: [ID 620804 daemon.error] Successfully failed back to NIC bge0
Nov 2 13:02:55 cstoc77022 nfs: [ID 664466 kern.notice] NFS getattr failed for server mls1: error 7 (RPC: Authentication error)
Nov 2 13:02:55 cstoc77022 last message repeated 1 time
Nov 2 13:03:20 cstoc77022 bge: [ID 801593 kern.notice] NOTICE: bge0: link down
Nov 2 13:03:20 cstoc77022 in.mpathd[146]: [ID 215189 daemon.error] The link has gone down on bge0
Nov 2 13:03:20 cstoc77022 in.mpathd[146]: [ID 594170 daemon.error] NIC failure detected on bge0 of group test
Nov 2 13:03:20 cstoc77022 in.mpathd[146]: [ID 832587 daemon.error] Successfully failed over from NIC bge0 to NIC bge1
Nov 2 13:04:20 cstoc77022 bge: [ID 801593 kern.notice] NOTICE: bge0: link up 100Mbps Full-Duplex
Nov 2 13:04:20 cstoc77022 in.mpathd[146]: [ID 820239 daemon.error] The link has come up on bge0
Nov 2 13:04:34 cstoc77022 in.mpathd[146]: [ID 299542 daemon.error] NIC repair detected on bge0 of group test
Nov 2 13:04:34 cstoc77022 ip: [ID 388441 kern.warning] WARNING: IP: Proxy ARP problem? Hardware address '00:14:4f:2a:9b:82' thinks it is 063.192.077.022
Nov 2 13:04:34 cstoc77022 in.mpathd[146]: [ID 620804 daemon.error] Successfully failed back to NIC bge0
Nov 2 13:04:44 cstoc77022 in.routed[158]: [ID 559541 daemon.warning] 10.0.0.0 --> 63.192.77.9 disappeared from kernel
Nov 2 13:04:44 cstoc77022 in.routed[158]: [ID 559541 daemon.warning] 63.192.78.0/24 --> 63.192.77.9 disappeared from kernel
Nov 2 13:04:44 cstoc77022 in.routed[158]: [ID 559541 daemon.warning] 63.192.85.64/27 --> 63.192.77.9 disappeared from kernel
Nov 2 13:04:44 cstoc77022 in.routed[158]: [ID 559541 daemon.warning] 172.20.0.0 --> 63.192.77.4 disappeared from kernel
Nov 2 13:04:44 cstoc77022 in.routed[158]: [ID 559541 daemon.warning] 10.3.0.0/16 --> 63.192.77.92 disappeared from kernel
Nov 2 13:04:44 cstoc77022 in.routed[158]: [ID 559541 daemon.warning] 172.16.0.0 --> 63.192.77.9 disappeared from kernel
Nov 2 13:04:44 cstoc77022 in.routed[158]: [ID 559541 daemon.warning] 63.192.76.0/24 --> 63.192.77.9 disappeared from kernel
Nov 2 13:05:31 cstoc77022 nfs: [ID 664466 kern.notice] NFS getattr failed for server mls1: error 7 (RPC: Authentication error)
# 3
Could you post:
+ routeadm
+ arp -an
+ ps -aef
+ /etc/defaultrouter file
Check your box is not acting as a router.
Your box does not have a default router. Is that configuration right?
The documentation states:
<< Routers that are connected to the IP link are automatically selected as targets for probing. If no routers exist on the link, in.mpathd sends probes to neighbor hosts on the link. A multicast packet that is sent to the all host multicast address.
...snip...
If in.mpathd cannot find routers or hosts that responded to the ICMP echo packets, in.mpathd cannot detect probe-based failures.>>
Is it allowed to send ICMP echo packet to the routers in the other networks?
# 4
Hello again,
When gathering data for the previous reply, I also noticed that the default route had not been set. We usually do specify that, so I added that to the configuration. But, the host had found the correct router previously, it's 63.192.77.9. Specifying it did not change the problem symptoms, anyway. Here's the other requested info:
-> netstat -rn
Routing Table: IPv4
DestinationGatewayFlags RefUseInterface
-- -- -- --
63.192.77.0 63.192.77.12 U 15 bge1
63.192.77.0 63.192.77.22 U 11 bge0
63.192.77.0 63.192.77.22 U 10 bge0:1
63.192.77.0 63.192.77.12 U 10 bge1:1
224.0.0.063.192.77.22 U 10 bge0
default 63.192.77.9 UG10
127.0.0.1127.0.0.1UH793 lo0
-> routeadm
ConfigurationCurrent Current
OptionConfigurationSystem State
IPv4 forwardingdisabled disabled
IPv4 routingdefault (disabled)disabled
IPv6 forwardingdisabled disabled
IPv6 routingdisabled disabled
IPv4 routing daemon"/usr/sbin/in.routed"
IPv4 routing daemon args""
IPv4 routing daemon stop"kill -TERM `cat /var/tmp/in.routed.pid`"
IPv6 routing daemon"/usr/lib/inet/in.ripngd"
IPv6 routing daemon args"-s"
IPv6 routing daemon stop"kill -TERM `cat /var/tmp/in.ripngd.pid`"
r
-> arp -an
Net to Media Table: IPv4
DeviceIP AddressMaskFlagsPhys Addr
-- --
bge163.192.77.1 255.255.255.25500:03:ba:c0:77:75
bge063.192.77.9 255.255.255.25500:16:46:f1:b5:c2
bge163.192.77.9 255.255.255.25500:16:46:f1:b5:c2
bge163.192.77.186255.255.255.25500:c0:4f:60:6a:ab
bge063.192.77.186255.255.255.25500:c0:4f:60:6a:ab
bge163.192.77.191255.255.255.25500:0c:f1:bf:1d:01
bge063.192.77.191255.255.255.25500:0c:f1:bf:1d:01
bge163.192.77.169255.255.255.25500:0c:f1:bf:1c:92
bge063.192.77.169255.255.255.25500:0c:f1:bf:1c:92
bge163.192.77.175255.255.255.25500:c0:4f:60:68:64
bge063.192.77.175255.255.255.25500:c0:4f:60:68:64
bge163.192.77.144255.255.255.25500:c0:4f:60:68:94
bge063.192.77.144255.255.255.25500:c0:4f:60:68:94
bge163.192.77.150255.255.255.25500:c0:4f:60:6a:70
bge063.192.77.150255.255.255.25500:c0:4f:60:6a:70
bge063.192.77.130255.255.255.25500:0c:f1:bf:1d:1f
bge163.192.77.130255.255.255.25500:0c:f1:bf:1d:1f
bge163.192.77.128255.255.255.25500:0c:f1:bf:1c:65
bge063.192.77.128255.255.255.25500:0c:f1:bf:1c:65
bge163.192.77.242255.255.255.25500:0d:56:0b:eb:2a
bge063.192.77.242255.255.255.25500:0d:56:0b:eb:2a
bge163.192.77.243255.255.255.25500:0f:1f:91:c1:9b
bge063.192.77.243255.255.255.25500:0f:1f:91:c1:9b
bge163.192.77.240255.255.255.25500:13:72:17:cb:13
bge063.192.77.240255.255.255.25500:13:72:17:cb:13
bge163.192.77.247255.255.255.25500:c0:4f:60:6a:e6
bge063.192.77.247255.255.255.25500:c0:4f:60:6a:e6
bge163.192.77.224255.255.255.25500:09:6b:2e:61:dd
bge063.192.77.224255.255.255.25500:09:6b:2e:61:dd
bge163.192.77.225255.255.255.25500:11:11:c4:9c:eb
bge063.192.77.225255.255.255.25500:11:11:c4:9c:eb
bge163.192.77.236255.255.255.25500:03:ba:eb:17:6d
bge063.192.77.236255.255.255.25500:03:ba:eb:17:6d
bge163.192.77.210255.255.255.25500:11:11:b1:2b:6e
bge063.192.77.210255.255.255.25500:11:11:b1:2b:6e
bge163.192.77.222255.255.255.25500:30:6e:08:ed:3a
bge063.192.77.222255.255.255.25500:30:6e:08:ed:3a
bge163.192.77.193255.255.255.25500:13:72:23:32:aa
bge063.192.77.193255.255.255.25500:13:72:23:32:aa
bge163.192.77.207255.255.255.25500:0c:f1:b6:26:aa
bge063.192.77.207255.255.255.25500:0c:f1:b6:26:aa
bge163.192.77.204255.255.255.25500:c0:4f:60:68:5b
bge063.192.77.204255.255.255.25500:c0:4f:60:68:5b
bge163.192.77.48 255.255.255.25500:0a:95:99:e4:40
bge063.192.77.48 255.255.255.25500:0a:95:99:e4:40
bge063.192.77.49 255.255.255.25500:03:93:90:52:f6
bge163.192.77.61 255.255.255.25500:c0:4f:60:6a:75
bge063.192.77.61 255.255.255.25500:c0:4f:60:6a:75
bge163.192.77.35 255.255.255.25500:30:6e:49:41:50
bge063.192.77.35 255.255.255.25500:30:6e:49:41:50
bge163.192.77.36 255.255.255.25500:16:35:3e:7d:0a
bge063.192.77.36 255.255.255.25500:16:35:3e:7d:0a
bge063.192.77.42 255.255.255.25500:11:11:c4:9d:05
bge163.192.77.42 255.255.255.25500:11:11:c4:9d:05
bge163.192.77.40 255.255.255.25500:0c:f1:bf:1f:8d
bge063.192.77.40 255.255.255.25500:0c:f1:bf:1f:8d
bge163.192.77.41 255.255.255.25500:0c:f1:bf:1d:10
bge063.192.77.41 255.255.255.25500:0c:f1:bf:1d:10
bge063.192.77.19 255.255.255.25508:00:20:f0:ea:e4
bge163.192.77.19 255.255.255.25508:00:20:f0:ea:e4
bge163.192.77.16 255.255.255.255 SP00:14:4f:2a:9b:83
bge063.192.77.22 255.255.255.255 SP00:14:4f:2a:9b:82
bge063.192.77.23 255.255.255.25500:09:6b:3e:2b:82
bge163.192.77.23 255.255.255.25500:09:6b:3e:2b:82
bge063.192.77.21 255.255.255.255 SP00:14:4f:2a:9b:82
bge163.192.77.29 255.255.255.25500:09:6b:2e:46:51
bge063.192.77.29 255.255.255.25500:09:6b:2e:46:51
bge063.192.77.1 255.255.255.25500:03:ba:c0:77:75
bge163.192.77.12 255.255.255.255 SP00:14:4f:2a:9b:83
bge063.192.77.115255.255.255.25500:0c:f1:bf:1c:e6
bge163.192.77.115255.255.255.25500:0c:f1:bf:1c:e6
bge163.192.77.122255.255.255.25500:10:83:f9:34:d4
bge063.192.77.122255.255.255.25500:10:83:f9:34:d4
bge163.192.77.125255.255.255.25500:0f:1f:91:bf:7d
bge063.192.77.125255.255.255.25500:0f:1f:91:bf:7d
bge163.192.77.99 255.255.255.25500:0c:f1:bf:1a:52
bge063.192.77.99 255.255.255.25500:0c:f1:bf:1a:52
bge163.192.77.100255.255.255.25500:0c:f1:b6:26:b4
bge063.192.77.100255.255.255.25500:0c:f1:b6:26:b4
bge163.192.77.101255.255.255.25500:0c:f1:bf:1c:fe
bge063.192.77.101255.255.255.25500:0c:f1:bf:1c:fe
bge163.192.77.107255.255.255.25500:0d:56:14:48:4d
bge063.192.77.107255.255.255.25500:0d:56:14:48:4d
bge163.192.77.110255.255.255.25500:c0:4f:60:6a:44
bge063.192.77.110255.255.255.25500:c0:4f:60:6a:44
bge163.192.77.108255.255.255.25500:14:bf:31:ec:e2
bge063.192.77.108255.255.255.25500:14:bf:31:ec:e2
bge063.192.77.80 255.255.255.25500:16:cb:a6:5e:3d
bge163.192.77.80 255.255.255.25500:16:cb:a6:5e:3d
bge163.192.77.92 255.255.255.25500:40:63:d3:8c:46
bge063.192.77.92 255.255.255.25500:40:63:d3:8c:46
bge163.192.77.68 255.255.255.25500:0c:f1:b6:27:10
bge063.192.77.68 255.255.255.25500:0c:f1:b6:27:10
bge163.192.77.69 255.255.255.25500:13:72:17:ca:4a
bge063.192.77.69 255.255.255.25500:13:72:17:ca:4a
bge163.192.77.73 255.255.255.25500:03:93:d1:db:cc
bge063.192.77.73 255.255.255.25500:03:93:d1:db:cc
bge163.192.77.77 255.255.255.25500:30:65:a8:22:bc
bge063.192.77.77 255.255.255.25500:30:65:a8:22:bc
bge1224.0.0.0240.0.0.0SM01:00:5e:00:00:00
bge0224.0.0.0240.0.0.0SM01:00:5e:00:00:00
-> ps -aef
UIDPID PPIDCSTIME TTY TIME CMD
root000 15:11:12 ?0:11 sched
root100 15:11:13 ?0:00 /sbin/init
root200 15:11:13 ?0:00 pageout
root300 15:11:13 ?0:00 fsflush
daemon19610 15:11:37 ?0:00 /usr/sbin/rpcbind
root710 15:11:15 ?0:10 /lib/svc/bin/svc.startd
root910 15:11:16 ?0:16 /lib/svc/bin/svc.configd
root25610 15:11:40 ?0:00 /usr/sbin/cron
root33510 15:11:49 ?0:00 /usr/sbin/syslogd
root11310 15:11:33 ?0:00 /usr/sbin/nscd -S passwd,yes
root7266910 15:16:16 pts/10:00 ps -aef
daemon20110 15:11:37 ?0:00 /usr/lib/nfs/statd
root20010 15:11:37 ?0:00 /usr/sbin/keyserv
root19210 15:11:36 ?0:01 /opt/apani/uagent/nlagent
daemon8610 15:11:26 ?0:00 /usr/lib/crypto/kcfd
root15210 15:11:35 ?0:00 /usr/lib/inet/in.mpathd -a
root21270 15:11:38 ?0:00 /usr/lib/saf/sac -t 300
root8910 15:11:26 ?0:00 /usr/lib/picl/picld
daemon24710 15:11:40 ?0:00 /usr/lib/nfs/nfs4cbd
root10210 15:11:28 ?0:00 /usr/lib/power/powerd
root9810 15:11:27 ?0:00 /usr/lib/sysevent/syseventd
root21510 15:11:38 ?0:00 /usr/sbin/nis_cachemgr
daemon21410 15:11:38 ?0:00 /usr/lib/nfs/lockd
root21310 15:11:38 ?0:00 /usr/lib/utmpd
root21770 15:11:38 console0:00 -sh
root2231920 15:11:39 ?0:00 inm -p9165
root2222120 15:11:39 ?0:00 /usr/lib/saf/ttymon
daemon25510 15:11:40 ?0:00 /usr/lib/nfs/nfsmapid
root3993970 15:11:52 ?0:00 /usr/sadm/lib/smc/bin/smcboot
root25210 15:11:40 ?0:04 /usr/lib/inet/inetd start
root3983970 15:11:52 ?0:00 /usr/sadm/lib/smc/bin/smcboot
root31710 15:11:48 ?0:00 /usr/lib/autofs/automountd
root35910 15:11:50 ?0:00 /usr/lib/sendmail -bd -q15m
root4484470 15:11:53 ?0:00 /usr/lib/locale/ja/wnn/jserver_m
root35110 15:11:50 ?0:02 /usr/lib/fm/fmd/fmd
root6742520 15:12:14 ?0:00 /usr/sbin/in.telnetd
root34710 15:11:50 ?0:00 /usr/lib/ssh/sshd
smmsp36010 15:11:50 ?0:00 /usr/lib/sendmail -Ac -q15m
root46110 15:11:53 ?0:00 /usr/lib/locale/ja/atokserver/atokmngdaemon
root39710 15:11:52 ?0:00 /usr/sadm/lib/smc/bin/smcboot
root4684590 15:11:53 ?0:00 htt_server -port 9010 -syslog -message_locale C
root44110 15:11:53 ?0:00 /usr/lib/locale/ja/wnn/dpkeyserv
root44710 15:11:53 ?0:00 /usr/lib/locale/ja/wnn/jserver
root45910 15:11:53 ?0:00 /usr/lib/im/htt -port 9010 -syslog -message_locale C
root51210 15:11:55 ?0:00 /usr/lib/snmp/snmpdx -y -c /etc/snmp/conf
root52010 15:11:56 ?0:00 /usr/lib/dmi/dmispd
root52810 15:11:56 ?0:00 /usr/sbin/vold
root52110 15:11:56 ?0:00 /usr/lib/dmi/snmpXdmid -s cstoc77022
root51110 15:11:55 ?0:00 /usr/dt/bin/dtlogin -daemon
root6916770 15:12:18 pts/10:00 bash
root6776740 15:12:14 pts/10:00 -sh
root58510 15:11:57 ?0:00 /usr/sfw/sbin/snmpd
# 5
Be sure your changes are permanent, reboot the box and try your tests again. IPMP finds targets at boot time.If it does not work, post all information as requested in reply 1 and 3.
# 6
OK. Actually, I had permanized and rebooted before the previous reply, but I had not rechecked all the ifconfig settings. Here there are again, this time with a configured default router:
-> netstat -rn
Routing Table: IPv4
DestinationGatewayFlags RefUseInterface
-- -- -- --
63.192.77.0 63.192.77.22 U 121 bge0
63.192.77.0 63.192.77.12 U 11 bge1
63.192.77.0 63.192.77.22 U 10 bge0:1
63.192.77.0 63.192.77.22 U 10 bge1:1
224.0.0.063.192.77.22 U 10 bge0
default 63.192.77.9 UG11
127.0.0.1127.0.0.1UH799 lo0
- BOTH CONNECTED bge0, bge1
-> ifconfig -a
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
inet 127.0.0.1 netmask ff000000
bge0: flags=1001000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,FIXEDMTU> mtu 1442 index 2
inet 63.192.77.22 netmask ffffff00 broadcast 63.192.77.255
groupname test
ether 0:14:4f:2a:9b:82
bge0:1: flags=9040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER> mtu 1500 index 2
inet 63.192.77.21 netmask ffffff00 broadcast 63.192.77.255
bge1: flags=1001000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,FIXEDMTU> mtu 1442 index 3
inet 63.192.77.12 netmask ffffff00 broadcast 63.192.77.255
groupname test
ether 0:14:4f:2a:9b:83
bge1:1: flags=9040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER> mtu 1500 index 3
inet 63.192.77.16 netmask ffffff00 broadcast 63.192.77.255
- REMOVED bge1
-> ifconfig -a
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
inet 127.0.0.1 netmask ff000000
bge0: flags=1001000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,FIXEDMTU> mtu 1442 index 2
inet 63.192.77.22 netmask ffffff00 broadcast 63.192.77.255
groupname test
ether 0:14:4f:2a:9b:82
bge0:1: flags=9040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER> mtu 1500 index 2
inet 63.192.77.21 netmask ffffff00 broadcast 63.192.77.255
bge0:2: flags=1001000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,FIXEDMTU> mtu 1442 index 2
inet 63.192.77.12 netmask ffffff00 broadcast 63.192.77.255
bge1: flags=1019000802<BROADCAST,MULTICAST,IPv4,NOFAILOVER,FAILED,FIXEDMTU> mtu 0 index 3
inet 0.0.0.0 netmask 0
groupname test
ether 0:14:4f:2a:9b:83
bge1:1: flags=19040803<UP,BROADCAST,MULTICAST,DEPRECATED,IPv4,NOFAILOVER,FAILED> mtu 1500 index 3
inet 63.192.77.16 netmask ffffff00 broadcast 63.192.77.255
Nov 2 16:47:59 cstoc77022 bge: NOTICE: bge1: link down
Nov 2 16:47:59 cstoc77022 in.mpathd[153]: The link has gone down on bge1
Nov 2 16:47:59 cstoc77022 in.mpathd[153]: NIC failure detected on bge1 of group test
Nov 2 16:47:59 cstoc77022 in.mpathd[153]: Successfully failed over from NIC bge1 to NIC bge0
Nov 2 16:48:07 cstoc77022 in.mpathd[153]: All Interfaces in group test have failed
- RESTORED bge1
-> ifconfig -a
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
inet 127.0.0.1 netmask ff000000
bge0: flags=1001000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,FIXEDMTU> mtu 1442 index 2
inet 63.192.77.22 netmask ffffff00 broadcast 63.192.77.255
groupname test
ether 0:14:4f:2a:9b:82
bge0:1: flags=9040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER> mtu 1500 index 2
inet 63.192.77.21 netmask ffffff00 broadcast 63.192.77.255
bge1: flags=1001000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,FIXEDMTU> mtu 1442 index 3
inet 63.192.77.12 netmask ffffff00 broadcast 63.192.77.255
groupname test
ether 0:14:4f:2a:9b:83
bge1:1: flags=9040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER> mtu 1500 index 3
inet 63.192.77.16 netmask ffffff00 broadcast 63.192.77.255
Nov 2 16:48:51 cstoc77022 bge: NOTICE: bge1: link up 100Mbps Full-Duplex
Nov 2 16:48:51 cstoc77022 in.mpathd[153]: The link has come up on bge1
Nov 2 16:49:06 cstoc77022 in.mpathd[153]: NIC repair detected on bge0 of group test
Nov 2 16:49:06 cstoc77022 in.mpathd[153]: Successfully failed back to NIC bge0
Nov 2 16:49:06 cstoc77022 in.mpathd[153]: At least 1 interface (bge0) of group test has repaired
Nov 2 16:49:06 cstoc77022 in.mpathd[153]: NIC repair detected on bge1 of group test
Nov 2 16:49:06 cstoc77022 in.mpathd[153]: Successfully failed back to NIC bge1
- REMOVED bge0
-> ifconfig -a
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
inet 127.0.0.1 netmask ff000000
bge0: flags=1019000802<BROADCAST,MULTICAST,IPv4,NOFAILOVER,FAILED,FIXEDMTU> mtu 0 index 2
inet 0.0.0.0 netmask 0
groupname test
ether 0:14:4f:2a:9b:82
bge0:1: flags=19040803<UP,BROADCAST,MULTICAST,DEPRECATED,IPv4,NOFAILOVER,FAILED> mtu 1500 index 2
inet 63.192.77.21 netmask ffffff00 broadcast 63.192.77.255
bge1: flags=1001000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,FIXEDMTU> mtu 1442 index 3
inet 63.192.77.12 netmask ffffff00 broadcast 63.192.77.255
groupname test
ether 0:14:4f:2a:9b:83
bge1:1: flags=9040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER> mtu 1500 index 3
inet 63.192.77.16 netmask ffffff00 broadcast 63.192.77.255
bge1:2: flags=1001000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,FIXEDMTU> mtu 1442 index 3
inet 63.192.77.22 netmask ffffff00 broadcast 63.192.77.255
Nov 2 16:50:02 cstoc77022 bge: NOTICE: bge0: link down
Nov 2 16:50:02 cstoc77022 in.mpathd[153]: The link has gone down on bge0
Nov 2 16:50:02 cstoc77022 in.mpathd[153]: NIC failure detected on bge0 of group test
Nov 2 16:50:02 cstoc77022 in.mpathd[153]: Successfully failed over from NIC bge0 to NIC bge1
- RESTORED bge0
-> ifconfig -a
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
inet 127.0.0.1 netmask ff000000
bge0: flags=1001000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,FIXEDMTU> mtu 1442 index 2
inet 63.192.77.22 netmask ffffff00 broadcast 63.192.77.255
groupname test
ether 0:14:4f:2a:9b:82
bge0:1: flags=9040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER> mtu 1500 index 2
inet 63.192.77.21 netmask ffffff00 broadcast 63.192.77.255
bge1: flags=1001000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,FIXEDMTU> mtu 1442 index 3
inet 63.192.77.12 netmask ffffff00 broadcast 63.192.77.255
groupname test
ether 0:14:4f:2a:9b:83
bge1:1: flags=9040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER> mtu 1500 index 3
inet 63.192.77.16 netmask ffffff00 broadcast 63.192.77.255
Nov 2 16:51:12 cstoc77022 bge: NOTICE: bge0: link up 100Mbps Full-Duplex
Nov 2 16:51:12 cstoc77022 in.mpathd[153]: The link has come up on bge0
Nov 2 16:51:12 cstoc77022 ip: WARNING: IP: Hardware address '00:14:4f:2a:9b:82' trying to be our a
ddress 063.192.077.021!
Nov 2 16:51:26 cstoc77022 in.mpathd[153]: NIC repair detected on bge0 of group test
Nov 2 16:51:26 cstoc77022 in.mpathd[153]: Successfully failed back to NIC bge0
Nov 2 16:51:34 cstoc77022 ip: WARNING: IP: Hardware address '00:14:4f:2a:9b:82' trying to be our a
ddress 063.192.077.022!
# 7
1. Test your default router:
ping 63.192.77.9
2. Test another Sun boxes 63.192.77.1 and 63.192.77.236 and 63.192.77.19:
ping 63.192.77.1 ; ping 63.192.77.236 ; ping 63.192.77.19
3. If it works, add static routes and in a boot script:
route add -host 63.192.77.1 63.192.77.1 -static
route add -host 63.192.77.236 63.192.77.236 -static
route add -host 63.192.77.19 63.192.77.19 -static
4. Try your tests again.
5. If it does not work, install Recommended patches and bge patch (122027-08).
By the way, did your software uses arp for publishing MAC-IP addresses?
# 8
I don't see how the explicit routes will change the results, but it's worth a try. Our software doesn't do anything with ARPs. The only thing we do is reduce the MTU size to make room for all the ESP headers. We've only had problems with the 'bge' interface, which is the mystery to us. Thanks for your help so far!
# 9
The static routes didn't help so I installed the bge patch. It made my host unbootable, and since I'm using a Try&Buy T1000 there's no optical drive or external SCSI port. leaving a net install as my only option. I'm currently creating a JumpStart server, so hopefully my host will be back up for more testing later today.
# 10
My host is back online, with a newer version of Solaris 10. It already has versions of the BGE patch, so I reran the IPMP tests. It now works normally for me.We'll test the corresponding patch for Solaris 8 also.Thanks!