Solaris 10 + Oracle 10gR2 RAC question
Hello everyone
Has anyone come across the case where the CRS services of Oracle cause
the public interface to get turned off and then restored at random
time intervals? To elaborate, we have a 2 node cluster database.
Solaris 10, Oracle 10gR2 RAC with patch 10.2.0.3 applied. No SUN
clustering is involved. When the cluster software is down (nodeapps,
asm, database instances all down) /var/adm/messages show nothing. When
we start nodeapps on the 2 nodes(thus initiating some form of
communication between the nodes), at random time intervals we get
"interface ce0 turned off and interface ce0 restored" in /var/adm/
messages. When we check the status of the RAC, we see that one node's
vip has been assigned to the other. This on/off behaviour of the NIC
can be eliminated only if we continuously PING it from a another
client in the network.
As a matter of fact, the RAC and the RDBMS work perfectly when we keep
pinging the 2 nodes from an other client on the network. We even
managed to run a long batch job, distributed on cluster managed
services on the 2 instances, and it completed after 9 hours without
any problems.
Does anyone have a hint on this behaviour? Is there some sort of
timeout for the network cards? Some power saving features? Googling
around I came across the new Containers feature available on Solaris
10. Is there a way that I can verify that either RAC or the RDBMS is
running in "container" mode ( since the solaris and Oracle
installation was not performed by me)? Any other ideas?
Thank you for reading

