V240 server pauses 30s, good for 50secs, repeats
And load averages reported by "uptime" show 5-12 vice normally
we see .01.Will happen over and over for 24+ hrs and then suddenly goes normal for 2 or more hrs.
While the server is hung for 30s, can ping both Ethernet addresses from an external machine. Have a background "sar -o sar.out 5 9999999" command running all the time to get every 5s updates on CPU utilization. To view the results do " sar -f sar.out".
There are 30s periods where no updates happen. Then there will be every 5s updates showing activity. Then repeats.
/var/adm/messages shows nothing unusual.
Sybase is running on machine. 2.5 GB of memory. swap -s shows only
about 500 MB in use.Have requested that local site folks look at the
console for adverse messages but haven't heard back. Am pretty sure the ALOM stuff isn't hooked up so cannot be viewed remotely.
prtdiag doesnt appear to show anything unusual. Help!
# 1
Figured this out partially. Turns out that every 50 seconds a program was
writing 8000 chars to the /dev/console. The 8000 chars consisted of a 32
char message followed by null and \n, repeated 256 times roughly.
Was able to duplicate the problem by just running something like
"cat bigfile >/dev/console". If bigfile is small then the outage is a few
seconds. As it gets bigger, then the outage becomes bigger.
If superuser runs a xconsole window, then no problems.Seems that
at some very low-level of OS or the hardware, there is congestion
or timing out when a write is done to console but nothing is "connected".
Server is at remote location and getting those folks to do things is hard
so I'm still unsure if this is a problem induced by the KVM that is actually
attached to the monitor port, or if it is instead a Solaris configuration
problem. Also noticed that command like "stty -a </dev/console" errs out
with "Invalid argument" error. Since its not my application software causing
the problem, I've dumped the solution task onto the local site SysAdmin.>