uninspectable processes on Solaris 10
All -
We are running Solaris 10, patch level Generic_118833-20, in a fairly complex system which includes MySQL 5.0.22 and has several other applications (Perl, Java) accessing the MySQL server. Recently, on a fairly high-powered Sun computer (4 to 8 cores, lots of memory), we've encountered fairly regular situations where one of these processes will become unkillable, and inspecting that process will become impossible; invoking "ps" or "prstat" on the unkillable process (or even trying to list /proc/<pid>, where <pid> is the unkillable process) will hang and also not respond to Cntrl-C; all you can do is kill the shell. In many of the cases, mysqld has been the process that hung; in other cases, it's been a process which connects to the MySQL server (but is not necessarily connected to the server at the time). My intuition is that this has to be a problem in Solaris 10 itself, since I've never seen userland code cause a process to become uninspectable; but obviously, I could be wrong about this. The folks who are supporting us at Sun are pushing for this to be a MySQL problem; I'm wondering whether anyone has ever seen anything like this, and whether it could possibly be a MySQL problem alone.
This problem appeared about two weeks after we originally configured the machine, and can reliably been reproduced after the machine runs for somewhere between a couple hours and one day.
Thanks in advance -
Sam Bayer
The MITRE Corporation
sam@mitre.org

