Apps freeze after 2 days
Problem:
Applications run nicely and consistently for a few days (at least 2) then just seem to freeze for a reason we can't figure out. Error logs are not indicating any problems.
Needed:
Debugging tool that can attach itself to a running process and let us know what is going on. (I tried jdb, but can't seem to get it to attach itself to a process... get some kind of socket error)
Application info:
We have a couple applications running that use similar classes. They run with the wrapper startStopApp so they can run as deamons and run until told to stop(at which time they finish what they are doing, cleanup, and exit nicely). Each application acts as some kind of relay, either from a data server to our database server or vise-versa. The applications report any lost connections, mention reconnect attempts every so often, and are coded to dump any java exception stack traces to an error log file.
Possible problems(?):
1. Something wrong with the JVM
2. Something going wrong with the Garbage collector (although the only issues i've read related to this only envolve a few minutes delay)
3. Memory (although system diagnostics don't indicate anything wrong in this department, and we haven't had any OutOfMemory exceptions)
4. network glitches
System info:
Redhat advanced server 4.0
kernel 2.6.9
java 1.4.2
Problem:
> Applications run nicely and consistently for a few
> days (at least 2) then just seem to freeze for a
> reason we can't figure out. Error logs are not
> indicating any problems.
Whatever the reason for the freeze is (even JVM bug), do you have functional input/output logs, and do they show a reproducible pattern before the freezes?
I remember facing such a freeze issue; although the freeze itself was due to a JVM bug, the JVM bug was only exhibitted by some bug in our own application, which was identified after analyzing the reproducible traffic pattern we got before each freeze occurence.
We corrected our bug and got rid of the freezes, long before the JVM itself was patched - we did spend a few month sweating, unconfident on the JVM
> Needed:
> Debugging tool that can attach itself to a running
> process and let us know what is going on.
If really the whole application freezes, and you don't know when it will freeze, you won't get much by attaching a debugger: before it freezes, you may not notice anything special; after it has frozen, well, you will probably not be able to step into the frozen part.
> Possible problems(?):
> 1. Something wrong with the JVM
> 2. Something going wrong with the Garbage collector
> (although the only issues i've read related to this
> only envolve a few minutes delay)
> 3. Memory (although system diagnostics don't indicate
> anything wrong in this department, and we haven't had
> any OutOfMemory exceptions)
> 4. network glitches
What about:
5. application bug (long-running process, thread starvation, deadlock, whatever)?
What makes you dismiss it?
My reflex is the following:
App freeze -> thread dumps (several consecutive thread dumps, to see if threads evolve).
The thread dumps will tell you where (class, method, and line number) the app threads are. If they do not eveolve over a few seconds, or minutes, then threads are really "frozen".
Note that if the problem is 1) JVM bug, you may not even be able to issue a thread dump...
But if you do succeed getting thread dumps, and see nothig wrong with your code, think that the JVM vendor may be able to tell, based on the thread pattern, if it's a known JVM bug.