CMS very long pauses

We are using JRE1.4._10 on Windows with large heaps

min=max=1400M

We are also using CMS in order to reduce pauses.

Two problems are experienced:

On startup we see a GC pause which takes more than an hour (it happens in one every four restarts)

On heavy load from time to time there are 2-3 minutes application pauses.

The memory is not depleted we have at least 200M free memory

(Young generation is 80MB)

Any ideas why it happen and how can it be solved.

[512 byte] By [haimya] at [2007-10-3 11:45:16]
# 1

> We are using JRE1.4._10 on Windows with large heaps

> min=max=1400M

> We are also using CMS in order to reduce pauses.

> (Young generation is 80MB)

So, you're running with something like-server -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -Xms1400m -Xmx1400m -Xmn80m

> On startup we see a GC pause which takes more than an

> hour (it happens in one every four restarts)

This makes no sense at all.

How can there be anything to GC when the JVM is only just starting up?

> On heavy load from time to time there are 2-3 minutes

> application pauses.

http://java.sun.com/performance/reference/whitepapers/tuning.html#section4.2.6

http://blogs.sun.com/jonthecollector/entry/what_the_heck_s_a

tschodta at 2007-7-15 14:17:35 > top of Java-index,Java HotSpot Virtual Machine,HotSpot Internals...
# 2

How much physical memory do you have on your machine? If you reboot your machine

and then start up the application, do you ever see the 1 hour long pause? Can you

post some of the output with -XX:+PrintGCDetails turned on? Is there anything

else running on the machine or is it just this application?

jon999a at 2007-7-15 14:17:35 > top of Java-index,Java HotSpot Virtual Machine,HotSpot Internals...
# 3

Some more info

1. We have 4GB of physical memory. (Windows 2003 enterprise edition2 with to xeon cpus)

2. I am familiar with these articles and I know that we should expect Full/Stop the world GC from time to time

I do not understand why does it take so much time to finish, When not using CMS GC takes at most 20 seconds. If CMS fails I would expect same time.

3. We are loading a big cache and performing a lot of computation at startup so maybe this is why it can happen/

4.When not using CMS we do not experience the 1hour freeze at startup

5. When using -XX:+CMSParallelRemarkEnabled we do not see the freeze anymore but the server crashes from time to time.

haimya at 2007-7-15 14:17:35 > top of Java-index,Java HotSpot Virtual Machine,HotSpot Internals...
# 4

There have been a lot of fixes/bugs with too long taking GCs. Furthermore stability of all the paralell collectors was wastly improved in java-5/6.

Are you using large reference-arrays (e.g. String[]) - there have been also bug-fixes and it would explain your problem.

1.) Java-1.4 is really old, because java-6 was just released and is mostly backward compatible I would recommend to give it a try.

You'll definitifly see much better stability (because some long-standing implementation problems have been fixed) and higher performance (in the range of 10-20% on our servers). We've not seen any incompatibility when migrating from 1.4.2 to 5.0 and just recently to 6.0. I would just give it a try.

2.) We've moved all our high-load Tomcat/Weblogic servers onto Linux-2.6.18 (Opteron boxes), from Windows2000 Advanced Server. For good reasons ;)

lg Clemens

linuxhippya at 2007-7-15 14:17:35 > top of Java-index,Java HotSpot Virtual Machine,HotSpot Internals...
# 5
It's not clear to me what type of GC pauses are the long ones. Can youturn on -XX:+PrintGCDetails and post the gc log for the problem pauses?You say that you don't see the problem without cms. Which collector are youusing in that case?
jon999a at 2007-7-15 14:17:35 > top of Java-index,Java HotSpot Virtual Machine,HotSpot Internals...