JDK1.5.0_06 -XX:SurvivorRatio setting is ignored?
We upgraded the JVM from the previous 1.4.2_08 to the 1.5.0_06 and we are using the traditional copying collector. I found the -XX:SurvivorRatio setting is ignored by the new VM. The following is my settings and the output from jstat:
JAVA_OPTIONS : -server -Xms1536m -Xmx1536m -XX:SurvivorRatio=2 -XX:NewSize=512m -XX:MaxNewSize=512m -XX:PermSize=80m -verbose:gc -XX:+PrintGCTimeStamps -XX:
+PrintGCDetails -XX:+PrintHeapAtGC -XX:+DisableExplicitGC
$ jstat -gccapacity 22700
NGCMNNGCMXNGCS0CS1CECOGCMNOGCMXOGC OCPGCMNPGCMXPGCPCYGCFGC
524288.0 524288.0 524288.0 31936.0 32064.0 460096.0 1048576.0 1048576.0 1048576.0 1048576.0 81920.0 81920.0 81920.0 81920.05591
I did some search on the SUN's website and also googling it for a while, but didn't see similiar report. Anyone know anything about this? Is this a bug or part of the new ergonomics in the 1.5 vm? Is there any new setting to replace this? I don't like the way it sets now since it'll push for premature objects to the tenure space too soon and results frequent full gcs. Plus now since the survivor space is way too small, the tenure space will gets to 99% before a Full gc kicks in, and for some reason it causes a long pause of 200+ seconds. If I drop the max heap size to 1GB, it took only 5 seconds for a full gc. I believe the swapping caused the long pause since the tenured space is too full and system doesn't have room for space required for a full gc.
Many thanks,
- Sophia
[1509 byte] By [
sophiaxa] at [2007-10-2 22:54:24]

Hello Sophia!
This is interesting... If you use the CMS collector (-XX:+UseConcMarkSweepGC), everything works fine with 1.5.0_06: The survivor spaces are 128m large. But with the default copying collector the SurvivorRatio is ignored.
If you use the default collector, you can set -XX:-UseAdaptiveSizePolicy. This would disable the adaptive sizing of the survivor spaces. However, one would expect them to be 128m large now. But in fact, they are only 64m large, no matter whether SurvivorRatio is 1, 2 or 4.... This looks like a bug. Or, for some reason, 64m is the maximum size for the survivor spaces for the default collector? I don't know...
Nick.
Hi Nick,
Thanks for the reply. I think it's a bug and I'm surprised there's no report of it yet. Anyhow, it forces me to revisit the alternative collectors. I'm using SUN Fire V210 with 2CPU/2GB, we didn't have luck in the previous releases (jdk1.4.2_08) with the newer collectors and the new tuning parameters. I found for a long run, the copying collector gives the most stable and reliable throughput as long as you know exactly how to tune it. At least it's true on a 2CPU box and the type of applications we are developing (tomcat/struts). Have you used the new collector in 1.5.0? What's your experience with it?
Thanks and Best Regards,
- Sophia
I think the key here is that you are running on a 2 CPU box with 2GB of memory. That means you qualify as a "server class machine" and the default collector (e.g., because you didn't specify one) changes from -XX:+UseSerialGC in JDK-1.4.2 to -XX:+UseParallelGC in JDK-1.5.0. (This is clearly explained in the JDK-1.5.0 release notes, but no one reads those. :-) You can confirm that the collector has changed by running with -XX:+PrintGCDetails, and you should see "DefNew" in JDK-1.4.2 and "PSYoungGen" in JDK-1.5.0. The parallel scavenge collector is almost certainly a better choice on your box. Please let us know how it works for you.
The issue is that the -XX:+UseParallelGC collector uses -XX:InitialSurvivorRatio= and -XX:MinSurvivorRatio= rather than -XX:SurvivorRatio= to control the size of the survivor spaces. On the other hand, the -XX:+UseParallelGC collector includes "ergonomics" to adjust the size of the survivor spaces to give you better throughput and shorter pauses. Ergonomics is also described in the JDK-1.5.0 release notes, and in the garbage collector white paper at http://java.sun.com/j2se/reference/whitepapers/memorymanagement_whitepaper.pdf.
Thanks very much for shedding the light. I didn't give a thought about the "PS" in the logs, thought it's was something new with the 1.5 logging format. I remember when parallelGC is enabled, there were more differences in the gc detail print out. I don't see any other indication in the log showing anything different than the serial GC. From the performance graph I collect from the test, the GCs were still the stop-the-world pauses. And for some reason, if I make the max heap to be 1.5GB, the full gc took around 200 seconds, while if the max heap was 1GB, the full gc was around 5 seconds. We didn't experience this when using the serial collector in the previous releases. It could be because when the survivor space was big, it triggered full gc when the tenured space was not totally full. And when the amount of the garbages in the Tenured space reached certain threshold for a full gc, systems spins out of ram space to handle the garbages? Or it's because the parallel gc thread is fighting with application threads in getting cycles for the cleaning job? Any idea? If I can lower down the pause and achieve better throughput, then I certainly will stick with the ergonomics offered by default in the 1.5 VM.
I will run more tests to compare the two collectors and let you know the results. Thanks again for your help!
- Sophia
Hello Sophia!
If GCs suddenly take 200 seconds instead of only a few seconds when you increase your heap, the reason might be your physical memory gets short and some of the Java heap has been paged out. Since you only have 2 gb of memory, setting Xmx to 1.5 gb means you're using most of your memory for the JVM. If now your memory gets short, the OS will page out some of the Java heap (mosty the regions that have already become garbage). If now the GC kicks in, it iterates over all objects, so the OS has to page in the heap again, meaning it needs to read it from the hard disk. And this of course takes some time.
To verify this, you can use the tool vmstat to monitor your memory usage while your application is running. On Solaris, use e.g. "vmstat -p 2". While your application is running, you would see a lot of "apo" (anonymous page outs) when your memory becomes short. When the GC kicks in, you would see a lot of "api" (anonymous page ins).
Nick.