Incremental Garbage Collector is halting my server for MINUTES at a time.

I have a Java server which services hundreds (currently 300-700 on average, target of 2000+) of simultaneous client connections, with a staggering number of long lived (but not permanent) objects.

The docs for incremental garbage collection state: "The incremental garbage collector, which is off by default, will eliminate occasional garbage-collection pauses during program execution." This is NOT true in my case. During peak load, the Server halts occasionally (once an hour or more) and entire MINUTES tick by. Average wait time is 2.5 - 3.5 minutes, the highest I have seen is 4 minutes, 10 seconds. This is entirely unnacceptable.

The server is on Red Hat Linux 6.2, kernel 2.2.14-5.0, with a gig of RAM. My current command line options are

java -server -Xincgc -Xms256M -Xmx900M

And I have just added -verbose:gc to help analyze the gc performance. I have read the gc tuning guide at http://java.sun.com/docs/hotspot/gc/index.html but still feel rather clueless about what is the optimum setup for my particular application.

I will of course start experimenting, but I was hoping to find a "wise old elf" who might give some useful pointers to accelerate the process, seeing as how the Server is already running in a production capacity, time is critical.

[1307 byte] By [Warith] at [2007-9-26 13:04:32]
# 1

You are allowing the jvm to use 900 MB of memory. Is this really neccesary? If your application is really only using several hundred megabytes less that this max, then garbage will slowly fill up the gap. When garbage collection is started to free up memory, it will have to go through these several hundred megabytes of garbage. That takes time.

tychoS at 2007-7-2 12:58:36 > top of Java-index,Java HotSpot Virtual Machine,Specifications...
# 2
I'm afraid it is neccessary. :( And if the machine had more physical RAM, I would be allocating even more, this Server has to scale to thousands uf users, each with possibly hundreds of K's of data.
Warith at 2007-7-2 12:58:36 > top of Java-index,Java HotSpot Virtual Machine,Specifications...
# 3

Are you using a Java application server like Tomcat, Dynamo, WebLogic etc. ?

In that case consider running several server instances on the machine, with the applicationserver's software load balancer. Find the amount of RAM allocated to the heap per serverinstance where garbage collection runs takes a couple of seconds, and don't allocate more than this to each server. Start as many servers as you have available RAM for.

This is the approach recommended by application server vendors such as BEA http://edocs.bea.com/wls/docs61/perform/JVMTuning.html and ATG.

A side benefit of this approach is that in case you get more concurrent users than your computer can handle, you already have the setup for spreading the load over several computers.

tychoS at 2007-7-2 12:58:36 > top of Java-index,Java HotSpot Virtual Machine,Specifications...
# 4
I'm not using an application server, but thanks for that link; there is some helpful advice on tuning there! :)
Warith at 2007-7-2 12:58:36 > top of Java-index,Java HotSpot Virtual Machine,Specifications...