Thanks for the information about the HotSpotDiagnosticMXBean.
However, for this set of processes I'd prefer not to start the whole JMX machinery.
1) Is there a supported (or unsupported) way to get a dump through the Attach API?
2) Are you working on making dumping quicker?
3) Is there some information on the built-in bumper in general? It operates without the JVMTI overhead, correct? Is it a stop-the-world operation (does it freeze all JVM threads)? Is it I/O bound in current implementations? Can we expect any future speedups? (perhaps parallel reference graph analysis and/or concurrent dumping (while application is running))
Apart from these questions, I'd like to say that I'm quite impressed about the progress of the whole "observability" aspect of the platform. Hats off to you, KO'H, A. Sundararajan and others.
The supported way to trigger a heap dump is using the HotSpotDiagnosticMXBean so you will need the management agent running if you want to do it remotely. The management agent can be started at VM startup or you can use the attach API to start it. If you wan to start the agent yourself then you will find an example here:
http://blogs.sun.com/alanb/resource/MemViewer.java
The jmap -dump uses a private interface to trigger the dump.
The heap dump is built-in into the VM (doesn't use a JVM TI agent). It is "stop-the-world" and it mostly intended for the out-of-memory-error case where it's not going to interfere with a running application. The heap dump should already be quite fast. It's very I/O bound and the benefit from a parallel implementation might not be huge.