If you've working on a medium to large Java project then one of the aspects of application performance you're probably (should be!) monitoring is garbage collection.
If you're using the Sun JVM or compatible JVM then I recommend you enable verbose garbage collection output with at least the following command-line options:
java -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -verbose:gc ...
This will print something like the following to stdout :
83.304: [GC 83.304: [DefNew: 890K->64K(960K), 0.0044947 secs] 890K->217K(5056K), 0.0046062 secs]
83.390: [GC 83.390: [DefNew: 960K->64K(960K), 0.0043936 secs] 1113K->376K(5056K), 0.0044796 secs]
83.534: [GC 83.534: [DefNew: 951K->64K(960K), 0.0046934 secs] 1263K->467K(5056K), 0.0048648 secs]
83.638: [GC 83.638: [DefNew: 960K->64K(960K), 0.0041580 secs] 1363K->658K(5056K), 0.0042530 secs]
83.783: [GC 83.784: [DefNew: 960K->64K(960K), 0.0030418 secs] 1554K->748K(5056K), 0.0031353 secs]
83.921: [GC 83.921: [DefNew: 960K->55K(960K), 0.0019457 secs] 1644K->794K(5056K), 0.0020575 secs]
91.655: [GC 91.655: [DefNew: 951K->19K(960K), 0.0012843 secs] 1690K->808K(5056K), 0.0013694 secs]
94.198: [Full GC (System) 94.198: [Tenured: 788K->759K(4096K), 0.0428238 secs] 883K->759K(5056K), [Perm : 2095K->2095K(12288K)], 0.0429641 secs]
As an alternative to writing to stdout - e.g. gc logging may get mixed up with a lot of non-gc related output - you can have the JVM log gc to a file with timestamps automatically enabled by using the -Xloggc option:
java -XX:+PrintGCDetails -Xloggc:logs/gc.log ...
One small annoyance I have with this logging is that the timestamp - which is of the format SSSS.MMMM where SSSS it the elapsed seconds and MMMM is the elapsed millisecs - is relative to when the JVM started. I would find it more useful if the timestamp told me the date and time of day that the logging occurred.
So here is a short list of requirements for more "human-friendly" timestamping:
- Log a timestamp that contains the date and time of day of the logging event, preferably in a configurable format. e.g. see java.text.SimpleDateFormat
- Work in a JVM-neutral and platform-neutral way
Diving into the openjdk source code I found that the verbose gc logging is implemented in native C code.
From Java 5 onwards some of the garbage collection information is exposed via the GarbageCollectorMXBean JMX bean, but what is not exposed is any type of notification/callback when the gc actually occurs or more collector-specific information. e.g. for the parallel garbage collector how much time was spent in stop-the-world phase and how much time was spent in other phases.
Given this info I found two solutions:
1. Use a native wrapper around the JVM that timestamps messages logged to stdout
For this option I could use something like the very mature and useful utility Java Service Wrapper that provides this sort of wrapper around stdout (among other useful features), or possibly roll my own native code that redirects stdout (nightmare!).
A disadvantage of this approach is that you will have to deploy native libraries along with your application, which may not be easy to integrate into application servers, clustered environments, etc.
2. Create a thread that occasionally polls the GarbageCollectorMXBeans and logs the number of collections and collection time since the last poll
The code for this approach looks something like this:
package robertmaldon.gc;
import java.lang.management.GarbageCollectorMXBean;
import java.lang.management.ManagementFactory;
import java.text.SimpleDateFormat;
import java.util.Date;
import java.util.List;
public class GCTimeStamper implements Runnable {
private SimpleDateFormat dateFormat = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss,SSS");
private GarbageCollectorMXBean minorGCBean;
private long lastMinorCollectionCount;
private long lastMinorCollectionTime;
private GarbageCollectorMXBean fullGCBean;
private long lastFullCollectionCount;
private long lastFullCollectionTime;
public GCTimeStamper() {
}
public void run() {
List<GarbageCollectorMXBean> gcMBeans = ManagementFactory.getGarbageCollectorMXBeans();
for (GarbageCollectorMXBean gcBean : gcMBeans) {
if ("Copy".equals(gcBean.getName())) {
minorGCBean = gcBean;
} else if ("MarkSweepCompact".equals(gcBean.getName())) {
fullGCBean = gcBean;
} else if ("ParNew".equals(gcBean.getName())) {
minorGCBean = gcBean;
} else if ("ConcurrentMarkSweep".equals(gcBean.getName())) {
fullGCBean = gcBean;
} else {
System.err.println("Unable to classify GarbageCollectorMXBean [" + gcBean.getName() + "]");
}
}
try {
while (true) {
if (minorGCBean.getCollectionCount() != lastMinorCollectionCount) {
long diffCount = minorGCBean.getCollectionCount() - lastMinorCollectionCount;
long diffTime = minorGCBean.getCollectionTime() - lastMinorCollectionTime;
System.out.println(getTimeStamp() + "Minor GC x " + diffCount + ", " + diffTime + " millisecs");
lastMinorCollectionCount = minorGCBean.getCollectionCount();
lastMinorCollectionTime = minorGCBean.getCollectionTime();
}
if (fullGCBean.getCollectionCount() != lastFullCollectionCount) {
long diffCount = fullGCBean.getCollectionCount() - lastFullCollectionCount;
long diffTime = fullGCBean.getCollectionTime() - lastFullCollectionTime;
System.out.println(getTimeStamp() + "Full GC x " + diffCount + ", " + diffTime + " millisecs");
lastFullCollectionCount = fullGCBean.getCollectionCount();
lastFullCollectionTime = fullGCBean.getCollectionTime();
}
// Sleep a little bit before the next poll
Thread.sleep(1000);
}
} catch (Exception ex) {
// Do nothing except exit this thread
}
}
private String getTimeStamp() {
Date now = new Date();
long upTime = ManagementFactory.getRuntimeMXBean().getUptime();
long upTimeSecs = upTime / 1000;
long upTimeMillis = upTime % 1000;
return dateFormat.format(now) + " [" + upTimeSecs + "." + upTimeMillis + "] ";
}
}
The above class would be started with a new Thread(new GCTimeStamper()).start() or a TaskExecutor, and you can replace the System.out statements with calls to log4j or whatever.
The output of this class mixed with the gc output looks something like this:
83.304: [GC 83.304: [DefNew: 890K->64K(960K), 0.0044947 secs] 890K->217K(5056K), 0.0046062 secs]
83.390: [GC 83.390: [DefNew: 960K->64K(960K), 0.0043936 secs] 1113K->376K(5056K), 0.0044796 secs]
83.534: [GC 83.534: [DefNew: 951K->64K(960K), 0.0046934 secs] 1263K->467K(5056K), 0.0048648 secs]
83.638: [GC 83.638: [DefNew: 960K->64K(960K), 0.0041580 secs] 1363K->658K(5056K), 0.0042530 secs]
83.783: [GC 83.784: [DefNew: 960K->64K(960K), 0.0030418 secs] 1554K->748K(5056K), 0.0031353 secs]
83.921: [GC 83.921: [DefNew: 960K->55K(960K), 0.0019457 secs] 1644K->794K(5056K), 0.0020575 secs]
2007-09-28 17:03:59,037 [84.80] Minor GC x 6, 22 millisecs
91.655: [GC 91.655: [DefNew: 951K->19K(960K), 0.0012843 secs] 1690K->808K(5056K), 0.0013694 secs]
2007-09-28 17:04:07,036 [92.79] Minor GC x 1, 1 millisecs
94.198: [Full GC (System) 94.198: [Tenured: 788K->759K(4096K), 0.0428238 secs] 883K->759K(5056K), [Perm : 2095K->2095K(12288K)], 0.0429641 secs]
2007-09-28 17:04:10,036 [95.79] Full GC x 1, 42 millisecs
This class will print the approximate date/time a gc occurred as well as the time relative to the start of the JVM, so you can then use that to correlate with the more verbose output in the gc.log file.
Another interesting problem is determining which gc was minor and which was full (major). If you're using the Sun JVM and the default gc settings then a collector named "Copy" does minor collections and a collector named "MarkSweepCompact" does full collections. If you're using the Sun JVM with the -XX:+UseConcMarkSweepGC option then a collector named "ParNew" seems to do minor collections and a collector named "ConcurrentMarkSweep" seems to do full collections. (I say "seems to" because the behavior changes depending on if you're running on a one cpu or multi-cpu machine.)
Not an ideal solution, but if anybody knows a better way I would be interested in hearing from you!