Saturday, September 29, 2007

More human-friendly Java GC timestamps

Update: If you are using the Sun (Oracle) JVM version 6u4 or newer then using the command line option -XX:+PrintGCDateStamps instead of -XX:+PrintGCTimeStamps will output a current date rather than a relative time in the GC logs.

If you've working on a medium to large Java project then one of the aspects of application performance you're probably (should be!) monitoring is garbage collection.

If you're using the Sun JVM or compatible JVM then I recommend you enable verbose garbage collection output with at least the following command-line options:


java -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -verbose:gc ...


This will print something like the following to stdout :


83.304: [GC 83.304: [DefNew: 890K->64K(960K), 0.0044947 secs] 890K->217K(5056K), 0.0046062 secs]
83.390: [GC 83.390: [DefNew: 960K->64K(960K), 0.0043936 secs] 1113K->376K(5056K), 0.0044796 secs]
83.534: [GC 83.534: [DefNew: 951K->64K(960K), 0.0046934 secs] 1263K->467K(5056K), 0.0048648 secs]
83.638: [GC 83.638: [DefNew: 960K->64K(960K), 0.0041580 secs] 1363K->658K(5056K), 0.0042530 secs]
83.783: [GC 83.784: [DefNew: 960K->64K(960K), 0.0030418 secs] 1554K->748K(5056K), 0.0031353 secs]
83.921: [GC 83.921: [DefNew: 960K->55K(960K), 0.0019457 secs] 1644K->794K(5056K), 0.0020575 secs]
91.655: [GC 91.655: [DefNew: 951K->19K(960K), 0.0012843 secs] 1690K->808K(5056K), 0.0013694 secs]
94.198: [Full GC (System) 94.198: [Tenured: 788K->759K(4096K), 0.0428238 secs] 883K->759K(5056K), [Perm : 2095K->2095K(12288K)], 0.0429641 secs]


As an alternative to writing to stdout - e.g. gc logging may get mixed up with a lot of non-gc related output - you can have the JVM log gc to a file with timestamps automatically enabled by using the -Xloggc option:


java -XX:+PrintGCDetails -Xloggc:logs/gc.log ...


One small annoyance I have with this logging is that the timestamp - which is of the format SSSS.MMMM where SSSS it the elapsed seconds and MMMM is the elapsed millisecs - is relative to when the JVM started. I would find it more useful if the timestamp told me the date and time of day that the logging occurred.

So here is a short list of requirements for more "human-friendly" timestamping:

  • Log a timestamp that contains the date and time of day of the logging event, preferably in a configurable format. e.g. see java.text.SimpleDateFormat
  • Work in a JVM-neutral and platform-neutral way
Given the above requirements what are my options?

Diving into the openjdk source code I found that the verbose gc logging is implemented in native C code.

From Java 5 onwards some of the garbage collection information is exposed via the GarbageCollectorMXBean JMX bean, but what is not exposed is any type of notification/callback when the gc actually occurs or more collector-specific information. e.g. for the parallel garbage collector how much time was spent in stop-the-world phase and how much time was spent in other phases.

Given this info I found two solutions:

1. Use a native wrapper around the JVM that timestamps messages logged to stdout

For this option I could use something like the very mature and useful utility Java Service Wrapper that provides this sort of wrapper around stdout (among other useful features), or possibly roll my own native code that redirects stdout (nightmare!).

A disadvantage of this approach is that you will have to deploy native libraries along with your application, which may not be easy to integrate into application servers, clustered environments, etc.

2. Create a thread that occasionally polls the GarbageCollectorMXBeans and logs the number of collections and collection time since the last poll

The code for this approach looks something like this:


package robertmaldon.gc;

import java.lang.management.GarbageCollectorMXBean;
import java.lang.management.ManagementFactory;
import java.text.SimpleDateFormat;
import java.util.Date;
import java.util.List;

public class GCTimeStamper implements Runnable {

private SimpleDateFormat dateFormat = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss,SSS");

private GarbageCollectorMXBean minorGCBean;
private long lastMinorCollectionCount;
private long lastMinorCollectionTime;

private GarbageCollectorMXBean fullGCBean;
private long lastFullCollectionCount;
private long lastFullCollectionTime;

public GCTimeStamper() {
}

public void run() {
List<GarbageCollectorMXBean> gcMBeans = ManagementFactory.getGarbageCollectorMXBeans();

for (GarbageCollectorMXBean gcBean : gcMBeans) {
if ("Copy".equals(gcBean.getName())) {
minorGCBean = gcBean;
} else if ("MarkSweepCompact".equals(gcBean.getName())) {
fullGCBean = gcBean;
} else if ("ParNew".equals(gcBean.getName())) {
minorGCBean = gcBean;
} else if ("ConcurrentMarkSweep".equals(gcBean.getName())) {
fullGCBean = gcBean;
} else {
System.err.println("Unable to classify GarbageCollectorMXBean [" + gcBean.getName() + "]");
}
}

try {
while (true) {
if (minorGCBean.getCollectionCount() != lastMinorCollectionCount) {
long diffCount = minorGCBean.getCollectionCount() - lastMinorCollectionCount;
long diffTime = minorGCBean.getCollectionTime() - lastMinorCollectionTime;
System.out.println(getTimeStamp() + "Minor GC x " + diffCount + ", " + diffTime + " millisecs");

lastMinorCollectionCount = minorGCBean.getCollectionCount();
lastMinorCollectionTime = minorGCBean.getCollectionTime();
}

if (fullGCBean.getCollectionCount() != lastFullCollectionCount) {
long diffCount = fullGCBean.getCollectionCount() - lastFullCollectionCount;
long diffTime = fullGCBean.getCollectionTime() - lastFullCollectionTime;
System.out.println(getTimeStamp() + "Full GC x " + diffCount + ", " + diffTime + " millisecs");

lastFullCollectionCount = fullGCBean.getCollectionCount();
lastFullCollectionTime = fullGCBean.getCollectionTime();
}

// Sleep a little bit before the next poll
Thread.sleep(1000);
}
} catch (Exception ex) {
// Do nothing except exit this thread
}
}

private String getTimeStamp() {
Date now = new Date();

long upTime = ManagementFactory.getRuntimeMXBean().getUptime();
long upTimeSecs = upTime / 1000;
long upTimeMillis = upTime % 1000;

return dateFormat.format(now) + " [" + upTimeSecs + "." + upTimeMillis + "] ";
}
}


The above class would be started with a new Thread(new GCTimeStamper()).start() or a TaskExecutor, and you can replace the System.out statements with calls to log4j or whatever.

The output of this class mixed with the gc output looks something like this:


83.304: [GC 83.304: [DefNew: 890K->64K(960K), 0.0044947 secs] 890K->217K(5056K), 0.0046062 secs]
83.390: [GC 83.390: [DefNew: 960K->64K(960K), 0.0043936 secs] 1113K->376K(5056K), 0.0044796 secs]
83.534: [GC 83.534: [DefNew: 951K->64K(960K), 0.0046934 secs] 1263K->467K(5056K), 0.0048648 secs]
83.638: [GC 83.638: [DefNew: 960K->64K(960K), 0.0041580 secs] 1363K->658K(5056K), 0.0042530 secs]
83.783: [GC 83.784: [DefNew: 960K->64K(960K), 0.0030418 secs] 1554K->748K(5056K), 0.0031353 secs]
83.921: [GC 83.921: [DefNew: 960K->55K(960K), 0.0019457 secs] 1644K->794K(5056K), 0.0020575 secs]
2007-09-28 17:03:59,037 [84.80] Minor GC x 6, 22 millisecs
91.655: [GC 91.655: [DefNew: 951K->19K(960K), 0.0012843 secs] 1690K->808K(5056K), 0.0013694 secs]
2007-09-28 17:04:07,036 [92.79] Minor GC x 1, 1 millisecs
94.198: [Full GC (System) 94.198: [Tenured: 788K->759K(4096K), 0.0428238 secs] 883K->759K(5056K), [Perm : 2095K->2095K(12288K)], 0.0429641 secs]
2007-09-28 17:04:10,036 [95.79] Full GC x 1, 42 millisecs


This class will print the approximate date/time a gc occurred as well as the time relative to the start of the JVM, so you can then use that to correlate with the more verbose output in the gc.log file.

Another interesting problem is determining which gc was minor and which was full (major). If you're using the Sun JVM and the default gc settings then a collector named "Copy" does minor collections and a collector named "MarkSweepCompact" does full collections. If you're using the Sun JVM with the -XX:+UseConcMarkSweepGC option then a collector named "ParNew" seems to do minor collections and a collector named "ConcurrentMarkSweep" seems to do full collections. (I say "seems to" because the behavior changes depending on if you're running on a one cpu or multi-cpu machine.)

Not an ideal solution, but if anybody knows a better way I would be interested in hearing from you!

Thursday, September 27, 2007

The most complete list of -XX options for Java 6 JVM

The Sun JVM has many, many, many non-standard options (i.e. may not be supported by other JVMs), and most are not documented on the official Sun website.

Here is a pretty comprehensive listing of the -XX options for the Java 6 JVM.

Google, Yahoo Sued For Stealing Names From Tanzanian Tribes

Another frivolous lawsuit?

...He claims that both Google and Yahoo stole their names from Tanzanian tribes -- and now they should pay up. Specifically, he claims that Google took its name from the Gogo tribe and Yahoo took its name from the Yao tribe. Conveniently, this guy happens to be a descendant of both tribes. He's merely asking for both companies to pay $10,000 each to every member of both tribes, going back three generations...

How do you say "MSN", "Ask.com" or "Slashdot" in Tanzanian?

Tuesday, September 11, 2007

Bush Cheney '08

A trend that appears to be gaining momentum on both sides of politics...





Update: US election hots up: Palin in a bikini, Battlestar Galactica stars enter the race

How to Launder Money


The clearest explanation of money laundering that I've ever seen :)

(Picture from this blog post)

Saturday, September 08, 2007

Programmatically Configuring log4j - and the Future of Java Logging

log4j is probably the most common logging framework in use in the Java world. Even if a project/framework uses commons-logging most deployments configure log4j to be the underlying logger.

log4j is usually configured via a properties file (most common, in my experience) or an XML file (more powerful). There are a few situations, however, where I've found it useful to programmatically configure log4j, and since there is very little info on the web on how to do this I thought I'd write a short blog entry on the subject.

Why configure log4j programmatically?

Two situations I've found it useful to configure log4j in code are:
  • During running of Unit Tests. Since unit tests are fine grained it helps to enable more verbose logging for just the area a particular test targets instead of for the whole suite. Also, if you are running single test inside your IDE you may want to output to the console, but if you run your whole suite from the command line you may want to output to a file.
  • Changing logging dynamically via JMX. log4j has some basic JMX support, but if you want to do anything mildly complicated then you'll have to roll your own.
log4j initialization

Whenever a Logger is first used log4j goes through a default initialization procedure to find a log4j.properties or a log4j.xml configuration file. If the initialization procedure fails you typically get an error message like this:


log4j:WARN No appenders could be found for logger (net.sf.ehcache.CacheManager).
log4j:WARN Please initialize the log4j system properly.


when a Logger is used.

The simplest way to detect if log4j was initialized or not is to ask the root logger if it has any appenders attached to it:


import org.apache.log4j.Logger;

Logger root = Logger.getRootLogger();
if (!root.getAllAppenders().hasMoreElements()) {
// No appenders means log4j is not initialized!
}


A minimal default log4j configuration might be to add a single console appender to the root logger; this is in fact what the convenience method BasicConfigurator.configure() does:


import org.apache.log4j.ConsoleAppender;
import org.apache.log4j.Logger;
import org.apache.log4j.PatternLayout;

Logger root = Logger.getRootLogger();
root.addAppender(new ConsoleAppender(
new PatternLayout(PatternLayout.TTCC_CONVERSION_PATTERN)));


If not explicitly set the default the log level will be DEBUG.

Example

So let's put these things together in a scenario: If I run my unit test suite from Ant then I add a log4j.properties to the classpath that directs log4j logging to a file. If I run a unit test from inside Eclipse then I deliberately want log4j initialization to fail, allowing me to initialize it programatically; log everything from the robertmaldon.moneymachine package to the console at DEBUG level, and log everything else to the console at INFO level.


package robertmaldon.moneymachine;

import org.junit.BeforeClass;

import org.apache.log4j.ConsoleAppender;
import org.apache.log4j.Logger;
import org.apache.log4j.PatternLayout;

public class OrderTest {

@BeforeClass
public static void classSetUp() {
Logger rootLogger = Logger.getRootLogger();
if (!rootLogger.getAllAppenders().hasMoreElements()) {
rootLogger.setLevel(Level.INFO);
rootLogger.addAppender(new ConsoleAppender(
new PatternLayout("%-5p [%t]: %m%n")));

// The TTCC_CONVERSION_PATTERN contains more info than
// the pattern we used for the root logger
Logger pkgLogger = rootLogger.getLoggerRepository().getLogger("robertmaldon.moneymachine");
pkgLogger.setLevel(Level.DEBUG);
pkgLogger.addAppender(new ConsoleAppender(
new PatternLayout(PatternLayout.TTCC_CONVERSION_PATTERN)));
}
}
}

The above code creates a new Logger and chains it to the root logger. Instances of Logger can not be instantiated directly, but have to be created from a LoggerRepository (a factory pattern).

The equivalent log4j.properties would look something like:


log4j.rootLogger=INFO, A

log4j.appender.A=org.apache.log4j.ConsoleAppender log4j.appender.A.layout=org.apache.log4j.PatternLayout
log4j.appender.A.layout.ConversionPattern=%-5p [%t]: %m%n

log4j.logger.robertmaldon.moneymachine=DEBUG, B

log4j.appender.B=org.apache.log4j.ConsoleAppender log4j.appender.B.layout=org.apache.log4j.PatternLayout
log4j.appender.B.layout.ConversionPattern=%r [%t] %p %c %x - %m%n


So not too difficult to do after all :)

The Future of Java Logging

In recent weeks I've noticed some discussion and dissension in the Java logging world.

Some implementation issues with commons-logging spurred a number of the Apache projects to switch to a replacement called SLF4j (Simple Logging Framework).

The original creator of log4j has been hard at work on a successor called LOGBack, which is an evolution of log4j rather than a revolution. e.g. smaller memory footprint, faster, better filtering and evaluation capabilities.

One feature of LOGBack that really caught my eye was that of parameterized logging. In log4j you might write logging code like:


if( logger.isDebugEnabled() ) {
logger.debug( "User with account "
+ user.getAccount() + " failed authentication; "
+ "supplied crypted password " + user.crypt(password)
+ " does not match." );
}


The isDebugEnabled() check is usually a performance thing - there is no need to call expensive methods like the crypt() method above if the current logging level is higher than DEBUG. The equivalent code in LOGBack would be:


logger.debug( "User with account {} failed authentication; "
+ "supplied crypted password {} does not match.",
user.getAccount(), user.crypt(password) );


which allows LOGBack to defer calling getAccount() and crypt() until it has determined that the log level will cause the logging to be generated. The less "if"s the better!

LOGBack natively implements the SLF4j APIs, but provides a bridge for applications currently using the log4j APIs. SLF4j provides a bridge for applications currently using the commons-logging APIs.

Thursday, September 06, 2007

Graphing website relationships with TouchGraph Google Browser

TouchGraph Google Browser is a cute Java applet that, given a Google keyword search, graphs the relationships between websites from Google search results.

Below is the result for a search on my company, Finetix. Most of the relationships look right (employees, press releases, alliances, memberships, etc) but I'm most curious about the link between Finetix blogs and guinness.com :)

Tuesday, September 04, 2007

Chill out with air-conditioned clothing

The worst of the summer heat is over for this year, but I'll have to remember to try out some of this air-conditioned clothing next year, like the USB-powered neck tie. Ah, those crazy Japanese...