Tuesday, July 18, 2006

Is C really that fast anymore?

An interesting article exploring the possability C is not necessarily the fastest language on block anymore because processor architectures have changed so much that C no longer maps directly to the processor instruction set. What is better? Runtimes that dynamically adapt, like Java.

The closer to the metal you can get while programming, the faster your program will compile — or so conventional wisdom would have you believe. In this article, I will show you how high-level languages like Java aren't slow by nature, and in fact low level languages may compile less efficiently...

Early versions of UNIX were written in Assembler, but this proved to be a hindrance when porting it to new platforms, because assembly languages are machine-specific. Existing high-level languages, such as LISP, provided too much abstraction for implementing an operating system, so a new language was created. This language, C, was a very slightly abstracted form of PDP-11 assembly language. There is almost a 1:1 mapping between C semantics and PDP-11 machine code, making it very easy to compile C for the PDP-11 (the target machine of UNIX at the time)...

When C was created, it was very fast because it was almost trivial to turn C code into equivalent machine code. But this was only a short-term benefit; in the 30 years since C was created, processors have changed a lot. The task of mapping C code to a modern microprocessor has gradually become increasingly difficult...

As a simple example, consider the vector unit found in most desktop processors. One instruction might take four integers, add them to four other integers, and provide the result as a set of four more integers. Now imagine some code that could be adapted to take advantage of this capability.

In C, the usual representation of a vector is as an array. Unfortunately, you can’t define operations on arrays, so adding the values in two arrays would be done as a loop in C, with scalar operations in the loop body...

A just-in-time compiler doesn’t have these limitations.

No comments: