Matt's Mind

Wednesday, August 18, 2004

Static vs Runtime Typing

In this post Bruce Eckel, who's written a lot of wisdom on software development, describes how he went from advocating C++-style static typing to Python-style dynamic "implicit" typing. What he means by this is the removal of things like Java interfaces and classes when describing an object. For example, a Python method to calculate a factorial takes an argument "x", but doesn't specify what x is: integer, floating point, BankAccount?
  def factorial(x):

if x == 0:
return 1
else:
return x * factorial(x-1)
Bruce argues that static typing makes things harder for developers:
The extra verbiage that must be added in order to make the compiler happy doesn't really accomplish that much and has a severe impact on both productivity and code maintenance. We have to write unit tests anyway, so the illusion of safety comes at far too high a cost. Ned Batchelder put it another way: "Static typing prevents certain kinds of failures. Unfortunately, it also prevents certain kinds of successes."
The interesting thing here is that Bruce seems to be going further than many dynamic language advocates, who often say that dynamic programming is better because it lets you get small things done quicker without the syntactic boilerplate. He's claiming that since we're going to have 100% test coverage anyway, why pay the price of static typing in order to catch things like missing methods, etc? A number of responses to this come to mind from day-to-day development experiences.

Point 1: the testing requirement may be too onerous for a medium-size project, where 100% test coverage might be infeasible. For example, the 14KLOC GUI messaging application I'm developing would need a very large number of tests to cover all the code paths. Apart from the perennial problems involved in testing a GUI's and network communication, I simply don't wish to pay the price of 100% test coverage. I have tests for the core components, and test the rest by simply using the application day-to-day. Static typing ensures there are no trivial faults like typo's in method calls, the basic unit tests ensures there are no serious faults in the core components, and day-to-day usage exercises the rest.

Point 2: by not stating type info, you reduce the information available to readers of a program and to IDE's helping you develop the program. For example, say you come across some code that contains a "toolbar" object that seems to have an "add (item)" method. What sorts of objects can I add () and what behavioural contract do those objects need to satisfy? If there's type info like "Toolbar.add (ToolItem item)", then I (and an IDE) can immediately see what behaviour an item on the toolbar needs to provide and what classes provide it. I'd rate this as such important information that I'd argue that the lack of type information would have an impact on code maintenance, not the other way around (as Bruce claims in the article).

Point 3: dynamic typing excludes conventional optimization options. For my messaging application I can use a static compiler (Excelsior JET) that generates a highly optimized Windows .EXE from my Java code. A dynamic language does not currently have this option.

Point 4: when you need some some dynamic programming in Java, you can get it. Reflection, although it's messier than Python's elegant syntax, gives you the ability to ignore type when you need it. For example, a common GUI paradigm is the Command pattern: a command object packages the functionality of an application command (eg "Edit -> Cut") and is embedded into toolbars, main menu, context menu etc which invoke its execute () method when the user clicks the various items. A key issue with this pattern is that many commands, such as "Cut", are context-sensitive: the Cut command obviously needs to know what to cut.

This "object-to-operate-on" property of a command is very often bound to the selected item in a view such as a list or a tree, and it would be nice if we could have a simple way to just bind a command's property to the selected item in a view instead of writing endless tiny event handlers. So I have a class that listens to selection changes on any GUI viewer control and feeds the selected item(s) to a given property X of an object (usually a command), which is done by calling its setX () method. The type of the target object is irrelevant, as long as it has a setX () method. In fact this binding approach ends up being more flexible than the equivalent in Python because the binding object can find the type of X and automatically decide whether the currently selected item is compatible: if not it does not proceed with the binding and instead disables the command.

Wednesday, August 11, 2004

Funny insight on religion in Slashdot

I just had to post this excerpt of a quote on Slashdot from yesterday which arose, of all places, in the middle of a Q&A session with Larry Wall (of Perl fame). He had a quite odd description of how he reconciled his Christian beliefs in god with actual religion (IMHO).

And then, as is standard practice on Slashdot, things went off on a tangent and resulted in the following quote that I think actually deserved its "Insightful" moderation:

(Slashdot 10/08/2004: http://interviews.slashdot.org/comments.pl?sid=39406)

Then again, my two favourite books in the Bible are Ecclesiastes and
Job. Both of which...

    So have fun, be smart and if there is a God - let's hope this is some sort of practical joke. Making us in the form of giant hairless monkeys is rather amusing when you think on it.

...provide a sound Scriptural basis for your conclusion. So welcome to the True Faith, Brother :-)

On a more enlightening angle, allow me to quote Carl Sagan (atheist):
    How is it that hardly any major religion has looked at science and concluded, 'This is better than we thought! The Universe is much bigger than our prophets said, grander, more subtle, more elegant'? Instead they say, 'No, no, no! My god is a little god, and I want him to stay that way.' A religion, old or new, that stressed the magnificence of the Universe as revealed by modern science might be able to draw forth reserves of reverence and awe hardly tapped by the conventional faiths."

    - Carl Sagan
The God worshipped by Islamokazi nutbags and Creationist fundies is of little interest in me. He may have created us in His image - but some of his less-than-clued followers have unfortunately returned the favour :(

The Islamokazis (and to a lesser extent the Roman Catholic Church) re-created him in the form of a celestial slot machine ("insert [ 767 | money ], pull lever, get [ 72 virgins | indulgences ]"), and the Christian fundie hucksters re-created him in the form of a carnival barker ("I whipped it up in a week six thousand years ago and hid different ratios of potassium and argon isotopes in the dinosaur bones just to confuse their scientists 6000 years later! Suckerrrrz!").

But a God who can come up with an entire universe based on a few fundamental constants and some deep mathematics, such that out of that universe, a few bits of carbon compounds might emerge into sentient life capable of looking around at the universe and trying to unravel the math for themselves... That's the kind of God that might be worth getting to know more about.

Thursday, August 05, 2004

Unsigned integer types

OK, so I've been programming in Java now since the 1.0 days and never needed to worry about the lack of unsigned types. Java kind of sneakily works its way around the issue when they are needed by just using the "next size up": for example the call to read a byte to a stream, InputStream.read (), returns an int rather than a byte. I used to wonder why until I realised this is an unsigned byte, and using byte would break for values > 127. Another example is the java.util.zip.CRC32 class that outputs a long (64 bits), even though you might expect an int (32 bits).

A lot of the time this trick works, but a real problem occurs when you want to promote, say, an "unsigned" byte to an unsigned int. That is, a byte that you are treating as unsigned (eg one read from an input stream into a byte array) - it's only a bit pattern after all and most of the bitwise operators don't care about the sign. If you just do something like:
byte [] bytes = new byte [10];
input.read (bytes);
int uint = bytes [0];
Then you'll find that, for negative values, Java helpfully "extends" the negative bit on assignment, so for example if bytes [0] had the hex value FF (binary 11111111), then uint ends up as hex 807F (binary 1000000001111111). Which may cause some problems, especially since things still work for values less than 128.

The way to work around this? The code below promote an "unsigned" byte to an "unsigned" int (there may be a more elegant way to do this, but I know this works):
public static int promote (byte value)

{
if ((value & (byte)0x80) != 0)
{
// create int without sign extension and then re-add high bit
int uintValue = (value & (byte)0x7F);
uintValue |= 0x80;

return uintValue;
} else
{
return value;
}
}

Even though this is a PITA and a trap for new players, I can understand why the Java designers might have decided to leave unsigned types out. For a start, you immediately increase complexity a lot: you double the number of primitives and need to define how to handle things like assigning a signed to an unsigned safely. The current types entirely enclose the smaller types (a long can handle any int, an int can handle any short etc). But unsigned types overlap but don't enclose. Also, the JVM encodes the type of operand into bytecode instructions, and there are limited number of those (256 IIRC).