Error handling and elegance
I was analing1 some code today, and seriously reduced the size of some knotty configuration handling logic, which made me happy2.
Here's the core Java logic as a snippet:
The code just assigns all configuration properties read from a file stream, where the file name is read from the next command line option. I suppose some might argue that it's too highly nested, but I'd say that if you read it from left to right you get close to reading out its meaning: "set all properties from file stream ...".
All the actual logic in the snippet is contained in functions, but the sheer number of failure modes in this code is huge. The options file could be missing, unreadable, unparseable. There could be unknown options, options in the wrong format, options out of range: setAll () has about a dozen option-related failures alone. The command line parameter could be missing. A malformed UTF-8 character could appear in the bowels of the config file. Anything.
This is not even considering all the usual runtime errors such as out of memory exceptions. And yet the code above handles all of these without distraction from its core purpose, which is after all what it actually does 99.99% of its life. And generates good error messages, if you'll take my word for it.
The reason the code is so simple is that all functions either return a valid result, or throw an exception indicating why they couldn't3. No function ever returns a value that could cause an error later on simply by being invalid.
Now the reason I was pondering this is that Joel Spolsky, someone who I have a great deal of respect for, doesn't like this exception-oriented approach at all. The two main reasons he cites are:
1. They are invisible in the source code.
2. They create too many possible exit points for a function.
These, he argues are bad things. Which puzzles me, because, to me, these are good things. I have to assume *any* line could cause an error and any line of code could except out.
This argument has been done to death by the way, not least of all on Joel's own forum. But I'm going to wade in anyway because I don' have anything better to do, and no one reads this blog anyway.
C is a language that follows Joel's prescription for error handling. It doesn't have exceptions, and instead usually uses the function return code to indicate success/failure (and the global errno variable, which is the work of the devil ... don't get me started). You can alternately pass a pointer to an error info variable which gets filled in if there's an error, which I've seen done on occasion. As Spolsky puts it, best practice in C is to "[...] have your functions return error values when things go wrong, and to deal with these explicitly, no matter how verbose it might be."
Both ways require error checks spattered throughout the code. Which, and here is where I wade into the debate swinging wildly, leads to ugly code that doesn't compose nicely - for instance you can't pass one function call as a parameter to another. However, in C this sort of chaining is often confounded anyway by the fact that allocated memory returned by a function must be deallocated.
Do we really like longer, more complex code, where the flow is broken by all the error checking? And this repetition leads programmers to be lazy: "why bother check the result of this malloc()? It's a tiny amount of memory, it'll never fail" ... SEGV.
Is this just my opinion? Quite possibly, but I decided to do my best to code the important logic above in C to see what it would be like as an alternative. The C version tries to keep the important properties of the Java code above, including trying to provide useful error messages (something that many C programs utterly fail at, leading to hilarious "Error: there was an error"-style messages).
Here's my best attempt:
I should note that I cut my programming teeth on C, and got to what probably was a pretty reasonable level. But nowadays that may not be the case and the below might be considered bad style. If so, I'd be really interested to know how it could be improved.
1. A new verb I've invented for obsessively tweaking perfectly good code.
2. And simultaneously sad in the eyes of most people reading this. Irony.
3. The fact that GC handles cleanup for us is also a big reason why this code is so simple.
Here's the core Java logic as a snippet:
try
{
...
config.setAll (propertiesFrom (fileStream (stringArg (args, ++i))));
...
System.exit (0);
} catch (Exception ex)
{
System.err.println ("Error starting app: " + ex.getMessage ());
System.exit (1);
}
The code just assigns all configuration properties read from a file stream, where the file name is read from the next command line option. I suppose some might argue that it's too highly nested, but I'd say that if you read it from left to right you get close to reading out its meaning: "set all properties from file stream ...".
All the actual logic in the snippet is contained in functions, but the sheer number of failure modes in this code is huge. The options file could be missing, unreadable, unparseable. There could be unknown options, options in the wrong format, options out of range: setAll () has about a dozen option-related failures alone. The command line parameter could be missing. A malformed UTF-8 character could appear in the bowels of the config file. Anything.
This is not even considering all the usual runtime errors such as out of memory exceptions. And yet the code above handles all of these without distraction from its core purpose, which is after all what it actually does 99.99% of its life. And generates good error messages, if you'll take my word for it.
The reason the code is so simple is that all functions either return a valid result, or throw an exception indicating why they couldn't3. No function ever returns a value that could cause an error later on simply by being invalid.
Now the reason I was pondering this is that Joel Spolsky, someone who I have a great deal of respect for, doesn't like this exception-oriented approach at all. The two main reasons he cites are:
1. They are invisible in the source code.
2. They create too many possible exit points for a function.
These, he argues are bad things. Which puzzles me, because, to me, these are good things. I have to assume *any* line could cause an error and any line of code could except out.
This argument has been done to death by the way, not least of all on Joel's own forum. But I'm going to wade in anyway because I don' have anything better to do, and no one reads this blog anyway.
C is a language that follows Joel's prescription for error handling. It doesn't have exceptions, and instead usually uses the function return code to indicate success/failure (and the global errno variable, which is the work of the devil ... don't get me started). You can alternately pass a pointer to an error info variable which gets filled in if there's an error, which I've seen done on occasion. As Spolsky puts it, best practice in C is to "[...] have your functions return error values when things go wrong, and to deal with these explicitly, no matter how verbose it might be."
Both ways require error checks spattered throughout the code. Which, and here is where I wade into the debate swinging wildly, leads to ugly code that doesn't compose nicely - for instance you can't pass one function call as a parameter to another. However, in C this sort of chaining is often confounded anyway by the fact that allocated memory returned by a function must be deallocated.
Do we really like longer, more complex code, where the flow is broken by all the error checking? And this repetition leads programmers to be lazy: "why bother check the result of this malloc()? It's a tiny amount of memory, it'll never fail" ... SEGV.
Is this just my opinion? Quite possibly, but I decided to do my best to code the important logic above in C to see what it would be like as an alternative. The C version tries to keep the important properties of the Java code above, including trying to provide useful error messages (something that many C programs utterly fail at, leading to hilarious "Error: there was an error"-style messages).
Here's my best attempt:
/* better hope message doesn't overflow this */
char *message = char [255];
char *error = NULL;
properties_t *properties = NULL;
char *file_name;
FILE *file_stream = NULL;
if (!(file_name = string_arg (args, i)))
{
sprintf (message, "Missing argument: %s", args [i]);
goto error_handler;
}
if (!(file_stream = fopen (file, "r")))
{
sprintf (message, "Failed to open file %s: %s",
file_name, strerror (errno));
goto errorHandler;
}
error = NULL;
if (!(properties = properties_from (file_stream, &error))
{
sprintf (message, "Error reading properties file: %s", error);
free (error);
goto error_handler;
}
if (error = config->set_all (properties))
{
sprintf (message, "Error setting properties: %s", error);
free (error);
goto error_handler;
}
/* done! */
exit (0);
error_handler:
fprintf (stderr, "Error starting app: %s", result);
if (properties)
free (properties);
exit (1);
I should note that I cut my programming teeth on C, and got to what probably was a pretty reasonable level. But nowadays that may not be the case and the below might be considered bad style. If so, I'd be really interested to know how it could be improved.
Footnotes
1. A new verb I've invented for obsessively tweaking perfectly good code.
2. And simultaneously sad in the eyes of most people reading this. Irony.
3. The fact that GC handles cleanup for us is also a big reason why this code is so simple.