Matt's Mind

Sunday, January 22, 2006

Heatwave hacking

Well, it's officially a heatwave here in Adelaide: four consecutive days over 40° celsius. And now finally a cool change is moving in, along with smoke from bushfires in various places around the state.

After the last few days a mere 30° feels nice and cool ;) 40° is at the level of heat where everything just feels hot. Put on a t-shirt and it feels like it's just been ironed. And even if you just sit still the shade you still sweat.

So, a good day for staying inside. And sorting out my photo collection. Yes, I definitely know how to get down and party.

The Problem

For historical reasons I currently have two sets of photos, one old set manually ordered into a "logical" set of folders and the new one in an iPhoto library. These overlap to a great extent, but I'm not totally confident that every photo in the older directory is now in iPhoto. Since there are about 2,500 photos I'm not too keen to match them all up myself. But I need disk space on my laptop, and removing 1.8GB of redundant photos would be handy.

The Solution

The solution is obvious: write a utility in language I've never used before to find files not in iPhoto and then plug it into a tool I've never used before to browse them. The recipe for a happy day of hacking :/

The Language

The chosen language: Ruby. For various reasons the other obvious candidate, Python, doesn't appeal and Ruby has a number of things to recommend it from my perspective. One, it seems to be Perl Done Right, and I'm a old-time Perl hacker who loves the power of it but hates the actual language. Ruby has proper OO, clean syntax, Perl regular expression goodness, and blocks up the wazoo.

The second reason is Rails – at some point I really want to try this out, and knowing Ruby would obviously be a good idea for that.

Ruby Experience

The Ruby experience was very positive. Apart from the usual thrashing needed to work out the nuts and bolts of a new language the script came together quite easily. Although I really missed the autocomplete and API browsing I get with Java in Eclipse. The ability to jump around the library without messing with a separate web browser is a godsend when learning a new API

I needed to build a table of JPEG files in the two directory trees and then essentially do a "diff" on them (script is here if you care). For this task, Ruby's libraries were quite sufficient, although somewhat inconsistent. Strange things popped up like the Array.nitems() method, which actually gives you the number of non-null elements vs Array.size() can confuse if you just browse the method names and don't read the spec fully.

Other people have noted that Ruby's libraries can be a little lacking in coherence too. Another example: the Dir class used to access directories is quite good, except the pattern used to scan directories is gratuitously different from a regex. But nothing's perfect, and on the whole I think Ruby is now my language of choice for these sorts of hacking tasks.


My script spat out a list of 600 files that appeared not to be in the iPhoto library. This was more than expected, and I needed a way to quickly inspect these and add any missing ones to iPhoto. The obvious route was to get them into Preview, from where I could then drag photos to iPhoto as needed. The obvious way to do this: connect my script and Preview with Apple's Automator.

Now, Automator is pretty cool and at first this looked like a no-brainer. I dropped a "Run Shell Script" action in, told it to run my Ruby utility, then connected an "Open Images in Preview" action. Automator seemed happy with the match. But hitting "Run" made Automator do nothing but fire up an empty Preview and then beep happily.

Of course hardened Mac heads are chuckling at this point since they know my mistake (the smug bastards). For some reason, even though the Mac has gone partly POSIX underneath and uses file/paths/with/slashes sometimes and eschews resource forks (in theory), it still seems to use HFS file:paths:with:colons for Cocoa (GUI) apps. My script was generating POSIX paths, and I needed to feed Preview HFS ones.

I won't bore you with the Googling that followed, but some time later I had worked out to plug the AppleScript below in between the script and the Preview task (screenshot here (160K)):

on run {input, parameters}
set output to {}
repeat with i from 1 to length of input
set x to item i of input
set output to output & {POSIX file x}
end repeat
return output
end run

Once I had grokked this, Automator hummed away and fairly quickly I had Preview showing 600 photos. It turned out most of the "missing" photos were ones I had rotated or otherwise edited in iPhoto. But I did catch a couple that had slipped through the net. Well worth the hours of hacking ;)