Tapestry Training -- From The Source

Let me help you get your team up to speed in Tapestry ... fast. Visit howardlewisship.com for details on training, mentoring and support!

Friday, February 01, 2013

Crafting Code in Clojure

The other day, I was working on a little bit of code in Clojure, just touching up some exception reporting, when I was suddenly struck by one of the fundamental reasons that Clojure is so enjoyable to code in. Clojure is craftable: that is, in Clojure you have the option to craft at your code to make it more concise, easier to read, and easier to maintain. That is not the case for all, or perhaps even most, programming languages.

In my case, I was constructing an error message where I needed to convert the keys of two maps into a comma-seperated string (I don't like to say "you guessed wrong" without saying "here's what you could have said").

What I want my code to do is easily expressed as an informal recipe:

  • Extract all the keys from both maps
  • Remove any duplicates
  • Convert the keys to strings
  • Sort the strings into ascending order
  • Build and return one big string, by concatinating all the key strings, using ", " as a seperator
  • Return "<none>" if both maps are empty

If I was writing this in Java, it would look something like this:

There's enough looping and conditionals in this code (along with tip-toeing around Java Generics) that its easier to look at its test specifiction (written in Spock) to see what it is supposed to do:

The first pass at a Clojure version is already simpler than the Java version ...

I couldn't resist using the clojure.string/join function, rather than building the string directly (which would be slightly tedious in Clojure). In many ways, this is a lot like the Java version; we're using let to create local symbols for each step in the process in just the same way that the Java version defines local variables for each step.

However, there's room for improvement here. Let's start to craft.

For example, let's assume that both maps being empty is rare, or at least, that the cost of sorting an empty list is low (it is!). Our code becomes much more readable if we merge it into one big let:

Now we're getting somewhere. I think this version makes it much more clear what is going on that the prior Clojure version, or the Java version.

However, if you've written enough code, you know one of the basic rules of all programming: names are hard. Anything that frees you from having to come up with names is generally a Good Thing. In Java, we have endless names: not just for methods and variables, but for classes and interfaces ... even packages. Long years of coding Java has made me dread naming things, because names never quite encompass what a thing does, and often become outdated as code evolves.

So, what names can we get rid of, and how? Well, if we look at the structure of our code, we can see that each step creates a value that is passed to the next expression as the final parameter. So all-keys is passed as the last parameter of the (map) expression, resulting in key-names, and then key-names is passed as the last parameter of the (sort) expression. In fact, ignoring the empty check for a moment, the sorted-names value is passed to the (s/join) expression as the last parameter as well.

This is a very important concept in Clojure; you may have heard people trying to express that you code in Clojure in terms of a "flow" of data through a series of expressions. We'll, you've just seen a very small example of this.

In fact, it is no simple coincidence that the last parameter is so important; this represents a careful and reasoned alignment of the parameters of many different functions in clojure.core and elsewhere, to ensure that flow can be passed as that final parameter, because it becomes central to the ability to combine functions and expressions together with minimal fuss.

We can use the ->> macro (pronounced "thread last") to rebuild our flow without having to come up with names for each step:

The ->> macro juggles our expressions into an appropriate order; without it we'd have to deeply nest our expressions in an unreadable way: (sort (map str (set (concat (keys map1) (keys map2))))). Even with a short flow of expressions, that's hard to parse and interpret, so ->> is an invaluable and frequently used tool in the Clojure toolbox.

We can continue to craft; the first expression (that builds the set from the keys), can itself be broken apart into a few smaller steps. This is really to get us ready to do something a bit more dramatic:

This is getting ever closer to our original recipe; you can more clearly see the extraction of keys from the maps before building the set (which is only used to ensure key uniqueness), before continuing on to convert keys from objects to strings, sort them, and combine the final result.

In fact, we're going to go beyond our original brief, and support any number of input maps, not just two:

The mapcat function is like map, but expects that each invocation will create a collection; mapcat concatinates all those collections together ... just what we want to assemble a collection of all the keys of all the input maps.

At this point, we don't have much more to go ... but can we get rid of the sorted-names symbol? In fact, we can: what if part of our flow replaced the empty list with a list containing just the string "<none>"? It would look like this:

... and that's about as far as I care to take it; a clean flow starting with the maps, and going through a series of expressions to transform those input maps into a final result. But what's really important here is just how fast and easy it is to start with an idea in Clojure and refine it from something clumsy (such as the initial too-much-like-Java version) into something elegant and surgically precise, such as the final version.

That's simply not something you can do in less expressive languages such as Java. For example, Tapestry certainly does quite a number of wonderful things, and supports some very concise and elegant code (especially in green code) ... but that is the result of organizing large amounts of code in service of specific goals. We're talking tons of interfaces, a complete Inversion-Of-Control container, and runtime bytecode manipulation to support that level of conciseness. That's the hallmark of a quite consequential framework.

That isn't crafting code; that's a big engineering effort. It isn't local and invisible, it tends to be global and intrusive.

In Java, your only approach to simplifying code in one place is build up a lot of complexity somewhere else.

That is simply not the case in Clojure; by adopting, leveraging, and extending the wonderful patterns already present in the language and its carefully designed standard library, you can reach a high level of readability. You are no longer coding to make the compiler happy, you are in control, because the Clojure languge gives you the tools you need to be in control. And that can be intoxicating.

The source code for this blog post is available on GitHub.