OTP: a Functional Approach (or Three)

I intially started the OTP challenge as a fun way to write some OCaml. It was, so much so that I wrote solutions in two other functional languages, Haskell and Elixir. I structured all three sets of programs the same so that I could easily see their similarities and differences. Check out the encrypt program in all three languages and then I’ll share some of my favorite parts. Go ahead, I’ll wait.

Don’t Cross the Streams

One tricky part of the OTP challenge is that you have to cycle over the key if it’s shorter than the plaintext. My initial approaches involved passing around an offset and using the modulo operator, like this:

 <code class="ocaml">let get_mask key index =
 let c1 = List.nth key (index mod (List.length key))
 and c2 = List.nth key ((index + 1) mod (List.length key)) in
 int_from_hex_chars c1 c2

Pretty gross, huh? Fortunately, both Haskell and Elixir have built-in functionality for lazy, cyclical lists, and OCaml (with the Batteries library) has the Dllist (doubly-linked list) data structure. The OCaml code above becomes simply:

 <code class="ocaml">let get_mask key =
 let c1 = Dllist.get key
 and c2 = Dllist.get (Dllist.next key) in
 int_of_hex_chars c1 c2

No more passing around indexes or using mod to stay within the bounds of the array – the Dllist handles that for us.

Similarly, a naïve Elixir approach:

 <code class="elixir">def get_mask(key, index) do
 c1 = Enum.at(key, rem(index, length(key)))
 c2 = Enum.at(key, rem(index + 1, length(key)))
 int_of_hex_chars(c1, c2)

And with streams activated:

 <code class="elixir">def get_mask(key) do
 Enum.take(key, 2) |> int_of_hex_chars

Check out the source code (OCaml, Haskell, Elixir) to get a better sense of cyclical data structures in action.

Partial Function Application

Most programming languages have a clear distinction between function arguments (input) and return values (output). The line is less clear in ML-derived languages like Haskell and OCaml. Check this out (from Haskell’s ghci interactive shell):

 <code class="haskell">Prelude> let add x y = x + y
Prelude> add 5 7

We create a function, add, that (seemingly) takes two arguments and returns their sum.

 <code class="haskell">Prelude> let add5 = add 5
Prelude> add5 7

But what’s this? Using our existing add function, we’ve created another function, add5, that takes a single argument and adds five to it. So while add appears to take two arguments and sum them, it actually takes one argument and returns a function that takes one argument and adds it to the argument passed to the initial function.

When you inspect the type of add, you can see this lack of distinction between input and output:

 <code class="haskell">Prelude> :type add
add :: Num a => a -> a -> a

Haskell and OCaml use a concept called currying or partial function application. It’s a pretty big departure from the C-derived languages most of us are used to. Other languages may offer currying as an option, but this is just how these languages work, out of the box, all of the time.

Let’s see this concept in action. To convert a number to its hex representation, you call printf "%x" num. To convert a whole list of numbers, pass the partially applied function printf "%x" to map, like so:

 <code class="haskell">hexStringOfInts nums = concat $ map (printf "%x") nums

For more info on currying/partial function application, check out Learn You a Haskell for Great Good.

A Friendly Compiler

I learned to program with C++ and Java, where gcc and javac weren’t my friends – they were jerks, making me jump through a bunch of hoops without catching any actual issues (or so teenage Dave thought). I’ve worked almost exclusively with interpreted languages in the intervening 10+ years, so it was fascinating to work with Haskell and OCaml, languages with compilers that catch real issues. Here’s my original decrypt function in Haskell:

 <code class="haskell">decrypt ciphertext key = case ciphertext of
 [] -> []
 c1:c2:cs -> xor (intOfHexChars [c1, c2]) (getMask key) : decrypt cs (drop 2 key)

Using pattern matching, I pull off the first two characters of the ciphertext and decrypt them against they key, and then recurse on the rest of the ciphertext. If the list is empty, we’re done. When I compiled the code, I received the following:

decrypt.hs:16:26: Warning:
 Pattern match(es) are non-exhaustive
 In a case alternative: Patterns not matched: [_]

The Haskell compiler is telling me that I haven’t accounted for a list consisting of a single character. And sure enough, this is invalid input that a user could nevertheless use to call the program. Adding the following handles the failure and fixes the warning:

 <code class="haskell">decrypt ciphertext key = case ciphertext of
 [] -> []
 [_] -> error "Invalid ciphertext"
 c1:c2:cs -> xor (intOfHexChars [c1, c2]) (getMask key) : decrypt cs (drop 2 key)

Elixir’s |> operator

According to Programming Elixir, the pipe operator (|>)

takes the result of the expression to its left and inserts it as the first parameter of the function invocation to its right.

It’s borrowed from F#, so it’s not an entirely novel concept, but it’s certainly new to me. To build our key, we want to take the first argument passed into the program, convert it to a list of characters, and then turn it to a cyclical stream. My initial approach looked something like this:

 <code class="elixir">key = Stream.cycle(to_char_list(List.first(System.argv)))

Using the pipe operator, we can flip that around into something much more readable:

 <code class="elixir">key = System.argv |> List.first |> to_char_list |> Stream.cycle

I like it. Reminds me of Unix pipes or any Western written language. Here’s how I use the pipe operator in my encrypt solution.

* * *

At the end of this process, I think Haskell offers the most elegant code and Elixir the most potential for us at Viget to use professionally. OCaml offers a good middle ground between theory and practice, though the lack of a robust standard library is a bummer, man.

I had a great time writing and refactoring these solutions. I encourage you to check out the code, fork the repo, and take the challenge yourself.

David is Viget's managing development director. From our Durham, NC, office, he builds high-quality, forward-thinking software for PUMA, the World Wildlife Fund, ID.me, and many others.

More posts by David