Parametricity in Go

2013-10-17 (Last Modified: 2013-10-23)

One of my objections to Erlang is that despite paying the price of being a functional language, it often fails to reap the advantages. An example of this is in testability; nominally, a purely functional bit of code ought to be easier to test than the imperative equivalent, because it is just a matter of setting up your parameters and checking the results, with no IO or state in between.

Erlang doesn't make this impossible, but it's less convenient than the brochure promises. The core of your application is generally locked up in the various gen_* handlers. These handlers have very stereotyped ways of being called, which include the full state of the thing being tested. I find this very tedious to test, for two reasons: 1. Every test assertion must define some sort of "complete state" for the handler, which is probably full of real-world concerns in it. In particular if it has further messages it is going to send, those are often relatively hard-coded somehow... an inconvenient-to-mock Mnesia entry, an atom-registered process name, etc. (Erlang programs end up having a surprising amount of global state like that.) 2. If you want to test some sort of sequence of events, you are responsible for threading through the code, or manually invoking the proper gen_* start up functions, or something. It's possible to refactor your way out of this mess, but in practice it's a lot of work for the reward. So many of the tools you could use in other languages aren't available.

Go, in theory, ought to be harder to test than Erlang, being an imperative programming language. In practice, I'm finding it much easier, and I'm doing a lot more testing in it.

First, the use of explicit channels makes it easier to control the input and output communication of a given piece of code than it is in Erlang, by taking control of the channel values in your test code.

That's trivial. More subtly, despite being an entirely non-functional language (in the modern sense), it has (probably accidentally) picked up a concept from the functional languages called parametricity. That's a decent intro to the concept even if you're not fluent in Haskell, but the core idea is that there is a duality to the "power" of a data type; every operation the data type supports may make it "more useful" to a user, but correspondingly, when you hand that data type off to a function, you know less about what that function is going to do, because it can do all those things too. Every manipulation that the data type does not permit is dually a manipulation that you, the argument passer, can be guaranteed isn't going to happen.

As I've asserted before (in slightly different words), most mainstream programming languages obsess themselves with trying to offer as much power as possible to the user of a value, and neglecting the fact that this leaves the user vulnerable to whatever crazy thing a bit of called code may decide to do to their value.

Go... is still firmly in this mainstream. Trying to treat it as a functional language is merely a path to pain. But it turns out Go does have this particular trick usually associated with functional languages in its repertoire. The io hierarchy, with the Reader, ReadWriter, ReadWriteCloser, and related classes, provide particularly vivid examples in the stock library, but the technique is generally useful with other interfaces as well. If your function only takes a io.Reader, you are guaranteeing, albeit with an asterisk, that you will not close the stream in the function. (Or write to it, or do anything else that the literal type that you pass in may have permitted you to do.) The asterisk is that type assertions still allow you to penetrate down to the original type, but the solution in your code is of course Don't Do That. Penetrating the interface should be considered a code smell. I say only a smell because there may be legitimate reasons (optimizations, for instance), but it's certainly something to be used sparingly.

Testing a network server in Go is easier than Erlang because you can follow this pattern:

func Listen() {
    listener, _ := net.Listen(...) // whatever
    for {
        conn, _ := listener.Accept()
        go handleConnection(conn)
    }
}

// watch the type signatures here!
func handleConnection(conn net.TCPConn) {
    conn.SetNoDelay(false)
    // ... whatever other specific socket stuff you may need...

    handleService(conn)
}

// and here:
func handleService(conn io.ReadWriteCloser) {
    for {
        // read input...
        // handle input...
    }
}

handleService basically "drops privs" on the network connection, and turns it into just io.ReadWriteCloser, which is then in turn much easier to test with. If you do what I imagine is a common pattern, and go yourself a routine to drain the socket and send the messages over a channel, it gets even easier to test; you have a chan []byte (or whatever you may convert that into) and a io.ReadCloser, both of which are trivially drivable in a test suite by direct manipulation and a bytes.Buffer, respectively. The fiddling with the socket is often very easy code to test directly. In many cases, it is the sort of code that simply can't fail, such as the example above; if it type checks at all, SetNoDelay isn't going to fail. It's the code in handleService that is what actually needs testing.

Specify as little power as possible in your Go function signatures by carefully choosing your interfaces. The fewer operations the Go code can use on an object, the easier it is to test it, the easier it is for the code to be reused, the easier it is for people to use your code with confidence. If you are disciplined, there's even times where it is useful to declare an interface for the sole purpose of "dropping privs" on an object, and consequently reaping those benefits.

This is not impossible in other languages with interfaces, but the way Go implements its interfaces implicitly makes this the second easiest language I know to use this. (Haskell still beats it; Go's interfaces are a bit more convenient than Haskell's typeclasses, though I believe no more powerful, but Go's slight advantage there is then lost by the fact that Haskell has pervasive parametricity in all type signatures, not just in its typeclasses, from which it then offers further guarantees.) Anything that implements io.ReadWriteCloser is already an io.Reader, no separate declarations or casting required. I'm pretty sure that most languages that could theoretically support this idea would require the user to typecast down on the interface (possibly unsafely), making it something you have to think about instead of just doing, and therefore making it something that doesn't happen. Making it so easy to refuse power is a very powerful technique for making correct and testable code.