The Environment Object Pattern in Go

2014-01-23 (Last Modified: 2014-02-25)

One of the things I've been really enjoying about Go is how easy testing is. The pervasive use of interfaces and composition-instead-of-inheritance synergize nicely for testing. But as I've expressed this online on reddit and Hacker News a couple of times, I've found that this does not seem to be a universally-shared opinion. Some have even commented on how hard it is to test in Go.

Since we are all obviously using the same language, the difference must lie in coding behavior. I've internalized a lot of testing methodology over the years, and I find some of the things work even better in Go that most other imperative languages. Let me share one of my core tricks today, which I will call the Environment Object pattern, and why Go makes it incrementally easier to use than other similar (imperative) environments.

So... global variables. All programmers will verbally agree they are bad, yet mysteriously, despite the fact that nobody ever writes one they continue to show up in code bases and libraries. (I blame evil code fairies.) In Go, they are still bad, and there is even less reason than ever to use them. Instead, I create an "environment" object that contains everything I was tempted to put in a global variable, and pass it around.

Suppose I have a logger, and some form of "registry" which will track resources by some string identifier:

package logging

type Logger interface {
    Error(s string)
    Info(s string)
}

type RealLogger struct {}

func (rl RealLogger) Error(s string) {
    print("ERROR: " + s + "\n")
}

func (rl RealLogger) Info(s string) {
    print("INFO: " + s + "\n")
}

type NullLogger struct {}

func (nl NullLogger) Error(s string) { }
func (nl NullLogger) Info(s string) { }

package registrar

type Registrar struct {
    values map[string]interface{}
    sync.Mutex
}

func (r *Registrar) Register(k string, v interface{}) {
    if r == nil {
        return
    }

    r.Lock()
    defer r.Unlock()
    r.values[k] = v
}

func (r *Registrar) Lookup(k string) interface{} {
    if r == nil {
        return nil
    }

    r.Lock()
    defer r.Unlock()
    return r.values[k]
}

Many are tempted to create a global Logger and a global Registrar instance, and in one fell swoop, they have made all their code that much harder to test. Now test code must manage these globals, potentially stomping on each other, potentially ruining the ability to run in parallel, and worst of all, potentially combining with dozens of other global variables which tests must manage, combining to form an essentially-untested code environment. (Or hundreds, or thousands....)

Instead of doing that, I've been creating a $PROJECT/environment module, which then gather all these things together:

package environment

import (
    "WHATEVER/logging"
    "WHATEVER/registrar"
)

type Env struct {
     logging.Logger
     *registrar.Registrar
}

func Environment(logger logging.Logger, registrar *registrar.Registrar) *Env {
     if logger == nil {
         logger = logging.NullLogger{}
     }

     return &Env{Logger: logger, Registrar: registrar}
}

func Null() *Env {
     return Environment(nil, nil)
}

There are two subtle keys to making this approach powerful:

All elements of the environment have a "null" implementation; a logger that doesn't log (NullLogger), a registrar that doesn't register (in this case implemented by a nil pointer of type *Registrar, which you can call methods on as shown above), etc. In this case, note that if the user fails to pass in a Logger, we provide the null logger, and the nil pointer of *Registrar has a sensible implementation that does nothing.
The existence of an environment.Null() method, which tracks the environment as it grows and makes it trivial to always obtain a "fully null" environment.

By no means is this anything like a novel idea. But Go does have two attributes that make it easier than many other imperative languages. First, the way method resolution works with struct embedding means that you will often simply embed the environment itself into some other object, which then means you can directly call all "environment" methods on the local reference to the object itself. If you have an s *Server that you are currently writing a method for, logging isn't even s.Environment.Info("..."), it's just s.Info("...") if you embed the *Env into the Server itself. It can be fewer characters to use than what it takes to dereference a global object anyhow! I find that in practice, while an Env is constructed in the main function and does generally have to be passed down a level or two, you do not end up threading it throughout your entire application by hand; usually there's somewhere where you can ride the object composition (and you should be anyhow).

Second, it is trivial to compose together an Environment like that due to the way that multiple composition is a great deal easier to work with like this than multiple inheritance would be if you tried to do the equivalent thing; indeed, the entire idea of multiply inheriting what can easily become a dozen or two classes like that would simply be considered insane in the inheritance-based OO world. Each subcomponent of the environment remains fully isolated from each other, despite the convenient packaging. Yes, if you have two methods named the same thing the convenience is a bit lessened, but you can still reference both elements with a fully-qualified name if you need to, and the compiler immediately notices if you do it accidentally.

This often 90% decouples an object from it's "environment", leaving only a socket simulation or something else equally simple left in order to create a "testing environment" for your object. Should you wish in your testing code to use a particular bit of the environment, you can easily call environment.Null(), then change the one or two elements you care about, since they are all public anyhow, and even as the environment struct evolves and grows, only changes to what the test itself cares about should affect your test.

I played a bit with decomposing the environment object into smaller pieces itself, but because of the convenience of the null environment and the fact that all the client code tended to just compose the pieces right back together again, in my experience it's just fine to have a dozen subcomponents in your environment, since they don't interact.

There is almost never a reason to have a "global variable" in a Go program.

Further, if you are writing a library in which you are tempted to offer a global variable to the user, remember just how darned easy it is for me to embed a struct into one of my data types and get "direct access" to your methods. Even what little convenience you think you are offering may be illusory! If nothing else, remember that I can easily get a struct from you and make it a global myself, but if you make something global and hard-code your library to make use of it, it is essentially impossible for a me as a user to "deglobalize" it.

Update, Feb 25, 2013: I'm currently playing with whether all consumers of an environment should instead define an interface containing the functions they actually need, to further decouple the environment from the user. In my current app the "environment" is just the one app, but as I move out into a framework situation hard-coding the Env type may become inconvenient. If an environment consumer declares an interface instead, multiple environments could be trivially passed in by different users.