Functors and Monads For People Who Have Read Too Many "Tutorials"

Title is literally true. This may not be the best place to learn about these concepts for the first time, because I'm going to focus on knocking down the misconceptions about them.

Then again, it may not be the worst place, for the same reason.

I had promised myself I would not add to the pile of functor or monad "tutorials", but I've been worn down. I gave up when I saw a reddit comment complaining about how Functor was "too hard to understand", which made me sad, because the correct response to the Functor interface is, "That's it?". And while Monad is legitimately a bit more interesting and complex, the correct response to that is not that different.

I am aware of the notorious effect that people "get" monads and then post their own idiosyncratic takes on them. In my defense, this isn't something I write just after my "ah ha!" moment, I've understood them in Haskell's context for many years now, and actually... this isn't even about that "ah ha!" moment at all. This is only about what they are. Even if you completely understand everything I write in this post, the real "ah ha!" where you realize just how useful the libraries built up around the monad interface are, the first time you search for a type on Hoogle where you're like this should exist and it turns out it does in fact exist already, that's still in your future. In fact I'm quite deliberately not trying to convey that feeling in the interests of getting at simply what the monad interface is. Which isn't, strictly speaking, a pre-requisite to that experience, but it does help.

What Is Functor?

Fact: If you have a source of Blobs, and a function that can turn a Blob into a Thing, you can put them together to create a source of Things.

That is not profound. It is mundane. Examples:

  • If I have A: a list of integers, and B: a function to convert integers to strings, I can easily create C: a list of strings.
  • If I have A: a hash table of integers to strings, and B: a function to convert strings to booleans, I can have C: a hash table of integers to booleans.

Duh. Obviously. The only reason it sounds even slightly complicated is just that English doesn't have a good equivalent for a "variable", so it always sounds complicated when you have a sentence with them. The concept itself is, if not ELI5, well within the ELI12 range.

I don't say how I get from A and B to C. Each case is different, but simple every time. Takes more English words to say it than implementing it in your favorite language would.

What is functor then? Functor is an interface (or "trait", "concept", "typeclass" or whatever your language favors; "interface" from now on as I am deliberately trying to use more common terminology) that allows you to refer to the fact above as a first-class language element. In Haskell's case, Functor, which is implemented by providing an fmap implementation. For the above examples, that function will:

  • List implementation: Take the function in B and run it on all my elements and return a new list, commonly called map.
  • Hash table implementation: Take the function in B and run it on all my values, return a new hash table with the same keys and new values.
  • Function implementation: Return a function that runs the function in A, then runs the function in B on the result. This is commonly called function composition.

The last one is slightly tricky, but if you think of functions in their metaphor as black boxes that take input on the left and provide their output on the right, that's just taking the second function and slamming it on the right side of the first to create a new one. This is why I say "source of" something; the principle here is very general and extends beyond functions, containers, and all the other convenient words I have... it applies to anything you might obtain a value from.

That's It?

Yes. That's it. That's Functor.

That is literally all it is; a way of discussing that simple operation coherently across a whole bunch of types.

Why Is It So Confusing Then?

Because even setting aside the occasional outright writer confusion, it's easy to get too excited in a blog post and end up conflating what a functor is with what we can build on it.

I'm going to repeatedly analogize to a more commonly-understood interface throughout the rest of this post: Iterator. Why is Functor "important" or "interesting"? For the same reason the iterator interface is. Iterators existed since the beginning of programming, just as "functors" did. Individual iterator implementations are typically quite boring, sometimes even outright trivial, like iterating along an array. But if you have a whole bunch of ad-hoc implementations of "iterators" that don't fit together, you can't create an itertools package. Once you have a common contract and a coherent name, you can start building on top of what used to be a chaotic mess of almost-but-not-quite-the-same implementations.

As a modern example of a language missing this, Go lacks any sort of language or library support for "iterators", and consequently there are at least three iteration patterns in common use and a modestly common antipattern (using range over channels, which has decent semantics but horrible performance for most use cases). Lacking this concept, Go also lacks anything like itertools or a similarly-useful library.

It is exactly the same with functor. Just as a variety of functions are now possible and useful that only know they are taking "an iterator", by having a named, coherent "functor" concept in the language functions can be written that take advantage of the common functionality.

It is important not to credit complexity of particular implementations to the complexity of the interface. Behind an interface of function () string could lie a constant return of some particular string, or a series of events that takes nearly a full book to explain by fetching a web page and returning its text... but the latter's complexity should be accounted solely to that particular implementation, not the function () string interface it can conform to.

Similarly, a functor may be as simple as mapping the incoming function over a list, or it may require an arbitrarily-complicated traversal over a data structure like a skip list and construction of a new one. But that is not a complicated operation because "functors are complicated", it is a complicated operation because skip lists are complicated. The functor interface itself does not account for the complication.

It is important not to credit the complexity of what we can do with iterators to implementations of the interface. It is possible to take three iterators, reverse one of them, combine two others with a reduction function, and take only the prime indexed values from the result, but that doesn't affect the fact that an iterator is just a function you call to get "the next value".

Please allow me to target some other specific misconceptions:

"Functor" Is Not A Noun

For their own good reasons, mathematicians use "functor" as a concrete noun. Haskell copied this, hence how often you hear about "a functor".

However, interfaces are adjectives that describe a data structure, which is why in many languages they are often named with -able suffixes (Iterable, Serializable, etc.) Programmers are better off thinking of it as Functable or Fmapable or something like that.

Unfortunately, we do not have a pre-existing verb form for "using the Functor interface" the way we have "iterate on" for Iterable or "serialize" for Serializable. The closest we have is "map on", and that is deceptive because that term usually has the mental image of some concrete data structure, possibly only and exactly a list, being mapped on, despite it being much more general. I recommend taking a nonsense term like Functable and owning it, rather than jamming a square concept into a round concept hole.

If you find yourself confused, I recommend literally saying "Functable" to yourself. Eventually you will naturally start to hear Functor as just a synonym for how you say it to yourself. You may also consider "Functionable", which is perilously close to "functional", but carries the idea of "we can apply a function to it", albeit a very particular function with a certain signature.

In the context of the post you are reading right now, this may feel like nitpicking. However, I think it is actually a very significant contributor to the problem people have with understanding the functor and monad interfaces. I've tried to carefully use those words in this post consistently as adjectives, because when you use them as nouns it becomes very confusing as you try to treat them as you treat most nouns. A lot of people speak as if you can have "a functor" or "a monad" in your hand, and start trying to talk about what they "do" stripped away from the context of the specific implementation in question. But you can't do that, any more than you can speak about the specific implementation of "an iterator" stripped away from what it is a specific implementation for. Treating "monad" and "functor" as nouns leads to a lot of confusion.

"Functors" Are Not Always Containers

Functor can be applied to a number of things that aren't containers. We've seen that already with functions. It is a common mistake in non-Haskell implementations to assume that functor implementations can only go on containers.

When I bring this up online, inevitably someone says something like "Well, if you squint hard enough, functions are containers, and those other things are containers, and so on, so they do always work on containers." To that I say, if everything is a container, then the word container has no meaning. I think when you're rendering useful words meaningless just to conform to your misconceptions that you're stepping into a maze of mirrors that is only going to confuse you. The most sensible approach is to discard the misconception and understand that functors are not always containers.

Plenty of things are "functionable" that are not containers.

Haskell Confuses OO Programmers Trying To Read Naively

It doesn't help that

fmap (\x -> x + 1) [1, 2, 3]

looks like you are calling the function "fmap" with two parameters, a function to add one to integers and a list of integers. This can compound confusion when functor is thought of as a noun, because, what is fmap? Maybe it's a part of "functor"? In OO languages, functions do not have any sort of interface resolution on them.

But if we represent this in tradition OO notation, it actually looks like:

[1, 2, 3].fmap(x => x + 1)

that is, a method call on a list. In fact it isn't even particularly relevant at this point that "fmap" participates in an interface, because this is all statically-resolvable at compile time, just as it would be in many other languages. Laid out this way, an OO programmer has no problem imagining that resolving "fmap" may involve looking at the definition of the list itself.

This is not how Haskell resolves names. It isn't an object-oriented name. But if you don't want to learn Haskell and merely want to learn about what all this "monad" and "functor" hype is, this is a better metaphor for what Haskell is doing than thinking of fmap as some sort of weird magical function. It's closer to a method name resolving to an "fmap" implementation on a particular data structure. (Which, again, isn't quite what's going on; it's just close enough for now.)

The Functor Definition In Haskell

Let's wrap up Functor by showing the definition in Haskell:

class Functor f where
    fmap :: (a -> b) -> f a -> f b

This just says, if you have a way of taking an a and producing a b, and you have a source of as, you can have a source of bs.

This should sound familiar.

This is the full definition. There is no additional hidden machinery that Haskell is somehow automatically invoking when you "use a Functor". Like many OO languages, there is no additional conversion or weird stuff wrapped around the resolved method. Just like OO resolving an overridden method (with no automatic conversions), all the magic is in picking the correct implementation; once resolved the language is not doing anything else.

What makes a data structure "functorish"? It is having a sensible implementation of fmap, whatever that may be on the data structure. If it has a sensible implementation, we can give it, that makes it a Functor, not the other way around. Theoretically this is how it ought to work in all languages with a concept of an interface of some sort, but sometimes the paperwork can get in the way and make it feel like it's the other way around, that something isn't an Iteratable until I deliberately make it one. But I think it's a more mature programming view to think of a data structure as always having a set of possible sensible implementations for a given interface, possibly empty or more than one, and you simply choose whether or not to explain it to the language. You usually aren't "creating" these implementations. At least for these simpler, more mathematical types of interfaces.

Haskell Syntax Sidebar

The functor definition above has f a, which is Haskell notation for "a type f parameterized by a". That's abstract with type variables, but easy in examples; List Int is a "List" parameterized by "Int", meaning it can hold Ints.

While in full Haskell this can get more complicated, for the simple types I'm using in this post you can do the match visually. To match f a to List Int, we set f = List and a = Int.

Unfortunately, two of Haskell's most important types get special-cased and visually matching them is hard. Rather than List Int, Haskell uses [Int], which is confusing at the worst possible time for a novice.

Fortunately, we can write this as [] Int. It is already valid Haskell syntax. Then we can visually match: f = [] and a = Int. I don't write a lot of Haskell code nowadays but I've been adopting that as how I represent lists everywhere in my types, because I don't think the special case is very useful.

The other confusing type is the function type a -> b. For didactic purposes, it is better spelled (->) a b. Again, legal Haskell syntax today, though you unfortunately need the parentheses. In this case, for the functor implementation on (->), f = (->) a and the remainder is b. That's still a bit abstract, so let's concretize it by observing that you can convert the output of both a (->) String Int and a (->) Int Int if you have some function (->) Int String, because what matters is that the output of the function you are "functionableing" matches the input of the function you are applying to it, which is Int in this case.

I do not use this in my code, because it mangles type signatures something fierce to do this everywhere. Still, it is important to bear in mind what it really means.

In this case, the f in the pattern matches "the front piece" instead of just the first element. If you do not intend to program in Haskell, just take my word for it that this is a legal match and let this wash over you; if you do intend to program in Haskell you'll find this won't take much to understand. Either way it's not terribly important for understanding what the monad interface is so I will be moving merrily along now.

The Big Bad Mo Nad

So, what makes a data structure monadish?

A data structure is monadish when you have some source of a, and a way of taking an a and turning it into the same sort of source of b, and turning that all into the same sort of source of b. As with "functor" I am using the word "source" as the closest English word I can think of for the maximally generic idea of a type; you have a list of ints, or a function that yields an int, or a way of getting an int from user input, etc.

class Monad m where
    (>>=) :: m a -> (a -> m b) -> m b

English is really tripping over itself here, but between the English description and the definition I hope the basic idea comes across. It probably also doesn't help that we have a function literally called >>=, which is hard to pronounce. It's pronounced "bind", which is only slightly more helpful in that that still fails to evoke any useful imagery. Let's take a specific example:

List Int -> (Int -> List String) -> List String

On its own, that's not a confusing function. In this particular case, there aren't that many possible sensible implementations. Clearly, we're going to iterate along the list of ints, pass them one by one to the thing that produces a list of strings, and the concatenate all those results together into the final list. (We have to do that, because the final result can't be a List (List String); that's a separate type.)

Once again....

That's It

That's all Monad is. It is an implementation of some "method" that conforms to that interface specification.

Understanding this is not what people mean by "understanding monad". This is the "what" of the monad interface, but not the "why". But I think it is a step often glossed over too quickly in other works (if not at times outright skipped), so let's stay here and chew on that "what" for a bit.

Once again, I want to emphasize that just like in OO languages and just like fmap, there is no special magic that Haskell invokes because "it's a monad". All Haskell does is look up the interface's implementation and pass the arguments in. There is "do notation" but that is only a syntax-sugar respelling of this function call; it still does no further manipulation of arguments, no coercions, no modification of input, no modification of output, nothing. Since this article is about understanding "monad" without going deeply into Haskell-specific details, I'm going to ignore it because it's just syntax sugar.

Bind is simply a function that takes some wrapped type, a function that takes the inner value and returns a wrapped value, and a final wrapped value of that type.

Maybe Int -> (Int -> Maybe String) -> Maybe String

Again, no mystery to this implementation. It takes something that may have an Int or not, converts it to a string, and then returns something that is either a String or nothing.

Blorp Int -> (Int -> Blorp String) -> Blorp String

Obviously, you don't know what Blorp does. I named it and I don't even know what it does. But you can see this would be an implementation of "bind" for Blorp, on particular types Int and String.

That's all Monad is. Just as there is no magic around the Iterator's Next method, just as there is no magic around functors,this is all the what of a Monad is. In fact, it literally can't be more complicated, because this is all an interface is. All an interface can do is name a particular method. Nothing else happens. If you find your current conception of monad does not match that and it is insisting there's a big ball o' mystery here, throw it out. It is nothing more than declaring a particular function as the implementation of this interface and the subsequent resolution machinery when you use it. Nothing. More.

No, I do not mean this in a "this hard concept is really easy if you look at right". I do not mean it as an attempt to simplify some explanation at all, because this isn't a simplified explanation. This is what a monadic implementation is. It's not a simplification at all. It's a completely accurate description. It is no more and no less than a specification of an implementation for the particular interface. It is the same as Iterator and Functor and Stringable and all the other well-known interfaces many languages have.

I have belabored this point in text mostly in the hopes that you'll slow down and process this before continuing on. A lot of people have really built this particular interface up in their head as something fundamentally mysterious. But this is all the what of monad is.

Is it a burrito? Is it a space suit? Is it some other bizarre metaphor? No. You don't need a bizarre metaphor to understand this part. In fact you rather need a lack of bizarre metaphors, because they mythologize a very simple truth. The mythology has done a fantastic job of obscure the simple truth that monad is just a particular interface.

If you need to, re-read this section one more time, slowly. Take deep breaths and don't let previous conceptions come in. This is not the hard part, but it makes understanding the hard part virtually impossible if you've got too much mythology about the basics floating around in your head.

Why Is That Interesting?

Anyone programming for a while shouldn't have much trouble at all understanding why iterators might be a useful thing to have a definition for. Iteration is often something you'll cover in the first hour of programming, after all. A programming could probably rattle off five iterators off the top of their head and five useful things to do with iterators generically (e.g., "take the first X", "skip the first X", "reverse it", etc.).

Anyone programming for a while shouldn't have much trouble understanding why it's kinda useful to have a generic "Stringable" interface, even if it only emits debugging output not suitable for any other purpose, because it's a useful thing to be able to turn lots of things into strings easily.

Anyone programming for a while shouldn't have much trouble understanding why functor is interesting, once you penetrate the fog of mystery to see how plain it is underneath. You don't have to get far in programming before you have a list of something and need a list of something else based on it. It's slightly more abstract in that you may not be able to just rattle off examples but you've done this thousands of times.

Anyone programming for a while looks at the bind definition and probably can't come up with one decent example off the top of their head, if they haven't already been exposed to it before.

This is one of the fundamental difficulties in "understanding (the why of) monad". Iterator and functor are natural things we can just find lying around. Monad implementations frequently need to be something we do in fact construct, with deliberation, to fit the abstraction. I'll actually disagree with the "monad tutorials" here a bit and say this is not a function type that we use a lot of, all the time, as if you could have discovered it yourself by just abstracting enough. It is true that if you deliberately put some effort into fitting things into it, a lot of things can be fit into it, but it is not an operation the average programmer is doing all the time.

So... good news! In contrast to all the tutorials trying to present this as some trivially obvious concept, it is in fact a bit weird! It's perfectly normal to find it so.

If I have been successful in this writing, at this point, the particular operation that constitutes bind shouldn't be mysterious, because that is not a very complicated function signature. What should still be weird is:

Why Give That Particular Signature An Interface?

And finally we've arrived at the core problem. This is the thing that really grinds people's gears, throws the dust up in the air, and brings the fog of confusion. I am going to try to answer this, but I'll be honest: If this makes no sense to you, but everything else in this post does, I'll consider that a job satisfactorily done on my part.

And, as is generally the case with interfaces, the answer is that there isn't really "an answer". To "understand iterators" in the sense I think most people mean, you just need to know the tiny amount of details around how to use instances, but the bulk of the value to the interface lies in the hundreds of individual implementations of them. Some of them are trivial, but some of them can be PhD-level research to implement.

Likewise the monad interface. There is some knowledge involved in using them, but the real value to the interface lies in the implementations. Some of them are trivial and some of them are complicated. But the complicated ones aren't complicated "because they're monads", they're complicated things that happen to be able to implement the monad interface, but the complication is fundamental to the implementing data type, not the monad interface.

Why The Monad Interface Admits More Complicated Functionality

But there is something sensible to say about why the set of implementations of bind can have so much more complication than functors. Compare the two typeclass specifications side-by-side (with a bit of massaging):

class Functor f where
    flip fmap :: f a -> (a ->   b) -> f b
class Monad m where
    (>>=) ::     m a -> (a -> m b) -> m b

I've sloppily created some notation for the fmap function here getting its two arguments flipped in order. (That is all flip does.) Note the one and only meaningful difference, which is that the bind takes a function that yields a full object of some sort, not just a single translated value. Especially in Haskell, with its lack of mutable state or other places to hide stuff, this means we have a place to thread additional information that the functor can't.

You know how you're reading some documentation for some awesome new library you need, and it's got all sorts of examples for it and you think "Hey, great, that's awesome", and you start using the library, and you copy and paste one of the examples, and it just doesn't quite do what you want, so you start trying to tweak it and discover that while the example looked simple it's almost a cheat? Like, the example is using a ton of default functionality and looks awesome but as soon as you try to customize it at all, suddenly you have to instantiate a few hundred lines of code to configure all the things that were previous defaults just so you can reach in to some inner bit and tweak something?

Monad examples kind of end up like that. It's a rare example of it being a bad idea to start out too simple. Most of the standard monads are so simple that they either don't take much advantage of the monad idea, or they do it in a really funky way that is only possible because they just happen to be simple enough for it to work. You're actually better off with a more complicated example of a monad, to build up an idea of what's going on. It turns out it still doesn't have to be that complicated.

Imagine that you've got a data type that contains a value, and an audit record of how you obtained that value. For instance, suppose you're going to post some content to the web and you need a rock-solid audit log trail of who saw it, who signed off on it, etc., and it is vitally important not to get the data separated from the audit trail because then you won't be allowed to use it. You could define a generic class that looks something like:

type AuditableData struct {
    public: DATATYPE data
    private: AuditEvent[] audit_log
}

DATATYPE is a generic template variable here; I'm expanding it because the standard T can get confusing, right at the worst time here. This is a generic data type where DATATYPE can be int, string, Webpage, SalaryRecords, whatever.

Now, we can provide a function on AuditableData<type DATATYPE> that allows us to manipulate the contained data like so:

func (self AuditableData) Manipulate(
      manip func (DATATYPE) AuditableData
    ) AuditableData {
    var result = manip(self.data)
    return AuditableData{
        data: result.data,
        audit_log: self.audit_log.append_list(result.audit_log)
    }
}

In other words, given a function that can, say, take a URL you got from an audited source and fetch the resulting Web page, with an audit log of how that went, it will return the final contents of the URL with the full audit log attached to it by extracting it out of the AuditableData returned. That means the function that retrieves the content of the URL doesn't have to deal with any previous audit logs, it is only responsible for generating the events of its own.

This is, of course, a "bind" function based on the AuditableData data type. But as I've been saying, the question isn't whether or not it conforms to the monad interface, the question is, why is it useful to care?

Suppose the closure passed in to manip only returned the direct result of the manipulation with no audit log. Now, with no audit log, there would be no way to know how that value was produced because it would come without an audit trail. That's the difference between the functor and the monad interface. A functor implementation would be able to return the contents of the URL, but it would have no way to pass an audit log back with it, because the type will have no room for that.

Looking at the definition for monad again:

class Monad m where
    (>>=) :: m a -> (a -> m b) -> m b

What this interface happens to give us is a way we can program with whatever our "values" are (ints, Salaries, etc.) without worrying about the audit record. All the functions we use (a -> m b) take care of generating their audit records in themselves. The implementation of >>= takes care of putting them all together transparently, without us having to constantly manage them. In this case, that is the "append" portion of the Manipulate function above. This is a powerful pattern for abstracting away a lot of incidental considerations and getting at the underlying logic.

One of the bizarre things about "understanding monads" in Haskell is how rarely monad implementations look like this. There's a few that do; the Software Transactional Memory uses this approach to keep a log of the variables you read and write. When you read a variable with readTVar, it yields an STM value that has a record of the TVar you just read and that you read it. The STM's bind method simply set-unions all those records together as you do your operation. Then, when it's time to "commit", the STM machinery can read that out and figure out whether they interfere with any other transactions in progress, and either commit or rollback your work depending on the result of that. You get to program in STM oblivious to all that machinery under the hood, and the bind function works together to maintain the correct data structures the commit function will eventually need to do its work.

In my opinion, this is the key element of understanding them, this ability to return "more stuff" than just what you asked for, and for this machinery under the hood to combine those "more stuffs" for you in a way that is meaningful for the particular data type, thus yielding an STM type where you are not responsible for managing what variables have been used, performing operations on values without worrying about the audit log, etc.

But most monad implementations in Haskell, and pretty much all of the ones used in "tutorials", are degenerate, and thus, in my opinion, bad examples. This is not a criticism of those implementations or those data structures, which are the way they are for good reasons, but I think it's better to start here and then understand the standard things like List as degenerate implementations (and perhaps ponder the mathematical reasons for that degeneracy) than to try to build up to them from the degenerate case.

Let me show you what I mean:

Too Simple To Be Educational: The Maybe Implementation

Maybe is an increasingly popular data type that is now fairly frequently discussed. It has one of two possibilities: Either it is a distinguished Nothing value indicating it has no value, or it is Just value, where "Just" indicates it contains "just" the value in question. If you are not already used to it, it may be better to think of the cases as Empty and Contains x. (I think the choice of "Just" was a fairly quirky internal model someone had; even after some years of using it it still doesn't really scan for me, either.)

Inserting the type concretely into the signature for bind, we get Maybe a -> (a -> Maybe b) -> Maybe b. What this does is fairly obvious; it receives a the function that will take the first type and produce a Maybe of the second. If this is called on a Nothing, it will simply return Nothing again; there is no value to call the function with. If there is a value, it is unwrapped, passed to the function, and the result of the function is simply returned as the value.

func (self Maybe) Bind(manip func (T) Maybe) Maybe {
    if (self.IsNothing()) {
        return Nothing;
    }
    return manip(self.Value);
}

This is, frankly, a terrible place to start discussing the monad interface, because it uses none of the monad typeclass' powers.

This is a great example of the power of simplifying things by keeping them in a common interface, even when it involves a degenerate implementation. This is just the "functor" on Maybe, "lifted" into being a monad precisely by ignoring the additional bit monad allows, but does not require. You can have this separate channel of data flowing around through the monad interface, but you're welcome to just ignore it.

Starting here is like trying to explain iterators by starting with the empty iterator, the iterator that immediately terminates. While it can be useful in practice, it's a terrible first example because it ignores everything that makes iterators interesting in the first place. Beginners who don't have a strong grasp on what's going on become confused as to why the first example of "the thing that yields values in some order" is "a thing that never yields values", and in my opinion, quite justifiably so. Similarly, even though Maybe does have a monad instance, it's too simple to be useful in an explanation.

Strange Mathematical Implications: List Monad

List is still simple, but has some real meat to it and has one of the simplest non-trivial bind functions.

Suppose I have a function that takes an integer and returns a list with that many "X"s in it, i.e., myF 4 = ["X", "X", "X", "X"], myF 0 = [], myF 2 = ["X", "X"]. Now I have a list [4, 0, 1]. I can use call the function named >>= on it: >>= [4, 0, 1] myF.

What can a monad implementation do?

What does it have to do?

First, let me again remind you that there is no special "monad machinery" in Haskell. All the runtime does is find the function attached to the list data type. So "Haskell" isn't going to do anything just because I'm "using a monad" or anything. All it is doing is resolving a function.

By the definition above, the return type must be [] String; [] comes from the fact that is the original data structure, String because it's the parameterization of [] that our transform outputs.

It is up to the >>= function as defined on lists to decide what to do using the original list values and the transformation function. It could choose to never call it and return an unconditionally empty list of strings, but that's pretty worthless.

More sensible is to use the transformation function on the three elements that it has, resulting in having ["X", "X", "X", "X"], [], and ["X"] respectively. How can we produce a list of strings from these lists? The easiest way is to concatenate the lists in order, resulting in ["X", "X", "X", "X", "X"].

Prelude> [4, 0, 1] >>= \x -> replicate x "X"
["X","X","X","X","X"]

This is often called "flatmap" because it looks a lot like mapping the transformation function over the original values:

Prelude> fmap (\x -> replicate x "X") [4, 0, 1]
[["X","X","X","X"],[],["X"]]
Prelude> concat [["X","X","X","X"],[],["X"]]
["X","X","X","X","X"]

It is a common misconception that "monad is the same as flatmap", which has even made its way into some "monad libraries" in non-Haskell languages. flatmap is not "monad". flatmap is the particular implementation of the monad interface on lists. If you understand flatmap, you understand the monad implementation for the list datatype... you do not understand "monads" in general.

Trying to understand monad through "flatmap" is like trying to understand "Iterator" through "a binary tree node iterator"; you end up confused when trying to write an Iterator on an array because, how exactly do you take the "left" node in an array? A sufficiently confused person will end up building a binary tree just to implement "iterator" or something. (Seen the equivalent plenty of times. Done it myself a few times.) A sufficiently clever programmer may even produce an iterator over an array that successfully turns it into a binary tree and iterates over that "correctly", in according with the erroneous definition of "correct" the programmer is operating under. But it is still an overcomplicated implementation of iteration over an array, even if it yields the correct values in the correct order.

List's monad implementation does at least demonstrate how the bind implementation has a lot of freedom in how it calls the passed-in a -> m b function. A common mistake in implementing "monad libraries" is to set things up so the function will be called exactly once; the monad interface can not be implemented that way, only certain ones that happen to do that work. The list's implementation calls the passed-in function once per value in the list, which could even be zero times for an empty list.

Let me highlight that there's nothing about "monad" that is somehow "requiring" the implementation on list to be flatmap. Instead, what matters is the logic I went through above; given the specification of the monad interface, given how that is reflected in the specific type for "list", what could an implementation possibly be? If we ignore the degenerate case of not using the transformation function at all, there aren't a whole lot of other sensible choices. There are other choices... we could walk the list backwards, but... why? The pieces we have don't go together too many other ways.

What strange effect does this produce? Bear with me for a second if the rest of this paragraph makes no sense: This results in making it easy to write a sequence of function calls that returns more than one result. This can make for some very slick-looking graph algorithms, where you can write code that asks "what is the next node?", and then it receives a list of answers, and it can easily operate on that list of possibilities again, recursively. Combined with laziness and a bit of filtering on the various lists it makes it easy to write code that explores multiple possibilities at once.

What? How does it do that? Well, I'd say here again is where things get so confusing for people. It's not particularly clear how or why "returning multiple values" leads to that. It has a rather mathematical flavor to it. It's just that as you run the transform function over your list to get more lists of results (possibly empty), you end up with a depth-first traversal of all the possibilities rather than one.

But while cute... this isn't actually that useful. concatMap (as Haskell calls flatMap) is a useful operation on its own terms, and is used quite a bit. However, using the monad implementation on List in Haskell with the rest of its monad machinery is rarely done; when it is done, it is generally only done a little bit. The problem is that a "do" block in the list monad chains together several of these flatMap calls. It is very easy for the transformation function to result in the computation growing exponentially as each one expands one item from before into one or more items. You have to have something else in the problem that prevents that from happening. You may see a lot of cute little algorithms like solving the n-queens problem, but in practice, this is not something you see used a lot in Haskell. The mathematical transform is cute, but hard to control because it's a bit too simple. In particular, this has no ability to "notice" that it is revisiting possibilities it visited before. That would take more machinery.

A Common Misunderstanding Of The Monad Interface

There's another common misunderstanding of the monad interface which has led to much head-scratching confusion: The idea that it isn't possible to extract values from "monads", and that this is somehow an important element of what they are.

This is completely false, and not just false, irrelevant to the monad interface.

It is true that the monad interface itself does not specify a way to extract the values from its originating data structure. It only has a way to pass a function in to >>= and get back something else wrapped in the same type as before. That means that using >>= alone, it is not possible to "extract" values from the data type implementing the monad interface because no matter what you do, the result is still wrapped in the outer data type.

You can see it right in the definition of the monadic interface. The return value is specified as m b; that is, there is no way to escape the data type wrapper using just bind.

But monad is just like any other interface... just because you implement something from that typeclass doesn't mean you can't implement other things! There is no exclusivity with the interface, any more than any other interface. Just because list implements a monadic interface doesn't stop you from taking something you know is a list and reaching into it as you normally would. Just because Maybe implements a monadic interface doesn't stop you from looking at it and seeing if it has a value or not. It is true you can't use the Monad interface itself to do that, but that doesn't mean you can't some other way.

Whether you can extract values from a data structure has nothing to do with whether it implements the monad interface... what controls that is the data structure itself.

It happens to be convenient that the monad interface allows you to manipulate values contained in them without every having them "directly", because it gives certain important data types a way of manipulating values without letting them escape out where the data type may not be able to provide guarantees about what's going on, which brings us finally to...

A Super-Magical Data Structure: The IO Data Type

Not "the IO Monad"... the "IO data type that implements a monad interface".

The problem with "understanding IO" actually has nothing to do with monad. The problem with "understanding IO" is that IO is a freaking weird data type! The bad news is that because it is so weird, it's a very bad data type to try to understand the monad interface with, because as a confused newbie, it is very easy to attribute confusion to "monad" when in fact you're confused about this super-magical IO type.

The good news is that it's very easy to just mechanically follow directions online, use the do notation with the IO data type, and things will just work. This is not a good reason not to try to learn Haskell.

But the IO data type is straight-up magic. It is, conceptually, a data type that reperesents the external universe, and this portion is actually called RealWorld internally. You can ask for it to produce a string from the user's typing, and poof, the string is present, regardless of "purity". You can ask for it to produce a set of bytes from a socket, and poof the bytes are here. You can ask for it to create a socket to a remote system, and shazam the socket exists! Or maybe an error. Who knows!

The purpose of the IO data type is to hide this magic from the purity of the rest of the Haskell type system. Haskell can't allow you to have "an Int that the user will type in in the future", because that isn't an Int. An Int is only and profoundly "exactly one of the set of things that inhabit the type Int", specifically, a concrete number from a certain range. It isn't allowed to be anything else; not indeterminate until the future (you might be tempted to shout "laziness!" here, but "not yet computed, but fully determined" is profoundly different from "not yet determinable"; this is not just laziness), not an error, not accidentally some letters, etc. IO Int however is that complicated. It may be a number, or a failure, or it may take arbitrary amounts of time to appear, and so on.

It so happens that the method defined by the monad interface provides a useful way to manipulate such values without letting them escape into the rest of the pure system and wreak havoc. It so happens that the desirable function for doing that matched the idea of monad. It so happened that people noticed it was a useful idea in general and factored out that function into the "monad typeclass", and then noticed that the functions they were defining were useful for other things. It so happens that for this one particular data type's case, it is true that you can't reach into it and pull values of it. (Except with unsafePerformIO which is pragmatically necessary, but we can pretend it doesn't exist most of the time.)

But contra a very popular opinion, it is not the case that Haskell "needs monads to do IO". You could take away the monad typeclass entirely, take away typeclasses entirely, take away everything in Control.Monad and do notation and everything else, and what was left would still be completely capable of doing IO. It would be uglier. There would be specialized, haphazard analogs to various Control.Monad constructs that only work on IO, and a bunch of other specialized constructs scattered throughout the rest of the language that don't quite match up that do those things on other data types... just like in languages like Go that lack an iteration interface!

What the monad typeclass brings to Haskell is not IO. What it brings is a highly-shared and standarized set of functions to operate on that IO that have wide applicability to many other data types as well, rather than a fractured universe of specialized functions for IO, and specialized functions for STM, and specialized functions for lists, and specialized functions for... etc.

What is necessary for IO in Haskell is the IO data type, its private representation of the RealWorld, and the way it offers only certain blessed functions for operating on it that only allow you to penetrate its abstraction in carefully-defined manners that don't break the guarantees the rest of the language uses. Every function on the IO datatype itself would have exactly the same signature and do the same thing, even with no "monad interface" to be seen. Just as in Java, removing an Iterator interface would still leave all the Iterator implementations on the various classes that have them... they would just no longer have a shared way to speak about all of them.

What the monad interface brings is not capability; interfaces can't do that, because interfaces don't do anything on their own.

If you understand this all properly, I think "needs monads to do IO" is not even a particularly sensible thing to say. It's the ability to hide implementation and provide functions for operating on that hidden implementation that is needed for IO, and that's a perfectly boring thing object-oriented programmers should understand just fine. It is not materially different than any other object that provides public methods that implement some interesting functionality via private members you can't directly access, be it through strictly-enforced language controls or community convention. The IO datatype simply hides "the code and state used to speak to the real world" in much the same way a standard OO database access library hides "the TCP socket I use to speak to the database" from you.

(In fact, I see people frequently accuse Haskell of having a "hack" for IO here, but if Haskell's IO type is a "hack" so is every other lanugage's IO support. It is completely normal to hide implementation details, which is all IO is really doing, and it's completely normal to tie the language's IO support into whatever other facilities the language may offer, which is all the monad interface on the IO type is doing. There's nothing "hacky" about this in the slightest. It's just a respelling of perfectly normal programming language policies.)

It is also commonly thought that "monad" is about containers just as it is with functors, but IO really disproves that. There is no meaningful sense in which IO is a "container". If you stretch the definition that far, there is literally nothing that is not a container. I use the term "literally" literally; with IO representing the Real World, if IO is a container, you're treating the entire universe external to your program as a "container". Might be kinda cool in the "Whoa, what if we're all just, like, butterflies who think we're human?" sort of way, but it's not very useful way to use the term for engineering purposes.

Being so abstract is precisely why it's so hard to deal with and understand. Instead, what IO provides are "things that may or may not exist of this data type in the future, if they don't error out instead". It is far from a "container" as you can get, in the sense of concrete values manifested somewhere in memory. When working with IO, you're often working with values that do not exist yet, in the strongest sense of the term, as in, literally do not exist in the universe as a whole yet. When I refered to "source of Things" above, IO is what drove that terminology choice; often that's all the IO value represents, just a source of things that may someday appear.

So Why "Monad"?

I say there isn't an "understanding monad" the way most people believe it exists. Even if you understand the interface, you still have to understand the specific data type implementing it to know what's going on.

It is much like the much more common Iterator interface this way. Even if you "understand iterator", no amount of studying the Iterator interface itself will explain to you why a hash table's iterator yields the elements in what seems to be a random order. That does not come from the hash table's "iteratorness"; it comes from the nature of the internal implementation of the hash table and how it can efficiently yield keys or values give that implementation.

Similarly, for the various exotic monads, it is not an understanding of "monad" that will reveal what they are doing, but an understanding of the specific data structure and implementation.

To the extent that there is a sense in which you can "understand monad", it comes from understanding things like the Control.Monad package, and the tooling around monad, and how they can be exploited with the properties of individual monads to achieve various powerful effects concisely, without sacrificing the essential properties of Haskell. If you are not learning Haskell, none of these are relevant; if you are learning Haskell, you don't need to stare at the Control.Monad documentation and try to memorize it all. Just... every time you find yourself saying, "Say, shouldn't {this simple thing to manipulate things through their monad interface} exist?", check the package, because odds are it does.

If They're So Wonderful Why Aren't They In My Favorite Language?

A combination of three things:

  • While I think it's an error to conflate the monad interface with the problems that can be addressed with it, the biggest problems it can be used to address are solved in other ways in other languages, which mitigates the benefits using it out of Haskell can bring. If you don't need IO sequencing, or purity because your language doesn't care about purity, or the list implementation's nondeterminism because that's a parlor trick, or STM because your ecosystem doesn't use it... there's not much point to worrying about it. Monad implementations in languages other than Haskell fail to take off because there generally isn't even a single thing that is improved by working with it through a monad interface, let alone a family of such things.
  • Popular languages with stronger type systems are not able to correctly express the monad interface, making it very difficult to work with. Without the collection of functions that use the monad typeclass to do generic operations, it's pointless to declare the typeclass. The whole point is to get a powerful toolbox out of it; if all you get is the ball-peen hammer, when you've already got a toolbox with several other hammer types in it, who cares?
  • Working with the monad typeclass involves pervasive use of closures, and most languages make it too much work to essentially have a closure per line of code. Even Haskell found it a bit much, hence the "do" notation; languages in which the minimal function closure is something like function (List xs, int x) List result { ... } just in the type signature, before you're doing real work, are just insane to work with. Even the dynamic languages using more recent trends like (x => 1 + x) tend to stack up the parentheses or braces and require a lot of temporary variables. Getting the monad interface correct in not-Haskell is full of landmines.

In general, it's an error to try to import the monad typeclass into your language. The costs are magnified and the benefits reduced. As has been a bit of a theme on my blog lately, when considering importing features from other languages, you can't just copy over the costs & benefits analysis from the remote language into yours. You need to do a fresh analysis. In most, but not necessarily all, lanugages, monads are not useful to you in general, if not outright inexpressible.

However, the ideas are still useful. Sometimes, the verbosity can be outweighed by the utility of the guarantees you can obtain; for instance, througout my ~25 years of programming professionally, all of which has been in Not-Haskell, I have a handful of things that look like that audit log example above scattered here and there, because sometimes it matters that much to ensure some particular security guarantee is maintained (often around auditing, hence my example), and the fact the resulting API is frankly hideous in the language I'm using is briefly and locally outweighed by the need to do my best to ensure that there is no way to perform some operations without getting the audit log correctly maintained, which is simply too easy to forget to do if you have a "nice" API. But even then, since it's only this one implementation, there was no utility in declaring an interface, as there was no useful implementation to share with anything else in the code base.

It's definitely an interesting idea, and if you are a curious programmer, it's worth learning about this particular interface and what it can do. But it's not something you generally directly take back to your language; instead of saying "I'm going to jam Monads into my language because Monads are Good and Goodness is Monad and my language must be Good enough to have Monad in it", you should idiomically translate rather than transliterate it into your particular local needs.

Haskell-specific Addendum

I have deliberately chosen to approach this with a minimum of Haskell-specific details. Glossed over with malice aforethought: lawfulness of implementations, automatic derivation of functor instances, how laziness turns data structures into control flow and especially how that interacts with the IO monadic implementation and how a Haskell program is really one big generally-infinite pure data structure lazily traversed such that every Haskell program really is just a pure value run through a particularly complicated interpreter, the trivial upgrade of any Functor to Monad, applicative & ApplicativeDo, a deep discussion of "do" (which I think isn't too hard once you get the monad interface itself) (and yes, technically do does do a little bit of stuff with MonadError but it's minor and mostly deprecated last I knew anyhow), join (which I kinda think might be a better way to introduce the monadic interface on its own terms, but it's so offbeat relative to the rest of the web I decided to go with the usual >>=, some details on the Haskell type system (I had some more in here but they got in the way). This intended to be a description of the Monad typeclass for non-Haskell programmers and a non-trivial component of what that entails is precisely stripping away a lot of Haskellisms that are not fundamental to that task.