Bias In The News

2004-10-05

Over the past few decades, psychology has taken great strides towards becoming a real, hard science instead of an "observational science" like history. Cognitive psychology has helped lead the way, since it has been lucky enough to be able to measure things like response times to stimuli for a long time now, but it has proven more difficult to test theories of how minds work on the lowest level.

Enter computer science and the field of machine learning, able to implement certain theories of mind and see how they perform under controlled conditions. As you've probably noticed, they have not created a completely artificial human intelligence. But they have been able to learn a lot of things, and equally important, unlearn a lot of things, and disprove a number of theories.

One of the results of this work has been the creation of a powerful theory of learning that sets effective upper limits on the ability of an entity to learn something from a given sensory input. While we have nothing that even approaches this upper limit (even human minds, the best things we know, rarely even come close to extracting 100% of the information available in a given sensory input; indeed we dedicate a lot of neural machinery to filtering it out!), the theory can still teach us much about learning.

My title was "Bias In The News"... why is this relevant? We can only communicate what we have learned; even lies are permutations of things we have learned. A four year old who just broke a vase might claim the dog did it, but they aren't going to claim the vase just jumped off the stand due to random quantum fluctuations imparting momentum to it. (Even if you convinced a child to say those words, they still won't mean what I mean.) Thus, biases built in to how we learn directly affect and constrain our communication.

Taking the broadest definition of bias from a quick search of the literature I could find, bias is "anything that affects an entity's choice of one concept over another". An obvious source of bias is if the input data is biased, but that one tends to be so obvious that it isn't worth pondering too deeply, except inasmuch as some types of input bias interact with others.

A much more theoretically interesting bias arises from the domain of concepts a learner is able to represent. Suppose we are training a simple learner to recognize the difference between a bowling ball and a feather, based on two inputs, the weight and the roundness of the object. (There are ways to meaningfully quantify roundness in one number; consider the "eccentricity" of ellipses as one example.) We chose those two inputs as humans because it is obvious to us that they matter.

Even the simplest of learning algorithms from the 1960s will rapidly learn that bowling balls are round and heavy, and feathers are light and not round. If you imagine a two-dimensional graph of "weight" vs. "roundness", there are two clear clumps for the two
objects in opposing corners.

But the very representation we've chosen represents a "bias"; the program prefers representation based on "weight" and "roundness", based on the fact that it is completely incapable of understanding or representing anything else, a very strong bias indeed. It so happens that this bias serves it well in this domain and it ends up learning extremely rapidly because the bias matches.

But now let's add a couple more objects into the mix, like a ping-pong ball (light as a feather but round), all kinds of other balls (all mostly round but varying weights), and a few random non-round things from around your house, and the program is going to have much more trouble. If I tell you that something is 1kg and is 25% round, you have no idea what I mean. The bias inherent in the representation leaves you unable to discriminate.

On the other hand, if I showed a person with normal vision a photograph of the object, they'd be able to recognize it immediately. They have a much richer representation ability in their head then our poor program does. On the other hand, human visual system is biased too, and not just on the final cognitive level where something is identified as a "chair". Visual system bias by the lower processing systems can be directly experienced in the common optical illusion. The bias built into human visual processing systems manifests itself as anomalous motion, where no real motion exists.

Here we see two examples in quick succession, one where bias works for a learner, and one where it works against them. As long as our program above is only trying to identify feathers and bowling balls, it will learn extremely quickly. It may even outperform a human, which is something indeed. On the other hand, the bias in our visual systems sometimes causes us to see things as other than they are.

With that understanding of bias, you can move towards understanding one of the results of the general theory of learning: No learning can take place without bias.

Being "free of bias" is not only impossible, it is not a desirable goal. When one is free of bias, that implies no ability to categorize, no ability to decide, no ability to learn from the past. Bias is inevitable.

While at first this quasi-mathematical definition of bias, and the more conventional definition of bias that you started reading this with, may seem very different, careful thought will reveal that the quasi-mathematical definition I'm giving here fully encompasses the more conventional idea of bias, and there are many ways in which it is a more useful definition.

The problems in computer vision can be succinctly summarized as "The sensors are almost entirely free of bias." The cameras take some light from a particular spectrum at a particular location, but the first is known to be sufficient for computer vision tasks by the demonstration of a system that can do it with that data (the human visual system), and the latter isn't much of a bias at all, especially as the machine generally can not make use of knowledge of where it is to any great degree. (Although that is one way to make the problem easier, constrain in advance the things the computer might see. Many industrial systems use this to great effect. I am discussing the general case here.) The problem is that the cameras just produce a series of numbers, and the program must start from there.

This is in stark contrast to the human visual system, which applies many, many biases through pre-processing before the signal ever gets to the cognitive part of the brain. The retina actually performs edge enhancement, and the picture is gradually transformed from pixel-like cone and rod input, to edges, to lines and curves, motion, discrete objects, and finally identification, with each layer receiving a highly biased representation from the one before.

(The previously linked optical illusions are failures of the edge detection and motion systems. You can actually "see" your edge detection at work if you try. The classical black boxes with thick white lines shows this; I think, though I've never been able to confirm this with a real source, that the black "spots" are basically your edge enhancement flaking out on a malformed input. Strong headaches sometimes cause the edge enhancement to go haywire, at least for me. While I've never taken any recreational drugs, based on the artistic impressions of the experience I think some of them partially work their magic by enhancing the edge correction, particularly the drugs that are associated with being fascinated by moving things like rippling fabric that has a lot of moving edges. Finally, I think the undulating images you see when you close your eyes in the dark are the edge detectors firing; again, I can't prove it but they behave exactly as I'd expect such things to act, particularly in the way they are not static like you'd see on a TV tuned to a dead station, but look like, well, the essence of moving, if you know what I mean.)

You, and everybody else, are a walking bundle of biases, without which you'd be useless and ineffective. While I am using the term "bias" in a general way, it trivially generalizes to any concept domain. The only person completely free of bias on a given topic is someone who is completely ignorant of the topic, and even then you may find they have generated one anyhow.

The practical conclusion of all this? Nobody is, or can be, free of bias. Don't pretend you are, don't accept others claims that they are. Humans, unlike the computer program I mentioned near the beginning, have the ability to change their concept biases (though just how flexible we can ultimately be is an open problem), so it is important to read as much as possible from as many different people with different biases as possible. And even then, you'll be biased, just hopefully with better biases then before.

Being free of bias is a mathematical impossibility; the sooner we all recognize that, especially the American press establishment which has established rules that are supposed to be free of bias (and which themselves create a bias, and not one that is very true to reality), the better off we all will be.