Spam Filtering's Last Stand

Note: If you're arriving from one of the many links who think I'm underestimating the power of personalization, please see my rebuttal (now with a working link!). Personalization won't work either, and it's nowhere near as powerful as people seem to think.

Recently, a relatively new idea for filtering spam has surfaced: Bayesian classification of e-mail, or at least Bayesian-inspired analysis. This seems to have been recently been brought to the Internet community's attention by Paul Graham in his essay A Plan for Spam, though I know he's not the first to think of it: For instance, here's a programming assignment given at the University of California, Irvine's Information and Computer Science department in Dec. 1999. That the idea was not popular until recently is probably a direct consequence of the fact that since 1999, the war on e-mail spam has been victory Spammers at every turn. The need for bigger guns is now more acutely felt then in 1999.

Bayesian classification rests on the application of Bayes' Rule. Bayes' Rule is reasonably easy to summarize and explain, but it is surprisingly deep for such a simple mathematical formula, and very powerful. It provides a mathematically rigorous way to update one's beliefs based on incoming evidence. In this case, the filter examines a message, decides how strongly it believes the message is spam, and then can use the human user's classification of the message (spam or not spam) to powerfully and easily update how it determines whether a message is spam or not.

The proposed solutions are often not strictly speaking Bayesian, but as the author of that post points out, "we have bigger fish to fry" then whether or not a given filter is 'pure' Bayesian. It is the effectiveness that matters. I'll refer to it as Bayes-type or probabilistic filtering.

Several implementations exist or are being developed. A couple of the more interesting ones are spambayes, a filter written in Python that can be used as a component in larger systems, and the Mozilla Mail Bayes filter slated for 1.3. Paul Graham seems to be keeping a list of implementations, which as of this writing lists 27 implementations of the idea.

From my understanding of the technology, I have every reason to believe that it will eventually (after some development and tweaking) work as advertised. It will probably be feasible to block %90+ of current spam, with only the odd false positive here and there. I doubt we'll need more then a few months for this stuff to mature, and given its obvious effectiveness, and the huge demand for spam blocking capabilities, I expect a significant percentage of us will be using these filters within that time frame. It would seem, based on the first-order effects, that Victory Over Spam Day should occur in the near future.

So why is the title of this post "Spam Filtering's Last Stand" instead of "Filters Finally to Conquer Spam"?

Ironically, because Bayesian filtering is too good... but not good enough.

Probability-based classification is currently on the cutting edge of Artificial Intelligence. It is one of the best known techniques for this sort of classification. At the top end of AI, the various techniques all perform about the same, with limited strengths and weaknesses, and Bayes-type classification is the best available for text classification of this kind.

If probability-based filtering fails, there is nowhere else to go in the realm of automated filtering. There is no next step in the automated-filtering arms race. This is it.

So, the obvious question is, will it fail? The equally obvious answer is an emphatic Yes!

Spammers will defeat the Baysian-type filters the same way they defeat Spam Assassin, which categorizes certain common traits in spam emails (such as using ALL CAPS, or certain key words not generally used in normal email), scores them, and tries to mark spams based on the score of the various spam markers it detects. Spammers are of course aware of this software, download it, and craft their messages to get past it. Spam Assassin is polite enough to explain to the user why a given message was marked spam, which is nice for the user, but pure gold for the potential spammer. The Spam Assassin maintainers detect this, find some new markers, all the real Spam Assassin users update, and around the cycle we go again. A never-ending arms race. Spammers will create their own Bayesian filters, create messages that reliably get past it, and send those. Yeah, filter creation will take a little bit of effort, but it's not like there aren't millions of examples of email and spam messages freely available for the taking to use to create very robust and generally powerful filters. (That may not all be email, but it's close enough.)

Unfortunately, there's more to it then that. All of the various filtering techniques are not independent. If a Bayesian filter catches 90% of current spam, Spam Assassin catches %99 of current spam, and my class project catches %50 of current spam, that does not mean you can string the three of them together and catch %99.95 percent of current spam (even discounting false positives). If it were that easy, the war would already have been won by the anti-spammers. In reality, the act of crafting a message to bypass one filter means that same message will probably make it past all the others, too.

This is where the unparalleled strength of the Bayes-type filtering becomes the very makings of unmitigated disaster, as the second-order effects come into play. Consider the question What does it mean for an spam email to make it past a Bayes filter?

The proper answer to that is a mathematical one, but in English, it means that there are no obvious cues that the message is spam. Nothing obvious in the title, nothing obvious in the text, no key words used only in spam, nothing. Contrast this with what it means for a spam email to get past Spam Assassin, which is that it passes all the tests SpamAssassin applies, which if you look at them, doesn't really eliminate a whole lot (which is why spammers keep getting around it). Or what it means for an email to get past the Realtime Blackhole List, which is merely that the spam came from a not-currently-blacklisted machine, which again, doesn't affect the spam itself much.

Because Bayes is so powerful, "obvious" in this case actually reasonably matches the human concept; that's why Bayesian analysis is so cool. In order to find out it's spam, you will actually have to read it. See, this is the problem. As good as probability classification is, it is not the holy grail of "Natural Language Comprehension", where the computer actually understands the meaning of the text. The computer is still looking at the external trappings of the text, not meaning. You might liken it loosely to a human skimming the test for keywords, though the human will still get more out of the text then a computer would.

Consider your current mailbox. By now, you've probably gotten pretty adept at identifying spam based on just the title, with the garbage chars and phrasings and stuff, although the spammers are starting to get around that, too. (Got a spam today titled "Won't you please write me back?!??!!?!". Don't know what was in it, 'cause I deleted it.) Well, consider this spam of the future:

Subject: Re: Re: the proposal

That's a nice point, but I think you should consider the information at http:/\/www.somewebsite.com/info.html before going with that approach. I found that information to be really pertinent.

Of course, anything could be at that link. Let us assume for the moment this gets past your Bayes-based filters, which is quite reasonable. That also means you have twenty or thirty of these types of messages, with varying levels of creativity applied to the task of getting past your filters. Consider the consequences of all of your spam looking like that sample spam:

  • First, you've effectively lost the ability to scan through your email and detect spam better then any filters, because all the obvious cues are gone (you know, like BUY EMAIL LISTS FROM A GUY WHO TYPES IN ALL CAPS!!!!!!!). This is a necessary skill no matter what, because all current filters still require you to scan through the spam pile to find false positives. If you are a business person who deals in proposals, you'll need to examine the from address to determine it's not relevant, and you may not be sure, because it could be a co-worker emailing from a personal account. Sure, if you're not a business person, you can delete it, but you probably have an Achille's heel you can't ignore too, which the spammers will find, along with enough near misses to make life interesting... besides, isn't that what all the profiling information everyone is collecting is for? Expect a ton more messages about "your website" or "Remember me from high school?" or any number of other mundane things. (Already this is getting to be a real problem, and the Bayes filters aren't even here yet!)
  • Second, since this spam was sophisticated enough to get past the Bayes filter, what else are you going to filter this with? Spam Assassin won't work, nor will anything else. Bayes is the King; once it gets past that, it will effectively get past everything based on automated detection. (You might say "Aha, I'll filter on the contents of any linked web pages in the mail message", but you'll find that you're just pushing the arms race back one level. The exact same techniques can be used by the spammers to effectively combat that, too, unless you're willing to forgo all messages that ever link to a webpage or something similarly drastic.)
  • Third, consider what it means to tell your Bayes filter that this message is spam. Are you going to say that messages with the phrase "I think you should consider" are more likely to be spam? Or "thought that information"? Or "nice point, but"? And I'm not even talking the spams that contain a small spam blurb at the top, followed by 20KB of perfectly non-spammy text, forcing your spam filter to either pass it through, or become hyper-sensitive to certain very common words or phrases.

All three of these boil down to one point: As spam becomes more like traditional email, with tools to help the spammer deliberately make it like traditional email, it will become impossible for any automated technique at all to pick out which ones are spam and which are normal e-mails, without extremely high false positive rates. (Note that the current Bayes projects are already fighting with false positives as their biggest problem, and the spammers aren't even trying to deliberately mess them up yet!)

Paul Graham's A Plan for Spam starts right off in the first paragraph with the assertion If we can write software that recognizes their messages, there is no way they can get around that, and that's right where the whole plan falls down. The whole automated spam fighting community is built around the incorrect assumption that they are fighting spam... that all that has to be done is "recognize their messages" and the fight is done. But we're not fighting spam, we're fighting spammers, and over and over again the failure to recognize this has rendered one anti-spam technique after another obsolete as the human spammers adapt.

If you look back at the part of this post where I was singing the praises of Bayes classifiers, you'll note I was careful to specify that these classifiers are excellent at recognizing current spam. Once these filters are widely deployed, spammers will write spams that will be impossible to recognize as spam without triggering countless other false positives. In a year, you can thank the Bayes bunch for the improved stealth of the spam filling your inbox. Spammers change, improve, and adapt as only humans can. The new filters will fall down on spam crafted to get past them just as simple word filters have failed, repetition filters have failed, community filters have failed, and every other analysis technique has failed. Sadly, when those other techniques failed, the spam became easier to identify by visual inspection, with the random letters appended to the end and such. Bayes-type filters are the first I know of that will push the spam in the opposite direction, towards total stealth.

The only solutions to the spam problem are still either a web of trust approach (which attacks the human problem itself by labeling a spammer as a not-good human), or creating a new email system where it is not free to send email (which directly attacks the economic incentive). Mechanical filtering systems are currently against the wall; this is their last hurrah, their last stand, their final chance for vindication. When probabilistic filtering falls down, mechanical filtering will be dead, and its only legacy will be "improved" spam.

I think there's a market opportunity here; nobody will accept the for-pay or web-of-trust email systems until all other hope in the current status quo (mechanical filtering + killing spam accounts) is crushed. I don't think we can be more then a year from that. Start developing a for-pay email system, or a good web-of-trust email system, right now, and you would probably be in good position to profit.

To make this scientific, I offer a prediction: Less then one year after Bayes filters are widely deployed, they will no longer work, but the collateral damage to all of our inboxes due to improved spammers will be significant. To anybody who would disagree with this post, I offer to meet you one year after significant deployment (a day agreed on by us both), and evaluate how I did.

PS: Sorry about the negativity over the last few posts; it's just a coincidence. I'll try to find something positive or humorous for the next post.