First, you'll note a cranky tone in the Slashdot postings. Would
you believe that Bayesian filtering has fanboys?
A lot of people seem congenitally incapable of reading something about
Bayesian once they get the faint idea that I may be a little critical
of them. (Completely over their head is the distinction that I'm not
critical of Bayesian per se, but of the idea that it will solve
the spam problem once and for all.) Instead of reading the words, they seem to suffer some sort of strange vision ailment that renders them incapable of seeing anything but the phrase "Bayesian Filters are bad."
I used to blame my writing for any misunderstanding the reader may
have had, but at some point, you just have to hold the reader
accountable, you know? When they can't seem to read the plain English
in front of them because they're too busy jumping to conclusions?
Speaking of "can't read plain English", I am both regretful and
pleased to announce that I was probably wrong. Assuming you are
willing to call Bayesian filters "widely deployed" (which I'm willing
to stipulate, though it's a sematic issue and I can't claim to have
deployment statistics; it is clearly having an effect on spammer
countermeasures so they must be feeling it), it is the case that they
are still working, even after six months. So I was wrong on how
quickly they would sink.
But I would note that I'm yet to see the attacks I outlined used;
instead I see just random word attacks, which really won't work. Now,
I know at least a few spammers have read that piece, and I get
a hit for "bypassing Bayesian filters" from Google at least once a
week; surely at least a few of those are spammers with ill
intent. Fortunately, to date, none of them have been bright
enough to figure out what I was saying, in what I thought was
plain English, and managed to implement this attack.
I freely admit that I have seriously overestimated the intelligence
of the spamming community. If there is an upper limit to spammers
intelligence, the anti-spam war may have some hope after all.
I need to update the piece with some of this information but it may
be a while; in the meantime I think I'll just link to this post from
Snopes is required reading for people on the Internet. If it sounds too good to be true, if it's a little too conveniently in favor (or against) your favorite ideological position, or if it's a little too horrifying to be true, check it on Snopes before you get upset, or worse, spread the claims further. Because you'll meet someone who has nearly the entire site indexed in their head, and there's little that's more damaging to your point then to have it conclusive rebutted on Snopes.
I'd just like to take this opportunity to thank Barbara and David Mikkelson (FAQ link substantiating the names) for providing such a fine resource to the Internet.
And it's darn fun stuff, too.
Pass it along.
Theres just no nice way to say this: Anyone who cant make a syndication feed thats well-formed XML is an incompetent fool. - Timoth Bray on ongoing
This is the first public posting regarding my next major project, Iron Lute (links to
a screenshot). Iron Lute is an outliner, written in Python and using Tk as its toolkit; as a
result it should run on Mac, Linux, Windows, and anything else that supports Python and Tk.
(Off hand I don't know how many platforms that is, but it is plausible
this would work on palmtops and some other obscure ones with few
changes.) The license is currently undetermined, but will initially
be open barring surprises; more details in a forthcoming post.
This is easily my proudest Google moment.
|<- Future Posts||Past Posts ->|