Bayes spam: More promises
Dec 09, 2002

Still getting sporadic emails about my Bayes spam predictions. Thanks to those who emailed for staying civil; I expected a bit more flamage.

People are still understandably skeptical (can't blame them, I'd be skeptical too). So, here's another thing I'll do: When the SpamBayes project actually releases files officially, I'll take that and implement the attack I mentioned, to see if I can make it work.

Exactly what I'll do if I can remains to be seen; I sure as *#&$ won't just release the code, though! And if I can't, I'll say so.

I chose that project because it's in Python and that's the language I'm most familiar with. If that project takes "too long" to release, I may look into a C++ Bayes solution... but I'd really prefer Python. (If you know of an existing Python solution, please feel free to let me know. I sampled likely-looking candidates in Paul Graham's list, but I may have missed one.)

Court makes landmark ruling in web defamation case
Dec 09, 2002

[Google Top Stories]

I generally stay away from international jurisdiction stories, considering them relatively uninteresting problems that must be solved through diplomacy, like any other national conflict. This seems wrong; even if the case is heard in Melbourne it should be judged under USA laws, since the article was written in America.

OK, I admit I just wanted to post an article from the Google feed to see how it looks. (Been waiting a while, too.)

Homeland Security from Doc
Dec 03, 2002

Washington Post:

...under authority it already has or is asserting in court cases, the administration, with approval of the special Foreign Intelligence Surveillance Court, could order a clandestine search of a U.S. citizen's home and, based on the information gathered, secretly declare the citizen an enemy combatant, to be held indefinitely at a U.S. military base. Courts would have very limited authority to second-guess the detention, to the extent that they were aware of it.

[The Doc Searls Weblog]

Of all the things to come out of the so-called "Patriot" act, this is by far the worst. Who'd have thought we'd have secret, un-overseeable courts in this country? I think the Supreme Court should find an opportunity to render this unconstitutional. How on Earth this could be Constitutional is beyond me. All courts are clearly supposed to be inferior to the Supreme Court.

Cops Bust Massive ID Theft Ring
Nov 25, 2002

Federal prosecutors have arrested three men involved in what officials are calling the largest identity fraud case in American history.... Cummings would then use the ruse of "helping" the customers work through software and hardware problems to obtain the customer code that allowed the company to request credit records.

This is like a textbook case on why privacy issues are so importent. There is no such thing as "a company" or "a government" having access to privacy information; only "people in a company" or "people in a government" can have access to privacy-sensitive information. Now the people who had their credit records stolen, who have committed no crime, have to go through the effort of checking their records and potentially changing their Social Security Number and putting a fraud lock on their records, making credit transactions that much more difficult for the duration of the lock.

Better privacy policies probably couldn't have prevented this case, but they prevent similar incidents from occuring elsewhere, by either making it difficult to get things like credit records, or preventing information from being assembled in the first place. Consider things like TIA in light of stories like this, and consider how much mischief just one rogue TIA database operator could do, even aside from the other privacy implications. Does that much power belong in anybody's hands, for so little potential payoff from the scheme?

Arms Race
Nov 21, 2002

Matt Haughey: "The [SpamAssassian] arms race has officially begun." [Scripting News]

I'm reading between the lines here based on scanty hints (based on the remarkable uniformity of spammer's arguments that they are doing a good thing), because I'm not intimately familiar with the world of spammers, but the biggest spammers seem to talk to each other fairly regularly. If one of them has figured out how to do it, rest assured that it is not long before they all know how.

There is a definately cycle to all of these things:

  1. A new technique is developed.
  2. Early adopters use it, and acheive amazing fantastic spam reduction rates.
  3. It starts to increase in popularity.
  4. The critical turning point is when a major ISP uses it as part of its spam filters, keeping the spam away from the few people who would actually respond to it.
  5. The spammers counter-attack and get around the filter, and rapidly spread how to do this amongst themselves.
  6. The resulting endless arms race, where taken over time only a fraction of the spam is blocked, finally ends when the spam fighters give up on the technique and go to a new one.
Every technique up to this point has followed this "arms race" pattern. Looks like SpamAssassian is now moving from four to five.

In light of this further analysis, I'll refine my challenge. Three months after a major ISP affecting hundreds of thousands (or more) of mailboxes starts using Bayesian filtering server-side, I predict that the filters will be largely useless. Takers?

<- Future Posts Past Posts ->


Site Links


All Posts