Mark expands on comment spam

2002-10-29

Mark expands on a couple of comments I made with regards to the recent beginning of people spamming comments sections of websites. Apparently the weblog community recently passed some sort of critical mass that makes it worth spamming.

Mark, if you read this, I think for now the only "Lojack" solution that will be feasible in the short-to-medium term is the one I proposed in my second comment, which is to let the web site owner easily review all recently posted comment and easily delete offending ones, in combination with a generalized rate-constraining scheme to ensure the user never has to filter through 3000 messages at a time. If enough comment tool authors do this, and enough of the comment tool users are proactive in deleting the spam (which is easily imaginable), it may (emphasize may) deter the spammers from working too hard to deface the comment sections, since unlike email spam, the spammers desired result is that these spams stay there indefinately, so that people (or search engines!) can see them.

Speaking to those who don't believe this, and giving these comments a place on my own site: The fundamental problem is that computers don't understand English, and until you solve that, a human must be in the loop. If you design to that, instead of trying to deny it, you can make this easy to deal with. Try to fight with automated solutions, and you'll only A: Engage with the spam tool makers, who enjoy a good techno-arms-race as much as the rest of us and B: By so engaging with them, you'll encourage them; it's human nature to try to get around things.

Remember, your "technological solution" must jump over all of the following hurdles, in the end:

It must not dissuade normal, casual posters, who are the point of all of this.
It must be able to withstand people embedding IE directly into their spam tools (not hard!) and manipulating the web page to post comments directly. (Kiss goodbye your clever hidden field schemes, your wierd Javascript schemes, anything else like that, because the browser is fully embeddable!)
It must still repel spammers who really are willing to invest some amount of effort per site; they may not bother with one site with extremely personalized defenses, but if for example all Movable Type sites went to using the same login scheme where you have to type a number that appears in a graphic on the screen, the spammers will happily let the human do that. The software will be written to present these all the user quickly and conveniently, as long as enough sites are doing the same thing.

There's just no way to leap all of these hurdles at once, esp. because of the last one, which implies you're not fighting program vs. program, you're fighting spamming human vs. program. The only way to win that fight is to add human intelligence to your side so it's spamming human vs. website-owning human. Nothing else will do.