Weblog Theory and [ping,track,link]back

Note to self: Read the pingback spec. Form opinion. [Scripting News]

I cry the cry of the academic: "Let no good thing go unsullied with theory!"

The problem with [ping,track,link]back is that you end up sacrificing one of the very attributes of the weblog community that make it so desirable, namely, the lack of bi-directional links. The unidirectional nature of linking is both a liability and an asset, and it's a good idea to understand the dual nature of that before you go wrecking it because you only see the liability part.

Right now, none of the *Back mechanisms has reached the critical mass necessary to see the negative effects experienced in all other community models I know about. But that is an accident of history, not a permenent fact. Eventually, the same things will happen to the [track,ping]back systems (considered as a whole), because it's just the way things are. The spammers will come. (This is every spammer's dream. It beats the snot out of email spam.) The trolls will come. The loud-mouthed idiots will come. Those who see an opportunity to swiftly increase their PageRank will come. It flows from human nature and the nature of anything resembling a commons.

This is not to say such projects are a bad thing. In fact, I'd say they are a good thing. But some intelligence should be put in their design as early as possible to try to strike a happy medium between a "pure", trusting "pingback" implementation and the current near-total lack of bidirectionality. It's not impossible, but it takes some thought.

I have a now-ancient-by-weblog-standards implementation of a *Back system myself, called LinkBack. It predates the era where we expect everyone to be running either a server on their desktop or a very smart weblog system on the server, back when Manila was new and Blogger was just a blip on the screen, so there are some design elements to it I don't think I'd keep today. (Not to mention the three more years of experience I have since then in designing and implementing such things.) In particular, it was completely centralized, which I don't think would turn any heads today. (Perhaps I should say "is". It's not running, but it was technically fully functional before I decided I didn't want to deal with the bandwidth issues a successful centralized service would have needed. One of my mottos is "Never engage in an activity where the worst-case scenario is complete success." That's also the reason you should never play Russian Roulette. ;-) )

LinkBack was fully voluntary. LinkBack only scanned weblogs that chose to participate in the system. Especially after the precedent set by any number of subsequent weblog-based systems such as DayPop and that search dealy I can't recall the name of, it seems that I could have scanned any weblogs I felt like, but I chose this approach deliberately. The idea was to not have one LinkBack, but to spawn off several LinkBacks, each implicitly forming a community around them, and each weblog possibly joining several LinkBack hubs.

This gains some of the advantages of the bi-directional communication, without sacrificing all of the advantages of uni-directional linking. In order to be seen in LinkBack results, you must yourself be part of the system. Random yahoos can't get on your results list just by pinging your system and feeding it a faked page. In addition, LinkBack did not have to live on the home page of the weblog. By default, it had its own results pages that showed the results of the system scans. (Also prevents the PageRank attack implicitly; you don't need to have the links on your site, and I completely blocked Google from the LinkBack results, so they wouldn't have figured in at all.)

(A few ideas for other implementors to pluck: Grab the text around the link and consider displaying it somewhere. The context can be useful. There are some easy algorithms for making this reasonable, I'd be happy to share them.)

The real point of all of this is that the act of hosting a link is not an unmitigated good, and no successful system will be able to treat it as such. For every link you accept, there is a risk factor. The *Back system must compensate for this somehow, whether with my (eventual) multiple indepenent hub-style design, with some sort of white-list/black-list mechanism (scaling problems)*, or with a Web of Trust of some kind (ideal, and of course, hardest). Otherwise, the *Back system will inevitably be a victim of its own success and self-marginalize as the users become disillusioned by how the system is actually put to use.

*: tech note: At the very least, you could do a multi-whitelist type of thing that emulates the eventual design I wanted for LinkBack. Create many groups of weblogs, let people set up these groups however they want, let them join as many as they want, and accept only verified pings from those groups. Also let the user black-list as they see fit (very importent!). This should be a tolerably good approximation of a web of trust for this application.