Wednesday, January 30, 2008

Browser as information broker

The phrase "browser as information broker" has been around for a year or so, and finally some of the vision is becoming reality.

My interpretation of that vision is that the browser will link data on a webpage to services that consume it. For example, if a date appears on a page you're browsing, you could drag it to your calendar application - or if a location appears, you can open it up in a map.

It's a powerful vision, and it's actually part of the Semantic Web mission. As Tim Berners-Lee explains, we've already gone from linking computers to linking web-pages. Now we need to take that next step - linking data.

There are three parts to solving this problem - semantics, services, and connections.


Firstly, web developers have to mark sections of a page with meaning - this is a location, that is a date, etc, in a standard way that computers can understand.

There are several active approaches

  • HTML itself has existing elements (e.g. the "unordered list" element, <ol>) which indicate meaning, and more are being added in HTML5 - e.g. the <time> element.
  • HTML has several attributes, especially "class", that can be used in a semantic way. People are trying to standardise class names and HTML structure to indicate data like dates and locations; this is called Microformats.
  • Groups can register protocols for various content types. For example, there is a common "mailto:" protocol in HTML links, which commonly opens up an email application. Protocols are established via internet standards.
  • The W3C is pushing an ambitious new language, RDF, as the foundation of its Semantic Web vision

In the wild, the first three approaches have good momentum, perhaps because they work well with existing technologies, though they seem to compete with each other. If they hit limitations, RDF will be the obvious choice!


People have to develop websites to manipulate data. Actually, a lot of this has already been done - what is Google Maps except a service to manipulate location data, or Microsoft Live Calendar except a service to manipulate dates & times?


The browser has to connect the user to relevant services when it spots data. For example, when it spots a location, it should present a nice interface that allows the user, if they desire, to view it in Google Maps.

The forthcoming Firefox 3 enables these connections for protocol handlers. It implements the HTML5 API for registering protocol handlers against a particular website, which tells the browser to use (for example) Yahoo! Mail whenever it sees a "mailto:" link.

It will be fascinating to see how this evolves. To become popular, web developers will have to be confident that high quality services exist around a protocol. What protocols will make the grade?

There are no (or very limited) automated browser connections for any of the other semantic approaches (HTML tags, microformats, and RDF). I would therefore predict that protocols will become the favoured approach to marking up HTML with extra meaning, with perhaps the exception of HTML5 <time>, which will work great with web forms.

No comments: