Thursday, August 02, 2007

Data mismatches

Over the years, each of the traditional system tiers - database, web server, browser - has grown more powerful, reliable and manageable.

The problem is, they still don't work with each other well!

I still haven't seen an elegant way to create object oriented code from queries of relational tables (although LINQ comes closest).

And using objects to manage semi-structured, hierarchical HTML doesn't work nicely either - that's why the DOM is so ugly.

Finally, placing (X)HTML neatly in a relational database is very awkward, although that hasn't stopped vendors from attempting it.

Different forms of data

That's because they all use different approaches to model data - table, object, and document.

Each approach has many advantages, each requires a different technical skill and personality type to use, and each works best in different circumstances. Unfortunately, they don't work together particularly well.

Can this last?

At the moment there are a few creaks, but no cracks. The creaks are
  • The success of scripting languages like PHP in managing documents, rather than formal object oriented code
  • Buzz around REST, which uses URLs and HTTP to store (and even edit) data, hiding the underlying relational database
  • Relational databases increasingly outputting XML, rather than proprietary data

Another approach - REST, XQuery

It is now (just) possible to use the document approach throughout every tier. It makes code enormously easy to write and maintain, and it fits perfectly into the web.

You wouldn't want to do this for data-intensive applications, such as handling financial market data. But for document-intensive web applications, such as social networking, blogging and photo-sharing, it's perfect.

The idea is to follow the REST approach:

  • carefully construct a URL for every resource important to your site
  • decide which resources require create, review, update, and delete (CRUD) permissions
  • enable HTTP PUT, GET, POST, and DELETE commands against these URLs

Even if these resources are eventually stored in a relational database, this approach totally shields the relational viewpoint in favour of the document.

You can even write most of the server-side code in XQuery. The advantage of doing this is that it fits perfectly with (X)HTML and REST - you can GET documents, extract the relevant parts using XPath, and insert them into page markup using straightforward inline code. No object orientation in sight!

After all, server-side code does four things:

ActionTechnology
read / update a databaseREST, HTTP
get / set HTTP headersXQuery functions
manage sessionsXQuery functions
construct HTML outputXQuery, HTML

It's very useful if you've got a huge number of URIs (as per REST) - you just have a central application that parses the URI and returns the appropriate mashed-up resources. XQuery is good at this parsing and returning.

One for the future

Unfortunately, the technologies behind REST and XQuery are still very immature and there isn't much support from libraries, documentation, or tools.

And given the immense standing base of relational databases and object-oriented code, and their use in so many different areas, I can't see their support diminishing soon.

That's ok - the point is that new ideas are still bubbling forward for improving developer productivity. SQL and OOP both pre-date the web; they have survived well, but it's always worth taking a step back and asking if there's a better approach.

No comments: