Tuesday, October 20, 2009

Who plays the part of transformation in mashups?

In the last week, two people have independently told me about an Australian government sponsored conference to create interesting mashup applications from government data.  I love the idea, and I'm really glad that the government believes that its data should be freely available.  I think most app providers are realising the power of providing open access to their data to drive adoption now.  In my opinion however, independent transformation of data between web applications is still missing as a generic tool to mashup creators.

Generally, in enterprise as in mash-up applications, the source data is not in the correct format to be directly consumed by the final application.  As an example, I am writing an iPhone application which takes heart rate monitor recordings of your exercise and stores them as a Google spreadsheet.  The reason I chose to store the information in an online spreadsheet instead of a bespoke database service is that google already provide all of the tools to make the available in the spreadsheet easily available as XML for others to consume.  It does this using the Atom protocol, which is great, but hardly easy to consume.

Traditionally, a mash-up is seen as the combination of data from one or more external sources with a javascript driven user interface.  The data flow looks something like this.



This is great, however it induces a great amount of coupling between the components.  The mashup provider needs to communicate directly with the data sources, transform them into its own native format, then consume it.  There's no opportunity to substitute in a different datasource if it becomes available, or easily fix things if the source data format changes slightly.  In enterprise applications, this has long been recognised as an issue and ESBs were developed as a way of handling this.  When an ESB is used correctly, the source data (or application) is abstracted from the destination by a transformation process, usually performed by XSLT.

I think that the same approach should be used for mash-up style applications.  The big advantage this brings is that it releases the data from the application (and the user interface).  More importantly, it allows the application itself to fetch data from different sources.  It is no longer limited to the sources that the programmer's put in.  A sufficiently talented user can take any datasource, transform it into the correct format that the application expects, and then get the application to point at that transformed source.



For this to work, there is a need for a generic XSLT service, that can take a data feed and an XSL style sheet and produce the desired output.  W3C provide a service which does exactly this.  Unfortunately the bandwidth requirements of any large enterprise use would crush their server so they have put limitations on the service to restrict it to personal use.  This is a shame.  I've written a very similar service for bleep, but it is run as a free Google App Engine app which has quite severe resource limitations of its own.   I reckon Google should release a transformation service of its own.  It would be very useful in many of its apps.  There's no way to make advertising revenue off it though :-/

Its not really within the average person's skill set to write a transform.  Many software engineers do not know how to do it properly.  In the future, I'd like to hope to think a web app will be written which brings it more within the reach of normal internet consumers.

To bring this back to the govhack conference I mentioned at the beginning of this post, I think that its good that the government wants people to make mashups, but in some ways they are a little misguided.  Its not just about the applications.  Most people will just put some data up on a google map, which is hardly innovative.  Instead, what I would like to see people taking the source data and transforming it and correlating it against other data sources to produce new data sets.  Then its possible for any number of people to take that data and visualise it in cool and impressive ways.

For Bleep, some of my random thoughts on data transformation and visualisation have been gathered at this page

blog comments powered by Disqus