Blogger: Lyn Robison
In Burton Group’s recent “2010 Planning Guide: Data Management Strategies” paper, I said, “the data management foundation that [relational] DBMSs provide is not adequate for all of the needs of modern enterprises.” Clearly, I believe the era in which enterprises use relational database servers to store all of their data is nearing an end.
Today I listened to a briefing from a company that is betting its future on “internet-scale data processing” which Hadoop offers. I imagine that this company will do fine, but it appears to me that Hadoop and NoSQL databases will not supplant relational database servers in enterprises until there are clear standards for client data access and processing. The widespread adoption of Structured Query Language is largely what made relational databases so popular. SQL made it possible for clients (both humans and software applications) to easily process and retrieve data in relational data stores. Until we see the widespread adoption of some programming metaphor or language for client data access and data processing in the NoSQL world, NoSQL databases are likely to remain niche products.
Does this mean that enterprises are stuck forever in the relational-only world of DBMSs? Not necessarily, because in the NoSQL world there is no equivalent to SQL … except, of course, XQuery.
XQuery is poised to do for XML databases what SQL did for relational databases. Using XQuery, clients (both humans and software applications) can easily process and retrieve data in XML data stores. This means that developers can use XQuery with their favorite object-oriented programming language to build XML-based applications. Application developers now have a choice for their data storage needs that enables them to select the meta-model that best fits their data. The relational meta-model is best for tabular, structured data, while the XML meta-model is best for document-centric, semi-structured data.
Like I said, developers can use object-oriented programming languages with XQuery to build XML-based applications –- that is, if they want to build those XML-based applications the wrong way and write a bunch of unnecessary software. In one vital sense, XML data is fundamentally more powerful than relational data: XML data does not need to be instantiated as objects in OO code in order to be processed and presented to users. Instead, XML data can be decorated and enriched with tags, and these tags can be processed and understood by software that enterprise IT groups do not have to write. If an enterprise’s data is in XML, lots of off-the-shelf, well-tested, well-proven, enterprise-ready software packages can process, display, and edit that data.
The software applications that can understand XML tags include Microsoft Word, Excel, InfoPath, as well as the Open Office productivity applications. IOW, enterprises can now use Office apps, including Word, Excel, and InfoPath for the user interface for presenting and editing enterprise data, provided that data is in XML -- more specifically, provided that data is in the native XML file format of one of the Office applications. But how do you get enterprise data into the native XML file format of Word, Excel, or InfoPath? You use an XML database such as MarkLogic Server that understands and that can transform enterprise data into and out of those Office XML file formats.
The vendors of relational database servers have not yet made their DBMSs capable of transforming enterprise data into and out of those Office XML file formats, which shows that they are somewhat out of touch with power of XML-based enterprise data. At least the relational database vendors have been smart enough to make their DBMSs expose relational data as XML, which means that application developers can choose the right meta-model (relational for tabular data and XML for semi-structured data) and then can treat all of that data as if it is XML. By treating all data as if it is XML, developers can finally say goodbye to the ever-troublesome object-relational impedance mismatch.
These are radical ideas that can take solution delivery in enterprise IT to new heights of effectiveness. These ideas will no doubt be viewed as heresy in certain circles. I have published my heretical ideas in a document entitled “The Methodology for Overcoming Data Silos (MODS): Using the New XQuery Development Stack”. In this document, I specify an application development stack that is built around XML- and XQuery-based development tools. Give it a read and let me know what you think.

Lyn,
I don't think your discussion is heresy, in fact I've worked for 11 years to build an object oriented data store (OODS) that does exactly what you say. Instead of using XQuery though, I've built OSQL. In addition the OODS can import and export XML documents appropriately tagged with classes already defined in the OODS.
In addition, the OODS provides for data extensibility and helps to promote reusable-repository objects (RRO's). First the OODS captures how the objects are linked together, and removes the need for business logic to be embedded in hard code. Second, the structure of the data doesn't need to change as new data is added to the system.
This has additional benefits: Existing applications do not need to be modified as new data is added. New application development projects don't have to redevelop the business logic needed to recover the objects from the OODS. And, depending on the type of object explorer implemented, for example a Form Builder and Work Flow Modeler, the data can be collected and disseminated to anyone that needs it.
I though do think that it will take time for individuals to fully understand and comprehend the power of such a system.
Mike,
Posted by: Michael J. Fuhrman | October 08, 2009 at 01:14 PM
I am afraid that I am not a big fan of object databases. OO programming is intended to optimize the programmer, not the value or usefulness of the data. XML, on the other hand, is intended to optimize the data. If you use XQuery with XML for all data representations, and never write any OO code, you optimize the data's usefulness and accessibility, because data in XML can be accessed and processed by almost any software on the market. Data in an object database can't.
Posted by: Lyn Robison | October 14, 2009 at 04:36 PM
Michael,
What about InterSystems' Cache features when compared to your OODS?
Posted by: Bulat Ashimov | November 05, 2009 at 06:19 AM