Blogger: Lyn Robison
In Burton Group’s recent “2010 Planning Guide: Data Management Strategies” paper, I said, “the data management foundation that [relational] DBMSs provide is not adequate for all of the needs of modern enterprises.” Clearly, I believe the era in which enterprises use relational database servers to store all of their data is nearing an end.
Today I listened to a briefing from a company that is betting its future on “internet-scale data processing” which Hadoop offers. I imagine that this company will do fine, but it appears to me that Hadoop and NoSQL databases will not supplant relational database servers in enterprises until there are clear standards for client data access and processing. The widespread adoption of Structured Query Language is largely what made relational databases so popular. SQL made it possible for clients (both humans and software applications) to easily process and retrieve data in relational data stores. Until we see the widespread adoption of some programming metaphor or language for client data access and data processing in the NoSQL world, NoSQL databases are likely to remain niche products.
Does this mean that enterprises are stuck forever in the relational-only world of DBMSs? Not necessarily, because in the NoSQL world there is no equivalent to SQL … except, of course, XQuery.
XQuery is poised to do for XML databases what SQL did for relational databases. Using XQuery, clients (both humans and software applications) can easily process and retrieve data in XML data stores. This means that developers can use XQuery with their favorite object-oriented programming language to build XML-based applications. Application developers now have a choice for their data storage needs that enables them to select the meta-model that best fits their data. The relational meta-model is best for tabular, structured data, while the XML meta-model is best for document-centric, semi-structured data.
Like I said, developers can use object-oriented programming languages with XQuery to build XML-based applications –- that is, if they want to build those XML-based applications the wrong way and write a bunch of unnecessary software. In one vital sense, XML data is fundamentally more powerful than relational data: XML data does not need to be instantiated as objects in OO code in order to be processed and presented to users. Instead, XML data can be decorated and enriched with tags, and these tags can be processed and understood by software that enterprise IT groups do not have to write. If an enterprise’s data is in XML, lots of off-the-shelf, well-tested, well-proven, enterprise-ready software packages can process, display, and edit that data.
The software applications that can understand XML tags include Microsoft Word, Excel, InfoPath, as well as the Open Office productivity applications. IOW, enterprises can now use Office apps, including Word, Excel, and InfoPath for the user interface for presenting and editing enterprise data, provided that data is in XML -- more specifically, provided that data is in the native XML file format of one of the Office applications. But how do you get enterprise data into the native XML file format of Word, Excel, or InfoPath? You use an XML database such as MarkLogic Server that understands and that can transform enterprise data into and out of those Office XML file formats.
The vendors of relational database servers have not yet made their DBMSs capable of transforming enterprise data into and out of those Office XML file formats, which shows that they are somewhat out of touch with power of XML-based enterprise data. At least the relational database vendors have been smart enough to make their DBMSs expose relational data as XML, which means that application developers can choose the right meta-model (relational for tabular data and XML for semi-structured data) and then can treat all of that data as if it is XML. By treating all data as if it is XML, developers can finally say goodbye to the ever-troublesome object-relational impedance mismatch.
These are radical ideas that can take solution delivery in enterprise IT to new heights of effectiveness. These ideas will no doubt be viewed as heresy in certain circles. I have published my heretical ideas in a document entitled “The Methodology for Overcoming Data Silos (MODS): Using the New XQuery Development Stack”. In this document, I specify an application development stack that is built around XML- and XQuery-based development tools. Give it a read and let me know what you think.
