Blogger: Lyn Robison
Whenever I talk about innovative ways to use XML in enterprise IT, people often object by asserting, “You can’t use XML to do that! XML is only …” and then they go on to dogmatically describe their particular perspective of XML’s narrow applicability. Their view might be that XML is only for configuration files. Or, they might say that XML is only helpful for transmitting and receiving data between applications. Or, they (the forward-thinking folks, anyway) might claim that XML is only useful for content. Their common theme is “XML is only…”
An extreme variation on the theme of XML being good only for narrow applications is the notion that XML is magic pixie dust that can make information float effortlessly across the enterprise to any businessperson who needs it.
From my perspective, XML is not quite pixie dust, but it is a nice compromise between structured and unstructured data. Of course, XML cannot handle structured and unstructured data in an ideal way (because compromises are never ideal), but XML does offer enough power around structured/unstructured data handling to be a compelling platform for information management.
My colleague Joe Maguire once explained structured/unstructured data this way: when people go into a restaurant and look at the menu, they intuitively understand that the list of entrées is structured data: the name of the dish, its description, its price. When people read the paragraph or two of text on the back of the menu about the history of the restaurant, they intuitively understand that they are looking at unstructured data in the form of flowing text. Humans can easily understand these two “meta-models” (structured data and unstructured data) and we can easily switch between them. Computers, on the other hand, have more difficulty.
In enterprise IT, we use DBMSs to handle structured data. We (the forward-thinking folks amongst us, anyway) recently figured out that we should use XML to handle unstructured data. Until recently, we stored unstructured, document-centric data in proprietary document repositories, or we just parked documents as files on network file shares. Now it is apparent that an XML database, such as MarkLogic Server, is the best way to handle unstructured, document-centric data. MarkLogic Server holds the position in the XML database market that Oracle held in the early relational database market: they are offering the first commercially viable product.
Productivity applications, such as Microsoft Office, use XML (OOXML) as their native file format. The XML capabilities of Oracle, SQL Server, and DB2 are not as robust as MarkLogic Server’s. (MarkLogic natively understands OOXML and can process enterprise data directly within Office documents, but inexplicably, the big three DBMSs cannot.) However, the big three DBMSs at least can render their structured data as XML. Cloud-based applications tend to favor data that is in XML. We are reaching the point where all of an enterprise’s data can be stored and processed as XML. Stay tuned. XML is going to become a powerful platform for information management in enterprise IT.