Blogger: Lyn Robison
I was not trying to be controversial with my previous post, but apparently I struck a nerve. Perhaps it will be helpful for me to explain the background logic that went into the ideas I presented.
First, let me acknowledge that there are areas outside of data and process that are vital to the success of enterprise IT. By mentioning the data vs. process debate, I am not advocating the removal of project management, security, collaboration, identity management, or data center management from the list of vital concerns for enterprise IT. We in IT must do all of those things, and more. (If anyone has something to add regarding one of these other vital concerns that is related to anything that I say about data, I am all ears.)
Data management needs to appear on the list of vital concerns for enterprise IT as well. In my work in Burton Group’s Data Management Strategies area, I talk to lots of IT professionals about data management practices such as MDM, ILM, MODS, IQ, and BI. When it comes to the more basic data management practices that we advocate, sometimes IT professionals ask me, “Why would we need to do those things? What’s the ROI on that?” Clearly, from these folks’ perspective, data management isn’t really on the list of vital concerns for IT.
The way that people think about data largely determines their innate desire to perform basic data management practices. If someone thinks about data only in a limited way, they will not perceive the need to carry out the basic practices of data management. Let me give three broad characterizations of how I’ve seen IT professionals think about data.
1. Input/Output: IT professionals view data in the context of the application or the system or the business process that created it. Data is viewed merely as the input/output for business processes. In this way of thinking, if a process needs some data as input, it is perfectly okay to blithely replicate some data so that the process can get the data input it needs. The processes are the priority, and the data is simply the juice to get the processes started and merely the byproduct after the processes are completed. This is the way that SOAP seems to have been designed: for feeding input to and for obtaining output from processes. This way of thinking is a hold-over from the batch data processing mainframe days, when the computer systems made no real attempt to preserve data, and instead merely processed it. Data produced through this way of thinking is at best a distorted, inconsistent, incomplete picture of reality, because the computer systems are not designed to preserve accurate data, and instead are designed simply to execute processes.
2. Fungible Commodity: IT professionals view data in the context of the type or topic of the data. Computer systems are designed to preserve data, but the data is viewed merely as a fungible commodity that can be easily aggregated. A large corporation’s financial statements are a prime example of this. Numbers of dollars for revenue, expenses, assets, and liabilities are totaled up for each division. There is no need to keep track of individual dollars, because dollars are fungible assets: one dollar is just as good as another and all that is needed are the totals. In this way of thinking, databases are viewed as containers that hold data the way that steel barrels hold crude oil or railroad cars hold lumber. IT professionals create databases and then the business processes fill ’em up with data. Data warehouses and BI tools are used to aggregate this data. Data produced through this perspective makes it possible for corporations to manage fungible assets such as money, but they cannot manage non-fungible assets such as customers, because the computer systems are not designed to preserve unique instance data or to indentify individual assets across silos. (Hence the huge amounts of money that was spent on ineffective CRM systems over the past couple of decades.)
3. Picture of Reality: IT professionals view data in the context of its real-world counterparts – as representations of individual things in the real world. In this way of thinking, data is a picture that must accurately represent reality, otherwise the data is useless. The computer systems are designed to manage non-fungible assets, in that there is a 1-to-1 mapping between a data entity and its corresponding real-life object. If a corporation has 3,592 customers, its computer systems will contain exactly 3,592 customer identifiers. Data about customers will of course be spread across any number of silo-ed systems, but those systems will all use the appropriate customer identifiers for the data that relates to each one of those 3,592 customers. This is the way that REST and resource-oriented architectures seem to be designed: every entity is given a unique identifier. REST has built-in methods to manipulate the data that represents individual assets, even across multiple silos. MODS is a data management methodology for reconciling and then maintaining those unique identifiers for each asset. The XQuery Dev Stack is the set of technologies that is best-suited to implementing MODS. Data produced through this perspective makes it possible for enterprises to act on accurate intelligence and to take actions that are based on the truth. Data such as this makes it possible to accurately evaluate systemic risk on a macro level, as well as to manage individual assets on the micro level. In short, data created through this way of thinking is an accurate picture of the realities that are important to the business.
Enterprise data is a mess because too many IT implementers and IT leaders view data only as “Input/Output” or as a “Fungible Commodity”. Those IT implementers and IT leaders who view data as a “Picture of Reality” are naturally eager to perform the basics of data management that we advocate in Burton Group’s Data Management Strategies area.
My previous blog post on the Data vs. Process Debate, along with my other blog posts, are intended to pursuade IT professionals to think of data as a picture of reality. This concept is at the heart of what we do in Burton Group’s Data Management Strategies area.

Lyn, I totally agree on the picture of reality approach.
According to Wikipedia data may be of high quality in two alternative ways:
• Either they are fit for their intended uses
• Or they correctly represent the real-world construct to which they refer
When data quality becomes enterprise wide we have to take several different purposes in consideration – not at least in master data management.
My thesis is that there is a break even point when including more and more purposes where it will be less cumbersome to reflect the real world object rather than trying to align all known purposes.
More on this here:
http://liliendahl.wordpress.com/2009/11/12/sharing-data-is-key-to-a-single-version-of-the-truth
Posted by: Henrik Liliendahl Sørensen | December 07, 2009 at 06:18 AM