Blogger: Lyn Robison
BI system implementations follow a cycle. The cycle starts when businesspeople begin seeing the need for better intelligence. The decision is eventually made to implement a new BI system. The enterprise spends hundreds of thousands of dollars on BI, ETL, and DW tooling. They spin up a project or series of projects to load the data into the DWs and to create the queries and reports that deliver the intelligence to the businesspeople. With the cost of time and labor, development tools, software licenses, and computer hardware, it is quite easy for an enterprise to spend a million dollars or more on a new BI system.
The sophisticated tooling, the complex BI systems, and the cutting edge technology are impressive. It is easy to think of the BI system as a large, highly developed, powerful engine. All that is needed to fire this engine up is some fuel. That fuel, the data, comes from the enterprise’s operational systems. Powerful ETL tools extract the data from the operational systems and transform the data so that its shape conforms to the mold in the DW. The ETL processes load the data the way that barrels of oil or stacks of lumber or bushels of wheat are loaded into warehouses (after all, data is nothing but a fungible commodity that can be easily replicated, aggregated, summarized, sliced and diced, right?). Once the data is loaded, the BI engine does its work beautifully. The engine plows through huge quantities of data, aggregating it, summarizing it, analyzing it, and allowing businesspeople to mine it for gems. Sure, the data is not perfectly accurate, but everyone assumes that the numbers will be close enough. After all, smart people are accustomed to working with uncertainty. And look at that BI system do its work! The complex, powerful BI system is truly a sight to behold, a thing of beauty, and the IT people are justifiably proud of their hard work.
Then the questions start. A few businesspeople ask to see where the BI system’s numbers are coming from. Are the operational systems from which the BI data is pulled the authoritative sources for information on the topic at hand? Did data from all of the relevant data sources get included? Is there any documentation on the BI system’s data handling, such as data models or data flow diagrams that show the data’s lineage? What calculations and processes does the BI system use to produce this information? Do those calculations and processes reflect the corporation’s policies and business rules? Does the information reflect the changes that occurred when that new manufacturing plant came online or when we switched to that other supplier? Were the identifiers reconciled properly between systems? Can anyone say for sure whether or not customer XYZ and customer 47 are actually the same customer? If not, how do we know that the BI system is summarizing the data about each of these customers correctly? If the numbers from the BI system are off, how far off are they? How can we know?
These questions about the reliability of the BI information eventually become a problem. No one can give a satisfactory ROI calculation for any data quality efforts. So data quality issues are never addressed in a systematic way, and as a result, the data quality is consistently neglected and continually deteriorates. Eventually, businesspeople aren’t sure how much they can trust the numbers coming from the new BI system. Businesspeople begin seeing the need for better intelligence. The decision is made to implement a new BI system. Here we go again. Let’s build another BI system!
In this cycle, the enterprise tries in vain to address problems of information reliability and usefulness by creating yet another system. Unfortunately, that does not solve the enterprise’s basic problem. The enterprise does not lack systems. The enterprise lacks reliable, useful business information.
Eventually, the business leadership sees that they are spending millions of dollars on IT, and they begin to look for cheaper ways to get business intelligence, through approaches like IT outsourcing, cloud, and SaaS. If they could peer into the future, they would see that IT externalization will at best give the business the same mess for less.
The fundamental problems that make BI systems ineffective are the mistaken perceptions about data that are harbored by most IT professionals. In truth, data is not merely a fungible commodity that can be moved around and aggregated and summarized to fill up data warehouses the way that crude oil can fill up oil tanks or steel barrels. To be useful for any purpose, data must be an accurate picture of reality.
To be an accurate picture of reality, data must accurately represent individual, non-fungible assets. Barrels of oil might be fungible (interchangeable) – one barrel of oil might be just as good as another, but one customer is not just as good as another. Customers are not interchangeable. Neither are buildings, factories, documents, contracts, decisions, or any number of vital business assets. If the data representing those non-fungible assets is treated merely as a fungible commodity, the data will not accurately represent reality, and the enterprise’s data will be expensive but unreliable for business decisions.

