Rating Your Dimensional Data Warehouse

The article titled Rating Your Dimensional Data Warehouse is a concise piece of scholarship that outlines the key metrics in evaluating a Data Warehouse. Data Warehouse as a data storage concept and decision making aid has evolved a lot over the past two decades. Yet, no robust set of metrics were developed to analyse and evaluate the dimensions of a Data Warehouse. Author Ralph Kimball sets out to do the same, as he proposes a list of 20 criteria for what makes a system ‘dimensional’. Each criteria can be assigned a value of ‘0’ (bad) or ‘1’ (good) and then added up to arrive at the final rating. While a sum total score of 0 would indicate a system completely un-supportive of a dimensional approach, a score of 20 would indicate a system that is fully supportive of a dimensional approach.

The author outlays 12 of the 20 criteria in the article. Some of the criteria that pertain to the Architecture of the Data Warehouse are: Explicit Declaration, Conformed Dimensions and Facts, Dimensional Integrity, Open Aggregate Navigation, Dimensional Symmetry, Dimensional Scalability and Sparsity Tolerance, etc. Kimball explains Open Aggregate Navigation as follows:

“The system uses physically stored aggregates as a way to enhance performance of common queries. These aggregates, like indexes, are chosen silently by the database if they are physically present. End users and application developers do not need to know what aggregates are available at any point in time, and applications are not required to explicitly code the name of an aggregate. All query processes accessing the data, even those from different application vendors, realize the full benefit of aggregate navigation.” (Kimball, 2000)

Similarly, some of the criteria that fall under Administration category are: Graceful Modification, Dimensional Replication, Dimension Notification, Surrogate Key Administration, International Consistency, etc. Kimball explains Dimensinal Replication as follows:

“The system supports the explicit replication of a conformed dimension outward from a dimension authority to all the client data marts, in such a way that we can only perform drill-across queries on data marts if they have consistent versions of the dimensions. Aggregates that are affected by changes to the content of a dimension are automatically taken offline in each client data mart until we can make them consistent with the revised dimension and the base fact table.” (Kimball, 2000)

One of the impressive aspects of the article is its compact presentation. Kimball succeeds in giving concise yet clear definitions of each of the 12 criteria he as addressed. Considering that this was the first of a two-part series, it is assumed that the remaining criteria would be covered in the concluding article. Moreover, given that Data Warehouse metrics is a technical subject, the author manages to give the gist of the underlying mathematics and statistics without employing esoteric language or complex formulae. Toward the end of the article Kimball notes that the criteria were deliberately set to high standards and that it is unreasonable for an enterprise to meet all expectations. In other words, the 12 (20) criteria serve as an ideal template with which enterprise Data Warehouse systems can be rated for the extent of their dimensions. The article is an overall success, in that it accomplishes its limited and clearly stated objectives with scholarly excellence and professional presentation.

Reference:

Ralph Kimball, Rating Your Dimensional Data Warehouse, Intelligent Enterprise, April 28, 2000, Volume 3 – Number 7.