Optimizing Microsoft SQL Server Analysis Services: MDX Optimization Techniques: Optimizing NON EMPTY

Overview

In this article, we will explore the use of a calculated member property to optimize query performance. Almost anyone who has worked with MSAS for any length of time has become aware of the NON EMPTY keyword in MDX. Empty cells are the reason that most of us are thankful for the NON EMPTY option. Empty cells are a direct result of no data existing at the intersection of two or more dimension, and our MDX statements display them alongside the populated cells in the datasets rendered.

Empty cells have an impact on the evaluation of search conditions and value expression, but, perhaps more significantly from an information consumer perspective, they can adversely affect usability and compactness of the data returned from a query. If the data in a cube is sparse, for example, a query might deliver a dataset with a significant number of empty cells.

A simple illustration of the effect of empty cells on the displayed data might be seen by running the following query against the Warehouse cube of the FoodMart 2000 sample database that ships with MSAS.

SELECT

{[Product].[Pearl Imported Beer]} ON COLUMNS,

{[Store].[Store Name].Members} ON ROWS

FROM

[Warehouse]

WHERE

([Measures].[Warehouse Sales])

A view of the dataset returned by the query, as shown in Figure 1, reveals the presence of many empty cells.

Figure 1: The Presence of Empty Cells in a Dataset

The empties occur because there is no data at most of the Product / Store intersects for Pearl Imported Beer: no transactions have transpired between the warehouses and many of the stores for this particular product within the times captured in the Warehouse cube. In cases like this, an easy method of “suppressing” the empties presents itself in the form of the popular NON EMPTY keyword. In short, NON EMPTY is used on a given axis to remove empty tuples prior to delivery of the result dataset.

Applying it to our previous example, we use the NON EMPTY keyword in the following fashion:

SELECT

   {[Product].[Pearl Imported Beer]} ON COLUMNS,

   NON EMPTY{[Store].[Store Name].Members} ON ROWS

FROM

   [Warehouse]

WHERE

   ([Measures].[Warehouse Sales]) All the Stores in the second axis (“rows”) dimension that have no values for Pearl Imported Beer are screened out of the result dataset before it is delivered to the client, as depicted in Figure 2.

Figure 2: NON EMPTY Screens Out the Empties

The NON EMPTY keyword in MDX allows us to screen out empty tuples, in actuality, not individual empty cells. As many of us have found (and as I see in a recurring question / complaint, both in web forums and in e-mails I receive from readers) empty cells can still appear in a dataset when the NON EMPTY keyword is employed in the query that generates it. This condition is best remedied by knowledgeable use of the keyword in writing the query, and will perhaps be the subject for a future article. Our focus in this article is another facet of the use of the NON EMPTY keyword, its effect upon query performance.

Continues…

Page 1 Page 2 Page 3

Leave a comment Cancel reply