Optimizing Microsoft SQL Server Analysis Services: MDX Optimization Techniques: Optimizing the CROSSJOIN() Function

We will begin creating our query with a focus on returning results efficiently. We request the specific requirements for the query from the information consumers. They explain that they have requested to see Warehouse Profit, based upon individual Store (denoted in the cube as Store Name), Product (denoted as Product Name), and Store Type (“Store Type”) combinations, for Supermarket Store Types specifically, for operating year 1997. They want to see the top fifteen combinations only, in descending order, as the final presentation.

We obtain the original query, to which we have referred above, from the MSAS instance on the computer abandoned by the developer upon his lay off. The query appears as follows:

SELECT
{[Measures].[Warehouse Profit]} ON COLUMNS,
TOPCOUNT( {CROSSJOIN([Warehouse].[Warehouse Name].Members,
CROSSJOIN([Store].[Store Name].Members,
[Product].[Product Name].Members))}, 15, ([Store Type].
[All Store Type].[Supermarket], [Measures].[Warehouse Profit]))
ON ROWS
FROM
[SSP014_OPT_CROSSJOIN]
WHERE
([Time].[1997])

The consumers with whom we are interacting tell us that the query does, indeed, give them the results they want, in the appropriate layout. We determine that we will create an identical query in the Sample Application, upon which we will apply enhancements to tune its performance. We will save each step as a separate query to allow us to “fall back,” if necessary, to a previous step, as we incrementally modify the query.

6. Create the following new query (identical, except for comment line, to the original):

–SSP014-1: Original Query (Suboptimal)
SELECT
{[Measures].[Warehouse Profit]} ON COLUMNS,
TOPCOUNT( {CROSSJOIN([Warehouse].[Warehouse Name].Members,
CROSSJOIN([Store].[Store Name].Members,
[Product].[Product Name].Members))}, 15, ([Store Type].
[All Store Type].[Supermarket], [Measures].[Warehouse Profit]))
ON ROWS
FROM
[SSP014_OPT_CROSSJOIN]
WHERE
([Time].[1997])

7. Execute the query using the Run Query button.

After running for a few seconds, the results dataset appears as depicted in Figure 9.

Figure 9: The Results Dataset – Original Approach

8. Save the query as SSP014-1.

Based upon what we know about the CROSSJOIN() function, we can readily see that the query above can be optimized. First, we note that the query as originally written creates every Warehouse Name (24), Store Name (24), and Product Name (1,560) combination. This means, once again, that the query generates 898,560 combinations, many of which (as we mentioned above) are empty, a frequent occurrence in CROSSJOIN() results. Furthermore, in addition to taking the time and resources to generate all combinations, we are next performing our TOPCOUNT() function on all combinations, most of which, again, are empty in the first place.

Let’s take the first step to make the query more efficient. We will substitute the NONEMPTYCROSSJOIN() function for the CROSSJOIN(). Our objective will be to filter out the empties. This alone will give us a significant boost in processing speed.

9. Create the following new query:

-SSP014-2: Introduce NONEMPTYCROSSJOIN() as initial improvement
SELECT
{[Measures].[Warehouse Profit]} ON COLUMNS,
TOPCOUNT( {NONEMPTYCROSSJOIN([Warehouse].[Warehouse Name].Members,
[Store].[Store Name].Members,
[Product].[Product Name].Members)} , 15,
( [Store Type].[All Store Type].[Supermarket],
[Measures].[Warehouse Profit])) ON ROWS
FROM
[SSP014_OPT_CROSSJOIN]
WHERE
([Time].[1997])

In this version of the query, we employ the formidable NONEMPTYCROSSJOIN() function to improve the query speed dramatically. As many of us are aware, NONEMPTYCROSSJOIN() returns the “non-empty” crossjoin of multiple sets. In accepting two or more sets for its input, nesting of the function is not required, unlike CROSSJOIN(), which can handle only two sets per function, and thus does require nesting (as we saw in the original query above). NONEMPTYCROSSJOIN() filters out calculated members in this case, as well, and so it might not be the best approach in a scenario where calculated members are to be returned (we will examine the optimization of CROSSJOIN() under such a scenario in another article).

Continues…

Leave a comment

Your email address will not be published.