Optimizing Microsoft SQL Server Analysis Services: Optimize Query Performance with a Derived Measure
Because a significant part of my practice consists of being called upon to tune MSAS implementations performed by others, I come across the less-than-optimal use of calculated members quite often. As most of us know, calculated members are dimensions or measures (depending upon the designated parent dimension) that are constructed, using a formula, from other dimensions or measures in our cubes. A typical example of a calculated member that is designed for a measure, to which we will refer in this article as a calculated measure, is a Profit calculated measure that is created by subtracting a cost / expense measure from a sales / revenue measure. Another common calculated measure is a variance measure, which is created by taking a difference between an actual and a budgeted value (or similar kinds of values), among other approaches.
If the calculation / formula that we use in creating the calculated measure consists of simple match between two measures, we can often use a derived measure instead. In this article, we will discuss the advantages and disadvantages involved, and compare and contrast the methods of adding these sorts of measures to our cubes. In examining the use of derived measures to enhance cube response times, we will:
Discuss the drawbacks in using calculated members in cases where a derived measure might be substituted;
Discuss benefits and disadvantages that might accrue through the use of derived measures;
Describe an illustrative scenario, upon which we will determine that a derived measure can offer a tuning solution for a group of hypothetical information consumers;
Implement a simple solution through creation of a derived measure to replace an existing calculated measure;
Explain the results we obtain from the steps we take to accomplish the solution.
Create a Derived Measure
In this article, we will examine the use of a derived measure, as a substitute for, and in contrast with, an existing calculated measure that represents a performance bottleneck. For purposes of our practice procedure, we will assume that we have been contacted by a client to investigate slowdowns in query response time when accessing an MSAS cube. We have been informed that reporting response times, in general, have disappointed information consumers throughout the organization. They specifically indicate that using some of the calculated members in the current cube structures lead to report performance issues, although they have no idea as to the root of the problem.
Discussion of the issues with various consumers has revealed that the client, as is often the case these days, failed to become involved in the implementation while it was underway. As the “go live” commitments of the responsible department loomed, the attention of its manager was suddenly “redirected” by upper management. The manager allowed a very brief turnover from the implementation team to a designated employee, whose original position had been recently off shored. Having assigned the employee the administration of the MSAS cubes, the manager dismissed the implementation team without further ado.
While the new support selectee had some experience in a competitor OLAP application, she had spent only a little “cram” time with MSAS, barely progressing from a popular “Baby Steps” book that was once popular on the market. She was confident that she could “figure it out,” however, and expressed optimistic confidence that managing the MSAS applications would be a “piece of cake.” For that matter, she allowed management to assume that she was an MSAS “guru,” and promised to take over where the implementation team left off, and meet the information consumers’ evolving needs as they arose.
The “Baby Steps” book, while an excellent introduction to MSAS functionality, failed to equip the new MSAS Administrator to deal with the involved troubleshooting, processing and performance tuning considerations of the real world. Understandably, its focus was teaching MSAS quickly, and in a way that would appeal to non-technical people of varied backgrounds. After the manner of most “populist” technical publications, imparting general concepts, versus optimization and best practices, was, unsurprisingly, the objective.
Upon initially examining the cubes in place, we have noticed many things that could be changed, including the use of lengthy description fields versus (surrogate) id’s for member key columns within the cubes, among a host of other items that could be adjusted to optimize both processing and querying performance. One specific item that we have identified, and which is the focus of this article, was the use of various calculated measures when a derived measure might have been constructed. While we will take a single example of this for purposes of our practice exercise, we have actually found several instances where the same procedures might be applied to build and substitute derived measures within the MSAS cubes we have examined at the client location.
The calculated measure upon which we will concentrate in our present article was constructed within the Sales sample cube to generate a store cost value, with a fixed allocation of marketing costs added. The adjusted cost value, constructed using the calculated measure under consideration, applied an added fifteen percent to the store cost for products that were subjected to given marketing promotions. Products that were not marketed under specific promotions were assigned only the original store cost amounts to their adjusted cost values. Again, this calculated measure is only one example of several such constructions that, as we shall see, can be managed in a way that promotes more efficient query processing.
Considerations and Comments
For purposes of this exercise, we will create the calculated measure as constructed by the erstwhile guru, using the Sales sample cube that accompanies the installation of MSAS. We will then construct a derived measure that will generate the same values, but with enhanced query processing time.
Unlike calculated members (including, of course, calculated measures), whose values are created at runtime, based upon the MDX expression(s) they contain, a derived measure, just as any other cube measure, is calculated during the cube processing cycle, and is stored within Analysis Services. In the case of the calculated member, only its definition is stored, and the values it renders are calculated in memory when specified in a query. Because only the definition is stored, cube processing time (“build time”) is not materially increased, and, indeed, depending upon the environment, this might be an appropriate tradeoff. In our present scenario, however, query processing is the more important concern, so we lose the primary benefit behind the choice of a calculated measure to provide the needed values.
Derived measures differ from “ordinary” measures because they take advantage of the flexibility that Analysis Services offers us in modifying the source column property for a given measure. Because they are stored in the cube file, as we have mentioned, they typically mean more efficient query processing. Derived measures, by their nature, are calculated prior to the creation of aggregations. (In contrast, calculated measures are calculated after aggregations are created.) In general, derived measures make sense if they will be called upon frequently, as in reporting scenarios such as that of our hypothetical information consumers. Calculated measures might be a better choice for values that we expect to be used infrequently, again because the tradeoff is tipped more in favor of a lower overhead on the cube processing side, where query response time might not be as high a priority.