In this article, we will explore another tool that MSAS offers for the enhancement of cube processing, the Optimize Schema option. This option can assist us in significantly reducing a cube’s processing time. Optimize Schema attempts to identify unnecessary joins between our fact and dimension tables, and then to remove them. In many cases, the tool works effectively to accomplish this. Elimination of the joins means more rapid resolution of MSAS’ queries to the relational database upon which our cube is dependent as a data source, which means that data is pulled into Analysis Services in less time, contributing to a more rapid cube build overall. The operation of the Optimize Schema option centers on taking advantage of a common concept within the construction of many star or snowflake schemas: the foreign key that serves as the basis of a join between the fact table and a given dimension table is identical to the member key itself. When this is the case, MSAS can eliminate the join, and source the member key directly from the fact table, instead of relying upon a join to the dimension table to obtain the key.
In this article, we will examine the use of Optimize Schema in making our cubes process faster. To accomplish this objective, we will undertake the following steps in this article:
Create a copy of the Warehouse sample cube for use in our practice exercise;
Prepare the cube further by processing;
Discuss the Member Key Column property of our cube, and examine existing settings;
Discuss possible considerations in determining the appropriateness of the use of the Optimize Schema option in our respective business environments;
Perform a practice exercise whereby we institute the Optimize Schema option;
Examine some of the immediate effects of using Optimize Schema;
Discuss the impact of using Optimize Schema within an MSAS cube.
Enhance Cube Processing Performance with Optimize Schema
Considerations and Comments
As many of us are acutely aware, cube processing performance is often a high visibility subject. There are many means and considerations in optimizing build times, but few are as straightforward as the use of Optimize Schema, which quickly and “automatically” makes modifications that can provide palpable improvements, in many cases. Manual intervention can become necessary, in some instances, to achieve the intended objectives of the option for all dimensions, but the concepts behind the steps are identical. The way this all works is, simply, that the Member Key Column property, aligned with the lowest level of a shared dimension in the cube, is pointed to a column in the fact table instead of a column in a dimension table. The result is that the need for joins is eliminated in the SQL query that MSAS executes against the relational data source to populate the cube. As we might expect, this happy circumstance imparts multiple benefits, including reductions in query complexity, reductions in the amount of data accessed in the relational data source, and reductions in network traffic between the Analysis Server and the relational data source.
Several conditions must exist for schema optimization (via the Optimize Schema or manual approach) to be effective. For any dimension under consideration, these conditions are as follows:
The dimension must be a shared dimension;
The dimension must have been processed at least once prior to selecting the Optimize Schema option;
The lowest dimension level must be physically represented in the cube (it can be hidden, but not disabled);
The Member Key Column property for the lowest dimension level must contain the key that relates the fact table to the dimension table;
The Member Key Column property for the lowest dimension level must be the only key that is needed to relate the fact and dimension tables;
The Member Key Column property for the lowest dimension level must be unique.
When manually optimizing a cube schema, it is important to remember that no warning or error dialogs will appear if the incorrect column is selected in the respective Member Key Column property. If incorrect choices are made, incorrect values may be returned, or cube processing may fail altogether. If the optimized dimension is used as a slicer within a cube partitioning strategy, processing failures are likely, as well. Within the “manual” scenario, consideration needs to be given to examining several variables before attempting to perform schema optimization. Consult the Microsoft Developer Network (MSDN), the Books Online, and other reliable documentation before attempting this on any but development systems – and with a backup in place before beginning in every case.
Other considerations can arise in altering the schema of a cube through the Optimize Schema option, as well. These include scenarios where our cube design is dependent upon joins between the dimension and fact tables, as a mechanism for forcing the exclusion of rows of the fact table from the cube that is to be generated. As is readily apparent to most of us, removal of all joins through Optimize Schema, or any other process will mean a corresponding removal of the filtering mechanism, and the resulting selection of all rows in the fact table.
Finally, it is important to keep in mind that the Optimize Schema process, in its elimination of existing joins, can alter our selection of available tables in subsequent attempts to specify drill through options. This, like all the other considerations, needs to be evaluated as a part of planning to use the Optimize Schema option, or to pursue other steps to replicate its action, to enhance cube processing performance.
To confirm our understanding of the steps we will undertake, we will perform “before and after” examinations of our cube, particularly focusing on the settings of the Member Key Column property at the lowest level that we have discussed. The changes that occur to this property as a result of our Optimize Schema efforts will make the process more transparent. For purposes of this exercise, we will create a copy of the targeted cube, as we have in various articles of this and other series. We will then process the clone cube to “register” it with Analysis Services, before beginning our examination of Optimize Schema within our practice exercise.