Breaking Down Complex Execution Plans

You can see that the first query took more than eight times longer to run on my machine and resulted in two scans and 10963 reads against a work table as opposed to 1 scan and 17 reads from the Tally table. Clearly, the more costly statement was the first, even though the “Query Cost…” tried to tell us it was virtually free compared to the second statement. The precise details for exactly why this occurred are length, but a quick summary is worth laying out. The optimizer provides estimated costs based on statistics and the query in hand. Since the first query was working from a CTE recursively, the optimizer, not knowing precisely how much data was in the CTE chose a value of 1, doubled because it knew that the CTE was being called more than once. So it used 2 rows to estimate the rest of the query. In fact there were 1875 rows returned. The other query, working from a table, with statistics, made a slightly better estimate of 100 rows. That difference between 100 rows & 2 rows resulted in huge differences in all the estimated costs of each operation in the two execution plans. This lead to the disparity observed between the estimated “Query Cost (relative to batch)” and the actual I/O and time it took to execute the query.

“Query cost (relative to the batch):” is a quick and easy way to determine which query costs more. It will get you started. If you are dealing with table variables or CTE’s or your statistics are out of date, it might lead you down the wrong path. To be sure of exactly how much a given statement costs, you need to look at the I/O and execution time.

I/O and Time
In Management Studio in the Query window, you can have the actual I/O and Time that a given statement cost be displayed in the messages. It’s very easy to do. First, open the Query Options window by either right clicking in the query and selecting “Query Options” or by clicking on the “Query” menu item and selecting “Query Options.” This opens the Query Options window. Any changes made in this window affect the query you are opening it from only. They won’t affect any other query windows behavior. To get the I/O or the Time, click on the “Advanced” choice on the left in the tree view. This screen will now be open:

Figure 5

Click in the two check boxes, “SET STATISTICS TIME” and “SET STATISTICS IO,” so that they are “checked.” Click OK.

Now when you run the query, you’ll see precisely the number Scans, the logical reads, physical reads and the rest. These figures will show you precisely that a given process used more scans or had more reads so that you’ll know, for sure, that process cost more I/O and is therefore worthy of your time. The results will also show CPU and elapsed time. Again, letting you know that this statement within the query took longer than those around it. Once again this gives you a precise measurement for which statements to begin to work with first. Here’s the output:

Figure 6

The only problem is, except for comparing the tables accessed by a given statement, there’s no immediate way to determine which statement is connected with the I/O or time measured. There are a couple of tricks that you can use to increase your chances of using these statements in large and complex execution plans. You can put PRINT statements into the middle of the query so that they serve as markers between various statements. While a useful trick, it would be tedious to maintain over time. Another trick is to use the TSQL commands to turn I/O or time capture on with individual statements to isolate them from their fellows and only return their data:

SET STATISTICS IO ON
SET STATISTICS TIME ON

SELECT…

SET STATISTICS IO OFF
SET STATISTICS TIME OFF

Again, this isn’t something you’d want to maintain within your code, but it is a useful way of gathering more isolated data. But to really find exactly which statement cost exactly how much in terms of I/O and time, SQL Server Profiler is the best tool.

Continues…

Leave a comment

Your email address will not be published.