Understanding SQL Server Query Optimization – Part 2

In the first of this four part series I gave an overview of the tools and techniques of query optimization in SQL Server. In this second part, I will talk about the process of displaying and interpreting query execution plans. I will also talk about how to generate query execution plans using SQL Server Management Studio, SHOWPLAN_TEXT and SHOWPLAN_ALL.

An execution plan is the sequence of operations SQL Server query optimizer performs to run the statements. The SQL Server query optimizer creates an execution plan before it actually runs a query. As per Microsoft Book Online, “Execution plans graphically display the data retrieval methods chosen by the SQL Server query optimizer. Execution plans represent the execution costs for specific statements and queries in SQL Server using icons rather than the tabular representation produced by the SET SHOWPLAN_ALL or SET SHOWPLAN_TEXT statements”. This graphical approach is very useful that for understanding the performance characteristics of a query, and help you to determine whether the query is written as efficiently as possible. You can use the execution plan information to quickly troubleshoot poorly performing queries.

You can use SQL Server Management Studio to view the execution plan by clicking the Display Estimated Execution Plan or the Include Actual Execution Plan button on the query editor toolbar. They show the individual execution steps for a query. The following shows the icons for estimated and actual execution plan buttons:

$Description: C:\Users\PMNSER~1\AppData\Local\Temp\SNAGHTML1f09a16.PNG$

&nbsp

Estimated vs. Actual Execution Plan

Estimate execution plan shows the expected steps and required resources. You can view the estimated execution plan without running the query. This can be done by clicking the Display Estimated Execution Plan icon button on the query editor toolbar.

Actual execution plan shows the actual steps and required resources used during query execution. You can only display the actual execution plan after executing the query.

Differences between the estimated and actual execution plans are usually caused by out-of-date statistics or extreme fragmentation. One use of the execution plan is to verify whether or not the query optimizer is choosing the indexes you expect. To read and evaluate the contents of an execution plan, you must have an understanding of the graphic icons used in the execution plan. SQL Server has over 50 different execution plan icons. A detailed discussion of all execution plan icons is beyond the scope of this article, but a few icons deserve special mention. The following table identifies icons for operations relating to table and index scans:

Image	Operator	Description
	Table Scan	Table scanned to retrieve all rows; it can be an expensive operation if the table has a large number of rows.
	Clustered Index Seek	Index seek used to retrieve qualifying rows. The Index seek only touches the records that are necessary to satisfy the query. The index seek is usually the most efficient way of accessing a small number of rows.
	Clustered Index Scan	They are same as the table scan, occur when the database engine determines there is a missing or improper indexes, with cardinality < 5%. When this happens, database engine scans through all rows from the specified table.
	RID Lookup	RID Lookup is a bookmark lookup on a heap using a supplied row identifier (RID) to return the corresponding rows. Usually an indicator that indicates that cluster index is missing on underlying table.
	Key Lookup	Key Lookup is also a bookmark lookup on a table with a clustered index. Occurs when data is not at the leaf level and the database engine has to use the index key to retrieve the corresponding row.
	Index Seek	Seek used to retrieve rows from a non-clustered index. The index seek is usually the most efficient way of accessing a small number of rows.
	Index Spool	Index scanned and input rows placed in hidden spool file, where they are indexed and rows returned through seek operations.
	Compute Scalar	When SQL Server calculates the aggregate, it adds a Compute Scalar step and a Stream Aggregate step to the execution plan.
	Hash Match	A flexible operation. It builds a hash table and groups the results using this hash table. In addition to processing joins, it is used with unions, differences, finding distinct values, and grouping. A hash join requires at least one equality in the join. The join table is initially built in memory. If it can’t fit in memory, it is processed as a set of files on the hard disk.
	Merge Join	Requires that both inputs into the join are sorted on the join columns. The join must be defined by an equality operator (=). Merge join operations are efficient when the table rows are already sorted in join order but become less efficient when the data must be sorted before it can be joined.
	Nested Loop	Also called nested interactions. They designate the join tables as inner and outer tables. Each row in the table is processed sequentially. The rows in the inner table are searched for matching rows for each row in the outer table. Nested loops are efficient when the outer table is relatively small.

Important: The SQL Server query optimizer automatically selects a join operation. You can manually select the join operation through query hints. If you frequently run queries that use the join, it is usually more efficient to create an indexed view based on the join.

Refer to the article “Graphical Execution Plan Icons (SQL Server Management Studio)” in SQL Server Books Online for detailed descriptions of all execution plan icons.

You can view detailed information about a scan or seek operation by placing the pointer over its icon. Information typically returned includes:

• Row count – Number of rows returned by the operator.

• Estimated row size – Size of the row that is returned by the operator.

• I/O cost – The estimated I/O activity for the operation.

• CPU cost – The estimated CPU activity for the operation.

• Number of executes – Number of times the operator is executed during query processing.

• Cost – Operation cost, reported to the query optimizer and used in selecting the best execution plan.

• Subtree cost – Cost of this operator plus the costs of all operators preceding it in the same subtree.

• Argument – Includes any parameters used by the query.

One of your goals is minimum I/O cost values. I/O operations are expensive in terms of time and resources.

SHOWPLAN_TEXT and SHOWPLAN_ALL

You can also view information about an execution plan through SHOWPLAN_TEXT and SHOWPLAN_ALL. SHOWPLAN_TEXT and SHOWPLAN_ALL are connection specified settings configured through the SET statement. After it is enabled (set ON), the statement will continue to be active until disabled (set OFF) or until the end of the current session.

The syntax for SHOWPLAN_TEXT is:

SET SHOWPLAN_TEXT ON|OFF

The syntax for SHOWPLAN_ALL is:

SET SHOWPLAN_ALL ON|OFF

The information generated is session specific. SHOWPLAN_ALL returns more detailed information than SHOWPLAN_TEXT. For example, you might execute:

SELECT name, customerid FROM Sales.BusinessCustomer

This would return a relatively simple execution plan, as follows:

StmtText
----------------------------------------------------
|--Clustered Index
Scan(OBJECT:([OUTLANDER].[Sales]
---------------------------------------------
[BusinessCustomer].[PK_BusinessCustomer]))
(1 row(s) affected)

The same statement with SHOWPLAN_ALL turned on returns more information than can easily be represented in print. Executing a query with SHOWPLAN_ALL option turned on returns a great deal more information than SHOWPLAN_TEXT.

]]>

Estimated vs. Actual Execution Plan

Leave a comment Cancel reply