Understanding SQL Server Query Optimization – Part 1

In the first of this four part series I will give an overview of the tools and techniques of query optimization in SQL Server

Most of the
activity of the database server is related to queries. Because of this, query
optimization is often a major concern. SQL Server tries to optimize query
performance itself, but its effectiveness in this may be influenced by several
problems. Indexes and poorly designed queries can lead to unacceptable
performance, no matter how well you design your database. Key issues related to
query optimization include designing indexes, statistics, and index
fragmentation. Statistics describe index key values, are maintained for index
columns, and are used by SQL Server to decide on the most appropriate index to
use when running queries.

SQL Server
Query optimizer

The query
optimizer of SQL Server is a component that is responsible for optimizing the
execution plan of the query. Key concerns in selecting an optimal execution
plan are I/O processes needed and CPU resources required for query processing,
and execution time required for query to return the results to the user. The
result of the analysis performed by the query optimizer is an execution plan
that acts as a step-by-step guide to process the query.


During the
optimization process, the query optimizer considers several execution plans and
selects the optimal plan of this set. The plan it selects is often neither the
fastest nor the less intense, but it provides good performance while reducing
resource requirements.

SQL Server
stores the execution plan in the procedure cache. If
appropriate, SQL Server caches both a serial execution plan and parallel
execution plan. This means that the query optimizer can select an existing
execution plan in the cache, if appropriate, instead of generating a new
execution plan if the query is executed again. If the underlying tables,
indexes or statistics change between each execution, the execution plan is
recompiled before being reused. The steps of the query processing process are:

1. Parsing
– Checks query for syntax errors. If it passes the syntax check, the syntax is
broken into component parts that are recognizable to the database engine.

2. Standardization
– Transforms query format, preparing it for optimization. Redundant clauses are
removed as part of this process.

3. Query
– Query optimizer selects the most appropriate
execution plan.

4. Compilation
– After selecting an execution plan, SQL Server compiles the query into
executable code based on that plan.


To determine
the best execution plan, the query optimizer takes into account several
factors, including:

– The query optimizer first determines if there are any indexes on the source tables
or views. If not, the database engine retrieves data by scanning the entire
table. The database engine begins scanning at the physical beginning of the
table, and scans through the full table page-by-page and row-by-row to look for
qualifying data. Then extracts and returns rows meeting the selection criteria.
The process is improved when indexes are present. However, If there is an
appropriate index available, SQL Server uses it to locate the data. If source
tables have multiple indexes then query optimiser determines which indexes or
columns might minimize I/O activity by reducing the number of rows that must be
examined while processing the query.


– If statistic updates are enabled, the query optimizer creates or updates
column statistics as necessary.


– The query optimizer chooses the best join strategy (how to process a
union/join), and the order of conducting join operations.

The query
optimizer considers the data structures and data characteristics. This includes
issues such as the size of the database file and its placement and index
density. Index density refers to the percentage of duplicate rows in an index.
Density is not necessarily constant along index, and because of this, the query
optimizer may decide to use a table scan when retrieving some values, but use
an index scan for other records. If query search conditions include a dense
(duplicate) column value, it uses a table scan.


The query
governor sets the query cost limit, which prevents long-running queries consume
an excessive amount of query resources. You can set the cost limit the query to
a value in seconds. SQL Server determines the limits of cost based on the value
entered. Set the query governor to 0 disables the query governor. You can set
the query governor through:




You can set
the limit for all connections using the sp_configure system-stored
procedure and server properties. You can set the limit for the current
connection only using the following:


the query optimizer

You can
fine-tune query performance in this situation by specifying optimizer hints in
your query. You should avoid using optimizer hints unless you believe that the
query optimizer is to choose less efficient execution plans and have no other
solution available. You can enter optimizer hints as:


– Specify indexes used, force a table scan, and set locking options.


– Mutually exclusive choices that set the join type.


– Specified as part of the OPTION clause and affecting all query operators.

If using
optimizer hints, verify that you are, in fact, getting an improved performance
by testing the query. Do not be surprised if performance does not improve.
Often it is found that the query optimizer selects appropriate execution plans
and that the problem is more in how the query is written and the resources
(such as indexes) that are available to the query optimizer. If you use hints,
you should check query performance periodically because the optimizer hints
prevent the query optimizer dynamically adjust to changing conditions.

Key points
to remember for indexes and queries

You support
queries and query optimization by creating indexes to reduce input/output (I/O)
requirements. Key points include:

non-clustered indexes to support queries. This gives the query optimizer the information
it needs without having to access data pages.

included columns for columns that cannot be specified as key columns. This
provides a more direct access to the data in included columns.

must include all of the columns on which the query depends, as either key or
non-key columns. This includes columns in the result set as well as those used
for sorting, aggregation, and limiting the result set. It’s often appropriate
to have all columns in some tables indexed through composite indexes.

aggregated data values ​​for queries
over indexed views. This provides better performance than the aggregate
calculation when processing the query.

You should
monitor the relationship between row sizes and index key sizes. If
the index key size approaches the size of the row, the index offers little or
no performance benefit. A clustered index, which includes a wide selection of
columns in the table, could be an appropriate choice.


No comments yet... Be the first to leave a reply!