Physical Join Operators in SQL Server – Nested Loops of the ‘Estimation Error Devil’

One of the most common pitfalls of the query optimizer is
under-estimating the number of required iterations for a nested loops operator
due to (partial list):

  • Statistical deviations
  • Outdated statistical data
  • ‘Hard to estimate’ predicates
    such as compound expressions, sub queries, functions or type conversions
    on filter columns
  • Multiple predicates estimation

When analyzing a poorly
performing query, one of the first things I do is to look at the ‘Estimated
number of rows’ vs. the ‘Actual number of rows’ figures of the outer input of
the nested loop operator. The outer input is the top one in the graphical query
execution plan. See for example Execution Plan 4 . I’ve
seen many cases where the optimizer estimated that only a few dozen iterations
will be required, making nested loops operator a very good choice for the join,
but in fact tens (or even hundreds) of thousands of rows satisfied the filters,
causing the query to perform very badly.

For Query 4 below, the optimizer
estimated that ~1.3 products will satisfy the query filters when in fact, 44
products did. This seemingly small error led the optimizer to believe that only
616 rows will be matched from the Order Detail table and that estimation
made it choose the same plan as the previous example, using an index seek and additional
lookups. The optimizer estimated that it will require only 616 key lookups when
in fact, 20,574 were required. This might seem at first like a flaw in the
optimizer, but only to the degree that mind reading is hard to program. You’ll
see why in the answer to Challenge
#1 at the end of this section.

*Challenge #1: Can you guess what misled the
optimizer to make the estimation error? See the answer at the end of this section.

Text Box: SELECT	SOD.SalesOrderID,
            FROM		Production.Product AS P
            INNER JOIN
            Sales.SalesOrderDetail AS SOD
            ON P.ProductID = SOD.ProductID
            WHERE		P.DaysToManufacture > 2
            ListPrice > 1350
            StandardCost > 750


Query 4


Execution Plan 4

*Exercise: The demo code contains Query 4 as shown here, plus hints that force the optimizer to use
nested loops, hash match or merge joins respectively (Queries 4b, 4c and 4d).
Use the demo code to execute these query variations.
Trace the executions with profiler to benchmark these alternatives and see if
the optimizer was correct to prefer nested loops for this query. Remember that logical reads might be misleading so pay close attention to both CPU
and duration when evaluating the queries’ true efficiency.

Wicked a-sequential page cache flushes

One more issue we need to consider with nested loops is the
issue of sequential vs. a-sequential page access patterns. You might recall
from my previous
that nested
loops are characterized with a high number of logical reads, as the same data
page might be accessed multiple times. For small row sets, this is usually not
an issue as the pages are cached once and subsequent reads are performed in
memory. But, if the system experiences memory pressure and in the common case
where the outer loop consists of a large number of rows where the distribution
of the data (in respect to the order of the join columns retrieved for the
outer loop) within the pages is more or less random, a real performance
nightmare might occur when a page is repeatedly flushed from the buffer
cache only to be physically retrieved a few seconds later for retrieval of
another row, for the same join! If you were the query optimizer – go with the
flow for a moment – how would you avoid such a performance ‘nightmare’?

Feel free to discuss your ideas for potential solutions in
the comments section.

*Challenge #1 answer: The filter consists of three
predicates. If you inspected each predicate by itself (armed with the
statistical distribution histograms of the values in each column), you would
see that the predicates are highly selective, meaning that they filter out
a large percentage of the rows as the values used are very close to the
maximum values for each of the columns. Using AND logical operators for
these three highly selective predicates leads the optimizer to estimate
that only a few rows will satisfy all three predicates. This is correct
from a statistical perspective under the assumption that there is no
correlation between the predicates. But… unbeknownst to the optimizer, the
values in these three predicates are actually closely related. The query
filters for the products that take the longest to manufacture, making them
the products with the highest cost, and naturally the ones with the highest
list price. So it’s more or less the same group of products that satisfy
each predicate. These and other conditions which make selectivity extremely
hard to estimate are much more common in production systems than most
people would think.

want to use this opportunity to congratulate the Microsoft SQL Server
optimizer team for producing an unbelievably intelligent optimization
engine which is (IMHO) by far the best of its kind on the market, and it
just keeps getting better with every new version. It is getting harder and
harder for me to come up with such examples where I manage to make it err.

If you are interested to dive deeper and learn more
on joins and their implementation in SQL Server, I highly recommend Craig Freedman’s SQL Server
Blog on MSDN. Craig has published a series of
excellent, in depth articles
regarding many more aspects and types of

Part II of this series continues with a look at the Merge Operator


Pages: 1 2 3


No comments yet... Be the first to leave a reply!

Software Reviews | Book Reviews | FAQs | Tips | Articles | Performance Tuning | Audit | BI | Clustering | Developer | Reporting | DBA | ASP.NET Ado | Views tips | | Developer FAQs | Replication Tips | OS Tips | Misc Tips | Index Tuning Tips | Hints Tips | High Availability Tips | Hardware Tips | ETL Tips | Components Tips | Configuration Tips | App Dev Tips | OLAP Tips | Admin Tips | Software Reviews | Error | Clustering FAQs | Performance Tuning FAQs | DBA FAQs |