Physical Join Operators in SQL Server – Nested Loops

Nested loop joins with table scans

If we just add the Order Quantity column to the select list as in Query 2 , the index on Product ID (which does not contain the Order Quantity column) is now not enough to ‘cover’ the query, meaning that it does not include all the data required to satisfy the query. The optimizer can either perform a ‘lookup’ (Use the pointer from the index to fetch the full row from the table itself) for each Order Detail row to get it (nearly 5,000 lookups…) or alternatively it can simply scan the Order Detail table, retrieve both Product ID and Order Quantity. In this case, since there is just one Product row to join hence the table only needs to be scanned once, and since the table is not very large (which would make the scan expensive), the single scan option is probably better than the alternative of performing thousands of lookups. Indeed, this is exactly what the optimizer chooses to do in this situation as you can see in Execution Plan 2 .

Text Box: SELECT	SOD.SalesOrderID,
P.Name,
SOD.OrderQty
FROM		Production.Product AS P
INNER JOIN
Sales.SalesOrderDetail AS SOD
ON P.ProductID = SOD.ProductID
WHERE		P.ProductID = 870

Query 2

Execution Plan 2

*Exercise: Try to change the predicate to ‘P.ProductID IN (870,871)’. This would mean that now, to use the same plan as above will require two full scans instead of just one. Do you think that this will still be the most efficient plan? Try it and see what plan the optimizer chooses. You can find the code for this exercise (Code sample #2b) in the demo code.

Nested loop joins with lookups

Now, let’s play around a bit with the parameters. In Query 3, I have changed the predicate to filter for three products instead of one, but I used products that are far less commonly ordered. Think of three colors of ’White-Out’ (‘Tipp-Ex’ for you Europeans) for the computer screen – not the hottest seller… Now, a slightly different use of nested loops emerges. Note that product 870 exists in nearly 5,000 Order Detail rows, but only 13 Order Detail rows contain one or more of Query 3’s three products (897,942,943). The optimizer always consults the column statistical histograms so it is very aware of this fact. To join only product 870, the optimizer chose to perform a single scan instead of an index seek + 5,000 additional lookups. For the three products in Query 3, it could either perform three full scans or revert to using an index seek + 13 additional lookups. What do you think is the right choice?

The answer is graphically portrayed in Execution Plan 3 . Note that the execution plan incorporates two nested loop physical operators, and that both constitute logical inner joins. The one on the right is our actual table join, and the one on the left denotes the full-record lookup from the Order Detail table (required in order to retrieve the Order Quantity) which, in a sense, is also a join between the Order Detail table and the result of the first join.

Text Box: SELECT	SOD.SalesOrderID,
            P.Name,
            SOD.OrderQty
            FROM		Production.Product AS P
            INNER JOIN
            Sales.SalesOrderDetail AS SOD
            ON P.ProductID = SOD.ProductID
            WHERE		P.ProductID IN (897,942,943)

Query 3

Execution Plan 3

Continues…

Leave a comment

Your email address will not be published.