Merge join with Non Clustered Indexes
Merge will be very efficient when one or both tables have a non clustered index that sorts the join column. This is twice as true if the index covers the query, as in Query 6 .
*Exercise: In the demo code I’ve added two more examples (Queries 6b and 6c) where a small change to the query changes the physical operator chosen by the SQL Server optimizer. Try to play around with the parameters and see how the optimizer changes its decisions.
ORDER BY clause may coax a Merge join
As I mentioned earlier, in many cases you will see that the optimizer decides to sort one or both inputs just to use the merge operator. This will usually happen when the inputs are not very large and the alternatives are worse. This decision becomes even easier if the optimizer sees that performing a sort provides an additional benefit, allowing the optimizer to ‘kill two birds with one stone’ (see disclaimer below) For example, the pre sorting may help facilitate the highly efficient stream aggregate for GROUP BY or DISTINCT clauses , UNION operators, analytical rankings or for delivering the result set in the order of the ORDER BY clause, as is the case in Query 7 .
DISCLAIMER: No birds (nor any other animals) were harmed for the making of this script. As a matter of fact, three of my cats actually assisted the writing of this article. One is purring in my lap as I’m writing these lines…
In Part III we will conclude this series with a look at the Hash operator.
If you are interested to dive deeper and learn more on joins and their implementation in SQL Server, I highly recommend Craig Freedman’s SQL Server Blog on MSDN. Craig has published a series of excellent, in depth articles regarding many more aspects and types of joins.