SQL Server Join Nastiness Tamed: Finding Joy In The INNER LOOP
This is fine for one-time administrative “fact finding” queries where you don’t have or want the indices needed to optimize the query, and you’re willing to wait the seconds or minutes it takes to get results.
For day-in-and-day-out application queries, however, you don’t want your database engine to be hashing or sorting hundreds of thousands or millions of rows, especially when the end result is only a small number of rows.
Give SQL A Hint
The way around the performance problems inherent with hash and merge joins is to use the “INNER LOOP” join hint to politely force SQL to use nested loop joins, like so:
— List all files in “Michael’s” account.
FROM Accounts WITH (NOLOCK)
INNER LOOP JOIN Files WITH (NOLOCK) ON Files.AccountID =
WHERE Name = ‘Michael’
— NOTE: This is not an actual Streamload (Streamload is the company the author works for) query, nor is it representative of any of our actual schema.
Real World Examples
Here are some concrete examples culled from actual Streamload query tuning. Each of these examples have screen shots of before-and-after execution plans, so you may need to bone up on how to read these diagrams (here’s an MSDN link to help you out: http://msdn.microsoft.com/library/en-us/optimsql/odp_tun_1_1pfd.asp.
Here’s a crash course on reading execution plans: thick lines are bad; they represent lots of data being retrieved and processed). Besides the superior execution paths shown in each example, each of the queries went from being a 1+ second speed bump to an <0.1 second sprint with the INNER LOOP join hint in place.
The first example is a query of the form:
JOIN Table2 — ON conditions omitted.
Table1 is a short table and Table2 is a long one. Table1 is so short that it should power the whole query, but notice the merge join at the far right involving Table1 and Table2 in the execution plan, and how it requires those full index scans and sorting to power the join. View execution plan.
With the INNER LOOP join hint in place like so.
FROM Table1 INNER LOOP JOIN Table2 INNER LOOP JOIN Table3 INNER LOOP JOIN
This monstrosity is tamed such that an initial index scan powers parallel loop execution to retrieve the results. View execution plan.
The next example is a query of the form:
SELECT Table3.Column0, COUNT(Table2.Column1)
FROM Table1 JOIN Table2 JOIN Table3
GROUP BY Table3.Column0
Again, Table1 is short enough to power the whole query, but notice in the execution plan the three big full index scans powering costly merge and hash joins. The optimizer must think that looking through the Files.file0 index will cost enough to justify dredging up all of Table2’s rows. Big mistake. View execution plan.
With the hint, there’s only one big index scan and the optimizer thinks that hitting into the Files.file0 index will cost the most, so the main point of contention has stayed the same, but the volume of data being accessed is much lower, and so is the execution time. View execution plan.