SQL Server Join Nastiness Tamed: Finding Joy In The INNER LOOP

Microsoft SQL Server sometimes uses hash and merge joins when querying large tables when uncomplicated nested loop joins would result in better performance and less server impact. In many such cases, query times go from many milliseconds to many seconds because hash table joins require that large amounts of data be processed and temporarily stored, and merge joins require sorting and then processing similarly large amounts of data.

This is fine for one-time administrative “fact finding” queries where you don’t have or want the indices needed to optimize the query, and you’re willing to wait the seconds or minutes it takes to get results.

For day-in-and-day-out application queries, however, you don’t want your database engine to be hashing or sorting hundreds of thousands or millions of rows, especially when the end result is only a small number of rows.

Give SQL A Hint

The way around the performance problems inherent with hash and merge joins is to use the “INNER LOOP” join hint to politely force SQL to use nested loop joins, like so:

– List all files in “Michael’s” account.

SELECT Filename
FROM Accounts WITH (NOLOCK)
INNER LOOP JOIN Files WITH (NOLOCK) ON Files.AccountID =
      Accounts.AccountID
WHERE Name = ‘Michael’

– NOTE: This is not an actual Streamload (Streamload is the company the author works for) query, nor is it representative of any of our actual schema.

Real World Examples

Here are some concrete examples culled from actual Streamload query tuning. Each of these examples have screen shots of before-and-after execution plans, so you may need to bone up on how to read these diagrams (here’s an MSDN link to help you out: http://msdn.microsoft.com/library/en-us/optimsql/odp_tun_1_1pfd.asp.

Here’s a crash course on reading execution plans: thick lines are bad; they represent lots of data being retrieved and processed). Besides the superior execution paths shown in each example, each of the queries went from being a 1+ second speed bump to an <0.1 second sprint with the INNER LOOP join hint in place.


Example 1

The first example is a query of the form:

SELECT Table4.Column0
FROM Table1
JOIN Table2 — ON conditions omitted.
JOIN Table3
JOIN Table4

Table1 is a short table and Table2 is a long one. Table1 is so short that it should power the whole query, but notice the merge join at the far right involving Table1 and Table2 in the execution plan, and how it requires those full index scans and sorting to power the join. View execution plan.

With the INNER LOOP join hint in place like so.

SELECT Table4.Column0
FROM Table1 INNER LOOP JOIN Table2 INNER LOOP JOIN Table3 INNER LOOP JOIN
Table4

This monstrosity is tamed such that an initial index scan powers parallel loop execution to retrieve the results. View execution plan.


Example 2

The next example is a query of the form:

SELECT Table3.Column0, COUNT(Table2.Column1)
FROM Table1 JOIN Table2 JOIN Table3
GROUP BY Table3.Column0

Again, Table1 is short enough to power the whole query, but notice in the execution plan the three big full index scans powering costly merge and hash joins. The optimizer must think that looking through the Files.file0 index will cost enough to justify dredging up all of Table2′s rows. Big mistake. View execution plan.

With the hint, there’s only one big index scan and the optimizer thinks that hitting into the Files.file0 index will cost the most, so the main point of contention has stayed the same, but the volume of data being accessed is much lower, and so is the execution time. View execution plan.

Continues…

Pages: 1 2 3




Related Articles :

  • No Related Articles Found

No comments yet... Be the first to leave a reply!

Software Reviews | Book Reviews | FAQs | Tips | Articles | Performance Tuning | Audit | BI | Clustering | Developer | Reporting | DBA | ASP.NET Ado | Views tips | | Developer FAQs | Replication Tips | OS Tips | Misc Tips | Index Tuning Tips | Hints Tips | High Availability Tips | Hardware Tips | ETL Tips | Components Tips | Configuration Tips | App Dev Tips | OLAP Tips | Admin Tips | Software Reviews | Error | Clustering FAQs | Performance Tuning FAQs | DBA FAQs |