SQL Server Join Nastiness Tamed: Finding Joy In The INNER LOOP

4. IN Queries Are Not In With This Hint

I have not found a way to use this technique for IN queries (like “SELECT * FROM Foo WHERE FooID IN (SELECT FooID FROM Bar)”), which is a shame because IN queries have the same characteristics — and risks of undesirable hash and merge joins — as explicit joins. This has required considerable effort on my part to either deal with extra join rows, or to unroll the query logic into cursor code or ADO code.

5. It’s Complicated

SQL query optimization, that is. It may well be that what I’ve observed on Streamload’s database servers is because our I/O and/or memory systems are effective enough to make the extra index hits caused by looping to be cheaper than creating hash tables or powering merges. Or maybe it’s the other way around. Maybe hash table building requires so much memory and/or tempdb I/O that it’s faster for our system to do all the looped index hits. SQL’s query analyzer thinks it can put costs on different types of operations, and it has various statistics at its disposal, and maybe it’s right a lot of the time…but then again, if it knows so much, then why did I have to write this article, right?

Ironies & Tragedies

If you read the documentation for hash and merge joins, you’ll see the level of sophistication that Microsoft put into trying to make these techniques work for very large databases. The irony is that I’ve got a very large database, and I want no part of hash or merge joins for my high traffic queries, except for the aforementioned one-off or rare-off administrative purposes, or when lots of rows are involved.

You may get tragic hashing or merging behavior even when you have good indices in place. I’m not sure why this is: perhaps its statistics tell it that the indices aren’t specific enough, but it turns out that you have domain knowledge about your query such that you know that the indices are good, or maybe the query optimizer just has flawed logic when it comes to gambling on deep looping structures.

Conclusion

If you’ve got queries that used to be lightning fast and that don’t return or process a lot of rows, and now these queries are really starting to drag, and hash and merge joins are showing up in the execution paths, give the INNER LOOP join hint a try. It might throw you for a loop…or it might get you back to that lightning fast responsiveness you had before.

Shameless Plug

These and many other cutting-edge techniques power www.streamload.com, a web storage and delivery site offering unlimited free online storage for your files. Upload all your files from your computer to gain format-savvy access to them from anywhere. Features include image slideshows, audio and video streaming and playlists, and streamlined download management. Also, share and sell your own digital creations. This is not another drive site: This is your digital entertainment, online.

]]>

Leave a comment

Your email address will not be published.