SQL Server Performance Forum – Threads Archive

NOT EXISTS vs NOT IN vs LEFT OUTER JOIN

I’m having a debate with someone over which is more performant and I see that athttp://www.sql-server-performance.com/transact_sql.asp you favor NOT EXISTS, then LEFT OUTER JOIN then NOT IN. Assuming proper indexes why do you favor one over the other. Is it based on your experience or based on how you know the optimizer works). The more details the better. Thanks in advance

One doubt comes to my mind is all that depends on when you execute these queries with those 3 clauses alongside by checking the statistics & exec. plans. This is best way forward to prove rather than debating on a particular set of options. IMHO outer join are more efficient in performance Satya SKJ
Microsoft SQL Server MVP
Writer, Contributing Editor & Moderator
http://www.SQL-Server-Performance.Com
@http://www.askasqlguru.com/ This posting is provided AS IS with no rights for the sake of knowledge sharing. Knowledge is of two kinds. We know a subject ourselves or we know where we can find information on it.

in sql 2k5 these 3 will almost always produce the same query plan. Not exists is probably favored because it stops processing after the condition is met
while join and in have to go over the whole table. however i wouldn’t waste my time on perf diffs with these 3 options.
i have yet to seen them behave differently perf-wise. now logically…. that’s another story. _______________________________________________
Causing trouble since 1980
blog:http://weblogs.sqlteam.com/mladenp

The short answer is it depends.

Im my experiecne it depends on what you are doing and how the data is arranged. In general I see OUTER joins perform better on larger tables (300 million plus rows). So, I tend to use that more frequently. Additionally, I think it makes the code easier to read if the outer join method is used consistantly. However, there are times when a NOT EXISTS can be more performant. So, test, test and test.

Knowing what the optimizer does? Don’t try to outsmart the optimizer! [

]
RDBMS like SQL Server try to flatten operations to JOINs behind the scenes if they can. That’s what they are optimized for. So, I prefer the JOIN method, But that’s just me. Write the query in the way that is most intuitive to you.

Suppose I have a table of stocks StockId
StockSymbol And a table of stock price history, with average 100,000 records per stock, and I want to return all stocks containing at least one entry in the history table intuitively it seems to me it would be quicker to just test EXISTS rather than join and then filter DISTINCT. Are you saying that JOIN would be better in that instance?

[quote user="MartinSmithh"]
Suppose I have a table of stocks StockId
StockSymbol And a table of stock price history, with average 100,000 records per stock, and I want to return all stocks containing at least one entry in the history table intuitively it seems to me it would be quicker to just test EXISTS rather than join and then filter DISTINCT. Are you saying that JOIN would be better in that instance?
[/quote]
Well, the way you formulate the query in human language already suggests that you use EXISTS. You don’t want to return any data from the history table, it seems. So why JOIN on it?

From what has been discussed here in the past, the advice is to use EXISTS where possible: it may perform better than when you use a JOIN or an IN subquery, but never worse.
For NOT EXISTS there’s probably less of a difference.

Just checking. Thanks for the clarification!

]]>