consider the following query: SELECT a.field1, b.field2 FROM dbo.tableA a LEFT JOIN dbo.tableB b ON a.field3 = b.field3 WHERE b.field4 IN ('A', 'B', 'C') Can anyone tell me if there would ever be an advantage to this being a left join operation? Also, the developer I'm discussing this with is confident that the WHERE clause is evaluated before the JOIN and I don't see how that could even be possible, but I don't want to unequivocally state he is wrong unless I can show him somehow. Supporting info/ documentation would be appreciated.
The question of left join or not is more a question of, is it correct or not?. left join is not something you do for 'an advantage', it's something you do because it's logical, because you have a 1:many relationship, because it has meaning with your data.
The performance issue is an aside for me... of course I'd prefer to know which is faster, but from a query operation standpoint, I'd really like to know how the query is analyzed in the optimizer. My first reaction as the developer was explaining why he did this sort of thing pegged my "BS-o-meter", but he swears by it. I'm pretty sure he's smoking something, but I need to be able to show him the error of his ways.
[quote user="jholovacs"] consider the following query: SELECT a.field1, b.field2 FROM dbo.tableA a LEFT JOIN dbo.tableB b ON a.field3 = b.field3 WHERE b.field4 IN ('A', 'B', 'C') Can anyone tell me if there would ever be an advantage to this being a left join operation? Also, the developer I'm discussing this with is confident that the WHERE clause is evaluated before the JOIN and I don't see how that could even be possible, but I don't want to unequivocally state he is wrong unless I can show him somehow. Supporting info/ documentation would be appreciated. [/quote] The LEFT JOIN here is senseless! Since the WHERE clause is referring to the unpreserved tableB the LEFT JOIN is effectivly turned into an INNER JOIN and any halfway smart optimizer will rewrite such a query to an INNER JOIN. I have a more detailed example here: http://sql-server-performance.com/Community/forums/p/12669/70997.aspx#70997 On the logical level is FROM evaluated before WHERE and any implementation is free to optimize things as long as the final resultset is guaranteed to be correct. To be honest, this is actually quite basic stuff, but I guess it will help reading Itzik Ben-Gan's "Inside SQL Server 2005" books.
yeah, it didn't make sense to me either, which is why I asked him about it in the first place. He patiently explained to me that the WHERE clause was evaluated before the JOIN operation and filtered the result set for the JOIN, and it was a LEFT JOIN because that made it faster; the implication was that I was an idiot for being a DBA and not knowing this to be the case. I politely told him I thought he was dead flat wrong, but I really need to be able to prove it to be able to shut him up.
Wrong! You're the DBA. It is him who should prove his hypothesis by hard facts. I would ask him to run his query and show the execution plan. All bets that SQL Server will show an INNER JOIN and that you get exactly the same plan for the equivalent query using the INNER JOIN. So, with two most likely identical execution plans, why should one variante run faster than the other? Here's a "management summary" of me posts in the other thread I mentioned above:USE tempDB IF OBJECT_ID('#Table1', 'table') > 0 DROP TABLE #Table1, #Table2 CREATE TABLE #Table1(tid INT, c1 CHAR) CREATE TABLE #Table2(tid INT, c2 CHAR)INSERT INTO #Table1 SELECT 1, 'a' UNION ALL SELECT 2, 'b' UNION ALL SELECT 3, 'c' UNION ALL SELECT 4, 'd'INSERT INTO #Table2 SELECT 1, 'x' UNION ALL SELECT 2, 'y' UNION ALL SELECT 3, 'z'SELECT t1.*, t2.c2 FROM #Table1 t1 LEFT OUTER JOIN #Table2 t2 ON t1.tid=t2.tid WHERE t2.c2='x' SELECT t1.*, t2.c2 FROM #Table1 t1 INNER JOIN #Table2 t2 ON t1.tid=t2.tid WHERE t2.c2='x' tid c1 c2 ----------- ---- ---- 1 a x (1 row(s) affected) tid c1 c2 ----------- ---- ---- 1 a x (1 row(s) affected) Both statements produce the same resultset. Here are the execution plans: StmtText --------------------------------------------------------------------------------------------------------------------- SELECT t1.*, t2.c2 FROM #Table1 t1 LEFT OUTER JOIN #Table2 t2 ON t1.tid=t2.tid WHERE t2.c2='x' (1 row(s) affected) StmtText ------------------------------------------------------------------------------------------------------------------------------------------------------------------ |--Hash Match(Inner Join, HASH[t2].[tid])=([t1].[tid]), RESIDUAL[tempdb].[dbo].[#Table2].[tid] as [t2].[tid]=[tempdb].[dbo].[#Table1].[tid] as [t1].[tid])) |--Table Scan(OBJECT[tempdb].[dbo].[#Table2] AS [t2]), WHERE[tempdb].[dbo].[#Table2].[c2] as [t2].[c2]='x')) |--Table Scan(OBJECT[tempdb].[dbo].[#Table1] AS [t1])) (3 row(s) affected) StmtText --------------------------------------------------------------------------------------------------------------- SELECT t1.*, t2.c2 FROM #Table1 t1 INNER JOIN #Table2 t2 ON t1.tid=t2.tid WHERE t2.c2='x' (1 row(s) affected) StmtText ------------------------------------------------------------------------------------------------------------------------------------------------------------------ |--Hash Match(Inner Join, HASH[t2].[tid])=([t1].[tid]), RESIDUAL[tempdb].[dbo].[#Table2].[tid] as [t2].[tid]=[tempdb].[dbo].[#Table1].[tid] as [t1].[tid])) |--Table Scan(OBJECT[tempdb].[dbo].[#Table2] AS [t2]), WHERE[tempdb].[dbo].[#Table2].[c2] as [t2].[c2]='x')) |--Table Scan(OBJECT[tempdb].[dbo].[#Table1] AS [t1])) (3 row(s) affected) On the logical level the FROM clause is evaluated first. That includes JOINs. The WHERE clause is logically evaluated thereafter. So, the filtering happens after the JOIN has been processed. There are many, many threads in almost any RDBMS community about this. Just google, for example, the MS public newsgroups. Here's just one example: http://groups.google.de/group/microsoft.public.sqlserver.programming/msg/ee85c52a7c1c909d As I mentioned before, Itzik has dedicated two chapters in one of his books about SQL Server 2005 to logical and physical query processing. Tell your developer to get a copy of these books before making such funny statements again.
We don't fire our developers when they screw something up. We make them fix it; it's far more painful that way.
[quote user="jholovacs"] We don't fire our developers when they screw something up. We make them fix it; it's far more painful that way. [/quote] Hmmm... I never thought about it.[]