Using semi joins for distributed databases | SQL Server Performance Forums

SQL Server Performance Forum – Threads Archive

Using semi joins for distributed databases

HI!
Recently I have heard that semi joins give better performance compared to outer joins when working with distributed databases.
I have gone through BOL and did bit googling but coudnt find any useful material regarding semi joins
BOL says abt left semi join
"The Left Semi Join logical operator returns each row from the first (top) input when there is a matching row in the second (bottom) input"
Can you please explain about the difference come in performance in using semi joins and outer joins
You can find those in BOL Left Semi Join The Left Semi Join logical operator returns each row from the first (top) input when there is a matching row in the second (bottom) input. If no join predicate exists in the Argument column, each row is a matching row. Right Semi Join
The Right Semi Join logical operator returns each row from the second (bottom) input when there is a matching row in the first (top) input. If no join predicate exists in the Argument column, each row is a matching row.

Yes, it is documented in BOL – only problem is that neither SQL 7.0 nor SQL 2000 seem to support it: SELECT T1.*
FROM T1
LEFT SEMI JOIN T2
ON T1.col = T2.col Server: Msg 155, Level 15, State 1, Line 3
‘semi’ is not a recognized join option. [xx(]
Second thought … The equivalent of that, provided that you’re not returning values from the outer table, is to use an EXISTS clause in the WHERE statement, instead of a JOIN.
One site has asked to do it like below
Semi-join Perform a join, but just show columns from one table, and include nulls, like in the outer join from #10. Employee left-semi-join Employee.Department = Department.Department Department
But taht also failed
]]>