MASSIVE performance decrease using Left Outer Join | SQL Server Performance Forums

SQL Server Performance Forum – Threads Archive

MASSIVE performance decrease using Left Outer Join

Hi, Was referred to this forum by Satya when he came down to advise our company on SQL query optimisation (Hi Satya, if you’re watching), so immediately thought of it when I ran into this issue which has totally perplexed me, wonder if anyone can help. I have a quite big SQL query, and when run with an inner join it returns 8400 records in 1 second. Left outer SHOULD only return 8600 records, so not much difference. The trouble is, I need to use a left outer join, and when I change the inner join to this it takes 2 seconds per row, therefore the query takes forever to run. [V] I have simplified the query in terms of output columns, this does not change the behaviour though. I have highlighted the join that changes below in RED. If anyone has any advice they can give on this one it would be really appreciated. Thanks,
Will SELECT matchedProducts.ProdCode AS ProdCode
FROM dbo.DataUpload du INNER JOIN
dbo.BusinessUnitCountryRelationship ON du.BusinessUnitID = dbo.BusinessUnitCountryRelationship.BusinessUnitID AND du.DealerCountry = dbo.BusinessUnitCountryRelationship.CountryAKA LEFT OUTER JOIN
(SELECT rl.BusinessUnitID, rl.AKA, dbo.Country.ID AS CountryID
FROM dbo.Partner rl INNER JOIN
dbo.BusinessUnit Reseller ON rl.PartnerID = Reseller.ID INNER JOIN
dbo.Country ON Reseller.RefCountry = dbo.Country.ID
WHERE (rl.Deleted = 0) AND (Reseller.Deleted = 0)) MatchedResellers ON
dbo.BusinessUnitCountryRelationship.CountryID = MatchedResellers.CountryID AND du.DealerName = MatchedResellers.AKA AND du.BusinessUnitID = MatchedResellers.BusinessUnitID LEFT OUTER JOIN
(SELECT pl.BusinessUnitID, pl.AKA, pl.CountryID, prods.ProdCode
FROM dbo.vw_OrganisedProducts prods INNER JOIN
dbo.BusinessUnitProduct pl ON prods.ProdID = pl.ProductID
WHERE (pl.Deleted = 0)) matchedProducts ON du.MATCode = matchedProducts.AKA AND du.BusinessUnitID = matchedProducts.BusinessUnitID
WHERE (du.Processed = 1) AND (du.Deleted = 0)
Is there any indexes in the tables?
You have already checked this?
http://support.microsoft.com/default.aspx?scid=kb;EN-US;836136
http://support.microsoft.com/default.aspx?scid=kb;EN-US;840406
http://support.microsoft.com/default.aspx?scid=kb;EN-US;830887
Frank Kalis
Microsoft SQL Server MVP
http://www.insidesql.de
Heute schon gebloggt?http://www.insidesql.de/blogs
Ich unterstuetze PASS Deutschland e.V. http://www.sqlpass.de)
Hi, Primary keys for the tables are DataUpload.ID
BusinessUnitCountryRelationship.ID
Partner.ID
BusinessUnit.ID
Country.ID As you can probably tell, vw_OrganisedProducts is a view, which I should have listed as well, please find this below, again I have restricted output columns to reduce There are no additional indexes on the tables. Thanks,
Will vw_OrganisedProducts
SELECT prods.ProductCode AS ProdCode, prods.ID AS ProdID
FROM dbo.Products prodType INNER JOIN
dbo.Products prods ON prodType.ID = prods.ParentID INNER JOIN
dbo.Products prodGroup ON prodType.ParentID = prodGroup.ID INNER JOIN
dbo.Products prodFamily ON prodGroup.ParentID = prodFamily.ID
WHERE (prods.Deleted = 0) AND (prods.EndProduct = 1) AND (prodType.ProductType = 1) AND (prodGroup.ProductGroup = 1) AND (prodFamily.ProductFamily = 1) AND (prodType.Deleted = 0) AND (prodGroup.Deleted = 0) AND (prodFamily.Deleted = 0)
quote:Originally posted by FrankKalis You have already checked this?
http://support.microsoft.com/default.aspx?scid=kb;EN-US;836136
http://support.microsoft.com/default.aspx?scid=kb;EN-US;840406
http://support.microsoft.com/default.aspx?scid=kb;EN-US;830887
Frank Kalis
Microsoft SQL Server MVP
http://www.insidesql.de
Heute schon gebloggt?http://www.insidesql.de/blogs
Ich unterstuetze PASS Deutschland e.V. http://www.sqlpass.de)

you have checked the links provided by Frank Kalis?
I have checked the links (thanks Frank for those) and we are using SP4 on the server, so my understanding is that those problems should not occur.
look at the execution plan for the query with the inner join and with the outer join,
what are the significant differences? also, run: SET STATISTICS IO ON, run the queries, and list the output of each
Regarding the output of the execution plans, I am not sure really how to interpret the results, but the main difference seems to be that there is an additional "Table Spool/Lazy Spool" coming down from one of the Nested Loops, and then branching out of this are some of the nested loops. In the INNER JOIN execution plan, these nested loops are all at the top level. Here is the output from STATISTICS IO, as you can see the really massive change is the usage of ‘BusinessUnitProduct’. WITH LEFT OUTER JOIN:
Table ‘BusinessUnit’. Scan count 3, logical reads 6, physical reads 0, read-ahead reads 0.
Table ‘Partner’. Scan count 8, logical reads 24, physical reads 0, read-ahead reads 0.
Table ‘Country’. Scan count 8, logical reads 16, physical reads 0, read-ahead reads 0.
Table ‘Products’. Scan count 44, logical reads 6908, physical reads 0, read-ahead reads 0.
Table ‘BusinessUnitCountryRelationship’. Scan count 12, logical reads 24, physical reads 0, read-ahead reads 0.
Table ‘DataUpload’. Scan count 1, logical reads 554, physical reads 0, read-ahead reads 0.
Table ‘BusinessUnitProduct’. Scan count 142362, logical reads 569448, physical reads 0, read-ahead reads 0. WITH INNER JOIN:
Table ‘BusinessUnit’. Scan count 3, logical reads 6, physical reads 0, read-ahead reads 0.
Table ‘Partner’. Scan count 6, logical reads 18, physical reads 0, read-ahead reads 0.
Table ‘Country’. Scan count 6, logical reads 12, physical reads 0, read-ahead reads 0.
Table ‘Products’. Scan count 26, logical reads 672, physical reads 0, read-ahead reads 0.
Table ‘BusinessUnitCountryRelationship’. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0.
Table ‘DataUpload’. Scan count 1, logical reads 554, physical reads 0, read-ahead reads 0.
Table ‘BusinessUnitProduct’. Scan count 1, logical reads 4, physical reads 0, read-ahead reads 0.
You have highest load on BusinessUnitProduct. Do you have an index on the ProductID for this table? What is in the vw_OrganisedProducts view? How many records do you in this table? If we can reduce substantially reduce the reads on the BusinessUnitProduct, I think that will provide the substantial difference. May the Almighty God bless us all!
www.empoweredinformation.com
Sorry, break. To continue, How about the BusinessUnitID? Do you have index on this column? May the Almighty God bless us all!
www.empoweredinformation.com
Hello Commander! There is not an index on the BusinessUnitProduct.ProductID nor is there an index on the BusinessUnitID field. For the SQL of vw_OrganisedProducts, please see the 4th post on this subject. There are 230 records in the BusinessUnitProduct table. I don’t understand how using a left outer join can make so much difference…
of course there is an index,
otherwise how you explain the fast plan with the inner join there is something about the query with the outer join that causes a very poor execution plan
I think you’re right Joe, it would appear the issue is that for some reason the Optimiser chooses a different plan when you use a LEFT OUTER, the question is why and how do we amend this…
run both of the queries again,
this time run profiler, select the performance, showplan statistics event
add the binary data columns, save the trace as a file and sent it to me
Sorry, been busy. I suggest you run the index tuning wizard and see what’s the recommendation. May the Almighty God bless us all!
www.empoweredinformation.com
How does it perform if you remove the derived table from the right hand side of the join? e.g. something like this… bet it flies… SELECT matchedProducts.ProdCode AS ProdCode
FROM dbo.DataUpload du INNER JOIN
dbo.BusinessUnitCountryRelationship ON du.BusinessUnitID = dbo.BusinessUnitCountryRelationship.BusinessUnitID AND du.DealerCountry = dbo.BusinessUnitCountryRelationship.CountryAKA LEFT OUTER JOIN
(SELECT rl.BusinessUnitID, rl.AKA, dbo.Country.ID AS CountryID
FROM dbo.Partner rl INNER JOIN
dbo.BusinessUnit Reseller ON rl.PartnerID = Reseller.ID INNER JOIN
dbo.Country ON Reseller.RefCountry = dbo.Country.ID
WHERE (rl.Deleted = 0) AND (Reseller.Deleted = 0)) MatchedResellers ON
dbo.BusinessUnitCountryRelationship.CountryID = MatchedResellers.CountryID AND du.DealerName = MatchedResellers.AKA AND du.BusinessUnitID = MatchedResellers.BusinessUnitID LEFT OUTER JOIN
dbo.BusinessUnitProduct matchedProducts ON du.MATCode = matchedProducts.AKA AND du.BusinessUnitID = matchedProducts.BusinessUnitID
LEFT OUTER JOIN dbo.vw_OrganisedProducts prods ON prods.ProdID = pl.ProductID
WHERE (du.Processed = 1) AND (du.Deleted = 0) and pl.deleted = 0
NB – there is an extra left outer join above but you should be fine so long as no products in the BusinessUnitProduct don’t have corresponding rows in the OrganisedProducts (where deleted =0) Hope this helps,
Dave. ps. Good to see the forum’s still thriving – sorry i been gone so long – been v.v. busy www.skyscanner.net – flight search
Dave, the SQL statement you used references the "pl" table even though you have removed it from the view…? Joe I will be sending you a trace file soon. Thanks,
Will
the problem is caused by a significant statistics estimation error
in the joining the products tables in the view, you could try updating all statistics on Products with FULL SCAN
if that does not change the execution plan
you might have to not use the view,
then force a certain join order anyways, focus on the execution plan from SET STATISTICS_PROFILE ON
it is the loop join to a table scan of BusinessUnitProduct executed 142,362 times that you want to reduced check the execution plan with the following join order reversed
FROM dbo.vw_OrganisedProducts prods
INNER JOIN dbo.BusinessUnitProduct pl ON prods.ProdID = pl.ProductID if the plan does not change, then try OPTION(FORCEORDER)
Hi Joe, I tried changing the order of the join but it didn’t make any difference. I am getting a syntax error when trying to use OPTION(FORCEORDER), can you give me an example of how you use this? Also, what does updating statistics do? Is this something I should need to run once, or every time the query is run? Thanks,
Will
Use: option(force order). Read about query hints in BOL.
Hi, Thanks guys, got force order working. On our development server (less records in various tables, but same amount in "Products" table, output 5000 records) it made a MASSIVE difference, going from 5 minutes without force order to 1 SECOND with! On our live server though (more records in various tables apart from "Products" table, output 100,000 records) it only cut a 4 minute query down to 3 minutes 30. How do I update statistics? Will
quote:Originally posted by wodell
How do I update statistics?
Look at update statistics topic in BOL. You can either allow autoupdating statistics or schedule update statistics to run during off hours. For batch process that inserts/updates/deletes significant number of rows you can add update statistics as the last step.
]]>