SQL Server Performance

MASSIVE performance decrease using Left Outer Join

Discussion in 'T-SQL Performance Tuning for Developers' started by wodell, May 30, 2006.

  1. wodell New Member

    Hi,

    Was referred to this forum by Satya when he came down to advise our company on SQL query optimisation (Hi Satya, if you're watching), so immediately thought of it when I ran into this issue which has totally perplexed me, wonder if anyone can help.

    I have a quite big SQL query, and when run with an inner join it returns 8400 records in 1 second. Left outer SHOULD only return 8600 records, so not much difference.

    The trouble is, I need to use a left outer join, and when I change the inner join to this it takes 2 seconds per row, therefore the query takes forever to run. [V]

    I have simplified the query in terms of output columns, this does not change the behaviour though. I have highlighted the join that changes below in RED.

    If anyone has any advice they can give on this one it would be really appreciated.

    Thanks,
    Will

    SELECT matchedProducts.ProdCode AS ProdCode
    FROM dbo.DataUpload du INNER JOIN
    dbo.BusinessUnitCountryRelationship ON du.BusinessUnitID = dbo.BusinessUnitCountryRelationship.BusinessUnitID AND du.DealerCountry = dbo.BusinessUnitCountryRelationship.CountryAKA LEFT OUTER JOIN
    (SELECT rl.BusinessUnitID, rl.AKA, dbo.Country.ID AS CountryID
    FROM dbo.Partner rl INNER JOIN
    dbo.BusinessUnit Reseller ON rl.PartnerID = Reseller.ID INNER JOIN
    dbo.Country ON Reseller.RefCountry = dbo.Country.ID
    WHERE (rl.Deleted = 0) AND (Reseller.Deleted = 0)) MatchedResellers ON
    dbo.BusinessUnitCountryRelationship.CountryID = MatchedResellers.CountryID AND du.DealerName = MatchedResellers.AKA AND du.BusinessUnitID = MatchedResellers.BusinessUnitID LEFT OUTER JOIN
    (SELECT pl.BusinessUnitID, pl.AKA, pl.CountryID, prods.ProdCode
    FROM dbo.vw_OrganisedProducts prods INNER JOIN
    dbo.BusinessUnitProduct pl ON prods.ProdID = pl.ProductID
    WHERE (pl.Deleted = 0)) matchedProducts ON du.MATCode = matchedProducts.AKA AND du.BusinessUnitID = matchedProducts.BusinessUnitID
    WHERE (du.Processed = 1) AND (du.Deleted = 0)
  2. smy New Member

    Is there any indexes in the tables?
  3. FrankKalis Moderator

  4. wodell New Member

    Hi,

    Primary keys for the tables are

    DataUpload.ID
    BusinessUnitCountryRelationship.ID
    Partner.ID
    BusinessUnit.ID
    Country.ID

    As you can probably tell, vw_OrganisedProducts is a view, which I should have listed as well, please find this below, again I have restricted output columns to reduce

    There are no additional indexes on the tables.

    Thanks,
    Will

    vw_OrganisedProducts
    SELECT prods.ProductCode AS ProdCode, prods.ID AS ProdID
    FROM dbo.Products prodType INNER JOIN
    dbo.Products prods ON prodType.ID = prods.ParentID INNER JOIN
    dbo.Products prodGroup ON prodType.ParentID = prodGroup.ID INNER JOIN
    dbo.Products prodFamily ON prodGroup.ParentID = prodFamily.ID
    WHERE (prods.Deleted = 0) AND (prods.EndProduct = 1) AND (prodType.ProductType = 1) AND (prodGroup.ProductGroup = 1) AND (prodFamily.ProductFamily = 1) AND (prodType.Deleted = 0) AND (prodGroup.Deleted = 0) AND (prodFamily.Deleted = 0)
  5. smy New Member

  6. wodell New Member

    I have checked the links (thanks Frank for those) and we are using SP4 on the server, so my understanding is that those problems should not occur.
  7. joechang New Member

    look at the execution plan for the query with the inner join and with the outer join,
    what are the significant differences?

    also, run: SET STATISTICS IO ON, run the queries, and list the output of each
  8. wodell New Member

    Regarding the output of the execution plans, I am not sure really how to interpret the results, but the main difference seems to be that there is an additional "Table Spool/Lazy Spool" coming down from one of the Nested Loops, and then branching out of this are some of the nested loops. In the INNER JOIN execution plan, these nested loops are all at the top level.

    Here is the output from STATISTICS IO, as you can see the really massive change is the usage of 'BusinessUnitProduct'.

    WITH LEFT OUTER JOIN:
    Table 'BusinessUnit'. Scan count 3, logical reads 6, physical reads 0, read-ahead reads 0.
    Table 'Partner'. Scan count 8, logical reads 24, physical reads 0, read-ahead reads 0.
    Table 'Country'. Scan count 8, logical reads 16, physical reads 0, read-ahead reads 0.
    Table 'Products'. Scan count 44, logical reads 6908, physical reads 0, read-ahead reads 0.
    Table 'BusinessUnitCountryRelationship'. Scan count 12, logical reads 24, physical reads 0, read-ahead reads 0.
    Table 'DataUpload'. Scan count 1, logical reads 554, physical reads 0, read-ahead reads 0.
    Table 'BusinessUnitProduct'. Scan count 142362, logical reads 569448, physical reads 0, read-ahead reads 0.

    WITH INNER JOIN:
    Table 'BusinessUnit'. Scan count 3, logical reads 6, physical reads 0, read-ahead reads 0.
    Table 'Partner'. Scan count 6, logical reads 18, physical reads 0, read-ahead reads 0.
    Table 'Country'. Scan count 6, logical reads 12, physical reads 0, read-ahead reads 0.
    Table 'Products'. Scan count 26, logical reads 672, physical reads 0, read-ahead reads 0.
    Table 'BusinessUnitCountryRelationship'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0.
    Table 'DataUpload'. Scan count 1, logical reads 554, physical reads 0, read-ahead reads 0.
    Table 'BusinessUnitProduct'. Scan count 1, logical reads 4, physical reads 0, read-ahead reads 0.
  9. cmdr_skywalker New Member

    You have highest load on BusinessUnitProduct. Do you have an index on the ProductID for this table? What is in the vw_OrganisedProducts view? How many records do you in this table? If we can reduce substantially reduce the reads on the BusinessUnitProduct, I think that will provide the substantial difference.

    May the Almighty God bless us all!
    www.empoweredinformation.com
  10. cmdr_skywalker New Member

    Sorry, break. To continue, How about the BusinessUnitID? Do you have index on this column?

    May the Almighty God bless us all!
    www.empoweredinformation.com
  11. wodell New Member

    Hello Commander!

    There is not an index on the BusinessUnitProduct.ProductID nor is there an index on the BusinessUnitID field.

    For the SQL of vw_OrganisedProducts, please see the 4th post on this subject.

    There are 230 records in the BusinessUnitProduct table.

    I don't understand how using a left outer join can make so much difference...
  12. joechang New Member

    of course there is an index,
    otherwise how you explain the fast plan with the inner join

    there is something about the query with the outer join that causes a very poor execution plan
  13. wodell New Member

    I think you're right Joe, it would appear the issue is that for some reason the Optimiser chooses a different plan when you use a LEFT OUTER, the question is why and how do we amend this...
  14. joechang New Member

    run both of the queries again,
    this time run profiler, select the performance, showplan statistics event
    add the binary data columns,

    save the trace as a file and sent it to me
  15. cmdr_skywalker New Member

    Sorry, been busy. I suggest you run the index tuning wizard and see what's the recommendation.

    May the Almighty God bless us all!
    www.empoweredinformation.com
  16. dhilditch New Member

    How does it perform if you remove the derived table from the right hand side of the join?

    e.g. something like this... bet it flies...

    SELECT matchedProducts.ProdCode AS ProdCode
    FROM dbo.DataUpload du INNER JOIN
    dbo.BusinessUnitCountryRelationship ON du.BusinessUnitID = dbo.BusinessUnitCountryRelationship.BusinessUnitID AND du.DealerCountry = dbo.BusinessUnitCountryRelationship.CountryAKA LEFT OUTER JOIN
    (SELECT rl.BusinessUnitID, rl.AKA, dbo.Country.ID AS CountryID
    FROM dbo.Partner rl INNER JOIN
    dbo.BusinessUnit Reseller ON rl.PartnerID = Reseller.ID INNER JOIN
    dbo.Country ON Reseller.RefCountry = dbo.Country.ID
    WHERE (rl.Deleted = 0) AND (Reseller.Deleted = 0)) MatchedResellers ON
    dbo.BusinessUnitCountryRelationship.CountryID = MatchedResellers.CountryID AND du.DealerName = MatchedResellers.AKA AND du.BusinessUnitID = MatchedResellers.BusinessUnitID LEFT OUTER JOIN
    dbo.BusinessUnitProduct matchedProducts ON du.MATCode = matchedProducts.AKA AND du.BusinessUnitID = matchedProducts.BusinessUnitID
    LEFT OUTER JOIN dbo.vw_OrganisedProducts prods ON prods.ProdID = pl.ProductID
    WHERE (du.Processed = 1) AND (du.Deleted = 0) and pl.deleted = 0


    NB - there is an extra left outer join above but you should be fine so long as no products in the BusinessUnitProduct don't have corresponding rows in the OrganisedProducts (where deleted =0)

    Hope this helps,
    Dave.

    ps. Good to see the forum's still thriving - sorry i been gone so long - been v.v. busy

    www.skyscanner.net - flight search
  17. wodell New Member

    Dave, the SQL statement you used references the "pl" table even though you have removed it from the view...?

    Joe I will be sending you a trace file soon.

    Thanks,
    Will
  18. joechang New Member

    the problem is caused by a significant statistics estimation error
    in the joining the products tables in the view,

    you could try updating all statistics on Products with FULL SCAN
    if that does not change the execution plan
    you might have to not use the view,
    then force a certain join order

    anyways, focus on the execution plan from SET STATISTICS_PROFILE ON
    it is the loop join to a table scan of BusinessUnitProduct executed 142,362 times that you want to reduced

    check the execution plan with the following join order reversed
    FROM dbo.vw_OrganisedProducts prods
    INNER JOIN dbo.BusinessUnitProduct pl ON prods.ProdID = pl.ProductID

    if the plan does not change, then try OPTION(FORCEORDER)
  19. wodell New Member

    Hi Joe,

    I tried changing the order of the join but it didn't make any difference. I am getting a syntax error when trying to use OPTION(FORCEORDER), can you give me an example of how you use this?

    Also, what does updating statistics do? Is this something I should need to run once, or every time the query is run?

    Thanks,
    Will
  20. mmarovic Active Member

    Use: option(force order). Read about query hints in BOL.
  21. wodell New Member

    Hi,

    Thanks guys, got force order working. On our development server (less records in various tables, but same amount in "Products" table, output 5000 records) it made a MASSIVE difference, going from 5 minutes without force order to 1 SECOND with!

    On our live server though (more records in various tables apart from "Products" table, output 100,000 records) it only cut a 4 minute query down to 3 minutes 30.

    How do I update statistics?

    Will
  22. mmarovic Active Member

    quote:Originally posted by wodell
    How do I update statistics?
    Look at update statistics topic in BOL. You can either allow autoupdating statistics or schedule update statistics to run during off hours.

    For batch process that inserts/updates/deletes significant number of rows you can add update statistics as the last step.

Share This Page