SQL Server Performance

Paging extremely large resultsets and query probs

Discussion in 'T-SQL Performance Tuning for Developers' started by bcochran, Jan 2, 2004.

  1. bcochran New Member

    I am working on a query that needs to page through well over 1 million rows. We are working with 1 million while it is in test, but this number will grow very large. The select statement is causing a lot of problems with the sorting. We can manipulate the query so that it will run in under 2 seconds. The statement is as follows:

    SELECT TOP 25 i.ItemID, ifd2.ItemFieldDefaultId
    FROM tblItems i
    LEFT JOIN tblItemFields ifs2 ON i.ItemId = ifs2.ItemId
    INNER JOIN tblItemFieldDefaults ifd2 ON
    ifd2.ItemFieldTypeId = ifs2.FieldTypeId
    AND ifd2.ItemFieldDefaultId = ifs2.FieldValue
    AND ifd2.ItemFieldProjectId = ifs2.itemrootid
    INNER JOIN tblObjectRelations ors on ors.objectid = i.itemparentid
    INNER JOIN tblUserSecurity us ON
    us.UserSecurityUserId = 1076
    AND us.UserSecurityObjectId = ors.ObjectId
    WHERE i.ItemCreationDate <= 'Dec 31 2003 12:00PM'
    AND i.itemrootid = 1100 and ifs2.fieldtypeid = 252
    AND ors.L > 1
    AND ors.R < 200000
    AND us.UserSecurityValue & 1 = 1
    AND ors.ObjectStatus IN (1,4)
    AND ors.ObjectTypeID=6
    ORDER BY ifd2.ItemFieldDefaultId DESC

    When this is run with the TOP 25, everything runs great. When it returns anything over 410 rows, it seems to want to re-sort everything (according to the execution plan) and takes about 4 minutes. SELECT TOP 409 works great as well. It has gotten very confusing and was wondering if anyone might have any ideas as to why something like this would slow it down. If we remove rows with a more narrow WHERE clause, it seems to re-sort as well. With this being the case, none of the paging routines (temp table, different TOP SELECTS) will work. This is a dynamiccaly generated SQL statement based on sort criteria chosen at the client. The lists are used in real time, and cannot have very long query times. All where and join fields are indexed appropriately, and it looks like we have the correct indexes, etc. that we're needing.

    Any help or ideas will be greatly appreciated!
  2. joechang New Member

    if there a difference in the execution plans between 409 and 410 rows?
    in the above case, the plan should be a loop join for low row counts up to a certain point, at which it may want to try hash or merge joins,
    however, if the indexes & keys are not setup correctly, the hash or merge could be really bad.
    same with parallel execution and HT enabled.
    also, check the relative cost of the query at TOP 100, 150, 200, 400, 410
  3. bcochran New Member

    Yes, the execution plan is definitely different between 409 and 410, and you are right that at 410 it is using a hash/inner join instead of a loop join. So this is due to indexes? We have indexes on all fields that are joined and filtered on, and some multi-column indexes for the multi-column joins, etc. I'm not sure what else to do with those.
  4. Luis Martin Moderator

    Run Index Tunning Wizard from SQL Analyzer to see if there is more recomendations about indexs,
    Using differents TOP.


    Luis Martin

    ...Thus mathematics may be defined as the subject in which we never know what we are talking about, nor whether what we are saying is true.
    Bertrand Russell
  5. ChrisFretwell New Member

    Not a huge thing, but you
    LEFT JOIN tblItemFields ifs2 and then use a non-null where clause on a field from this table (and ifs2.fieldtypeid = 252).
    Which means it will only return inner join rows. So if you really want this where clause, you can remove the left join, or if you really want the left join then changes your where to and (ifs2.fieldtypeid = 252 or ifs2.fieldtypeid is null) or move the where part of this one up to the on part of the join clause.

    This may or may now make a difference, but its cleaner code.

    Chris
  6. ChrisFretwell New Member

    Oh, and one more thing, what is the execution plan if you dont use Top x (ie all rows).

    chris
  7. bcochran New Member

    The LEFT JOIN was actually supposed to be joined on the ifs2.fieldtypeid = 252 as well, it just got moved while trying to optimize the query, and never got moved back. That join should actually be:

    LEFT JOIN tblItemFields ifs2 ON i.ItemId = ifs2.itemId AND ifs2.FieldTypeId = 252

    The execution plan for all rows is the same as 410 and above. I have run the index tuner but id doesn't suggest anything different. I'm at a loss here, and can't figure this one out. It's very high priority, but nothing seems to help.
  8. bcochran New Member

    I have found that if I remove the ORDER clause it works great at all TOPs that I specify which would probably speed up the paging. While this isn't an option, it may help in diagnosing what is going on!
  9. ChrisFretwell New Member

    Yes and no. Do you have an index on ItemFieldDefaultId? Is it clustered?

    If you have a clustered index on this field, sql whould use the clustered index, for sorting/selecting. If you dont have one, then sql will pick what seems best.

    Can you test adding a clustered index if you dont have one?

  10. Twan New Member

    You could also perhaps try change the query a little
    a)


    SELECT TOP 25
    i.ItemID,
    ifd2.ItemFieldDefaultId
    FROM tblItemFieldDefaults ifd2
    INNER JOIN tblItemFields ifs2
    RIGHT JOIN tblItems i
    INNER JOIN tblObjectRelations ors
    INNER JOIN tblUserSecurity us
    ON us.UserSecurityUserId = 1076
    AND us.UserSecurityObjectId = ors.ObjectId
    AND us.UserSecurityValue & 1 = 1
    on ors.objectid = i.itemparentid
    AND ors.L > 1
    AND ors.R < 200000
    AND ors.ObjectStatus IN (1,4)
    AND ors.ObjectTypeID=6
    ON i.ItemId = ifs2.ItemId
    AND i.ItemCreationDate <= 'Dec 31 2003 12:00PM'
    AND i.itemrootid = 1100
    ON ifd2.ItemFieldTypeId = ifs2.FieldTypeId
    AND ifd2.ItemFieldDefaultId = ifs2.FieldValue
    AND ifd2.ItemFieldProjectId = ifs2.itemrootid
    AND ifs2.fieldtypeid = 252
    ORDER BY
    ifd2.ItemFieldDefaultId DESC

    and b) force a hash join where you think it should be

    and c) assuming that there is no other possible criteria for the tblItemFieldDefaults table, add a non-clustered index ( ItemFieldDefaultId, ItemFieldTypeId, ItemFieldProjectId ) in that order. If necessary try forcing it and forceplan.

    NOT sure how this would fit in with dynamically generating the statement though... how varied are the possible combination of where clauses and order by statements? If you solve the performance for this one then do you have another 10, 100, 1000 permutations to also work on?

    PS which table is the large one? Is it tblitems? If so then you could also try



    SELECT TOP 25
    i.ItemID,
    ifd2.ItemFieldDefaultId
    FROM tblItemFieldDefaults ifd2
    INNER JOIN tblItemFields ifs2
    RIGHT JOIN (SELECT
    i.ItemID
    FROM tblItems i
    INNER JOIN tblObjectRelations ors
    INNER JOIN tblUserSecurity us
    ON us.UserSecurityUserId = 1076
    AND us.UserSecurityObjectId = ors.ObjectId
    AND us.UserSecurityValue & 1 = 1
    on ors.objectid = i.itemparentid
    AND ors.L > 1
    AND ors.R < 200000
    AND ors.ObjectStatus IN (1,4)
    AND ors.ObjectTypeID=6
    WHERE i.ItemCreationDate <= 'Dec 31 2003 12:00PM'
    AND i.itemrootid = 1100 ) i
    ON i.ItemId = ifs2.ItemId
    ON ifd2.ItemFieldTypeId = ifs2.FieldTypeId
    AND ifd2.ItemFieldDefaultId = ifs2.FieldValue
    AND ifd2.ItemFieldProjectId = ifs2.itemrootid
    AND ifs2.fieldtypeid = 252
    ORDER BY
    ifd2.ItemFieldDefaultId DESC
  11. bcochran New Member

    Since these are auto-generated, the sort could (unfortunately) come many different fields in many different tables, so yes, this is just one example of a query that will need to be run. Everything in the application is very generic, so we never know what we're going to be sorting on. Some of the fields will be SQL_VARIANTS as well. I am away until Monday, so I will try these suggestions at that time and see if they work. Thank you to everyone for your suggestions so far!
  12. bcochran New Member

    I have a clustered index on the three fields that are searched on ItemFieldDefaults, and I changed them to the order in which it was formed, and it has changed the number from 410 to something higher. It is definitely the Hash Match that is killing me. Anytime that is used it is incredibly slow. I do notice that every time it does the hash join, it is sorting afterwards. When it does the nested join, it is sorting before.
  13. bcochran New Member

    Correction on that olast post, when doing the loop join, it does not do a sort at all, so I'm assuming it's using the clustered index in this case to do the sorting.
  14. Twan New Member


    Hi ya,

    you could force the loop join by adding the word loop to which ever of the join conditions is causing you the problem. I have had to do this for some of my queries too...

    Cheers
    Twan

Share This Page