Paging extremely large resultsets and query probs | SQL Server Performance Forums

SQL Server Performance Forum – Threads Archive

Paging extremely large resultsets and query probs

I am working on a query that needs to page through well over 1 million rows. We are working with 1 million while it is in test, but this number will grow very large. The select statement is causing a lot of problems with the sorting. We can manipulate the query so that it will run in under 2 seconds. The statement is as follows: SELECT TOP 25 i.ItemID, ifd2.ItemFieldDefaultId
FROM tblItems i
LEFT JOIN tblItemFields ifs2 ON i.ItemId = ifs2.ItemId
INNER JOIN tblItemFieldDefaults ifd2 ON
ifd2.ItemFieldTypeId = ifs2.FieldTypeId
AND ifd2.ItemFieldDefaultId = ifs2.FieldValue
AND ifd2.ItemFieldProjectId = ifs2.itemrootid
INNER JOIN tblObjectRelations ors on ors.objectid = i.itemparentid
INNER JOIN tblUserSecurity us ON
us.UserSecurityUserId = 1076
AND us.UserSecurityObjectId = ors.ObjectId
WHERE i.ItemCreationDate <= ‘Dec 31 2003 12:00PM’
AND i.itemrootid = 1100 and ifs2.fieldtypeid = 252
AND ors.L > 1
AND ors.R < 200000
AND us.UserSecurityValue & 1 = 1
AND ors.ObjectStatus IN (1,4)
AND ors.ObjectTypeID=6
ORDER BY ifd2.ItemFieldDefaultId DESC When this is run with the TOP 25, everything runs great. When it returns anything over 410 rows, it seems to want to re-sort everything (according to the execution plan) and takes about 4 minutes. SELECT TOP 409 works great as well. It has gotten very confusing and was wondering if anyone might have any ideas as to why something like this would slow it down. If we remove rows with a more narrow WHERE clause, it seems to re-sort as well. With this being the case, none of the paging routines (temp table, different TOP SELECTS) will work. This is a dynamiccaly generated SQL statement based on sort criteria chosen at the client. The lists are used in real time, and cannot have very long query times. All where and join fields are indexed appropriately, and it looks like we have the correct indexes, etc. that we’re needing. Any help or ideas will be greatly appreciated!
if there a difference in the execution plans between 409 and 410 rows?
in the above case, the plan should be a loop join for low row counts up to a certain point, at which it may want to try hash or merge joins,
however, if the indexes & keys are not setup correctly, the hash or merge could be really bad.
same with parallel execution and HT enabled.
also, check the relative cost of the query at TOP 100, 150, 200, 400, 410

Yes, the execution plan is definitely different between 409 and 410, and you are right that at 410 it is using a hash/inner join instead of a loop join. So this is due to indexes? We have indexes on all fields that are joined and filtered on, and some multi-column indexes for the multi-column joins, etc. I’m not sure what else to do with those.
Run Index Tunning Wizard from SQL Analyzer to see if there is more recomendations about indexs,
Using differents TOP.
Luis Martin …Thus mathematics may be defined as the subject in which we never know what we are talking about, nor whether what we are saying is true.
Bertrand Russell
Not a huge thing, but you
LEFT JOIN tblItemFields ifs2 and then use a non-null where clause on a field from this table (and ifs2.fieldtypeid = 252).
Which means it will only return inner join rows. So if you really want this where clause, you can remove the left join, or if you really want the left join then changes your where to and (ifs2.fieldtypeid = 252 or ifs2.fieldtypeid is null) or move the where part of this one up to the on part of the join clause. This may or may now make a difference, but its cleaner code. Chris
Oh, and one more thing, what is the execution plan if you dont use Top x (ie all rows). chris
The LEFT JOIN was actually supposed to be joined on the ifs2.fieldtypeid = 252 as well, it just got moved while trying to optimize the query, and never got moved back. That join should actually be: LEFT JOIN tblItemFields ifs2 ON i.ItemId = ifs2.itemId AND ifs2.FieldTypeId = 252 The execution plan for all rows is the same as 410 and above. I have run the index tuner but id doesn’t suggest anything different. I’m at a loss here, and can’t figure this one out. It’s very high priority, but nothing seems to help.
I have found that if I remove the ORDER clause it works great at all TOPs that I specify which would probably speed up the paging. While this isn’t an option, it may help in diagnosing what is going on!
Yes and no. Do you have an index on ItemFieldDefaultId? Is it clustered? If you have a clustered index on this field, sql whould use the clustered index, for sorting/selecting. If you dont have one, then sql will pick what seems best. Can you test adding a clustered index if you dont have one?
You could also perhaps try change the query a little
a)
SELECT TOP 25
i.ItemID,
ifd2.ItemFieldDefaultId
FROM tblItemFieldDefaults ifd2
INNER JOIN tblItemFields ifs2
RIGHT JOIN tblItems i
INNER JOIN tblObjectRelations ors
INNER JOIN tblUserSecurity us
ON us.UserSecurityUserId = 1076
AND us.UserSecurityObjectId = ors.ObjectId
AND us.UserSecurityValue & 1 = 1
on ors.objectid = i.itemparentid
AND ors.L > 1
AND ors.R < 200000
AND ors.ObjectStatus IN (1,4)
AND ors.ObjectTypeID=6
ON i.ItemId = ifs2.ItemId
AND i.ItemCreationDate <= ‘Dec 31 2003 12:00PM’
AND i.itemrootid = 1100
ON ifd2.ItemFieldTypeId = ifs2.FieldTypeId
AND ifd2.ItemFieldDefaultId = ifs2.FieldValue
AND ifd2.ItemFieldProjectId = ifs2.itemrootid
AND ifs2.fieldtypeid = 252
ORDER BY
ifd2.ItemFieldDefaultId DESC and b) force a hash join where you think it should be and c) assuming that there is no other possible criteria for the tblItemFieldDefaults table, add a non-clustered index ( ItemFieldDefaultId, ItemFieldTypeId, ItemFieldProjectId ) in that order. If necessary try forcing it and forceplan. NOT sure how this would fit in with dynamically generating the statement though… how varied are the possible combination of where clauses and order by statements? If you solve the performance for this one then do you have another 10, 100, 1000 permutations to also work on? PS which table is the large one? Is it tblitems? If so then you could also try SELECT TOP 25
i.ItemID,
ifd2.ItemFieldDefaultId
FROM tblItemFieldDefaults ifd2
INNER JOIN tblItemFields ifs2
RIGHT JOIN (SELECT
i.ItemID
FROM tblItems i
INNER JOIN tblObjectRelations ors
INNER JOIN tblUserSecurity us
ON us.UserSecurityUserId = 1076
AND us.UserSecurityObjectId = ors.ObjectId
AND us.UserSecurityValue & 1 = 1
on ors.objectid = i.itemparentid
AND ors.L > 1
AND ors.R < 200000
AND ors.ObjectStatus IN (1,4)
AND ors.ObjectTypeID=6
WHERE i.ItemCreationDate <= ‘Dec 31 2003 12:00PM’
AND i.itemrootid = 1100 ) i
ON i.ItemId = ifs2.ItemId
ON ifd2.ItemFieldTypeId = ifs2.FieldTypeId
AND ifd2.ItemFieldDefaultId = ifs2.FieldValue
AND ifd2.ItemFieldProjectId = ifs2.itemrootid
AND ifs2.fieldtypeid = 252
ORDER BY
ifd2.ItemFieldDefaultId DESC

Since these are auto-generated, the sort could (unfortunately) come many different fields in many different tables, so yes, this is just one example of a query that will need to be run. Everything in the application is very generic, so we never know what we’re going to be sorting on. Some of the fields will be SQL_VARIANTS as well. I am away until Monday, so I will try these suggestions at that time and see if they work. Thank you to everyone for your suggestions so far!
I have a clustered index on the three fields that are searched on ItemFieldDefaults, and I changed them to the order in which it was formed, and it has changed the number from 410 to something higher. It is definitely the Hash Match that is killing me. Anytime that is used it is incredibly slow. I do notice that every time it does the hash join, it is sorting afterwards. When it does the nested join, it is sorting before.
Correction on that olast post, when doing the loop join, it does not do a sort at all, so I’m assuming it’s using the clustered index in this case to do the sorting.

Hi ya, you could force the loop join by adding the word loop to which ever of the join conditions is causing you the problem. I have had to do this for some of my queries too… Cheers
Twan
]]>