fastest where clause for a table scan | SQL Server Performance Forums

SQL Server Performance Forum – Threads Archive

fastest where clause for a table scan

I am designing a database and an application and am considering query performance. I have the option of querying on: 1 a list of ints i.e. where column = 1 or column = 3 or column = 7
2 a range of ints i.e. where column >=1 and column <= 10
3 a binary operation on an int i.e. where (column and 0x000000ff) = 10
4 a like operation on a string i.e. where column like ‘%something%’ My tests seem to show 2 and 3 have the least server CPU cost and are the fastest. Any recommendations? Thanks, Simon
It also depends on any other indexes present, on the primary key, on the column list of your query, etc. You may be able to improve #1 by doing SELECT … FROM tbl WHERE column = 1
UNION ALL SELECT … FROM tbl WHERE column = 3
UNION ALL SELECT … FROM tbl WHERE column = 7 #4 will always be slow because your criteria string starts with a wildcard. Unless you want to add a full-text index, and the syntax to go with that, you won’t be able to improve response time.
… and by the way: if SQL does a table scan for a query with criteria, it is an indication that the indexes are not appropriate for the query. For better performance, you want SQL to do an index seek for the criteria. This only happens if the indexes are designed to cover your criteria.
Thanks for that – in reality the column would be indexed but not PK. Would that make any difference to how each of the where clauses perform against each other? Re #4 would it be quicker to do [where SUBSTRING(column, 5, 14) = ‘something’] ?
<blockquote id="quote"><font size="1" face="Verdana, Arial, Helvetica" id="quote">quote:<hr height="1" noshade id="quote"><br />Re #4 would it be quicker to do [where SUBSTRING(column, 5, 14) = ‘something’] ?<br /><hr height="1" noshade id="quote"></font id="quote"></blockquote id="quote"><br />No, most likely not, since this is not sargeable and the existing index wouldn’t be of any use anyway.<br /><br />SQL Server can use multiple indexes to satisfy a query. Search for "index intersection" here on the site. [<img src=’/community/emoticons/emotion-1.gif’ alt=’:)‘ />]<br /><br />–<br />Frank Kalis<br />Moderator<br />Microsoft SQL Server MVP<br />Webmaster:<a target="_blank" href=http://www.insidesql.de>http://www.insidesql.de</a>
quote:Originally posted by Adriaan
You may be able to improve #1 by doing SELECT … FROM tbl WHERE column = 1
UNION ALL SELECT … FROM tbl WHERE column = 3
UNION ALL SELECT … FROM tbl WHERE column = 7

Adding the union clauses seemed to slow things down according to the ISQL server trace (156ms as opposed to 46ms for the ORed select). Is there an efficiency tradeoff on the number of values used – I’m using 20 in my tests?
quote:Originally posted by Adriaan You may be able to improve #1 by doing SELECT … FROM tbl WHERE column = 1
UNION ALL SELECT … FROM tbl WHERE column = 3
UNION ALL SELECT … FROM tbl WHERE column = 7
I don’t think that’s correct. Roji. P. Thomas
Microsoft SQL Server MVP
http://toponewithties.blogspot.com

AFAIK SQL Server will sometimes use UNION ALL in the execution plan if you have an OR list like that. I guess it all depends largely on the selectivity of the index for the given values.
quote:Originally posted by Adriaan AFAIK SQL Server will sometimes use UNION ALL in the execution plan if you have an OR list like that. I guess it all depends largely on the selectivity of the index for the given values.
May be, But definitely not in the above mentioned scenario. Whether or not Optimizer internally replace an OR to UNION ALL, UNION ALL is a better option in the following scenario. SELECT OrderID FROM Orders
WHERE OrderDate > ‘19960710’
OR CustomerID = ‘VINET’ SELECT OrderID FROM Orders
WHERE OrderDate > ‘19960710’
UNION –ALL
SELECT OrderID FROM Orders
WHERE CustomerID = ‘VINET’ Infact here we are forcing Index Intersection, which performs far better than an index scan. Roji. P. Thomas
Microsoft SQL Server MVP
http://toponewithties.blogspot.com

Roji, thanks for the clarification.[<img src=’/community/emoticons/emotion-1.gif’ alt=’:)‘ />]
If you want to find the best way to rewrite a Query, let SQL Optimizer for Visual Studio find it for your. If you go into options and turn the intelligence up to 10, which takes a little longer it will go through 1,000’s of options and determine the best way to rewrite the SQL for your environment. You’ll also we able to look at all the version that it created and see how each rewrite affects the SQL Exectuion Plan. You can give it a free try athttp://www.extensibles.com/modules.php?name=Products&op=SSP The Relentless One
No Bug says alive http://www.extensibles.com
<blockquote id="quote"><font size="1" face="Verdana, Arial, Helvetica" id="quote">quote:<hr height="1" noshade id="quote"><i>Originally posted by TherelentlessOne</i><br /><br />If you want to find the best way to rewrite a Query, let SQL Optimizer for Visual Studio find it for your. If you go into options and turn the intelligence up to 10, which takes a little longer it will go through 1,000’s of options and determine the best way to rewrite the SQL for your environment. You’ll also we able to look at all the version that it created and see how each rewrite affects the SQL Exectuion Plan. You can give it a free try at<a target="_blank" href=http://www.extensibles.com/modules.php?name=Products&op=SSP>http://www.extensibles.com/modules.php?name=Products&op=SSP</a><br /><hr height="1" noshade id="quote"></font id="quote"></blockquote id="quote"><br />No software is as good as a performance consultant [<img src=’/community/emoticons/emotion-2.gif’ alt=’:D‘ />]<br /><br />Roji. P. Thomas<br /<a target="_blank" href=http://toponewithties.blogspot.com>http://toponewithties.blogspot.com</a><br />
One other thing, range queries work best with non-clustered indexes I am told<img src=’/community/emoticons/emotion-1.gif’ alt=’:)‘ /><br /><br />and about SQL optimizer.. tried that and many of its "best" suggestions were slower than my home grown code<img src=’/community/emoticons/emotion-1.gif’ alt=’:)‘ /> and sometimes the query it makes provides different output so watch out.. it is buggy<img src=’/community/emoticons/emotion-1.gif’ alt=’:)‘ /><br /><br /><br />Michael B<br />Sr. DBA<br /><br />"The fear of the Lord is the beginning of knowledge,<br />but fools despise wisdom and instruction." Proverbs 1:7
i think simon may be long gone on this
but my understanding of the Optimizer in question is that it will try to find the best way to write the exact query provided. sometimes, it is necessary to ask the question: what is the right query for my business logic, for which may involve changes that a software tool cannot consider
Michael, that is an assumption that could start a big discussion here. If you check BOL, it says that clustered indexes are best for range queries. If you check the regulars here, the consensus seems to be that the best use for a clustered index is to be based on an identity column that is also the primary key. Because all nonclustered indexes include the clustered index key, the clustered index itself should really have a small footprint (hence the identity field). With an identity value for the clustered index, you can be sure that all new entries are added at the top of the index, never anywhere else, and that the key values never change. With most composite clustered indexes, you see much the opposite of all that. Assuming that your nonclustered indexes are in good health (regular maintenance) then range queries will profit from the ideal clustered index even if the filtering is not on the key to that clustered index.
<blockquote id="quote"><font size="1" face="Verdana, Arial, Helvetica" id="quote">quote:<hr height="1" noshade id="quote"><i>Originally posted by TherelentlessOne</i><br /><br />If you want to find the best way to rewrite a Query, let SQL Optimizer for Visual Studio find it for your. If you go into options and turn the intelligence up to 10, which takes a little longer it will go through 1,000’s of options and determine the best way to rewrite the SQL for your environment. You’ll also we able to look at all the version that it created and see how each rewrite affects the SQL Exectuion Plan. You can give it a free try at<a target="_blank" href=http://www.extensibles.com/modules.php?name=Products&op=SSP>http://www.extensibles.com/modules.php?name=Products&op=SSP</a><br /><hr height="1" noshade id="quote"></font id="quote"></blockquote id="quote"><br /><br />Is there any similar product available for SPs? Or maybe as a stand-alone product? <br />Would be good if I could write my query with a couple of possible variations of input parameters and then it will search for the best. <img src=’/community/emoticons/emotion-1.gif’ alt=’:)‘ />
quote:Originally posted by simon9813 I am designing a database and an application and am considering query performance. I have the option of querying on: 1 a list of ints i.e. where column = 1 or column = 3 or column = 7
I wrote an article about handling such problems:http://www.sql-server-performance.com/mm_list_random_values.asp
There is also the famous article:http://www.sommarskog.se/arrays-in-sql.html
That article offers many different solutions but is missing one from mine. Erland’s article is a must read if you are dealing with that kind of problems.
quote:
2 a range of ints i.e. where column >=1 and column <= 10
This is the preferable solution because it can range scan the index on the column.
quote:
3 a binary operation on an int i.e. where (column and 0x000000ff) = 10
If you have an index on the column better use range query if possible.
quote:
4 a like operation on a string i.e. where column like ‘%something%’
Avoid it. See Frank’s explanation.

]]>