SQL Server Performance

fastest where clause for a table scan

Discussion in 'T-SQL Performance Tuning for Developers' started by simon9813, Sep 8, 2006.

  1. simon9813 New Member

    I am designing a database and an application and am considering query performance.

    I have the option of querying on:

    1 a list of ints i.e. where column = 1 or column = 3 or column = 7
    2 a range of ints i.e. where column >=1 and column <= 10
    3 a binary operation on an int i.e. where (column and 0x000000ff) = 10
    4 a like operation on a string i.e. where column like '%something%'

    My tests seem to show 2 and 3 have the least server CPU cost and are the fastest.

    Any recommendations?

    Thanks,

    Simon
  2. Adriaan New Member

    It also depends on any other indexes present, on the primary key, on the column list of your query, etc.

    You may be able to improve #1 by doing

    SELECT ... FROM tbl WHERE column = 1
    UNION ALL SELECT ... FROM tbl WHERE column = 3
    UNION ALL SELECT ... FROM tbl WHERE column = 7

    #4 will always be slow because your criteria string starts with a wildcard. Unless you want to add a full-text index, and the syntax to go with that, you won't be able to improve response time.
  3. Adriaan New Member

    ... and by the way: if SQL does a table scan for a query with criteria, it is an indication that the indexes are not appropriate for the query.

    For better performance, you want SQL to do an index seek for the criteria. This only happens if the indexes are designed to cover your criteria.
  4. simon9813 New Member

    Thanks for that - in reality the column would be indexed but not PK. Would that make any difference to how each of the where clauses perform against each other?

    Re #4 would it be quicker to do [where SUBSTRING(column, 5, 14) = 'something'] ?
  5. FrankKalis Moderator

    <blockquote id="quote"><font size="1" face="Verdana, Arial, Helvetica" id="quote">quote:<hr height="1" noshade id="quote"><br />Re #4 would it be quicker to do [where SUBSTRING(column, 5, 14) = 'something'] ?<br /><hr height="1" noshade id="quote"></font id="quote"></blockquote id="quote"><br />No, most likely not, since this is not sargeable and the existing index wouldn't be of any use anyway.<br /><br />SQL Server can use multiple indexes to satisfy a query. Search for "index intersection" here on the site. [<img src='/community/emoticons/emotion-1.gif' alt=':)' />]<br /><br />--<br />Frank Kalis<br />Moderator<br />Microsoft SQL Server MVP<br />Webmaster:<a target="_blank" href=http://www.insidesql.de>http://www.insidesql.de</a>
  6. simon9813 New Member

    quote:Originally posted by Adriaan


    You may be able to improve #1 by doing

    SELECT ... FROM tbl WHERE column = 1
    UNION ALL SELECT ... FROM tbl WHERE column = 3
    UNION ALL SELECT ... FROM tbl WHERE column = 7


    Adding the union clauses seemed to slow things down according to the ISQL server trace (156ms as opposed to 46ms for the ORed select). Is there an efficiency tradeoff on the number of values used - I'm using 20 in my tests?
  7. Roji. P. Thomas New Member

    quote:Originally posted by Adriaan

    You may be able to improve #1 by doing

    SELECT ... FROM tbl WHERE column = 1
    UNION ALL SELECT ... FROM tbl WHERE column = 3
    UNION ALL SELECT ... FROM tbl WHERE column = 7

    I don't think that's correct.

    Roji. P. Thomas
    Microsoft SQL Server MVP
    http://toponewithties.blogspot.com
  8. Adriaan New Member

    AFAIK SQL Server will sometimes use UNION ALL in the execution plan if you have an OR list like that. I guess it all depends largely on the selectivity of the index for the given values.
  9. Roji. P. Thomas New Member

    quote:Originally posted by Adriaan

    AFAIK SQL Server will sometimes use UNION ALL in the execution plan if you have an OR list like that. I guess it all depends largely on the selectivity of the index for the given values.
    May be, But definitely not in the above mentioned scenario.

    Whether or not Optimizer internally replace an OR to UNION ALL, UNION ALL is a better option in the following scenario.

    SELECT OrderID FROM Orders
    WHERE OrderDate > '19960710'
    OR CustomerID = 'VINET'

    SELECT OrderID FROM Orders
    WHERE OrderDate > '19960710'
    UNION --ALL
    SELECT OrderID FROM Orders
    WHERE CustomerID = 'VINET'

    Infact here we are forcing Index Intersection, which performs far better than an index scan.

    Roji. P. Thomas
    Microsoft SQL Server MVP
    http://toponewithties.blogspot.com
  10. Adriaan New Member

    Roji, thanks for the clarification.[<img src='/community/emoticons/emotion-1.gif' alt=':)' />]
  11. TherelentlessOne New Member

    If you want to find the best way to rewrite a Query, let SQL Optimizer for Visual Studio find it for your. If you go into options and turn the intelligence up to 10, which takes a little longer it will go through 1,000's of options and determine the best way to rewrite the SQL for your environment. You'll also we able to look at all the version that it created and see how each rewrite affects the SQL Exectuion Plan. You can give it a free try athttp://www.extensibles.com/modules.php?name=Products&op=SSP



    The Relentless One
    No Bug says alive

    http://www.extensibles.com
  12. Roji. P. Thomas New Member

    <blockquote id="quote"><font size="1" face="Verdana, Arial, Helvetica" id="quote">quote:<hr height="1" noshade id="quote"><i>Originally posted by TherelentlessOne</i><br /><br />If you want to find the best way to rewrite a Query, let SQL Optimizer for Visual Studio find it for your. If you go into options and turn the intelligence up to 10, which takes a little longer it will go through 1,000's of options and determine the best way to rewrite the SQL for your environment. You'll also we able to look at all the version that it created and see how each rewrite affects the SQL Exectuion Plan. You can give it a free try at<a target="_blank" href=http://www.extensibles.com/modules.php?name=Products&op=SSP>http://www.extensibles.com/modules.php?name=Products&op=SSP</a><br /><hr height="1" noshade id="quote"></font id="quote"></blockquote id="quote"><br />No software is as good as a performance consultant [<img src='/community/emoticons/emotion-2.gif' alt=':D' />]<br /><br />Roji. P. Thomas<br /<a target="_blank" href=http://toponewithties.blogspot.com>http://toponewithties.blogspot.com</a><br />
  13. MichaelB Member

    One other thing, range queries work best with non-clustered indexes I am told<img src='/community/emoticons/emotion-1.gif' alt=':)' /><br /><br />and about SQL optimizer.. tried that and many of its "best" suggestions were slower than my home grown code<img src='/community/emoticons/emotion-1.gif' alt=':)' /> and sometimes the query it makes provides different output so watch out.. it is buggy<img src='/community/emoticons/emotion-1.gif' alt=':)' /><br /><br /><br />Michael B<br />Sr. DBA<br /><br />"The fear of the Lord is the beginning of knowledge,<br />but fools despise wisdom and instruction." Proverbs 1:7
  14. joechang New Member

    i think simon may be long gone on this
    but my understanding of the Optimizer in question is that it will try to find the best way to write the exact query provided.

    sometimes, it is necessary to ask the question: what is the right query for my business logic, for which may involve changes that a software tool cannot consider
  15. Adriaan New Member

    Michael, that is an assumption that could start a big discussion here.

    If you check BOL, it says that clustered indexes are best for range queries.

    If you check the regulars here, the consensus seems to be that the best use for a clustered index is to be based on an identity column that is also the primary key. Because all nonclustered indexes include the clustered index key, the clustered index itself should really have a small footprint (hence the identity field). With an identity value for the clustered index, you can be sure that all new entries are added at the top of the index, never anywhere else, and that the key values never change. With most composite clustered indexes, you see much the opposite of all that.

    Assuming that your nonclustered indexes are in good health (regular maintenance) then range queries will profit from the ideal clustered index even if the filtering is not on the key to that clustered index.
  16. brimba New Member

    <blockquote id="quote"><font size="1" face="Verdana, Arial, Helvetica" id="quote">quote:<hr height="1" noshade id="quote"><i>Originally posted by TherelentlessOne</i><br /><br />If you want to find the best way to rewrite a Query, let SQL Optimizer for Visual Studio find it for your. If you go into options and turn the intelligence up to 10, which takes a little longer it will go through 1,000's of options and determine the best way to rewrite the SQL for your environment. You'll also we able to look at all the version that it created and see how each rewrite affects the SQL Exectuion Plan. You can give it a free try at<a target="_blank" href=http://www.extensibles.com/modules.php?name=Products&op=SSP>http://www.extensibles.com/modules.php?name=Products&op=SSP</a><br /><hr height="1" noshade id="quote"></font id="quote"></blockquote id="quote"><br /><br />Is there any similar product available for SPs? Or maybe as a stand-alone product? <br />Would be good if I could write my query with a couple of possible variations of input parameters and then it will search for the best. <img src='/community/emoticons/emotion-1.gif' alt=':)' />
  17. mmarovic Active Member

    quote:Originally posted by simon9813

    I am designing a database and an application and am considering query performance.

    I have the option of querying on:

    1 a list of ints i.e. where column = 1 or column = 3 or column = 7
    I wrote an article about handling such problems:http://www.sql-server-performance.com/mm_list_random_values.asp
    There is also the famous article:http://www.sommarskog.se/arrays-in-sql.html
    That article offers many different solutions but is missing one from mine. Erland's article is a must read if you are dealing with that kind of problems.


    quote:
    2 a range of ints i.e. where column >=1 and column <= 10
    This is the preferable solution because it can range scan the index on the column.


    quote:
    3 a binary operation on an int i.e. where (column and 0x000000ff) = 10
    If you have an index on the column better use range query if possible.


    quote:
    4 a like operation on a string i.e. where column like '%something%'
    Avoid it. See Frank's explanation.

Share This Page