SQL Server Performance

How make this WHERE sargeable?

Discussion in 'T-SQL Performance Tuning for Developers' started by joker197cinque, Jun 5, 2003.

  1. joker197cinque New Member

    WHERE Grafico IS NOT NULL AND RIGHT(Grafico, 3) IN ('jpg', 'jpe', 'gif')

    thx

    -- fabrizio
  2. bambola New Member

    how about creating a calulated column on RIGHT(Grafico, 3) with an index?

    Bambola.
  3. vbkenya New Member

    You don't even need the IS NOT NULL part of the WHERE clause. The IN('jpg','jpe','gif') precludes the selection of any other value apart of those 3 file types in your Grafico column.


    NHO
  4. gaurav_bindlish New Member

    I think this will work as

    WHERE Grafico like 'jpg%'
    OR Grafico like 'jpe%'
    OR Grafico like 'gif%'

    Not very sure if this is the optimal solution. Try executing the same and see the execution plna if this makes sense.

    HTH.

    Gaurav
  5. satya Moderator

    I thought this is appropriate forum, so moved the thread from Perf.Tuning for DBA's forum.

    _________
    Satya SKJ
  6. vbkenya New Member

    Gaurav, I think Joker197 is trying to pick only records with a Grafico (read Graphic filename extension - My Italian is obviuosly poor) value of either jpg , gif or jpe. You put the '%' after the desired file extension instead of at the beginning. Re-writing your solution, It would then read like this:

    WHERE Grafico like '%jpg'
    OR Grafico like '%jpe'
    OR Grafico like '%gif'

    This will work fine because the IN clause will not support the wildcard character.

    NHO
  7. gaurav_bindlish New Member

    Sorry about that I confused RIGHT() with LEFT(). See the solution suggested by vbkenya is alternative to what was initially posted in forum. But I would like to mention that the solution suggested will not have any SARG Conditions as for a string comparison to be SARG, the % operatoe should come at end. Let me explain it this way- if the query optimizer has to llok for record like '%jpg', it has to scan through all the records and hence the condition is non-sarg.

    Gaurav
  8. bambola New Member


    I tested it on a table of 240,000 rows, on a column of varchar(20) and here are the results:

    LIKE '%jpg' or Grafico like '%jpe' runs in 8000-8200 ms.

    RIGHT(Grafico, 3) IN ('jpg', 'jpe', 'gif') runs in 1700 - 1800 ms

    calculated column + INDEX on it
    calc_column IN ('jpg', 'jpe', 'gif') runs in 40-60 ms !

    Bambola.


  9. gaurav_bindlish New Member

    Maintaining this calculated index is also an overhead. This may be a concern if there are too many inserts in the table.

    Gaurav
  10. vbkenya New Member

    ....however, in Jokers case the logic involves looking at the last three characters (file extension) in that field. The other solution would be to break up the field into 2 columns, GraficoFilename and GraficoFileExtension and use the extension field for the query


    SELECT......WHERE GraficoExtension IN ('jpg','jpe','gif')

    or he could achieve almost the same results with


    SELECT.....WHERE GraficoExtension ='jpg'
    OR GraficoExtension ='jpe'
    OR GraficoExtension ='gif'


    OR even this shorter SARGable wildcard mask (Assuming that there are no other extensions that start with the letters 'jp'


    SELECT.....WHERE GraficoExtension ='gif'
    OR GraficoExtension LIKE 'jp%'


    Try this...But if you do, an index on the Graphicoextension field would benefit all the three queries.




    NHO
  11. bambola New Member

    It's a non-clustered index on char/vharchar(3). Doesn't seem too hard to maintain. In the time it takes to run 3 LIKE '%xxx' searched, I could run 3 searches and a DBCC DBREINDEX()... <img src='/community/emoticons/emotion-1.gif' alt=':)' /><br /><br />Bambola.
  12. vbkenya New Member

    Haven't tried those statistics yet but I believe that Joker should take that path or at least try it and tell us.

    NHO
  13. bradmcgehee New Member

    Great testing bambola. This is the kind of data I love to see.

    -----------------------------
    Brad M. McGehee, MVP
    Webmaster
    SQL-Server-Performance.Com
  14. bambola New Member

    Thanks Brad [:I]

    Bambola.
  15. gaurav_bindlish New Member

    Sorry I mentioned calculated index - I meant maintaining computed column. This means whenever I am doing a insert, I have to insert at an additional place (of course after preprocessing the data) and also the non-clustered index (which is not a big deal). Anyway, I think computed column is great idea if there are lot of queries running on the system.

    Gaurav
  16. vbkenya New Member

    If a computed column is implemented then the queries I posted above could still be used. The queries will require an index on this computed column.

    Please bear in mind the side-effects of creating an index on this type of column - like the ANSI_NULLS, ANSI_PADDING, ANSI_WARNINGS, ARITHABORT, CONCAT_NULL_YIELDS_NULL, QUOTED_IDENTIFIER and NUMERIC_ROUNDABORT settings for connections that will need to utilize that index.


    NHO
  17. seasider New Member

    I don't think that we are dealing with a calculated column here. It appears that the values are all filenames and you are trying to get the extension of each filename, is this correct?

    If it is then what you need to do is properly normalise your data. If this column was normalised then you would have columns: (i) filename; (ii) extension. This would resolve the issue and make searching on either column much easier.

    All you need to do to implement this is change the original insert into the table to insert into the 2 columns instead of one.
  18. vbkenya New Member

    Right. That's what I thought and suggested. Could Joker197cinque respond and clarify the usefulness of this line of thinking?

    NHO
  19. bambola New Member

    The idea with a calculated column was to find a solution that will not break the application logic. If changes to the database are possible, there are of course other solution to condiser like seasider suggested.

    In both cases, added column or calculated column, I'll stick with

    WHERE Grafico IN ('jpg','jpe','gif')

    without % before or after.

    Bambola.

Share This Page