SQL Server Performance

Using Functions on right side of a WHERE clause

Discussion in 'Performance Tuning for DBAs' started by bekkens, Jun 28, 2006.

  1. bekkens New Member

    I want to know why SQL Server generates 2 different execution plans for the 2 queries below ?<br />On big datavolumes the execution time difference between the 2 queries is very big and not acceptable.<br /><br />Is there a way to use the second query but force the first execution plan ?<br /><br />Thanks.<br /><br />Functionality<br />-------------<br />The estimated_start_time column contains the UTC datetime (and is indexed).<br />Reporting users must give the desired time range but specifying it in local datetime.<br />The UDF <b>dbo.WE2TEST </b> just converts the given local datetime into UTC datetime. In this way SQL Server should be able to use the index on the estimated_start_time column.<br /><br />Query 1<br />-------<br />SELECT *<br /> FROM dbo.POM_ENTRY<br /> WHERE estimated_start_time BETWEEN <b>dbo.WE2TEST</b>(CONVERT(datetime,'2006-04-05 14:00:00',120)) AND <b>dbo.WE2TEST</b>(CONVERT(datetime,'2006-04-05 16:00:00',120))<br /><br />Query 2<br />-------<br />SELECT *<br /> FROM POM_ENTRY<br /> WHERE estimated_start_time BETWEEN CONVERT(datetime,'2006-04-05 15:00:00',120) AND CONVERT(datetime,'2006-04-05 17:00:00',120)<br /><br />Execution Plan 1<br />----------------<br />select * from POM_ENTRY WHERE estimated_start_time BETWEEN dbo.WE2TEST(CONVERT(datetime,'2006-04-05 14:00:00',120)) AND dbo.WE2TEST(CONVERT(datetime,'2006-04-05 16:00:00',120))<br /> |--Filter(WHERE<img src='/community/emoticons/emotion-6.gif' alt=':(' />[POM_ENTRY].[estimated_start_time]&gt;=[dbo].[WE2TEST]('Apr 5 2006 2:00PM') AND [POM_ENTRY].[estimated_start_time]&lt;=[dbo].[WE2TEST]('Apr 5 2006 4:00PM')))<br /> |--Clustered Index Scan(OBJECT<img src='/community/emoticons/emotion-6.gif' alt=':(' />[IVECO].[dbo].[POM_ENTRY].[pk_pom_entry]))<br /><br />Execution Plan 2<br />----------------<br />select * from POM_ENTRY WHERE estimated_start_time BETWEEN CONVERT(datetime,'2006-04-05 15:00:00',120) AND CONVERT(datetime,'2006-04-05 17:00:00',120)<br /> |--Bookmark Lookup(BOOKMARK<img src='/community/emoticons/emotion-6.gif' alt=':(' />[Bmk1000]), OBJECT<img src='/community/emoticons/emotion-6.gif' alt=':(' />[IVECO].[dbo].[POM_ENTRY]))<br /> |--Index Seek(OBJECT<img src='/community/emoticons/emotion-6.gif' alt=':(' />[IVECO].[dbo].[POM_ENTRY].[ix_pom_entry_start_time]), SEEK<img src='/community/emoticons/emotion-6.gif' alt=':(' />[POM_ENTRY].[estimated_start_time] &gt;= 'Apr 5 2006 3:00PM' AND [POM_ENTRY].[estimated_start_time] &lt;= 'Apr 5 2006 5:00PM') ORDERED FORWARD)<br />
  2. Adriaan New Member

    Using a UDF in a WHERE clause is the same as telling SQL Server to totally ignore indexes. The UDF must process the date on each and every row, before the BETWEEN criteria can be applied.

    Without the UDFs, SQL Server can look at the index to find entries between the given dates, and ignore the rest. The main thing is that it knows what to expect from the CONVERT function.

    By the way, you don't have to use CONVERT here, as your dates are in the best possible format already:

    WHERE estimated_start_time BETWEEN '2006-04-05 15:00:00' AND '2006-04-05 17:00:00'
  3. bekkens New Member

    Thanks for the fast reply.<br /><br />In my opinion MS SQL Server 2000 does not work in an optimal way.<br /><br />I can understand that the optimizer can not use his indexes when a UDF is applied on the left side of the WHERE clause: WHERE UDF(column) BETWEEN A AND B.<br />But if you put the opposite function on the right side WHERE column BETWEEN UDF(A) AND UDF(B) then the optimizer should take this into account.<br /><br />It has no sense to calculate it for each and every row since it is a deterministic function and the values A and B do not change.<br />He should only calculate it once and use this result to make the filtering<br /><br />Maybe I expect too much from MS <img src='/community/emoticons/emotion-1.gif' alt=':)' /><br /><br />So, if I understand correctly, SQL Server will perform the calculation for each row !<br /><br /><br />
  4. Adriaan New Member

    All that SQL knows is that the UDF will return a date (otherwise there will be a run-time error) but it doesn't know how this returned date will relate to the date that is entered on the column.

    Because the result is unpredictable to SQL Server (I can see how it is predictable to you, but that is an entirely different matter) SQL Server assumes that the index is of no use.

    How are you calculating UTC - does your table have entries from locations in different time zones, and are you adding/subtracting a number of hours from the date on the column, according to the location?

    Of course you can do this in-line, and without a UDF. It really depends on how and where you're querying the UTC datetime parameters. You can't do a straight DATEADD() on the column in the WHERE statement, since it will have the same effect on performance as the UDF.
  5. mmarovic Active Member

    quote:Originally posted by Adriaan

    Using a UDF in a WHERE clause is the same as telling SQL Server to totally ignore indexes. The UDF must process the date on each and every row, before the BETWEEN criteria can be applied.

    Without the UDFs, SQL Server can look at the index to find entries between the given dates, and ignore the rest. The main thing is that it knows what to expect from the CONVERT function.

    By the way, you don't have to use CONVERT here, as your dates are in the best possible format already:

    WHERE estimated_start_time BETWEEN '2006-04-05 15:00:00' AND '2006-04-05 17:00:00'
    Adriaan, this is still range query, column is not wrapped into udf, range limits are. So index can be used, but statistics can't. I think, in the query 2, query optimizer actually decoded what starting and ending valuer are, so it used statistics and figured out index on date provides the best execution plan. First query can be forced to use that index using index query hint:



    SELECT *
    FROM dbo.POM_ENTRY (index = ix_pom_entry_start_time)
    WHERE estimated_start_time BETWEEN dbo.WE2TEST(CONVERT(datetime,'2006-04-05 14:00:00',120)) AND dbo.WE2TEST(CONVERT(datetime,'2006-04-05 16:00:00',120))

  6. merrillaldrich New Member

    The problem is (I think) that SQL Server has no way to tell that the result of your UDF is a constant - if it could, it could then resolve it to a constant value once and use that result on the whole set. Otherwise, it has to evaluate for each row.

    e.g. '2006-06-22' is a constant, but the result of f( '2006-06-22' ) may not be a constant for some possible function body. If there can be a case where it is not a constant result, then the server needs a way to know that, in order to tell whether your particular function yields a constant result or not.
  7. Adriaan New Member

    Hm yes - I got carried away with the UTC thingy.

    The best option is probably this - note that the date literals are already in the 120 format:

    DECLARE @Dt1 DATETIME, @Dt2 DATETIME

    SET @Dt1 = dbo.WE2TEST('2006-04-05 14:00:00')
    SET @Dt2 = dbo.WE2TEST('2006-04-05 16:00:00')

    SELECT *
    FROM dbo.POM_ENTRY
    WHERE estimated_start_time BETWEEN @Dt1 AND @Dt2

Share This Page