SQL Server Performance

Response to the datetime article by Bryan Syverson

Discussion in 'Please Tell Us What You Think' started by bambola, Jun 9, 2003.

  1. bambola New Member

    <font face="Verdana">I would like to refer to the datetime article on this excelent site <a target="_blank" href=http://www.sql-server-performance.com/bs_date_time.asp>http://www.sql-server-performance.com/bs_date_time.asp</a>) written by Bryan Syverson. This article is based on some wrong assumptions. I would appreciate your comments about it. To make things simple, I included a few little scripts to prove my point.<br /><blockquote id="quote"><font size="1" face="Verdana, Arial, Helvetica" id="quote">quote:<hr height="1" noshade id="quote"><br />Datetime and smalldatetime are like the floating-point data types, float and real, in that they#%92re approximate numerics. That means the value retrieved from SQL Server may be different from the value that was originally stored. <br /><hr height="1" noshade id="quote"></blockquote id="quote"></font id="quote"><br />Reading the datetime as float, will not always result the correct value. <br /><pre><font color="blue">DECLARE @d datetime<br />SELECT @d = <font color="red">'1900-01-24 13:12:10.990'</font id="red"><br />SELECT @d as_datetime, <font color="purple">CONVERT</font id="purple">(datetime, <font color="purple">CAST</font id="purple">(@d as float)) as_float</font id="blue"><br /><br /><font color="green">Results are: 1900-01-24 13:12:10.<b>990</b> as_datetime, 1900-01-24 13:12:10.<b>987</b> as_float.</font id="green"></pre> <br />We can see that casting datetime to float and back result a value different than the initial. The reason is simple: Datetime and smalldatetime are not really floating point datatype. They are actually 2 4-bytes integers (or 2 2-bytes integers in case of smalldatetime). One represents the number of days past since 1900-01-01. The other, the number of clock ticks past since midnight (1 second = 300 clock ticks). <br /><br />To calculate the number of days past since 1900-01-01<br /><pre><font color="green">-- Calculate number of days since 1900-01-01</font id="green"><br /><font color="blue">SELECT <font color="purple">DATEDIFF</font id="purple">(dd, 0, @d) as number_of_days</font id="blue"><br /><br /><font color="green">This results 23 days.</font id="green"></pre>Calculating the time part is a bit more tricky. We have to first calculate the number of second past since midnight, multiply is by 300 clock ticks and add the fruction of the milliseconds:<br /><pre><font color="green">-- Calculate the number of clock ticks:</font id="green"><br /><font color="blue">SELECT <br /> (<br /> <font color="purple">DATEPART</font id="purple">(hour, @d) * 60 * 60<br /> + <font color="purple">DATEPART</font id="purple">(minute, @d) * 60<br /> + <font color="purple">DATEPART</font id="purple">(second, @d)<br /> ) * 300 <br /> + <font color="purple">CAST</font id="purple">(<font color="purple">DATEPART</font id="purple">(ms,@d)/1000.*300 as int) as number_of_clock_ticks</font id="blue"><br /><br /><font color="green">This results 14259297 clock ticks since midnight.</font id="green"></pre>Let's try to read the 2 integers that are composing the datetime. <pre><font color="blue"><br />SELECT <font color="purple">CAST</font id="purple">(<font color="purple">SUBSTRING</font id="purple">(<font color="purple">CONVERT</font id="purple">(binary(<img src='/community/emoticons/emotion-11.gif' alt='8)' />, @d), 1, 4) as int) date_part_as_int, -- first 4 bytes<br /> <font color="purple">CAST</font id="purple">(<font color="purple">SUBSTRING</font id="purple">(<font color="purple">CONVERT</font id="purple">(binary(<img src='/community/emoticons/emotion-11.gif' alt='8)' />, @d), 5, <img src='/community/emoticons/emotion-11.gif' alt='8)' /> as int) time_part_as_int -- last 4 bytes</font id="blue"><br /><br /><font color="green">The results are 23 days and 14259297 clock ticks which are the <b>exact</b> same values calculated above.</font id="green"></pre>And reverse back to datetime:<br /><pre><font color="blue">SELECT <font color="purple">CONVERT</font id="purple">(datetime, <font color="purple">CAST</font id="purple">(23 as binary(4)) + <font color="purple">CAST</font id="purple">(14259297 as binary(4)), 120)</font id="blue"><br /><br /><font color="green">Results 1900-01-24 13:12:10.990 which <b>is</b> the initial vaule.</font id="green"></pre><br />Note that for this example I used .990 as milliseconds to avoid rounding problems. Milliseconds are always rounded to .000 .003 or .007 in following way:<br /><pre><br />.990 .991 --&gt; .990<br />.992 .993 .994 --&gt; .993<br />.995 .996 .997 .998 --&gt; .997<br />.999 --&gt; .000</pre><pre><font color="blue"> <br />DECLARE @d1 datetime, @d2 datetime<br />SELECT @d1 = <font color="red">'2002-02-02 23:59:59:998'</font id="red"><br /> , @d2 = <font color="red">'2002-02-02 23:59:59:999'</font id="red"><br />SELECT @d1, @d2 <br /> , <font color="purple">CAST</font id="purple">(<font color="purple">FLOOR</font id="purple">(<font color="purple">CAST</font id="purple">(@d1 AS float))AS datetime) <br /> , <font color="purple">CAST</font id="purple">(<font color="purple">FLOOR</font id="purple">(<font color="purple">CAST</font id="purple">(@d2 AS float))AS datetime) </font id="blue"><br /><font color="green"> <br />Results: 2002-02-02 00:00:00.000 2002-02-<b>03</b> 00:00:00.000<br /> 2002-02-02 00:00:00.000 2002-02-<b>03</b> 00:00:00.000</font id="green"></pre><br />This assignment might not have much sense: you are not likely to find yourself asigning 998 milliseconds to a datetime variable. But you can see that rounding occures <b>before</b> asigning the value. And the value retrieved is <b>never</b> different from the one stored. <br /> <br />To see the difference between the conversion to float and int, run this script:<br /><pre><font color="blue"> <br />SET NOCOUNT ON<br /><br />DECLARE @x datetime<br />DECLARE @t table <br /> ( base_date datetime<br /> , as_float datetime<br /> , as_binary binary(<img src='/community/emoticons/emotion-11.gif' alt='8)' /><br /> )<br />DECLARE @i int<br />SELECT @x = <font color="red">'1900-01-01 13:12:10.999'</font id="red"> <br />SELECT @i = 1<br />SELECT @x = <font color="purple">DATEADD</font id="purple">(millisecond, @i, @x)<br /><br /> <br />WHILE @i &lt;= 10<br />BEGIN<br /> SELECT @x = <font color="purple">DATEADD</font id="purple">(millisecond, @i, @x)<br /> INSERT INTO @t<br /> SELECT @x<br /> , <font color="purple">CONVERT</font id="purple">(datetime,cast(@x as float)) <br /> , <font color="purple">CONVERT</font id="purple">(binary(<img src='/community/emoticons/emotion-11.gif' alt='8)' />,@x)<br /> SELECT @i = @i + 1<br />END<br /> <br />SELECT <br /> base_date<br /> , as_float<br /> , <font color="purple">CASE </font id="purple">WHEN as_float &lt;&gt; base_date THEN 'WRONG' ELSE 'OK' END result_float <br /> , <font color="purple">CONVERT</font id="purple">(datetime, as_binary) binary_to_date<br /> , <font color="purple">CASE </font id="purple">WHEN <font color="purple">CONVERT</font id="purple">(datetime, as_binary) &lt;&gt; base_date THEN 'WRONG' ELSE 'OK' END result_binary<br />FROM @t</font id="blue"></pre> <br />There is not doubt that there are limitations to the datetime variable. I would have liked, for example, to be able to store values before 1753. I also cannot understand why millisecond are rounded and not stored correctly. However, as long as you are reading datetimes correctly, <u>the system always returns the same value stored</u>.<br /> <br />If you choose to refer to datetime at lower level, and avoid having the datetime algorytm run on all values (in some cases performance can improve dramatically!), you must read it as 2 integers according to the shown above. This is what it is - 2 integers. Not a float. <img src='/community/emoticons/emotion-1.gif' alt=':)' /><br /> <br />A few words about searching. SQL Server does not have seperate datatypes for date and time. As long as this is the case (and from what I have heard in the next version there will be), it would be wrong to refer to datetime as date only. <br /> <br />When searching a date range, we have to make sure our time portion is set correctly. If time is irelevant, start_date time portion should be set to '00:00:00: and the end_date to '23:59:59.997'.<br /> <br />This can be done in serveral ways:<br /> <br /> <pre>-<font color="blue"><font color="purple">CONVERT</font id="purple">(varchar(10), start_date, 120) </font id="blue"><br /> (ISO is always the best way to handle dates)</pre><br /> <pre>-<font color="blue"><font color="purple">CAST</font id="purple">(<font color="purple">SUBSTRING</font id="purple">(<font color="purple">CAST</font id="purple">(start_date as binary(<img src='/community/emoticons/emotion-11.gif' alt='8)' />), 1, 4) as datetime) </font id="blue"><br /> not that intuitive, but works much better than the above.</pre><br /> <pre>-<font color="blue"><font color="purple">DATEADD</font id="purple">(dd, <font color="purple">DATEDIFF</font id="purple">(dd, 0, start_date ), 0) </font id="blue"><br /> This is probably the most elegant and efficient way. </pre><br />Note that the last 2 would be very efficient in queries that needs to group by day (ignoring the time). <br /><br />To set the end_date, you should first set the time part to zero, then add a day and substruct 2 milliseconds (substructing only one will result .999 that rounds up to .000 of the next day). <br /><br /> <pre>-<font color="blue"><font color="purple">DATEADD</font id="purple">(millisecond, -2, <font color="purple">DATEADD</font id="purple">(dy,1, end_date_00)) </font id="blue"><br /><br /> or simply concat <font color="red">'23:59:59.997'</font id="red"> to the date</pre>I hope this shades some light on the datetime variable. <br /> <br />Bambola.<br /><br /></font id="Verdana"><br />
  2. gaurav_bindlish New Member

    Great observation bambola...

    Gaurav
  3. vbkenya New Member

    This would be a great article out there. Why don't you give the issue a little more exposure?

    NHO
  4. bradmcgehee New Member

    As vbkenya has suggested, why don't you expand this post into an article, and I will post it on my website. Don't worry about grammar, I can clean it up for you.



    -----------------------------
    Brad M. McGehee, MVP
    Webmaster
    SQL-Server-Performance.Com
  5. bambola New Member

    Thanks guys that's really nice of you. And thank you Brad for your offer. I would really love to have my own article published on this site. But truth is, when it comes to words I'm sort of lost...<br />Hey, I ended up being a programmer and not a writer! <img src='/community/emoticons/emotion-5.gif' alt=';-)' /><br /><br />Anyway, I will try to do my best and see how it goes. Is there's anything in particular about datetime that you would like to see in it? Any input suggestions corrections from you guys would be highly appreciated. <br /><br />Waiting for yout input!<br /><br />Bambola. <br />
  6. bradmcgehee New Member

    One of the best tips I learned as a beginning writer is to try to write as if you are telling a friend how to do something, using your own words. If you can do that, you can write an article.

    -----------------------------
    Brad M. McGehee, MVP
    Webmaster
    SQL-Server-Performance.Com
  7. gaurav_bindlish New Member

    Great tip Brad. Really worth noting down.

    Gaurav
  8. bambola New Member

    Thanks for the tip, Brad. I'll try it.

    Bambola.
  9. vbkenya New Member

    As an add on to your discussion about the storage and handling of date and time variables, you could include the fact that even the guys at Microsoft themselves have some 'false' documentation the BOL about the storage of time values in some system tables. e.g in the msdb.sysjobschedules table date and time values are indicated as being stored as integers. if you run a query against this table and try to convert these values to datetime, you can get into serious trouble. What these MS folks have done is to strip the date and time values of any '/', '-' or ':' separators and store the results as integers.

    Despite my initial confusion, these storage tactic seems to be more prone to accuracy than the one currently being used in other tables within the system (It still requires 4 bytes for the date and 4 bytes for the time). It would also prevent most of the rounding errors that prompted your response.






    NHO

Share This Page