SQL Server Performance

Zero Length String Checking

Discussion in 'T-SQL Performance Tuning for Developers' started by BDRichardson, Mar 17, 2006.

  1. BDRichardson New Member

    Please would someone kindly clarify for me which is the most efficient way to determine whether a string (VARCHAR) value has a zero length.

    I am sure that the answer would vary according to circumstances, but there must be a few common rules.

    For example, which of the following is most efficient:


    1. LEN( @StringValue) = 0
    2. @StringValue = ''


    Any assistance would be greatly appreciated.
  2. mmarovic Active Member

    I guess @StringValue = '' could be marginally faster, but there is no better proof then testing.
  3. Adriaan New Member

    The accepted wisdom in programming languages is that the LEN function should be faster, as the length of a string in a variable length variable is recorded at the start of the variable in memory.

    Still I get the impression that T-SQL is different - I did a small test and it seems like calling the LEN function actually takes up additional CPU time.

    Obviously you won't see much difference in a one-time check on a single variable ...
  4. mmarovic Active Member

    I think varchar variable is implemented as a c-string, which means null character (ansi 0) is string terminator.
  5. FrankKalis Moderator

  6. mmarovic Active Member

    Just to add to confusion:
    declare @s varchar(10), @t varchar(10), @p varchar(10)

    set @s = '123'

    set @t = replace(@s, '3', char(0))

    select @t, len(@t), @t + 'a', len(@t+'a')

    set @t = replace(@t, char(0), '!')

    select @t, len(@t)
  7. Adriaan New Member

    Mirko,

    Not sure what you're testing here. When you run this ...

    select case when char(0) = '' then 1 else 0 end

    ... you will see that char(0) is not the same as a ZLS.
  8. mmarovic Active Member

    Adriaan,

    both examples show @varcharVariable = char(0) is not the zero length string, but it looks like char(0) does have special meaning and its handling is inconsistent.

    It looks like characters after the first char(0) are not displayed but all other operations when char(0) is the part of the string are far from intuitive at least.
  9. FrankKalis Moderator

  10. FrankKalis Moderator

  11. mmarovic Active Member

    <blockquote id="quote"><font size="1" face="Verdana, Arial, Helvetica" id="quote">quote:<hr height="1" noshade id="quote"><i>Originally posted by FrankKalis</i><br /><br /<a target="_blank" href=http://www.sql-server-performance.com/forum/topic.asp?TOPIC_ID=11940&SearchTerms=char>http://www.sql-server-performance.com/forum/topic.asp?TOPIC_ID=11940&SearchTerms=char</a>(0)<br /><br />--<br />Frank Kalis<br />Microsoft SQL Server MVP<br /<a target="_blank" href=http://www.insidesql.de>http://www.insidesql.de</a><br />Heute schon gebloggt?<a target="_blank" href=http://www.insidesql.de/blogs>http://www.insidesql.de/blogs</a><br />Ich unterstuetze PASS Deutschland e.V. <a target="_blank" href=http://www.sqlpass.de>http://www.sqlpass.de</a>)<br /><hr height="1" noshade id="quote"></font id="quote"></blockquote id="quote">Hey, I new I was involved in a such issue, I just couldn't remember what exactly it was and how it was resolved. I am getting senile [<img src='/community/emoticons/emotion-6.gif' alt=':(' />]
  12. mmarovic Active Member

    Anyway, can anyone explain how does replacing char(0) with '!' in '12'+char(0) produce '!'?
  13. Adriaan New Member

    It gets better and better ...

    declare @s varchar(10), @t varchar(10), @p varchar(10)
    set @s = '123' + char(0) + '4'
    select @s
    ----------
    123
    Suggests char(0) is indeed a closing character

    select len(@s)
    -----------
    5
    Suggests that LEN function can look beyond char(0)

    select @s, len(@s)
    ---------- -----------
    123 4 5
    This is very weird!

    set @s = replace(@s, char(0), '@')
    select @s, len(@s)
    ---------- -----------
    123@4 5
    This makes reasonable sense.
  14. FrankKalis Moderator

  15. BDRichardson New Member

    Thankyou to you all for your assistance with my query.

    I executed the following script, and my conclusion from executing it many times, that to directly compare a string to '' is very slightly more efficient than using the LEN function. The difference really is very insignificant, but there does seem to be one, in my opinion.




    IF EXISTS( SELECT * FROM #tbl) DROP TABLE #tbl

    SET NOCOUNT ON

    -- Populate temporary table with data to test
    CREATE TABLE #tbl
    (
    Pk INT IDENTITY( 0, 1) PRIMARY KEY,
    Val VARCHAR( 20)
    )

    DECLARE @idx INT
    DECLARE @IdxString VARCHAR( 10)

    SET @idx = 0

    WHILE @idx < 1000000 BEGIN
    SET @IdxString = CAST( @idx AS VARCHAR( 10))
    INSERT #tbl( Val) VALUES( 'Index' + REPLICATE( '0', 6 - LEN( @IdxString)) + @IdxString)
    SET @idx = @idx + 1
    END

    -- Test 1
    DECLARE @dtmStartTest1 DATETIME
    SET @dtmStartTest1 = GETDATE()

    SELECT COUNT( *) FROM #tbl WHERE NOT LEN( Val) = 0

    PRINT 'Time Elapsed for Test 1: ' + CAST( DATEDIFF( ms, @dtmStartTest1, GETDATE()) AS VARCHAR( 20)) + ' milliseconds'

    -- Test 2
    DECLARE @dtmStartTest2 DATETIME
    SET @dtmStartTest2 = GETDATE()

    SELECT COUNT( *) FROM #tbl WHERE NOT Val = ''

    PRINT 'Time Elapsed for Test 2: ' + CAST( DATEDIFF( ms, @dtmStartTest2, GETDATE()) AS VARCHAR( 20)) + ' milliseconds'
  16. Madhivanan Moderator

    quote:Originally posted by mmarovic

    Anyway, can anyone explain how does replacing char(0) with '!' in '12'+char(0) produce '!'?

    SELECT '12'+char(0),replace('12'+char(0),char(0),'!')

    Result

    1212!


    Madhivanan

    Failing to plan is Planning to fail
  17. Adriaan New Member

    <blockquote id="quote"><font size="1" face="Verdana, Arial, Helvetica" id="quote">quote:<hr height="1" noshade id="quote"><i>Originally posted by BDRichardson</i><br /><br />Thankyou to you all for your assistance with my query.<br /><br />I executed the following script, and my conclusion from executing it many times, that to directly compare a string to '' is very slightly more efficient than using the LEN function. The difference really is very insignificant, but there does seem to be one, in my opinion.<hr height="1" noshade id="quote"></font id="quote"></blockquote id="quote">Hi Brian,<br /><br />Apologies for kidnapping your thread. Looks like you've found your answer already.<br /><br />Don't mind us![<img src='/community/emoticons/emotion-1.gif' alt=':)' />]
  18. BDRichardson New Member

    No worries. [<img src='/community/emoticons/emotion-5.gif' alt=';)' />]<br /><br />I'm only too grateful that the world is full of so many helpful people!
  19. mmarovic Active Member

    quote:Originally posted by Madhivanan


    quote:Originally posted by mmarovic

    Anyway, can anyone explain how does replacing char(0) with '!' in '12'+char(0) produce '!'?

    SELECT '12'+char(0),replace('12'+char(0),char(0),'!')

    Result

    1212!
    I see that works, however I have the problem understaning results of the script I've posted before:
    declare @s varchar(10), @t varchar(10), @p varchar(10)

    set @s = '123'

    set @t = replace(@s, '3', char(0))

    select @t, len(@t)

    set @t = replace(@t, char(0), '!')

    select @t, len(@t)

    On my comp results are:


    ---------- -----------
    12 3

    (1 row(s) affected)


    ---------- -----------
    ! 1

    (1 row(s) affected)
  20. Madhivanan Moderator

    &gt;&gt;I'm only too grateful that the world is full of so many helpful people!<br /><br />Thats what the name WORLD means for [<img src='/community/emoticons/emotion-1.gif' alt=':)' />]<br /><br />Madhivanan<br /><br />Failing to plan is Planning to fail
  21. Madhivanan Moderator

    Strange. I tested here and got


    ---------- -----------
    12 3

    (1 row(s) affected)


    ---------- -----------
    12! 3

    (1 row(s) affected)


    Anything wrong in settings?

    Madhivanan

    Failing to plan is Planning to fail
  22. mmarovic Active Member

    Really strange! [:0]

    It is not just presentation issue, because the len function returns 1 in my case.

    I've just figured out that there are not services packs on the instance I used for testing. I'm going to install it and see if that was the problem.
  23. Madhivanan Moderator

    Yes. Service pack may be the problem

    Madhivanan

    Failing to plan is Planning to fail
  24. mmarovic Active Member

    I've just upgraded to sp 3a and results are the same. [<img src='/community/emoticons/emotion-6.gif' alt=':(' />]
  25. Madhivanan Moderator

    Strange. I dont understand where goes wrong

    Madhivanan

    Failing to plan is Planning to fail
  26. Twan New Member

    What collation are you using?

    Cheers
    Twan
  27. mmarovic Active Member

    quote:Originally posted by Twan

    What collation are you using?

    Cheers
    Twan
    Czech_SI_AS
  28. mmarovic Active Member

    It is the collation problem! [:0] I've created database with SQL_Latin1_General_1250_CI_AS collation and results are finally correct.

    It is still strange that eastern europe collation affects char(0) handling!
  29. Twan New Member

    It looks like it works properly if you use a binary sort order, but not if you use a CS_AI, CS_AS, CI_AI or CI_AS one...

    charindex( char(0), 'anything at all' ) always returns 1, eventhough ascii( 'anything at all' ) returns the correct value (i.e. <> 0

    looks like a bug to me...

    Cheers
    Twan
  30. mmarovic Active Member

    I vote for a bug rather then a feature [<img src='/community/emoticons/emotion-1.gif' alt=':)' />]. However, it works at least with one CI_AS collation (SQL_Latin1_General_1250_CI_AS).<br /><br />Anyway, thanks for the help.
  31. Twan New Member

    create table #t(
    col1 varchar(10) collate Latin1_General_CS_AI,
    col2 varchar(10) collate Latin1_General_CS_AS,
    col3 varchar(10) collate Latin1_General_CI_AI,
    col4 varchar(10) collate Latin1_General_CI_AS,
    col5 varchar(10) collate Latin1_General_BIN,
    )

    insert into #t values ( '123', '123', '123', '123', '123' )
    select
    charindex( char(0), col1), ASCII(col1),
    charindex( char(0), col2), ASCII(col1),
    charindex( char(0), col3), ASCII(col1),
    charindex( char(0), col4), ASCII(col1),
    charindex( char(0), col5), ASCII(col1)
    from #t


    drop table #t




    149149149149049
  32. RGKN New Member

    B D Richardson,

    The test you ran isn't looking at

    @StringValue = ''
    and
    LEN(@StringValue) = 0

    what you seem to be testing is

    SELECT count(*) from #tbl
    where StringValue = ''

    SELECT count(*) from #tbl
    where LEN(StringValue) = 0

    This is slightly different as originally you asked about assessing whether a variable is a zero length string and the test you ran part way down seem to test selecting a count from a table. These will behave differently, especially as the selection from tables will be able to use indexes if indexes there be.

    I think that the following code allows you to test which is quicker for assessing variables and there is not much in, it if anything at all. Most of the tests I did had @StringValue = '' come out marginally quicker regardless of the order of the tests, but I would say it was far from conclusive and I didn't run many, I leave that to you!

    Regards,

    Robert.


    DECLARE @StringValue AS VARCHAR(10)
    DECLARE @Counter AS INT

    DECLARE @dtmStartTest1 DATETIME
    DECLARE @dtmStartTest2 DATETIME
    DECLARE @MaxCount AS INT

    --Set a number of iterations
    SET @MaxCount = 1000000
    SET @Counter = 1
    SET @StringValue = ''


    SET @dtmStartTest2 = GETDATE()
    WHILE @Counter < @MaxCount
    BEGIN
    IF @StringValue = ''
    BEGIN
    SET @Counter = @Counter + 1
    END
    END

    SELECT 'Time Elapsed for @StringValue = :' + CAST( DATEDIFF( ms, @dtmStartTest2, GETDATE()) AS VARCHAR( 20)) + ' milliseconds'

    SET @Counter = 1


    SET @dtmStartTest1 = GETDATE()
    WHILE @Counter < @MaxCount
    BEGIN
    IF LEN(@StringValue) = 0
    BEGIN
    SET @Counter = @Counter + 1
    END
    END

    SELECT 'Time Elapsed for LEN(@StringValue):' + CAST( DATEDIFF( ms, @dtmStartTest1, GETDATE()) AS VARCHAR( 20)) + ' milliseconds'



  33. FrankKalis Moderator

    Next time I'm online at home (where I have NNTP access, I'll post this to the private MVP newsgroup. Let's wait and see what the dev's from Microsoft have to say. What I suspect right now, is that it definitely isn't a bug. It's a feature! [<img src='/community/emoticons/emotion-2.gif' alt=':D' />] Probably a feature that's been carried around for ages and on which some TAP customers rely on. Just like the weired ISNUMERIC() behaviour.<br /><br />Madhivanan, I'm also going to post to that group the question why a DATETIME has 53 '-' signs when viewed in text mode.<br /><br />--<br />Frank Kalis<br />Microsoft SQL Server MVP<br /<a target="_blank" href=http://www.insidesql.de>http://www.insidesql.de</a><br />Heute schon gebloggt?<a target="_blank" href=http://www.insidesql.de/blogs>http://www.insidesql.de/blogs</a><br />Ich unterstuetze PASS Deutschland e.V. <a target="_blank" href=http://www.sqlpass.de>http://www.sqlpass.de</a>)
  34. mmarovic Active Member

    <blockquote id="quote"><font size="1" face="Verdana, Arial, Helvetica" id="quote">quote:<hr height="1" noshade id="quote">This is slightly different as originally you asked about assessing whether a variable is a zero length string and the test you ran part way down seem to test selecting a count from a table. These will behave differently, especially as the selection from tables will be able to use indexes if indexes there be.<hr height="1" noshade id="quote"></font id="quote"></blockquote id="quote">That's a good point, espacially one about indexing.<br /><br />Developers without enough db experience often ask similar question: which syntax is faster, is better to use function f1 instea of f2. However in sql the main thing is to build criteria to allow execution that will read the smallest number of rows possible. Pay attention: the number of rows returned is constant, specified by requirements, but the most efficient implementation will access the smallest number of items (index and/or data rows) possible.<br /><br />In the case of the query tested: <pre id="code"><font face="courier" size="2" id="code">SELECT COUNT( *) FROM #tbl WHERE NOT Val = ''</font id="code"></pre id="code"> is recommended, not because it is marginally faster without indexes, but because with index on val column only empty string rows will be accessed and that would be much more efficient. When you use len(val) = 0 criteria index scan has to be performed so all the rows from the index would be accessed not just empty string rows.<br /><br />Frank: Looking forward for report about the issue. ("feature") [<img src='/community/emoticons/emotion-5.gif' alt=';)' />] Btw, how is your investigation about count, in, exists, outer join going?
  35. Madhivanan Moderator

    &gt;&gt;Madhivanan, I'm also going to post to that group the question why a DATETIME has 53 '-' signs when viewed in text mode.<br /><br />Well. Actually it has 54 '-'s [<img src='/community/emoticons/emotion-1.gif' alt=':)' />]<br /><br />Madhivanan<br /><br />Failing to plan is Planning to fail
  36. mmarovic Active Member

    To correct myself, instead of:
    SELECT COUNT( *) FROM #tbl WHERE NOT Val = ''
    next query would make efficient use of index on val column:
    select count(*) from #tbl where val > ''
  37. FrankKalis Moderator

    quote:
    Frank: Looking forward for report about the issue. ("feature") Btw, how is your investigation about count, in, exists, outer join going?
    Working on that. Apart from fulltime job, I had to do a lot of work of PASS Deutschland recently. But I'm confident to get this done this month and send it to Brad.

    --
    Frank Kalis
    Microsoft SQL Server MVP
    http://www.insidesql.de
    Heute schon gebloggt?http://www.insidesql.de/blogs
    Ich unterstuetze PASS Deutschland e.V. http://www.sqlpass.de)
  38. mmarovic Active Member

    Uh, judging by how much Shane looks to be busy we can't expect the article before May. I'm really curious to see your finds. Can you send me the article before it is published?
  39. FrankKalis Moderator

    That's a good idea anyway. I will do so. [<img src='/community/emoticons/emotion-1.gif' alt=':)' />]<br /><br />--<br />Frank Kalis<br />Microsoft SQL Server MVP<br /<a target="_blank" href=http://www.insidesql.de>http://www.insidesql.de</a><br />Heute schon gebloggt?<a target="_blank" href=http://www.insidesql.de/blogs>http://www.insidesql.de/blogs</a><br />Ich unterstuetze PASS Deutschland e.V. <a target="_blank" href=http://www.sqlpass.de>http://www.sqlpass.de</a>)

Share This Page