Zero Length String Checking | SQL Server Performance Forums

SQL Server Performance Forum – Threads Archive

Zero Length String Checking

Please would someone kindly clarify for me which is the most efficient way to determine whether a string (VARCHAR) value has a zero length. I am sure that the answer would vary according to circumstances, but there must be a few common rules. For example, which of the following is most efficient:
1. LEN( @StringValue) = 0
2. @StringValue = ”
Any assistance would be greatly appreciated.
I guess @StringValue = ” could be marginally faster, but there is no better proof then testing.
The accepted wisdom in programming languages is that the LEN function should be faster, as the length of a string in a variable length variable is recorded at the start of the variable in memory. Still I get the impression that T-SQL is different – I did a small test and it seems like calling the LEN function actually takes up additional CPU time. Obviously you won’t see much difference in a one-time check on a single variable …
I think varchar variable is implemented as a c-string, which means null character (ansi 0) is string terminator.
Without actually knowing it, I’m with Mirko on this one. I think T-SQL handles variables slightly different from other programming languages. In this regard might this be interesting:http://www.sqlservercentral.com/columnists/jtravis/understandingthedifferencebetweenisnull.asp
Frank Kalis
Microsoft SQL Server MVP
http://www.insidesql.de
Heute schon gebloggt?http://www.insidesql.de/blogs
Ich unterstuetze PASS Deutschland e.V. http://www.sqlpass.de)
Just to add to confusion:
declare @s varchar(10), @t varchar(10), @p varchar(10) set @s = ‘123’ set @t = replace(@s, ‘3’, char(0)) select @t, len(@t), @t + ‘a’, len(@t+’a’) set @t = replace(@t, char(0), ‘!’) select @t, len(@t)

Mirko, Not sure what you’re testing here. When you run this … select case when char(0) = ” then 1 else 0 end … you will see that char(0) is not the same as a ZLS.
Adriaan, both examples show @varcharVariable = char(0) is not the zero length string, but it looks like char(0) does have special meaning and its handling is inconsistent. It looks like characters after the first char(0) are not displayed but all other operations when char(0) is the part of the string are far from intuitive at least.
Hadn’t we had this discussion before? IIRC, CHAR(0) is the terminator character for Query Analyzer. —
Frank Kalis
Microsoft SQL Server MVP
http://www.insidesql.de
Heute schon gebloggt?http://www.insidesql.de/blogs
Ich unterstuetze PASS Deutschland e.V. http://www.sqlpass.de)
http://www.sql-server-performance.com/forum/topic.asp?TOPIC_ID=11940&SearchTerms=char(0) —
Frank Kalis
Microsoft SQL Server MVP
http://www.insidesql.de
Heute schon gebloggt?http://www.insidesql.de/blogs
Ich unterstuetze PASS Deutschland e.V. http://www.sqlpass.de)
<blockquote id="quote"><font size="1" face="Verdana, Arial, Helvetica" id="quote">quote:<hr height="1" noshade id="quote"><i>Originally posted by FrankKalis</i><br /><br /<a target="_blank" href=http://www.sql-server-performance.com/forum/topic.asp?TOPIC_ID=11940&SearchTerms=char>http://www.sql-server-performance.com/forum/topic.asp?TOPIC_ID=11940&SearchTerms=char</a>(0)<br /><br />–<br />Frank Kalis<br />Microsoft SQL Server MVP<br /<a target="_blank" href=http://www.insidesql.de>http://www.insidesql.de</a><br />Heute schon gebloggt?<a target="_blank" href=http://www.insidesql.de/blogs>http://www.insidesql.de/blogs</a><br />Ich unterstuetze PASS Deutschland e.V. <a target="_blank" href=http://www.sqlpass.de>http://www.sqlpass.de</a>)<br /><hr height="1" noshade id="quote"></font id="quote"></blockquote id="quote">Hey, I new I was involved in a such issue, I just couldn’t remember what exactly it was and how it was resolved. I am getting senile [<img src=’/community/emoticons/emotion-6.gif’ alt=’:(‘ />]
Anyway, can anyone explain how does replacing char(0) with ‘!’ in ’12’+char(0) produce ‘!’?
It gets better and better … declare @s varchar(10), @t varchar(10), @p varchar(10)
set @s = ‘123’ + char(0) + ‘4’
select @s
———-
123
Suggests char(0) is indeed a closing character select len(@s)
———–
5
Suggests that LEN function can look beyond char(0) select @s, len(@s)
———- ———–
123 4 5
This is very weird! set @s = replace(@s, char(0), ‘@’)
select @s, len(@s)
———- ———–
[email protected] 5
This makes reasonable sense.


SELECT ’12’+’!’ —-
12! (1 row(s) affected) —
Frank Kalis
Microsoft SQL Server MVP
http://www.insidesql.de
Heute schon gebloggt?http://www.insidesql.de/blogs
Ich unterstuetze PASS Deutschland e.V. http://www.sqlpass.de)
Thankyou to you all for your assistance with my query. I executed the following script, and my conclusion from executing it many times, that to directly compare a string to ” is very slightly more efficient than using the LEN function. The difference really is very insignificant, but there does seem to be one, in my opinion.
IF EXISTS( SELECT * FROM #tbl) DROP TABLE #tbl SET NOCOUNT ON — Populate temporary table with data to test
CREATE TABLE #tbl
(
Pk INT IDENTITY( 0, 1) PRIMARY KEY,
Val VARCHAR( 20)
) DECLARE @idx INT
DECLARE @IdxString VARCHAR( 10) SET @idx = 0 WHILE @idx < 1000000 BEGIN
SET @IdxString = CAST( @idx AS VARCHAR( 10))
INSERT #tbl( Val) VALUES( ‘Index’ + REPLICATE( ‘0’, 6 – LEN( @IdxString)) + @IdxString)
SET @idx = @idx + 1
END — Test 1
DECLARE @dtmStartTest1 DATETIME
SET @dtmStartTest1 = GETDATE() SELECT COUNT( *) FROM #tbl WHERE NOT LEN( Val) = 0 PRINT ‘Time Elapsed for Test 1: ‘ + CAST( DATEDIFF( ms, @dtmStartTest1, GETDATE()) AS VARCHAR( 20)) + ‘ milliseconds’ — Test 2
DECLARE @dtmStartTest2 DATETIME
SET @dtmStartTest2 = GETDATE() SELECT COUNT( *) FROM #tbl WHERE NOT Val = ” PRINT ‘Time Elapsed for Test 2: ‘ + CAST( DATEDIFF( ms, @dtmStartTest2, GETDATE()) AS VARCHAR( 20)) + ‘ milliseconds’

quote:Originally posted by mmarovic Anyway, can anyone explain how does replacing char(0) with ‘!’ in ’12’+char(0) produce ‘!’?

SELECT ’12’+char(0),replace(’12’+char(0),char(0),’!’) Result 1212!
Madhivanan Failing to plan is Planning to fail
<blockquote id="quote"><font size="1" face="Verdana, Arial, Helvetica" id="quote">quote:<hr height="1" noshade id="quote"><i>Originally posted by BDRichardson</i><br /><br />Thankyou to you all for your assistance with my query.<br /><br />I executed the following script, and my conclusion from executing it many times, that to directly compare a string to ” is very slightly more efficient than using the LEN function. The difference really is very insignificant, but there does seem to be one, in my opinion.<hr height="1" noshade id="quote"></font id="quote"></blockquote id="quote">Hi Brian,<br /><br />Apologies for kidnapping your thread. Looks like you’ve found your answer already.<br /><br />Don’t mind us![<img src=’/community/emoticons/emotion-1.gif’ alt=’:)‘ />]
No worries. [<img src=’/community/emoticons/emotion-5.gif’ alt=’;)‘ />]<br /><br />I’m only too grateful that the world is full of so many helpful people!
quote:Originally posted by Madhivanan
quote:Originally posted by mmarovic Anyway, can anyone explain how does replacing char(0) with ‘!’ in ’12’+char(0) produce ‘!’?

SELECT ’12’+char(0),replace(’12’+char(0),char(0),’!’) Result 1212!
I see that works, however I have the problem understaning results of the script I’ve posted before:
declare @s varchar(10), @t varchar(10), @p varchar(10) set @s = ‘123’ set @t = replace(@s, ‘3’, char(0)) select @t, len(@t) set @t = replace(@t, char(0), ‘!’) select @t, len(@t) On my comp results are:
———- ———–
12 3 (1 row(s) affected)
———- ———–
! 1 (1 row(s) affected)

&gt;&gt;I’m only too grateful that the world is full of so many helpful people!<br /><br />Thats what the name WORLD means for [<img src=’/community/emoticons/emotion-1.gif’ alt=’:)‘ />]<br /><br />Madhivanan<br /><br />Failing to plan is Planning to fail
Strange. I tested here and got
———- ———–
12 3 (1 row(s) affected)
———- ———–
12! 3 (1 row(s) affected)
Anything wrong in settings? Madhivanan Failing to plan is Planning to fail
Really strange! [:0] It is not just presentation issue, because the len function returns 1 in my case. I’ve just figured out that there are not services packs on the instance I used for testing. I’m going to install it and see if that was the problem.
Yes. Service pack may be the problem Madhivanan Failing to plan is Planning to fail
I’ve just upgraded to sp 3a and results are the same. [<img src=’/community/emoticons/emotion-6.gif’ alt=’:(‘ />]
Strange. I dont understand where goes wrong Madhivanan Failing to plan is Planning to fail
What collation are you using? Cheers
Twan
quote:Originally posted by Twan What collation are you using? Cheers
Twan
Czech_SI_AS

It is the collation problem! [:0] I’ve created database with SQL_Latin1_General_1250_CI_AS collation and results are finally correct. It is still strange that eastern europe collation affects char(0) handling!

It looks like it works properly if you use a binary sort order, but not if you use a CS_AI, CS_AS, CI_AI or CI_AS one… charindex( char(0), ‘anything at all’ ) always returns 1, eventhough ascii( ‘anything at all’ ) returns the correct value (i.e. <> 0 looks like a bug to me… Cheers
Twan
I vote for a bug rather then a feature [<img src=’/community/emoticons/emotion-1.gif’ alt=’:)‘ />]. However, it works at least with one CI_AS collation (SQL_Latin1_General_1250_CI_AS).<br /><br />Anyway, thanks for the help.
create table #t(
col1 varchar(10) collate Latin1_General_CS_AI,
col2 varchar(10) collate Latin1_General_CS_AS,
col3 varchar(10) collate Latin1_General_CI_AI,
col4 varchar(10) collate Latin1_General_CI_AS,
col5 varchar(10) collate Latin1_General_BIN,
) insert into #t values ( ‘123’, ‘123’, ‘123’, ‘123’, ‘123’ )
select
charindex( char(0), col1), ASCII(col1),
charindex( char(0), col2), ASCII(col1),
charindex( char(0), col3), ASCII(col1),
charindex( char(0), col4), ASCII(col1),
charindex( char(0), col5), ASCII(col1)
from #t
drop table #t
149149149149049

B D Richardson, The test you ran isn’t looking at @StringValue = ”
and
LEN(@StringValue) = 0 what you seem to be testing is SELECT count(*) from #tbl
where StringValue = ” SELECT count(*) from #tbl
where LEN(StringValue) = 0 This is slightly different as originally you asked about assessing whether a variable is a zero length string and the test you ran part way down seem to test selecting a count from a table. These will behave differently, especially as the selection from tables will be able to use indexes if indexes there be. I think that the following code allows you to test which is quicker for assessing variables and there is not much in, it if anything at all. Most of the tests I did had @StringValue = ” come out marginally quicker regardless of the order of the tests, but I would say it was far from conclusive and I didn’t run many, I leave that to you! Regards, Robert.
DECLARE @StringValue AS VARCHAR(10)
DECLARE @Counter AS INT DECLARE @dtmStartTest1 DATETIME
DECLARE @dtmStartTest2 DATETIME
DECLARE @MaxCount AS INT –Set a number of iterations
SET @MaxCount = 1000000
SET @Counter = 1
SET @StringValue = ”
SET @dtmStartTest2 = GETDATE()
WHILE @Counter < @MaxCount
BEGIN
IF @StringValue = ”
BEGIN
SET @Counter = @Counter + 1
END
END SELECT ‘Time Elapsed for @StringValue = :’ + CAST( DATEDIFF( ms, @dtmStartTest2, GETDATE()) AS VARCHAR( 20)) + ‘ milliseconds’ SET @Counter = 1
SET @dtmStartTest1 = GETDATE()
WHILE @Counter < @MaxCount
BEGIN
IF LEN(@StringValue) = 0
BEGIN
SET @Counter = @Counter + 1
END
END SELECT ‘Time Elapsed for LEN(@StringValue):’ + CAST( DATEDIFF( ms, @dtmStartTest1, GETDATE()) AS VARCHAR( 20)) + ‘ milliseconds’
Next time I’m online at home (where I have NNTP access, I’ll post this to the private MVP newsgroup. Let’s wait and see what the dev’s from Microsoft have to say. What I suspect right now, is that it definitely isn’t a bug. It’s a feature! [<img src=’/community/emoticons/emotion-2.gif’ alt=’:D‘ />] Probably a feature that’s been carried around for ages and on which some TAP customers rely on. Just like the weired ISNUMERIC() behaviour.<br /><br />Madhivanan, I’m also going to post to that group the question why a DATETIME has 53 ‘-‘ signs when viewed in text mode.<br /><br />–<br />Frank Kalis<br />Microsoft SQL Server MVP<br /<a target="_blank" href=http://www.insidesql.de>http://www.insidesql.de</a><br />Heute schon gebloggt?<a target="_blank" href=http://www.insidesql.de/blogs>http://www.insidesql.de/blogs</a><br />Ich unterstuetze PASS Deutschland e.V. <a target="_blank" href=http://www.sqlpass.de>http://www.sqlpass.de</a>)
<blockquote id="quote"><font size="1" face="Verdana, Arial, Helvetica" id="quote">quote:<hr height="1" noshade id="quote">This is slightly different as originally you asked about assessing whether a variable is a zero length string and the test you ran part way down seem to test selecting a count from a table. These will behave differently, especially as the selection from tables will be able to use indexes if indexes there be.<hr height="1" noshade id="quote"></font id="quote"></blockquote id="quote">That’s a good point, espacially one about indexing.<br /><br />Developers without enough db experience often ask similar question: which syntax is faster, is better to use function f1 instea of f2. However in sql the main thing is to build criteria to allow execution that will read the smallest number of rows possible. Pay attention: the number of rows returned is constant, specified by requirements, but the most efficient implementation will access the smallest number of items (index and/or data rows) possible.<br /><br />In the case of the query tested: <pre id="code"><font face="courier" size="2" id="code">SELECT COUNT( *) FROM #tbl WHERE NOT Val = ”</font id="code"></pre id="code"> is recommended, not because it is marginally faster without indexes, but because with index on val column only empty string rows will be accessed and that would be much more efficient. When you use len(val) = 0 criteria index scan has to be performed so all the rows from the index would be accessed not just empty string rows.<br /><br />Frank: Looking forward for report about the issue. ("feature") [<img src=’/community/emoticons/emotion-5.gif’ alt=’;)‘ />] Btw, how is your investigation about count, in, exists, outer join going?
&gt;&gt;Madhivanan, I’m also going to post to that group the question why a DATETIME has 53 ‘-‘ signs when viewed in text mode.<br /><br />Well. Actually it has 54 ‘-‘s [<img src=’/community/emoticons/emotion-1.gif’ alt=’:)‘ />]<br /><br />Madhivanan<br /><br />Failing to plan is Planning to fail
To correct myself, instead of:
SELECT COUNT( *) FROM #tbl WHERE NOT Val = ”
next query would make efficient use of index on val column:
select count(*) from #tbl where val > ”

quote:
Frank: Looking forward for report about the issue. ("feature") Btw, how is your investigation about count, in, exists, outer join going?
Working on that. Apart from fulltime job, I had to do a lot of work of PASS Deutschland recently. But I’m confident to get this done this month and send it to Brad. —
Frank Kalis
Microsoft SQL Server MVP
http://www.insidesql.de
Heute schon gebloggt?http://www.insidesql.de/blogs
Ich unterstuetze PASS Deutschland e.V. http://www.sqlpass.de)
Uh, judging by how much Shane looks to be busy we can’t expect the article before May. I’m really curious to see your finds. Can you send me the article before it is published?
That’s a good idea anyway. I will do so. [<img src=’/community/emoticons/emotion-1.gif’ alt=’:)‘ />]<br /><br />–<br />Frank Kalis<br />Microsoft SQL Server MVP<br /<a target="_blank" href=http://www.insidesql.de>http://www.insidesql.de</a><br />Heute schon gebloggt?<a target="_blank" href=http://www.insidesql.de/blogs>http://www.insidesql.de/blogs</a><br />Ich unterstuetze PASS Deutschland e.V. <a target="_blank" href=http://www.sqlpass.de>http://www.sqlpass.de</a>)
]]>