SQL Server Performance

Skip duplicate before concate How?

Discussion in 'General Developer Questions' started by joseln, Aug 16, 2006.

  1. joseln New Member

    Hi all,

    i have a column value A,B,A,C,D,B

    i want to concatenate each value with a seperator "/" like A/B/C/D without duplication. so when i run the above values in loop but as i dont want to concatenate duplicates, meaning, when the loop runs 3rd time as "A" is already there in the first position it has to skip it and goes to the next record ie "C" .

    i will be able to resolve this by written a seperate function or procedure, but is there any way in SQL Server like "StrComp" of VB.

    thanks
  2. Roji. P. Thomas New Member

    Here you go

    Create table #Test(id int IDENTITY(1,1), SomeValue Char(1))
    INSERT INTO #Test VALUES('A')
    INSERT INTO #Test VALUES('B')
    INSERT INTO #Test VALUES('A')
    INSERT INTO #Test VALUES('C')
    INSERT INTO #Test VALUES('D')
    INSERT INTO #Test VALUES('B')

    DECLARE @c VARCHAR(100)
    SET @c = ''

    SELECT @c = @c + CASE WHEN @c = '' THEN '' ELSE '' END + SomeValue
    FROM #Test
    GROUP BY SomeValue

    SELECT @c

    Roji. P. Thomas
    Microsoft SQL Server MVP
    http://toponewithties.blogspot.com
  3. FrankKalis Moderator

    Another way would be:


    Create table #Test(id int IDENTITY(1,1), SomeValue Char(1))
    INSERT INTO #Test VALUES('A')
    INSERT INTO #Test VALUES('B')
    INSERT INTO #Test VALUES('A')
    INSERT INTO #Test VALUES('C')
    INSERT INTO #Test VALUES('D')
    INSERT INTO #Test VALUES('B')

    DECLARE @c VARCHAR(100)

    SELECT @c = ISNULL( @c + '', '' ) + SomeValue
    FROM #Test
    GROUP BY SomeValue

    SELECT @c
    DROP TABLE #Test


    --
    Frank Kalis
    Moderator
    Microsoft SQL Server MVP
    Webmaster:http://www.insidesql.de
  4. Roji. P. Thomas New Member

    quote:Originally posted by FrankKalis

    Another way would be:


    Create table #Test(id int IDENTITY(1,1), SomeValue Char(1))
    INSERT INTO #Test VALUES('A')
    INSERT INTO #Test VALUES('B')
    INSERT INTO #Test VALUES('A')
    INSERT INTO #Test VALUES('C')
    INSERT INTO #Test VALUES('D')
    INSERT INTO #Test VALUES('B')

    DECLARE @c VARCHAR(100)

    SELECT @c = ISNULL( @c + '', '' ) + SomeValue
    FROM #Test
    GROUP BY SomeValue

    SELECT @c
    DROP TABLE #Test

    I like that. Infact I like anything that replaces a CASE WHEN

    Roji. P. Thomas
    Microsoft SQL Server MVP
    http://toponewithties.blogspot.com
  5. Adriaan New Member

    quote:I like that. Infact I like anything that replaces a CASE WHEN

    As long as you don't use COALESCE, which definitely takes more time than CASE WHEN. Not sure how ISNULL compares to CASE WHEN - could also be slower ...

    ... not so that you'd notice for a single row/value, but with large rowsets or large numbers of iterations, you will see a difference.
  6. Roji. P. Thomas New Member

    quote:Originally posted by Adriaan


    quote:I like that. Infact I like anything that replaces a CASE WHEN

    As long as you don't use COALESCE, which definitely takes more time than CASE WHEN. Not sure how ISNULL compares to CASE WHEN - could also be slower ...
    Why do you think COALESCE is slower than CASE WHEN ? Can you point me to a link where it is proven.

    I always prefer COALESCE Over ISNULL. The reasons are documented here.

    http://toponewithties.blogspot.com/2004/08/differences-between-coalesce-and.html

    Roji. P. Thomas
    Microsoft SQL Server MVP
    http://toponewithties.blogspot.com
  7. FrankKalis Moderator

    <blockquote id="quote"><font size="1" face="Verdana, Arial, Helvetica" id="quote">quote:<hr height="1" noshade id="quote"><br />... not so that you'd notice for a single row/value, but with large rowsets or large numbers of iterations, you will see a difference.<br /><hr height="1" noshade id="quote"></font id="quote"></blockquote id="quote"><br />On SQL Server 2000 you are limited to VARCHAR(8000) and, to be honest, it's a presentational issue anyway. [<img src='/community/emoticons/emotion-1.gif' alt=':)' />]<br /><br />--<br />Frank Kalis<br />Moderator<br />Microsoft SQL Server MVP<br />Webmaster:<a target="_blank" href=http://www.insidesql.de>http://www.insidesql.de</a>
  8. Adriaan New Member

    Frank,<br /><br />Sorry for going off-topic. It might be a presentational issue, but it can happen in any type of procedure ...<br /><br /><br />Roji,<br /><br />I don't have any link to support my claim. Just one of those things you run into when troubleshooting poorly performing queries. <br /><br />Checking the execution plans for ISNULL, COALESCE and CASE WHEN, it's funny to notice that the definitions for the "Compute Scalar" are slightly different:<br /><br />ISNULL - (DEFINE<img src='/community/emoticons/emotion-6.gif' alt=':(' />[Expr1002]=isnull(<i>date_column</i>, 'Jan 1 1900 12:00AM')))<br /><br />COALESCE - (DEFINE<img src='/community/emoticons/emotion-6.gif' alt=':(' />[Expr1002]=If (<i>date_column</i>&lt;&gt;NULL) then <i>date_column</i> else 'Jan 1 1900 12:00AM'))<br /><br />CASE WHEN - (DEFINE<img src='/community/emoticons/emotion-6.gif' alt=':(' />[Expr1002]=If (<i>date_column</i>=NULL) then 'Jan 1 1900 12:00AM' else <i>date_column</i>))<br /><br />They all weigh in at the same percentage, but if you trace it you should notice a difference.<br /><br />Some quick tests suggest that I was right that COALESCE is definitely never the fastest, CASE WHEN is average and ISNULL is best. Plus it will depend on exactly what you're doing, how many rows have null on that column, and whether the column is indexed.<br /><br />What I didn't test is what happens when you have a whole list of columns and you want the first one that doesn't have NULL - which is what COALESCE is designed to do - and indeed it does seem quicker in that application.
  9. FrankKalis Moderator

    Although COALESCE is advertised as the ANSI equivalent of ISNULL, COALESCE is slightly different. It is basically just a shortcut to multiple CASE expressions and as such offers advanced functionality compared to ISNULL. When comparing COALESCE and CASE WHEN one would probably expect that both do the same and therefore perform the same. It seems that they doesn't and I tend to think that COALESCE is less optimized in SQL Server. But there are also other quirks when using COALESCE. Consider this:<br /><pre id="code"><font face="courier" size="2" id="code"><br />CREATE TABLE #t<br />(<br /> c1 CHAR<br />)<br />INSERT INTO #t VALUES (NULL)<br /><br />SELECT<br /> ISNULL(c1,'Frank')<br /> , COALESCE(c1,'Frank')<br /> ,CASE WHEN c1 IS NULL THEN 'Frank' ELSE c1 END<br />FROM<br /> #t<br /><br />DROP TABLE #t<br /> <br />---- ----- ----- <br />F Frank Frank<br /><br />(1 row(s) affected)<br /></font id="code"></pre id="code"><br />Now, when you read very carefully through BOL, you realise that both functions (ISNULL and COALESCE) just do what is described in BOL. Still they yield different results. Same with numbers:<br /><pre id="code"><font face="courier" size="2" id="code"><br />SELECT <br /> 7 / ISNULL(NULL, 2.00)<br /> , 7 / COALESCE(NULL, 2.00)<br /> , 7 / CASE WHEN NULL IS NULL THEN 2.00 END<br /><br /> <br />----------- ----------- ----------- <br />3.500000 3.500000 3.500000<br /><br />(1 row(s) affected)<br /></font id="code"></pre id="code"><br />But it's getting interesting when data type precedence is coming into play here. <br /><pre id="code"><font face="courier" size="2" id="code"><br />SELECT<br /> 7 / ISNULL(CAST(NULL AS INT), 2.00)<br /> , 7 / COALESCE(CAST(NULL AS INT), 2.00)<br /> , 7 / CASE WHEN CAST(NULL AS INT) IS NULL THEN 2.00 END<br /> <br />----------- ------------------ ----------- <br />3 3.5000000000000 3.500000<br /><br />(1 row(s) affected)<br /></font id="code"></pre id="code"><br />Don't ask for an explanation! [<img src='/community/emoticons/emotion-1.gif' alt=':)' />] I tend to blame it on "by design" and as long as I don't need the extended functionality of COALESCE, I prefer ISNULL.<br /><br />--<br />Frank Kalis<br />Moderator<br />Microsoft SQL Server MVP<br />Webmaster:<a target="_blank" href=http://www.insidesql.de>http://www.insidesql.de</a>
  10. Roji. P. Thomas New Member

    While we are at it, here is a repro of a bug WHERE ISNULL lies.

    CREATE TABLE #Temp(id int)
    Insert INTO #TEMP values(1) --This is insignificant

    SELECT B.ID, ISNULL(B.ID,'Is Null'), COALESCE(B.ID,'Is Null')
    FROM #TEMP A
    LEFT JOIN (
    SELECT b.id
    from (select 'Isn''t Null' id) b
    LEFT join #TEMP d on 1=0
    ) B ON 1=0

    DROP TABLE #T

    Roji. P. Thomas
    Microsoft SQL Server MVP
    http://toponewithties.blogspot.com
  11. mmarovic Active Member

    IMO, you should make sure optimal join algorithm is used and you have best possible indexes (and db model to start with). Speed difference between scalar functions is really insignificant.
  12. FrankKalis Moderator

    <blockquote id="quote"><font size="1" face="Verdana, Arial, Helvetica" id="quote">quote:<hr height="1" noshade id="quote"><i>Originally posted by mmarovic</i><br /><br />IMO, you should make sure optimal join algorithm is used and you have best possible indexes (and db model to start with). Speed difference between scalar functions is really insignificant.<br /><hr height="1" noshade id="quote"></font id="quote"></blockquote id="quote"><br />Agreed, but it's sometimes fun to try to figure out what's going on behind the scenes. [<img src='/community/emoticons/emotion-1.gif' alt=':)' />]<br /><br />--<br />Frank Kalis<br />Moderator<br />Microsoft SQL Server MVP<br />Webmaster:<a target="_blank" href=http://www.insidesql.de>http://www.insidesql.de</a>
  13. Madhivanan Moderator

    http://sqljunkies.com/WebLog/amachanic/archive/2004/11/30/5311.aspx


    Madhivanan

    Failing to plan is Planning to fail
  14. mmarovic Active Member

    <blockquote id="quote"><font size="1" face="Verdana, Arial, Helvetica" id="quote">quote:<hr height="1" noshade id="quote"><i>Originally posted by FrankKalis</i><br /><br /><blockquote id="quote"><font size="1" face="Verdana, Arial, Helvetica" id="quote">quote:<hr height="1" noshade id="quote"><i>Originally posted by mmarovic</i><br /><br />IMO, you should make sure optimal join algorithm is used and you have best possible indexes (and db model to start with). Speed difference between scalar functions is really insignificant.<br /><hr height="1" noshade id="quote"></font id="quote"></blockquote id="quote"><br />Agreed, but it's sometimes fun to try to figure out what's going on behind the scenes. [<img src='/community/emoticons/emotion-1.gif' alt=':)' />]<br /><hr height="1" noshade id="quote"></font id="quote"></blockquote id="quote"><br />I guess I am getting old... [<img src='/community/emoticons/emotion-6.gif' alt=':(' />]
  15. FrankKalis Moderator

    <blockquote id="quote"><font size="1" face="Verdana, Arial, Helvetica" id="quote">quote:<hr height="1" noshade id="quote"><br />I guess I am getting old... <br /><hr height="1" noshade id="quote"></font id="quote"></blockquote id="quote"><br />You are already!<br /><br />...at least compared to me. [<img src='/community/emoticons/emotion-4.gif' alt=':p' />]<br /><br />--<br />Frank Kalis<br />Moderator<br />Microsoft SQL Server MVP<br />Webmaster:<a target="_blank" href=http://www.insidesql.de>http://www.insidesql.de</a>
  16. mmarovic Active Member

    I rather use Luis for that comparison. [:0)]

Share This Page