SQL Server Performance

Extracting rowset from delimited string

Discussion in 'Contribute Your SQL Server Scripts' started by joechang, Jul 8, 2003.

  1. joechang New Member

    this is a summary of the post by abhi, with contributions from joe chang and bambola

    the purpose is: starting from a string like 'Tom|Dick|Harry|John|Joe'
    get a rowset
    Tom
    Dick
    Harry
    John
    Joe

    the solution requires a table populated with sequential integers starting from 1:

    CREATE TABLE RSequence (ID int NOT NULL,
    CONSTRAINT [PK_RSequence] PRIMARY KEY CLUSTERED ( [ID] )
    WITH FILLFACTOR = 100 ON [PRIMARY] )
    GO
    EXEC sp_indexoption 'dbo.RSequence', 'disallowrowlocks', TRUE
    GO
    EXEC sp_indexoption 'dbo.RSequence', 'disallowpagelocks', TRUE
    GO

    SET NOCOUNT ON
    TRUNCATE TABLE RSequence1

    DECLARE @I INT SELECT @I = 1
    WHILE @I <= 8000 BEGIN -- this value should be the upper limit on the string length
    INSERT RSequence (ID) VALUES (@I)
    SET @I = @I+1
    END
    GO

    the sp_indexoption to disallowrowlocks may be important if a table scan is involved in the execution plan, as small table scans uses row locks

    create the table RSequence and populate it with the above scripts, the try the following example:

    DECLARE @list varchar(700), @len int, @delim char(1)
    SELECT @delim = '|'
    SELECT @list = 'Tom|Dick|Harry|John|Joe'
    SELECT @len = LEN(@list)

    SELECT ltrim(substring(@list, S1.ID, min(s2.ID)-S1.ID-1))
    FROM RSequence AS S1, RSequence AS S2
    WHERE substring(@delim+@list+@delim,S1.ID,1)=@delim
    AND substring(@delim+@list+@delim,S2.ID,1)=@delim
    AND S1.ID < S2.ID AND S1.ID < @len AND S2.ID <= @len+2
    GROUP BY S1.ID


    bambola provides a function wrapper

    CREATE FUNCTION split_string(@list varchar(8000), @delim char(1), @len int)
    RETURNS TABLE
    AS
    RETURN
    SELECT ltrim(substring(@list, S1.ID, min(s2.ID)-S1.ID-1)) ITEM
    FROM RSequence AS S1, RSequence AS S2
    WHERE substring(@delim + @list + @delim, S1.ID, 1) = @delim
    AND substring(@delim + @list + @delim, S2.ID, 1) = @delim
    AND S1.ID < S2.ID AND S1.ID < @len AND S2.ID < @len+3
    GROUP BY S1.ID

    this can also be wrapped in a stored procedure
    i used it in that manner where the calling procedure creates a temp table with a fixed name (ex. #T1) that has atleast a column with a specified name (ex. element).
    the extract stored proc inserts the extracted values into the element column.
    The calling procedure can create the temp table #T1 with other columns as necessary

    in abhi's example, the delimited list values itself has more than field
    ex. 'asahay.com,1,Y|asahay1.com,2,N|asahay2.com,2,N'

    yielding
    asahay.com,1,Y
    asahay1.com,2,N
    asahay2.com,2,N

    the core of this query:

    SELECT ltrim(substring(@list, S1.ID, min(s2.ID)-S1.ID-1))
    FROM RSequence AS S1, RSequence AS S2
    WHERE substring(@delim+@list+@delim,S1.ID,1)=@delim
    AND substring(@delim+@list+@delim,S2.ID,1)=@delim
    AND S1.ID < S2.ID GROUP BY S1.ID

    was from Joe Celko (www.celko.com)

    My customizations for SQL Server are the disallowrowlocks above, and the additional WHERE clause conditions:
    AND S1.ID < @len AND S2.ID <= @len+2

    these additional conditions are not required for logical correctness,
    however they do further reduce the range of the search to find the position of each delimiter

    try the query with and w/o the last two WHERE clause conditions
  2. bambola New Member

    A small correction.
    AND S1.ID < S2.ID AND S1.ID <= @len AND S2.ID <= @len+2
  3. Kristen New Member

    This doesn't seem to work if the final value in the delimited list is a single character, or a blank (i.e. no blank value returned in that instance)

    Example: put the following into the above example

    SELECT @list = 'Tom|Dick|Harry|John|Joe|X'

    or

    SELECT @list = 'Tom|Dick|Harry|John|Joe|'

    (Note that the following does return a blank:

    SELECT @list = 'Tom|Dick||John|Joe'

    )

    I've looked at the source, but its beyond my ken to work out what might be needed to fix this edge condition, for now I've just added a dummy delimiter on the end, taking advantage of it not returning a blank currently!

    Regards,

    Kristen
  4. bambola New Member

    It does with a small correction


    SELECT ltrim(substring(@list, S1.ID, min(s2.ID)-S1.ID-1))
    FROM RSequence AS S1, RSequence AS S2
    WHERE substring(@delim+@list+@delim,S1.ID,1)=@delim
    AND substring(@delim+@list+@delim,S2.ID,1)=@delim
    AND S1.ID < S2.ID AND S1.ID <= @len AND S2.ID <= @len+2
    GROUP BY S1.ID


    CREATE FUNCTION split_string(@list varchar(8000), @delim char(1), @len int)
    RETURNS TABLE
    AS
    RETURN
    SELECT ltrim(substring(@list, S1.ID, min(s2.ID)-S1.ID-1)) ITEM
    FROM RSequence AS S1, RSequence AS S2
    WHERE substring(@delim + @list + @delim, S1.ID, 1) = @delim
    AND substring(@delim + @list + @delim, S2.ID, 1) = @delim
    AND S1.ID < S2.ID AND S1.ID <= @len AND S2.ID <= @len+2
    GROUP BY S1.ID
  5. Adriaan New Member

    Does it need to be a function? If a stored procedure will do, then you can use a temporary table and do something along these lines:

    DECLARE @List varchar(8000), @Delim char(1), @LastPos INT, @NextPos INT

    CREATE TABLE #TMP (ItemName VARCHAR(20))

    /* Values for testing ... */
    SET @List = '|| | | Tom|Dick|Harry|John|Joe| | '
    SET @Delim = '|'
    /**/

    WHILE LEFT(LTRIM(@List), 1) = @Delim
    BEGIN
    SET @List = LTRIM(RTRIM(SUBSTRING(@List, 2, LEN(@List))))
    END
    WHILE RIGHT(RTRIM(@List), 1) = @Delim
    BEGIN
    SET @List = LTRIM(RTRIM(SUBSTRING(@List, 1, LEN(@List) - 1)))
    END

    SET @LastPos = 1

    WHILE @LastPos > 0
    BEGIN
    SET @NextPos = CHARINDEX(@Delim, @List, @LastPos + 1)
    IF @LastPos = 1
    BEGIN
    IF @NextPos = 0
    BEGIN
    INSERT INTO #TMP SELECT @List
    SET @LastPos = 0
    END
    ELSE
    BEGIN
    INSERT INTO #TMP
    SELECT SUBSTRING(@List, @LastPos, (@NextPos - @LastPos))
    SET @LastPos = @NextPos
    END
    END
    ELSE
    BEGIN
    IF @NextPos = 0
    BEGIN
    INSERT INTO #TMP
    SELECT RTRIM(SUBSTRING(@List, @LastPos + 1, LEN(@List)))
    SET @LastPos = 0
    END
    ELSE
    BEGIN
    INSERT INTO #TMP
    SELECT SUBSTRING(@List, @LastPos + 1, (@NextPos - @LastPos - 1))
    SET @LastPos = @NextPos
    END
    END
    END

    SELECT * FROM #TMP
  6. bambola New Member

    It doesn't *need* to be a function. But as a function you can easily join it in a query.

    But this is not the point. Look at the execution plan and you will see that Joe's solution performs much better. It does index seek while the stored procedure is very much like a cursor, and it uses a temp table which is something you'd usually like to avoid.
  7. EPardx New Member

    How do you use the individual coma delimited results from this query to populate multiple columns after the select?
    I think this is useful for someone that wants to turn this into a multiple row insert.
  8. Adriaan New Member

    EPardx,

    Look at the function definition:

    CREATE FUNCTION ........ RETURNS TABLE

    This means that you can call the function in a FROM clause, like this:

    INSERT INTO target_table (column_list)
    SELECT *
    FROM split_string('asahay.com,1,Y|asahay1.com,2,N|asahay2.com,2,N', '|', 1)
  9. Roji. P. Thomas New Member

  10. EPardx New Member

    quote:Originally posted by Adriaan

    EPardx,

    Look at the function definition:

    CREATE FUNCTION ........ RETURNS TABLE

    This means that you can call the function in a FROM clause, like this:

    INSERT INTO target_table (column_list)
    SELECT *
    FROM split_string('asahay.com,1,Y|asahay1.com,2,N|asahay2.com,2,N', '|', 1)


    yes that would attempt to enter "asahay1.com,1,Y" into a single string column in the target table, what I wanted was to enter "asahay1.com" into col1 and "1" into col 2 and "Y" into col 3 of the same row and then proceed to do the next entry ... and so on.
  11. EPardx New Member

    I have an iterative method to split into both rows (separated by a delimeter) and columns (knowing that there are x columns and only space delimeter currently works for the column case)



    DECLARE @list varchar(700), @len int, @delim char(1), @rowdelim char(1)
    SELECT @rowdelim = '|'
    SELECT @delim = ' '
    SELECT @len = 157
    SELECT @list = 'asahay.com 1 Y|asahay1.com 2 N|asahay2.com 2 N|asahay3.com 2 N|asahay.com 1 Y|asahay1.com 2 N|asahay2.com 2 N|asahay3.com 2 N|asahay3.com 2 N|asahay3.com 2 N'

    SELECT Column1,
    ltrim(left(ITEM,charindex(@delim,ITEM)-1)) AS Column2,
    ltrim(substring(ITEM,charindex(@delim,ITEM),@len)) AS ITEM
    FROM (
    SELECT
    ltrim(left(ITEM,charindex(@delim,ITEM)-1)) AS Column1,
    ltrim(substring(ITEM,charindex(@delim,ITEM),@len)) AS ITEM
    FROM split_string(@list, @rowdelim, @len)
    )T1



    now you can use the
    INSERT INTO testtable (col1, col2, col3)
    SELECT *
    FROM split_string_to_columns(@list, @delim, @rowdelim, @len)

    the split_string is the function mentioned above to split into rows.

    to get columns with the appropriate values ... I still am not sure why ',' is not working right ... any comments will be very welcomed.



  12. Adriaan New Member

    This parsing stuff doesn't need a UDF. Read the string, identify the individual values and where they should go, and make it work.

    This sort of procedure can only be tailor-made - you can't really come up with something generic.
  13. bhushanvinay New Member

    Hi can this be approached in a diff way
    select @values = 'asahay.com,,asahay1.com,2,asahay2.com,2,N '
    select @values = 'SELECT '''+@values
    select @values = replace(@values, ',' , ''' UNION ALL SELECT '''
    select @values = @values + ''''
    select @sql= 'insert into #temp '+ @values
    execute(@sql)
    select * from #temp. [:)]
    sneeky.
    This means that you can call the function in a FROM clause, like this:

    INSERT INTO target_table (column_list)
    SELECT *
    FROM split_string('asahay.com,1,Y|asahay1.com,2,N|asahay2.com,2,N', '|', 1)
  14. satya Moderator

    Appreciate your insight here, it might help users that are looking for such solution.

Share This Page