SQL Server Performance

Cursor tuning, parsing comma-separated list

Discussion in 'T-SQL Performance Tuning for Developers' started by jaywt, Nov 21, 2002.

  1. jaywt New Member

    I have a table with three varchar fields containing comma-separated lists, and an integer key. Each of the lists contains 1 to about 10 values; in a single row, each list contains the same number of items; and the lists are ordered such that the 1st item of the 1st list is intended to be paired with the 1st item of the second list, and so on. The table contains about a 300,000 rows.<br /><br />A sample row would look something like:<br /><br />key list1 list2 list3<br />1 a,b,c 1,2,3 x,y,z<br /><br />The goal is to "unpack" the table into a new table so that we get something like the following:<br /><br />key value1 value2 value2<br />1 a 1 x<br />1 b 2 y<br />1 c 3 z<br /><br />and so on.<br /><br />In Access, I wrote a VB function to accomplish this using ADO Recordsets in basically a two-level nested loop structure. The first level looped through the rows of the original table; the second parsed the lists within the row, appending one row to a new table for each set of items parsed.<br /><br />In SQL Server, I attempted to write a stored procedure using the same logic. In the sp, I loop through the rows via a cursor, then use a 2nd While loop to parse the row, inserting a new row for each set of items parsed. Much to my suprise, the Access version works over ten times faster, finishing within about 8 minutes, writing an average of 161,613 rows per minute to the new table, whereas the SQL Server version writes an average about 13,551 rows per minute to the destination table! [xx(]<br /><br />Since both the number of items within the lists as well as the number of characters in each item are variable, I can't think of a way to do this without a cursor.<br /><br />Is the performance difference I'm seeing typical of SQL Server cursors vs looping through ADO recordsets in Access VB? If not, would anyone have a thought as to what I might be doing wrong?<br /><br />Any advice would be greatly appreciated. <br /><br />Thank you! [<img src='/community/emoticons/emotion-1.gif' alt=':)' />]<br /><br />Jay<br /><br /><br /><br /><br />
  2. bradmcgehee New Member

    SQL Server cursors, if possible, should generally be avoided, as they are never good performers. If you have looked at the tips I have on cursors, you might find some ideas on how to improve the performance of your cursor.

    Ideally, you should avoid using a cursor, and instead use a derived query (there is an article on them on this website, or if that won't work, try using temp tables. I don't have enough information to determine which method is best for you, but I am confident that you can do what you want to do without using a cursor, using one of the other methods instead.


    ------------------
    Brad M. McGehee
    Webmaster
    SQL-Server-Performance.Com
  3. arb New Member

    Without giving this too much thought, you could use a query similar to the one below to split your comma-seperated fields. The performance of this is not too bad, and is much better than using a cursor or a while loop to split a string. This code will only split the first list (list1) - you could insert the results of this into a temp table, then repeat for the other lists and join the three temp tables... Shouldn't be too hard to figure out.


    -- This piece of nastiness splits a comma seperated string and returns a table of values
    SELECT ID,
    NULLIF(SUBSTRING(',' + list1 + ',' , COMMA.Pos , CHARINDEX(',' , ',' + list1 + ',' , COMMA.Pos) - COMMA.Pos) , '') AS AppNum
    FROM (
    SELECT (A0.ID + A1.ID + A2.ID) AS Pos
    FROM
    (SELECT 0 ID UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION
    SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) A0,
    (SELECT 0 ID UNION SELECT 10 UNION SELECT 20 UNION SELECT 30 UNION SELECT 40 UNION
    SELECT 50 UNION SELECT 60 UNION SELECT 70 UNION SELECT 80 UNION SELECT 90) A1,
    (SELECT 0 ID UNION SELECT 100 UNION SELECT 200 UNION SELECT 300 UNION SELECT 400 UNION
    SELECT 500 UNION SELECT 600 UNION SELECT 700 UNION SELECT 800 UNION SELECT 900) A2
    ) AS COMMA, tblLists
    WHERE COMMA.Pos <= LEN(',' + list1 + ',')
    AND SUBSTRING(',' + list1 + ',' , COMMA.Pos - 1, 1) = ','
    AND CHARINDEX(',' , ',' + list1 + ',' , COMMA.Pos) - COMMA.Pos > 0
    ORDER BY ID, COMMA.Pos
  4. jaywt New Member

    quote:Originally posted by bradmcgehee

    SQL Server cursors, if possible, should generally be avoided, as they are never good performers. If you have looked at the tips I have on cursors, you might find some ideas on how to improve the performance of your cursor.

    Ideally, you should avoid using a cursor, and instead use a derived query (there is an article on them on this website, or if that won't work, try using temp tables. I don't have enough information to determine which method is best for you, but I am confident that you can do what you want to do without using a cursor, using one of the other methods instead.

    I have tried most of the cursor tuning recommendations posted on this site; but since the consensus seems to be that there is a way to do this without cursors, I'll focus my attention there for now. Thank you very much for your advice.

    Jay


  5. jaywt New Member

    quote:Originally posted by arb

    Without giving this too much thought, you could use a query similar to the one below to split your comma-seperated fields. The performance of this is not too bad, and is much better than using a cursor or a while loop to split a string. This code will only split the first list (list1) - you could insert the results of this into a temp table, then repeat for the other lists and join the three temp tables... Shouldn't be too hard to figure out.


    -- This piece of nastiness splits a comma seperated string and returns a table of values
    SELECT ID,
    NULLIF(SUBSTRING(',' + list1 + ',' , COMMA.Pos , CHARINDEX(',' , ',' + list1 + ',' , COMMA.Pos) - COMMA.Pos) , '') AS AppNum
    FROM (
    SELECT (A0.ID + A1.ID + A2.ID) AS Pos
    FROM
    (SELECT 0 ID UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION
    SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) A0,
    (SELECT 0 ID UNION SELECT 10 UNION SELECT 20 UNION SELECT 30 UNION SELECT 40 UNION
    SELECT 50 UNION SELECT 60 UNION SELECT 70 UNION SELECT 80 UNION SELECT 90) A1,
    (SELECT 0 ID UNION SELECT 100 UNION SELECT 200 UNION SELECT 300 UNION SELECT 400 UNION
    SELECT 500 UNION SELECT 600 UNION SELECT 700 UNION SELECT 800 UNION SELECT 900) A2
    ) AS COMMA, tblLists
    WHERE COMMA.Pos <= LEN(',' + list1 + ',')
    AND SUBSTRING(',' + list1 + ',' , COMMA.Pos - 1, 1) = ','
    AND CHARINDEX(',' , ',' + list1 + ',' , COMMA.Pos) - COMMA.Pos > 0
    ORDER BY ID, COMMA.Pos

    That is one dense piece of SQL! I will definitely give it a try. Thanks very much!

    Jay

Share This Page