SQL Server Performance

Updates cap at 32,700 rows

Discussion in 'General Developer Questions' started by tmarrin1177, Mar 10, 2006.

  1. tmarrin1177 New Member

    Hi,

    I am running a transaction that loops through a table and runs a calculation, updating the result to a blank field in the same row. The transaction works correctly, except it fails to update beyond 32,700 rows. The code:

    declare @row_id int
    set @row_id = 1
    while @row_id < 800,001
    begin
    begin tran calc_log
    update dbo.MyTable set col_D = log(col_A/col_B)* log(col_A/col_B) where row_id = @row_id
    set @row_id = @row_id + 1
    commit tran calc_log
    print @row_id
    end

    This code executes up to row 32,700 and just hangs. I reran the process a few times and it always stops around row 32,700. Is this being caused by a sql_server resource issue? How can I rewrite the code to get this to run for a larger dataset?

    Thanks in advance,

    Tim

  2. Luis Martin Moderator

    I've moved to relevant Forum.

    BTW: The column row_id in MyTable is also int or smallint?



    Luis Martin
    Moderator
    SQL-Server-Performance.com

    Although nature commences with reason and ends in experience it is necessary for us to do the opposite, that is to commence with experience and from this to proceed to investigate the reason.
    Leonardo Da Vinci

    Nunca esperes el reconocimiento de tus hijos, eso ocurrirá luego de tu muerte


    All postings are provided “AS IS” with no warranties for accuracy.



  3. Tahsin New Member

    Is there a smallint data type declared somewhere in MyTable? Could you post the data definitions of MyTable along with the contents of the transaction calc_log?

    - Tahsin
  4. tmarrin1177 New Member

    Hi Luis,

    row_id is int

    Tim
  5. tmarrin1177 New Member

    Hi Tahsin,

    col_D, col_A, col_B are all real datatype

  6. tmarrin1177 New Member

    Observation: when I initially run this transaction, the speed of the updates is quite fast - as the updates pile up, the rate at which they execute is significantly slower, indicating possibly that a cache or server memory is filling up. Could there be a way to free up resources in the transaction?
  7. satya Moderator

    Refer to the KBAhttp://support.microsoft.com/?id=224453 and linkhttp://www.sqljunkies.com/Article/7F8518F9-FDAA-4FF3-8FC5-25E8946C8D0C.scuk that will help to assess the blocking issues whenever this transaction is pending and running that is causing CPU to a rocket high.


    BTW while this transaction is running how about the schedule of Transaction log backup and any observation on TEMPDB contention?

    HTH

    Satya SKJ
    Contributing Editor & Forums Moderator
    http://www.SQL-Server-Performance.Com
    This posting is provided “AS IS” with no rights for the sake of knowledge sharing.
  8. mmarovic Active Member

    Is there any row where col_a = 0 or col_b = 0? I am not sure what is the reason. What is db recovery model?

    Anyway, try next code:

    declare @row_id int
    set @row_id = 1
    while @row_id <= 800
    begin
    begin tran calc_log
    update dbo.MyTable set col_D = - square(log(col_A/col_B)) where row_id between @rowId and @row_id + 999
    set @row_id = @row_id + 1000
    commit tran calc_log
    print @row_id
    end
  9. Twan New Member

    Hi ya,

    Are there any negative numbers in col_A or col_B, or is col_B zero anywhere?

    Also what is the reason for updating one row at a time?

    rather than using "where row_id < 800001"

    transaction log is not likely to get massive unless colD is a brand new row and heavily indexed? another alternative to storing the data may be to create a calculated column or view and not store the data at all? (assuming that colD is not updated independently of A and B

    Doing a begin tran/commit for every single row will also be hurting performance, batch updating in chunks can be done bigger if performance is required...?

    Cheers
    Twan
  10. mmarovic Active Member

    Yes, I forgot to ask about negative values.
  11. tmarrin1177 New Member

    Hi,

    There are no negative values or 0's - as for why I am iterating through each row, I don't necessarily have a good reason...I looked up some transact sql examples online and put this together.

  12. tmarrin1177 New Member

    mmarovic, the change you suggested worked very well. I was unaware of the power of batch/bulk querying like this. Many thanks!

    Can you briefly explain what a db recovery model is? (db newbie I am)
  13. tmarrin1177 New Member

    Twan, thanks for the help. Can you explain how I can check the db transaction log?
  14. mmarovic Active Member

    Take a look at BOL (topic: Selecting a Recovery Model). Explanation there is much better then anything I can do in my broken english.
  15. tmarrin1177 New Member

    can this batch model be used to execute a stored procedure for each row instead of the log calculation? ie in replacement of a cursor (which I am finding VERY SLOW!!!)

  16. satya Moderator

    During this operation if you can maintain regular backups for Transaction log then the sizes will be taken care. Using FULL RECOVERY MODEL will help the database recoverability, but bear in mind you have to maintain regular intervals of backup log in order to manage the size of virtual log during the insert operations.

    Satya SKJ
    Contributing Editor & Forums Moderator
    http://www.SQL-Server-Performance.Com
    This posting is provided “AS IS” with no rights for the sake of knowledge sharing.
  17. druer New Member

    This might be a stupid suggestion but consider removing the PRINT @ROW_ID command that is part of the loop. It could be that the command is terminating because of the fact that 32,700 messages are being returned and that might be some kind of limit on the number of messages that can be processed in the query analyzer or by an sql connection in general. Or change it to print only if it is divisible by 100 or something to limit the number.

    Just a thought, going down a different path,
    Dalton
  18. mmarovic Active Member

    quote:Originally posted by tmarrin1177

    can this batch model be used to execute a stored procedure for each row instead of the log calculation? ie in replacement of a cursor (which I am finding VERY SLOW!!!)


    No. Cursor is slow for the same reason your original solution is slow. That reason is row by row processing. In sql world it is allways faster to use data set approach. Probably the fastest solution would be:
    update dbo.MyTable set col_D = - square(log(col_A/col_B))
    Problem with this solution is that would create long running transaction. On each checkpoint if db recovery model is simple, or when transaction log is backed-up (full recovery) sql server marks all space of completed transaction free for reuse. However, your transaction may be incomplete at the time that occurs (if you use code above) so it will keep producing log space that can't be reduced before transaction finishes. That log file size may go out of controll. This is the reason I rather suggested spliting transaction in smaller batches. It is still much faster then row by row processing, but at the same time allows keeping transaction log size under controll.

Share This Page