SQL Server Performance

Using loops to improve performance?

Discussion in 'SQL Server 2005 T-SQL Performance Tuning' started by null, Oct 29, 2008.

  1. null New Member

    I've seen this a handful of times over the years...
    When updating "large" tables, let's say, a table with 200 - 300 million rows ... instead of updating the entire table at once, the developer chose to update chunks at a time...maybe a set of 10,000, or 100,000 records at a time. W/the exception of the handful of times I've seen this, I've seen far more intances where loops were used, refactored to use a single update, and saw dramatically improved performance.
    Why is there a pattern to use loops to update chunks of data when the data set is "large". What makes this more manageable to the server vs. sticking w/a single update?
  2. madhuottapalam New Member

    In general, in set based operation looping can easily be avoided by join. But your scenario is a classic example where you may want to use loop (cursor). Because, if you update the whole table at a stretch , there is chance of system applying a table lock and all other process which wants to access the table has to wait. IF you do update in chunks the table locking can be avoided. Madhu
  3. FrankKalis Moderator

    The main reason is to safe server resources (log space, locks, etc...) by splitting one large transaction into multiple smaller ones.
  4. null New Member

    Thanks Guys.
    I suspected this may've had something to do w/saving resources w/locks and the transaction log. I'm familiar w/how that could work w/locks, but how does the benefit carry over to the transaction log? Is it similar, in that the transaction log is locked while a transaction is being logged? If that's the case, I could see how having to log the much smaller transactions would keep concurrency up. Is this correct?
    Also, is this method used to crutch insufficient hardware in some cases? One server that I've seen this has since had a pretty massive hardware upgrade. Would it be a bad idea to see how a single update would perform now?
  5. mufford New Member

    Hi null, (I love your handle, btw!)
    As someone who has performed many batch updates on large data sets (100m+), I thought I'd share some of the reasons why I prefer them over single, massive updates.
    1) Transaction size. I prefer many small transactions over one large transaction. One reason is because the transaction will remain active until it's committed, which prevents the log from being truncated. Thus, smaller transactions mean a smaller impact on the transaction log.
    2) Error handling. This goes hand-in-hand with #1. In the event of an error or the need to cancel the process (i.e. perhaps unexpected blocking occurs), smaller transactions will result in a shorter rollback time. The rollback will usually takeat least as long as your execution time has; meaning, if you get 30minutes into a large update before the unexpected occurs, it will take at least 30 minutes torollback. This also means that when an error occurs, you won't need to start all over again; you can pick up on the last record updated.
    3) System resources. Depending on the process, executing the update in separate chunks can lessen the stress on tempdb and reduce disk contention.
    4) Throttling. On large updates, it's prudent to assume that there may be a noticeable impact on the server. You may want to build in a delay as a "breather" for the server to catch up on any pending requests. Having a way to throttle the batch size or the delay between batches can be useful if the impact is greater than initially anticipated or if you need to temporarily "pause" services while another process finishes (i.e. an executive report).
    Obviously, there may be instances where an update is an "all-or-nothing" event, in which case chunking may not be possible. However, if you have a choice in the matter, I suggest always breaking the update into smaller pieces.
  6. mufford New Member

  7. null New Member

    I appreciate the added feedback Michelle, thx for taking the time to add that.

Share This Page