SQL Server Performance Forum – Threads Archive

Update query optimization

Hello, I am using an update query like this: UPDATE table SET field1 = 0 WHERE id > 0 and id < 100000 and field3= 1 table contains 1M records There clustered index on id and non clustered index on field 3 This query takes more than 30 sec to append (an update on all records of this table takes more than 4 minutes !!!!) [<img src=’/community/emoticons/emotion-6.gif’ alt=’

‘ />] Is there a way to optimize this ? Thanks

30s may not actually be that bad. From the information shown the query could be updating up to 100,000 records, how many does it actually update when run? How many columns are in the table? If the table has many columns then it will take longer to read in 100,000 rows than it would with fewer columns. What is the selectivity on field3 column like? Are all the values equal to 1, if they are then the only restriction will be on the clustered index so the index on field3 would be useless. If there are a small number of rows returned by the restriction on Field3 then it may be worth changing this to be the clustered index but not without testing first. Are there anymore indexes on the table? These could be degrading performance, especially if there is an index on Field1. Are there any triggers firing on the table?

If there are no ‘id’ values less than 0 then you don’t need the ‘id>0’ part. It could save you some index searches.

Perhaps you could break the where clause into multiple parts and run multiple copies of the update at the same time for more parallelism.
For example run the following queries at the same time: UPDATE table SET field1 = 0
WHERE id > 0 and id < 25000
and field3= 1 UPDATE table SET field1 = 0
WHERE id >= 25000 and id < 50000
and field3= 1 UPDATE table SET field1 = 0
WHERE id >= 50000 and id < 75000
and field3= 1 UPDATE table SET field1 = 0
WHERE id >= 75000 and id < 100000
and field3= 1
…

I think updaating the table in batches like roltoff suggested is one good option. In this case, i understans that there will be less amount of logging and less log consumption. This will consume less resources and hence the query should eecute faster… HTH. Gaurav

Batches are good but ‘parallelism’ may not be actually achieved. The UPDATE operator will execute serially. The remainder of the query i.e. the WHERE clause can execute in parallel. SQL Server identifies the rows affected by the data modification in parallel but it applies the changes serially. As for the log….I thought every modification (Table and index) has to be logged….

try by creating a index with id & field3 and avoid id < 0 Soma

Right. -Fragment your job into batches
-Remove any non-clustered indexes on Field1
-Have indexes on both the ID field and Field3.
-Run your job when nobody else is running any DML on the table You should get less than 30 secs. NHO

]]>