Deleting many rows – use clustered index? | SQL Server Performance Forums

SQL Server Performance Forum – Threads Archive

Deleting many rows – use clustered index?

Hi all,
I have a table that holds about 6 million records:
id (PK)
res_id (FK)
qst_id (FK)
number (smallint)
value (varchar(8000)) I’ve got some questions about how I can optimize some operations.
First, users of our application can start a cleanup operation that could affect for instance 100000 rows. At the moment this affects performance very much. The delete operation is done on the res_id column which is part of a composite non-clustered unique index: res_id, qst_id, number. Is deleting faster when SQLserver can use a clustered index, and if so, should I consider making this index the clustered index, even if it’s a composite index?
Any other suggestions on how to speed up a delete operation of this size? The qst_id column refers to another table, named questions. I had to add an index on qst_id because when I want to delete a record in the questions table, SQLserver was checking to see if there were no records in the table referring to it. I don’t really need that index for other purposes. How can I do without it without dropping the foreign key? The whole table will be a lot smaller if we split our database up, and make separate databases for each client. Can we expect a large performance gain if we do this? We need to support other character sets in the near future, so I have to change the varchar column to nvarchar. Does this affect performance? Will our database more or less double in size because of the 2 bytes instead of one? Thanks a lot for any suggestions.
Best regards, Jeroen
One way to get performance with delete is: remove all indexs, perform delete, recreate indexs.
Luis Martin
Moderator
SQL-Server-Performance.com All postings are provided “AS IS” with no warranties for accuracy.
Although, if you’re doing a DELETE FROM table WHERE colx = etc it’s advisable to retain indexes to satisfy the WHERE clause to speed it up. If you’re doing a mass (indescriminate) delete, however, remove all indexes. Downside of that is you have the associated table locks and time lost in the re-creation of the indexes afterwards. Tom Pullen
DBA, Oxfam GB
IF the rows to be deleted are more than the rows to be kept, its better to export those rows and use TRUNCATE table which will clear the rows and then import those exported rows to this table. Satya SKJ
Moderator
http://www.SQL-Server-Performance.Com/forum
This posting is provided “AS IS” with no rights for the sake of knowledge sharing.
Thanks for all your suggestions. It looks like there’s not really much I can do.
There are much more rows to be kept than to be deleted, and some time ago I was forced to reindex the table myself which took so much time I’m afraid that’s not an option. The index that’s being used is not the clustered index. Can I expect some performance gain if I make that index clustered? It’s a composite unique index (3 fields). Other options I’m considering:
– split the database in separate smaller databases
– add a flag field to the table and do all deletes at night
As long as the free space is not an issue on hard disk where the data files are kept, you can deploy the clusterd index in the place of non-clustered index. See if SQL Server is doing a lot of I/O. Tools like SQL Profiler and Blocker Script output can tell you if the query(s) are resulting in these large I/Os, and tuning or optimizing these individual queries from a statistics and indexes point of view may help from the disk I/O perspective. An improper index strategy can add to a disk subsystem workload. For example, instead of doing an Index Seek, the SQL Server Optimizer may choose to do a Table/Index Scan, resulting in a lot of unnecessary I/O. You have seen that adding appropriate indexes reduces load on disks, thus reducing the disk I/O. If the proper indexes do exist, adding more spindles to the RAID configuration may also help. Satya SKJ
Moderator
http://www.SQL-Server-Performance.Com/forum
This posting is provided “AS IS” with no rights for the sake of knowledge sharing.
quote:I’ve got some questions about how I can optimize some operations.
First, users of our application can start a cleanup operation that could affect for instance 100000 rows. At the moment this affects performance very much. The delete operation is done on the res_id column which is part of a composite non-clustered unique index: res_id, qst_id, number. Is deleting faster when SQLserver can use a clustered index, and if so, should I consider making this index the clustered index, even if it’s a composite index?
Any other suggestions on how to speed up a delete operation of this size? The qst_id column refers to another table, named questions.

I would give a try to this: 1. Make Res_id clustered index
2. Keep id as nonclustered pk.
3. Make qst_id, res_id, number composite non-clustered index (if you realy need such combination). You can also have just qst_id + number as index instead since res_id will be there because it is part of clustered index. If qst_id + number is selective enough that could be even better. and test if this improves performance.
quote:The whole table will be a lot smaller if we split our database up, and make separate databases for each client. Can we expect a large performance gain if we do this?

Since you have just 6 million rows, that is probably not necessary. You can test if this split is going to lessen b-tree depth for your indexes. Let me know if you need a script for checking that.
]]>