Hi, All: I have a huge table in a production DB need to do some cleanup duties, I decide to only keep one month data in that table and delete all others. I am afraid this may cause a serious performance hit because on the average there are about 10~15 rows inserted into this table per second. The primary key of this table is a INTEGER with IDENTITY defined on it. The system can not afford long down time. Thanks John
the size of the table should not impact the insert performance in a significant manner. if you only want to keep a small amount of data to keep the size manageable, that is another matter. the fastest way to delete a high percentage of row from a large table is 1. delete rows from tables that have a FK to this table 2. drop the FK's to this table 3. drop indexes 4. select the row you want to keep into a new table 5. drop the existing table 6. rename the new table to the original name 7. create indexes 8. create FK you can test the time by just doing step 4.
Yes, the best way to minimize interruptions to such a heavily used table is to create another table and copy all the data you want to keep into the new table and then rename the new table to the actual one. This way the only time the users will be interrupted is during the rename and possibly during indexing.
Step 5 is "drop the existing table", does SQL Server allows a table been dropped if it is currently been used? Will this action crash these applications use the table?
you should drop existing table and rename the new table inside a transaction with isolation level or lock hints that will not allow others to access it. both are fast operations and should not create a problem (once you obtain an exclusive lock on the existing table). Other processes would simply have to wait till it's done. creating indexes though might be more or an issue depending on the size of the (new) table and the indexes. Bambola.
How can that be done though while allowing an insert rate of 15 rows/sec? I'd suggest that you'll need a job that kicks off every hour or so... it creates a result set of the rows you want to delete (a nolock select into a temp table, or a read only nolock cursor) and then deletes each one in turn inside a transaction. The speed of deletion need not be that important here, as long as the insert rate can be maintained... Is that a fair assumption? As an aside at the rate of growth your identity field will run out within 5 years, is that acceptable? Cheers Twan
Hi, Twan:<br /><br />So you think an insert rate of 15 rows/sec makes joechang's proposal not workable?<br /><br />Actually I create a job very similar to your way:<br /><br />STEP1:<br /><br />--Run this loop:<br /><br />set rowcount 10000 --delete 10,000 rows each time<br />while 1=1 <br /> begin<br /> DELETE FROM table1 WHERE tdate < DATEADD(dd, -30, GETDATE())<br /> IF @@rowcount = 0<br /> BREAK<br /> ELSE<br /> CHECKPOINT<br /> end<br /><br />STEP2:<br /><br />--Run this query:<br /><br />IF NOT EXISTS (SELECT 1 FROM table1 WHERE tdate < DATEADD(dd, -30, GETDATE()))<br /> BEGIN<br /> UPDATE msdb..sysjobs<br /> SET enabled = 0<br /> WHERE name = 'deletejob'<br /> END<br /><br /><br />Then I create another job called 'deletejob' to call the above job, and the 'deletejob' is running every 10 minutes.<br /><br />Unfortunately, at traffic time (around 8pm) sometimes it still creates some locks...<br /><br />I did RESEED IDENTITY last year.<br /><br />By the way, I scare my manager to death when I told him I plan to delete the table and recreate it, a good Halloween joke <img src='/community/emoticons/emotion-1.gif' alt='' /><br /><br />John
zxmssp: I had many similar situations like the one you're describing. Usually, I would solve the problem by batching the deletes and using rowlocks. Yes, it's quitely slow but it gives you the maximum flexibility because it would be deleting the table in chunks and at the same time won't affect concurrency. Here's an example: -- Delete one thousand rows at a time: declare @CHUNK_SIZE int, @rowcount int set @CHUNK_SIZE = 1000 set rowcount @CHUNK_SIZE set @rowcount = 1 -- Keep deleting 1000 at a time until all are deleted: while @rowcount > 0 begin delete from some_hot_table with (rowlock) where some_condition = 'true' set @rowcount = @@rowcount end
Hi ya, yes I think that in your case Joe suggestion is unfortunately not going to help. The selecting of the rows to keep will take a considerable amount of time, by the time it has finished there will be hundreds of new rows Also - ensure that you have an index on tdate (possibly even a clustered index if it makes sense with the rest of the columns/indexes/accesses) - I'd use a parameter into the proc rather than a variable or calculation and use with recompile on the proc - Combine both of the above into a single job, also making the enhancement that homam talked about above The job set to do Code: @declare @tdate set @tdate = DATEADD(dd, -30, GETDATE()) exec proc_delete_rows @tdate with recompile [code] also a proc is created as follows [code] create proc proc_delete_rows( @tdate datetime ) as begin -- Delete one thousand rows at a time: declare @CHUNK_SIZE int, @rowcount int set @CHUNK_SIZE = 1000 set rowcount @CHUNK_SIZE set @rowcount = @CHUNK_SIZE -- Keep deleting 1000 at a time until all are deleted: while @rowcount = @CHUNK_SIZE begin begin tran delete from table1 with (rowlock) WHERE tdate <= @tdate set @rowcount = @@rowcount commit tran end -- while end -- proc [code] Cheers Twan
Hi, Twan: I run the procedure manually and I receive the following error message: Server: Msg 1204, Level 19 SQL Server has run out of LOCKS. Rerun your statement when there are fewer active users, or ask the system administrator to reconfigure SQL Server with more LOCKS. First I think it is caused by the ROWLOCK hint, so I remove it, but the thing happens again. So I issue a DELETE statement directly by only deleting one day's data, guess what, the ghost appears again. But when I query the table, I found the data has been successfully deleted, is this error message a bogus one?
It almost sounds like you're executing the entire procedure within another transaction...? As long as the delete is batched and committed after each batch then you shoudn't get a lock error... Cheers Twan
Hi, Twan: I don't understand what you mean "you're executing the entire procedure within another transaction"? I have two procedures and one job, one procedure contains the while loop and another procedure execute the first procedure, then a job to run the second procedure everyday. Is it possible that the server was not releasing the locks fast enough? Do you think it is helpful that I add WAITFOR DELAY after the "commit tran" statement? begin begin tran delete from table1 with (rowlock) WHERE tdate <= @tdate set @rowcount = @@rowcount commit tran WAITFOR DELAY '00:00:05' end Thanks John
I use performance monitor to monitor the normal lock request in the database (without run the job) Lock Name Average Min Max Datababase 46 0 139 Extent 4 0 77 Key 109782 0 373145 Page 4411 0 17530 RID 75149 0 309422 Table 178 2 378 I also monitor the lock memory(KB) Average Min Max Lock momory (KB) 1334 808 2008 The momery of the server is 2GB and 1GB is assigned to SQL Server, dynamically controlled by SQL Server. Is it normal to have so many locks on the server? Run sp_configure on the server and the the option for locks is: locks5000214748364700 Thanks John
Hi John, What I meant was having a begin tran, commit tran around the procedure call itself. If the procedure is run as it is then the commit should release the lock immediately as it is part of committing the transaction, so a waitfor won't help I'm at a loss to explain why it might be running out of locks... anyone else have any ideas? Cheers Twan