SQL Server Performance

How do I delete records based on a day of the week

Discussion in 'General Developer Questions' started by SanetteWessels, Aug 15, 2003.

  1. SanetteWessels New Member

    Let me explain...

    We have an Archive table that grows by around 10 million records per day. I need to clean this table up and only keep the last 2 weeks of data per day and the rest only records for a Monday.

    DECLARE @dDate DATETIME
    SET @dDate = getdate() - 14

    DELETE tbl_ArchiveInventory
    WHERE @dDate > ArchiveInventoryDate
    AND DATENAME(DW, ArchiveInventoryDate) <> 'Monday'

    Is there any other way? I need speed...speed...speed!! [8D]

    Sanette



    SQL Developer
    JustEnough Software Corporation
  2. SQL_Guess New Member

    Create Temp table

    {optional - will improve performance, may not be feasible] Choose you poison for getting no logging (set recovery model simple, bulk insert etc. etc. - bear in mind potential impact on recoverability)

    bcp data you need to keep (far smaller subset than what you want to remove) to temp

    truncate table (the real one)

    bcp data you need to keep from temp to normal

    either drop temp table , or truncate it

    undo the no logging above.

    This would be specifically for you 1st, BIG run.
  3. bambola New Member

    Try this:


    declare @t table (d datetime)
    declare @date datetime
    select @date = '2003-08-15'

    insert into @t
    select dates.d from (
    select dateadd(dd,seq.id, @date - 14) as D from (
    select (d0.id + d1.id) as id
    from
    (select 0 id union select 1 union select 2 union select 3 union select 4 union select 5 union
    select 6 union select 7 union select 8 union select 9) d0,

    (select 0 id union select 10 union all select 20) d1
    )
    seq
    ) dates
    where dates.D < @date
    and datename(DW, d) = 'MONDAY'
    order by dates.D

    delete
    from tbl_ArchiveInventory
    where ArchiveInventoryDate > dateadd(dy, -14, @date)
    and dateadd(dd, 0, datediff(dd, 0, ArchiveInventoryDate)) NOT IN
    (select d from @t)

    Bambola
  4. gaurav_bindlish New Member

    I would recommend having a computed column in the table for storing the day. Create a clustered or covering non-clustered index on the date and day columnn.

    The other option can be to use DATEPART with weekday option. String comparison is always slower than integer comparison.

    Gaurav
    Moderator
    Man thrives, oddly enough, only in the presence of a challenging environment- L. Ron Hubbard
  5. gaurav_bindlish New Member

    Bambola, I think your query will prevent use of any index on ArchiveInventoryDate since you are using dateadd(dd, 0, datediff(dd, 0, ArchiveInventoryDate)). What do you say?

    Although I liked the idea of storing the valid dates in a table. Can you re-write the query using join of two tables?

    Gaurav
    Moderator
    Man thrives, oddly enough, only in the presence of a challenging environment- L. Ron Hubbard
  6. SQL_Guess New Member

    We are talking about moving, depending on whether is is 10 million per weekday or work day, 100 to 140 Million rows + ((x-2) times mondays x 10 million) rows ... don't you think trying to do this as a non-logged operation, would be better, or are you just trying to work on the select's ?

    x being the number of "monday's" in the table, currently.



  7. SQL_Guess New Member

    another thought :

    @bambola, you have :

    delete
    from tbl_ArchiveInventory
    where ArchiveInventoryDate > dateadd(dy, -14, @date)
    and dateadd(dd, 0, datediff(dd, 0, ArchiveInventoryDate)) NOT IN
    (select d from @t)

    If I'm understand SARGable correctly, "dateadd(dd, 0, datediff(dd, 0, ArchiveInventoryDate)) " will make this not SARGable.

    Since we have data in @t in standard datetime format, why not just do :

    delete
    from tbl_ArchiveInventory
    where ArchiveInventoryDate > dateadd(dy, -14, @date)
    and ArchiveInventoryDate NOT IN (select d from @t)

    I'm sure I am misunderstanding something here ...
  8. gaurav_bindlish New Member

    SQL_Guess, I think Bambola was trying to take out the date part of the date and not use time in the comparison. This is because 03-11-2003 11:59:59.002 is not equal to 03-11-2003 12:00:00.000

    Gaurav
    Moderator
    Man thrives, oddly enough, only in the presence of a challenging environment- L. Ron Hubbard
  9. bambola New Member

    I see everyone is so worried about that SARGable thing. But..<br />1 - who said that there is an index on that field?<br />2 - what about the DATENAME(DW, ArchiveInventoryDate) &lt;&gt; 'Monday'<br /> It's a function, isn't it?<br /><br />So to my understanding in both cases it will go over all the rows<br />that match the first condition and run the function on the column. <br /><br />From the test I did on a table of 1.5 mill rows deleting 4000 rows,<br />both queries took about 4 seconds without an index. less than<br />100 ms with an index. In both cases Sanette's initial query was faster <br />by 40-50 milliseconds. <br /><br />I did find a way to match the Sanette's query, but it's more complicated <br />and since performance does not improve, I'd still go with the query Sanette<br />wrote.<br /><br />Here it is if anyone is curious.<pre><br />where cast(substring(cast(d_senddate as binary(<img src='/community/emoticons/emotion-11.gif' alt='8)' />), 1, 4) as int) &gt; <br /> cast(substring(cast((dateadd(dy, -14, @date) ) as binary(<img src='/community/emoticons/emotion-11.gif' alt='8)' />), 1, 4) as int) <br />and cast(substring(cast(@d as binary(<img src='/community/emoticons/emotion-11.gif' alt='8)' />), 1, 4) as int) NOT IN <br /> ( select cast(substring(cast(d as binary(<img src='/community/emoticons/emotion-11.gif' alt='8)' />), 1, 4) as int) from @t)</pre><br />So to answer your question, Sanette, add an index on the date column if you don't have <br />one. Otherwise, wait for someone to come up with a better solution <img src='/community/emoticons/emotion-1.gif' alt=':)' /><br /><br />SQL_Guess - Your query will work only if ArchiveInventoryDate has only the date part <br />and the time part is 0. If this is the case you can remove the function I'm using and <br />it will be SARGable. <br /><br />Bambola.<br /><font face="Book Antiqua"><font size="2">If we are to achieve results never before accomplished, we must expect to employ methods never before attempted. - Francis Bacon</font id="size2"></font id="Book Antiqua">
  10. Twan New Member

    What if you create a computed column called delete_date and set it according to your logic... here is some untested code (i.e. off the cuff...)

    alter table tbl_ArchiveInventory
    add DeleteRecordDate as case datename( dw, ArchiveInventoryDate )
    when 'Monday' then '9999-12-31'
    else dateadd( day, 14, ArchiveInventoryDate ) end

    create index ix_tbl_ArchiveInventory_DeleteRecordDate on tbl_ArchiveInventory( DeleteRecordDate )

    Delete from tbl_ArchiveInventory where DeleteRecordDate <= getdate()


    This could be modified if you want days only...

    create function fn_get_DeleteRecordDays( @archivedate datetime )
    returns int
    as
    begin
    return convert( int,
    convert( varchar, datepart( year, ArchiveInventoryDate ) ) +
    convert( varchar, datepart( month, ArchiveInventoryDate ) ) +
    convert( varchar, datepart( day, ArchiveInventoryDate ) ) )
    end

    alter table tbl_ArchiveInventory
    add DeleteRecordDays as case datename( dw, ArchiveInventoryDate )
    when 'Monday' then 100000000 -- to make it greater than 9999-12-31
    else fn_get_DeleteRecordDays( dateadd( d, 14, ArchiveInventoryDate ) ) end

    create index ix_tbl_ArchiveInventory_DeleteRecordDays on tbl_ArchiveInventory( DeleteRecordDays )

    declare @days int
    set @days = fn_get_DeleteRecordDays( getdate( ) )
    Delete from tbl_ArchiveInventory where DeleteRecordDays <= @days
  11. bambola New Member

    Sanette - could you tell us if this column has an index on it and if so, what kind of index?



    Bambola.
    If we are to achieve results never before accomplished, we must expect to employ methods never before attempted. - Francis Bacon
  12. SanetteWessels New Member

    Wow! Since I have last check this site, things have heated up!! [<img src='/community/emoticons/emotion-2.gif' alt=':D' />]<br /><br />Bambola, there is a primary Clustered key on the ArhiveInventoryId (identity column) and and an unique index on the InventoryID and ArchiveInventoryDate with a fill factor of 90%.<br /><br />I have also tried some other options. Please have a look at my code. It removed around 300mil records and took 21 hours! (I must just add that it removed records from 2 tables)<br /><br /><font color="blue">DECLARE @iIncrement INT <br />SET @iIncrement = 5000 -- number of InventoryID's to work with in one delete statement<br /><br />CREATE TABLE #ItemsToDelete (ArchiveInventoryID INT)<br />CREATE CLUSTERED INDEX IDX_temptbl_ArchiveInventoryID ON #ItemsToDelete ArchiveInventoryID)<br /><br />DECLARE @dDate DATETIME<br /> , @iArchiveInventoryID INT<br /> , @iIDMin INT<br /> , @iIDMax INT<br /> <br />SET @dDate = CONVERT(DATETIME,FLOOR(CONVERT(FLOAT,getdate()))) - 14 -- need to save 2 weeks of daily records.<br /><br /> SELECT @iIDMin = MIN(ArchiveInventoryID) FROM tbl_ArchiveInventory<br /> SELECT @iIDMax = MAX(ArchiveInventoryID) FROM tbl_ArchiveInventory<br /><br /> WHILE @iIDMin &lt;= @iIDMax<br /><br /> BEGIN<br />IF EXISTS (SELECT * from tempdb..sysobjects WHERE name like '#ItemsToDelete')<br />TRUNCATE TABLE #ItemsToDelete<br /> <br />INSERT INTO #ItemsToDelete (ArchiveInventoryID)<br /> SELECT ArchiveInventoryID<br /> FROM tbl_ArchiveInventory<br /> WHERE ArchiveInventoryID &lt; @iIDMin + @iIncrement<br /> AND ArchiveInventoryID &gt;= @iIDMin<br /> AND @dDate &gt; ArchiveInventoryDate <br /> AND DATENAME(DW, ArchiveInventoryDate) &lt;&gt; 'Monday'<br /><br /> DELETE tbl_ArchiveInventoryStatus<br /> FROM #ItemsToDelete T<br /> INNER JOIN tbl_ArchiveInventory AI <br /> ON T.ArchiveInventoryID = AI.ArchiveInventoryID<br /> INNER JOIN tbl_ArchiveInventoryStatus AIS<br /> ON AI.ArchiveInventoryID = AIS.ArchiveInventoryID<br /> AND AI.ArchiveInventoryID = @iArchiveInventoryID<br /> WHERE @dDate &gt; AI.ArchiveInventoryDate <br /> AND DATENAME(DW, AI.ArchiveInventoryDate) &lt;&gt; 'Monday'<br /> <br /> <br /> DELETE tbl_ArchiveInventory<br /> FROM #ItemsToDelete T<br /> INNER JOIN tbl_ArchiveInventory AI<br /> ON T.ArchiveInventoryID = AI.ArchiveInventoryID<br /> WHERE @dDate &gt; ArchiveInventoryDate <br /> AND DATENAME(DW, ArchiveInventoryDate) &lt;&gt; 'Monday'<br /> <br /> SET @iIDMin = @iIDMin + @iIncrement<br /><br /> END</font id="blue"><br /><br />Sanette<br /><br />SQL Developer<br />JustEnough Software Corporation<br />------------------------------------------------------------------------------------<br />"Walking on water and coding of a spec is easy, as long as both are frozen" - A friend
  13. bambola New Member

    Could you tell us every how long you have to run this procedure?
    I think that the problem doesn't come from the date column, but from deleting so
    many lines on tables that have clustered index...
    I would try dropping the indexes, maybe keep a non clustered on the joined column, loop over the table with SET ROWCOUNT x and recreate the indexes after the procedure it done.
    I will try to do some test later on and let you know what I cam up with.

    Bambola.
  14. SanetteWessels New Member

    It will only run once a month or maybe more often if needed. But bare in mind, we have a window period of 8 hours every night to do the rest of our number crunching stuff.
    The reason why I added the while loop was that if they need to 'crash' the proc they can without waiting for the rollback to happen if it was all in one statement.

    Sanette

    SQL Developer
    JustEnough Software Corporation
    ------------------------------------------------------------------------------------
    "Walking on water and coding of a spec is easy, as long as both are frozen" - A friend
  15. SanetteWessels New Member

    I also think it has to do with the table size. I have noted that SQL server gets slower and slower with some statements due to table size. I am not talking marginally slower but sometimes 2 / 3 times slower. That is also another reason why I added the WHILE loop so that only x amount of records are executed in one go. The less records SQL has to deal with the faster it works. [?]

    So are we then saying this is it, the user will either have to run the prcedure more often or just be patient for it to finish?

    Sanette

    SQL Developer
    JustEnough Software Corporation
    ------------------------------------------------------------------------------------
    "Walking on water and coding of a spec is easy, as long as both are frozen" - A friend
  16. bambola New Member

    Nother thing I would consider is keeping the 'Monday' in an archive table. In this case instead of deleting so many rows, you would copy all mondays into the archive table then truncate the big one.
    That might be the fastest way.
    And if your queries need to work on both table, you can create a view and UNION ALL both table.
    And yes, I would run it more often if tables grow so fast.

    Bambola.

  17. SanetteWessels New Member

    Thanks, Bambola much appreciated!

    Sanette

    SQL Developer
    JustEnough Software Corporation
    ------------------------------------------------------------------------------------
    "Walking on water and coding of a spec is easy, as long as both are frozen" - A friend
  18. bambola New Member

    you're fast!
    Anyway, let us know what did you end up doing and how does it perform.

    Bambola.
  19. SQL_Guess New Member

    quote:Originally posted by SanetteWessels

    It will only run once a month or maybe more often if needed. But bare in mind, we have a window period of 8 hours every night to do the rest of our number crunching stuff.
    The reason why I added the while loop was that if they need to 'crash' the proc they can without waiting for the rollback to happen if it was all in one statement.

    Sanette

    SQL Developer
    JustEnough Software Corporation
    ------------------------------------------------------------------------------------
    "Walking on water and coding of a spec is easy, as long as both are frozen" - A friend

    Sannete,

    Was there any reason you couldn't go with the original proposal I had of only listing the data you want to keep, truncating the table, then reinserting that data ?

    I would have estimated that you would have kept less than you needed to delete (assuming you had more than about 4 1/2 weeks data). Was the delete list smaller than the keep list ?

    Also, can you look at turning off logging ? Your deletes are all logged, afaics (as far as I can see).

    CiaO (of Totsiens)
  20. SanetteWessels New Member

    Goeie more SQL Guest,<br /><br />The only problem is that data gets into that table via a "Consolidation" run. Our system is an automated one and there is no user intervention. Everything happens in a schedule. We do not have access to our Customer's databases to do such operatons. (BCP's etc). Plus that table is accessed throughout the day and there is no "downtime" from the user's side in order for me to remove all the records in the table and move the data back from the temp table. But not a bad idea though! Thanks for your input.<br /><br />Totsiens! [<img src='/community/emoticons/emotion-2.gif' alt=':D' />]<br /><br />Sanette<br /><br />SQL Developer<br />JustEnough Software Corporation<br />------------------------------------------------------------------------------------<br />"Walking on water and coding of a spec is easy, as long as both are frozen" - A friend
  21. SanetteWessels New Member

    SQL Guest, you mentioned " turning off logging "? Where do I do that?

    Sanette
  22. Twan New Member


    You don't have any delete triggers on the table do you? If so, then these would also slow down deletes by an order of magnitude (especially if the triggers were written assuming single row deletes and not optimised for multi-row deletes...)

    How many indexes are there on the table? These would also reduce delete performance

    Did you try adding a computed column being the date after which the record can be deleted, create a non clustered index on that column and delete based on that?

    I'd also try locking the table to start off with, rather than trying to lock 400m individual rows...(I realise that lock escalation will kick in at some point)

    re the procedure above I would change the # table to a table variable, and change the deletes as below (NOTE though that the procedure is inserting 400m rows and deleting 800m rows, so this should hopefully be slower than deleting the rows from the table directly...?)

    DECLARE @iIncrement INT
    SET @iIncrement = 5000 -- number of InventoryID's to work with in one delete statement

    declare @ItemsToDelete table(ArchiveInventoryID INT)

    DECLARE @dDate DATETIME
    , @iIDMin INT
    , @iIDMax INT

    SET @dDate = CONVERT(DATETIME,FLOOR(CONVERT(FLOAT,getdate()))) - 14 -- need to save 2 weeks of daily records.

    SELECT @iIDMin = MIN(ArchiveInventoryID) FROM tbl_ArchiveInventory
    SELECT @iIDMax = MAX(ArchiveInventoryID) FROM tbl_ArchiveInventory

    WHILE @iIDMin <= @iIDMax

    BEGIN
    begin tran

    TRUNCATE TABLE @ItemsToDelete

    INSERT INTO @ItemsToDelete (ArchiveInventoryID)
    SELECT ArchiveInventoryID
    FROM tbl_ArchiveInventory
    WHERE ArchiveInventoryID between @iIDMin and @iIDMin + @iIncrement
    AND @dDate > ArchiveInventoryDate
    AND DATENAME(DW, ArchiveInventoryDate) <> 'Monday'

    DELETE tbl_ArchiveInventoryStatus
    FROM @ItemsToDelete T
    -- loop probably not required but definitely do not want a hash join
    -- additional where clause not required since @ table only has deletable rows
    INNER loop JOIN tbl_ArchiveInventoryStatus AIS
    ON T.ArchiveInventoryID = AIS.ArchiveInventoryID

    DELETE tbl_ArchiveInventory
    FROM @ItemsToDelete T
    -- loop probably not required but definitely do not want a hash join
    -- additional where clause not required since @ table only has deletable rows
    INNER loop JOIN tbl_ArchiveInventory AI
    ON T.ArchiveInventoryID = AI.ArchiveInventoryID

    SET @iIDMin = @iIDMin + @iIncrement

    commit tran

    END

    Cheers
    Twan
  23. SQL_Guess New Member

    More Sannete,

    By "turning off logging", I meant:

    Using non-logged opreations , like the truncate, or bcp, or bulk insert. Due to the non-logged nature, these operations would be faster than insert/deletes. Bear in minf the following, from BOL :


    quote:When using the full recovery model, all row-insert operations performed by bcp are logged in the transaction log. For large data loads, this can cause the transaction log to fill rapidly. To help prevent the transaction log from running out of space, a minimally logged bulk copy can be performed if all of these conditions are met:

    The recovery model is simple or bulk-logged.

    The target table is not being replicated.

    The target table does not have any triggers.

    The target table has either 0 rows or no indexes.

    The TABLOCK hint is specified. For more information, see Controlling the Locking Behavior.
    Any bulk copy into an instance of Microsoft® SQL Server™ that does not meet these conditions is logged

    Obviously, changing recovery model would have implications on the recoverability of the DB, hence my original idea was for your "mass" inital unload, where I assummed you would have some kind of window. Don't you just hate 24x7 ?

    CiaO
  24. SanetteWessels New Member

    quote:You don't have any delete triggers on the table do you? If so, then these would also slow down deletes by an order of magnitude (especially if the triggers were written assuming single row deletes and not optimised for multi-row deletes...)
    No triggers

    quote:How many indexes are there on the table? These would also reduce delete performance
    1 Clustered on ArchiveInventoryID and 1 Unique on InventoryID and ArchiveInventoryDate

    I am going to implement your suggestions Twain, and will let you know what the result was. This will only be tomorrow as I only work from 8 to 1 and the process takes a number of hours to execute. Thanks!

    SQL Guess: Don't you just hate 24x7
    YIP!!

    Sanette
  25. bambola New Member

    > Did you try adding a computed column being the date after which the record can be deleted,
    > create a non clustered index on that column and delete based on that?

    date column already have an index, so adding the calculated column is not needed. Also keep in
    mind that table has a clustered index which means that once a row is deleted all non clustered
    index will need to be updated. so I don't think it is a good idea. It is probably deleteing
    a row with a clustered index + updating other indexes + logging that's causing the problem,
    not finding the date column.

    > I'd also try locking the table to start off with, rather than trying to lock 400m individual
    > rows...(I realise that lock escalation will kick in at some point)

    Locking the table means no other processes would be able to access it (unless using NOLOCK hint)
    and I understood it is not possible. Maybe it is the case to let sql handle locks. It will probably
    not be a row level lock anyway, but a Page lock and KEY lock since index is updated.

    `Table datatype can be a good idea, as long as row number is not too high. when it is a temporaty table is being created instead. In my test for 5000 rows there was no problem and the table datatype was used.

    > TRUNCATE TABLE @ItemsToDelete
    You cannot truncate a table datatype. You'd have to go with delete here.

    Another idea that might solve both logging and performance issues.
    Still thinking about moving the mondays rather than deleting the other days.
    Since SELECT INTO is a minimally logged operation

    SELECT ArchiveInventoryID
    INTO tbl_ArchiveInventory_new
    FROM tbl_ArchiveInventory
    WHERE DATENAME(DW, ArchiveInventoryDate) = 'Monday'
    AND ...

    DROP TABLE tbl_ArchiveInventory

    EXEC sp_rename 'tbl_ArchiveInventory_new', 'tbl_ArchiveInventory'

    Create whatever indexes you had on tbl_ArchiveInventory

    TO my opinion, tt should be much faster. I don't know about the implications, but I remember
    once openning a trace to see how alter table is handled (when doing it from EM), and this is what I have seen in it. insert into new_table, drop table and rename new_table.

    Bambola.
  26. SQL_Guess New Member

    @bambola,

    Unfortunately, Sanette's 24x7 requirements kinda negates my truncate movements thoughts, as well as any table renaming (see my original post regarding way to do this, back when this thread was young and innocent).

    It's ugly, but I don't know what more she can do.

    PS : at 400 million rows, isn't SQL going to escalate to Table Locks ? Doesn't that mean we are all the way back to table locked exclusive, and no-one can get in ?

    @Sanette - when you ran this for 21 hours - were users able toaccess at all ?

    CiaO
  27. bambola New Member

    Truncate table and renaming the table should be quite fast so I don't think there is a problem here. The SELECT from tbl_ArchiveInventory Into the new table can be done with NOLOCK hint to avoid lock problems. To handle new insert you can keep the last id inserted then add the new rows inserted (I am assuming rows are not updated). So lock would be held for the time it takes to add new rows (so you won't lose information), and when you recreate indexes on the new table. It seems much less than with the other possibility. Actually it should be pretty fast, I would give it a shot.
    And this procedure should probably run more often and not arrive to a point where table it this big.

    Bambola.
  28. Twan New Member

    &gt;&gt;<br />date column already have an index, so adding the calculated column is not needed. Also keep in <br />mind that table has a clustered index which means that once a row is deleted all non clustered <br />index will need to be updated. so I don't think it is a good idea. It is probably deleteing<br />a row with a clustered index + updating other indexes + logging that's causing the problem, <br />not finding the date column.<br />&gt;&gt;<br /><br />the date column that is there now is not selective enough. The idea with adding the computed column is to always only have a small proportion of the table with the date column being less than getdate() This should allow fast deletion via the index<br /><br />updating the other indexes will need to be done irrespective of having a clustered index or not... after all the row is being deleted... <img src='/community/emoticons/emotion-1.gif' alt=':)' /><br /><br />yep logging will be adding a massive overhead, but that's one that you can't get around if you want the database to remain recoverable...<br /><br />locks should escalation from row to page to table, so yes users should feel the affect of this...<br /><br />During the run it might be worth 'profiling' (Profiler and Perfmon) what happens, also with regard to checkpointing log resizing, etc.<br /><br />Cheers<br />Twan
  29. SanetteWessels New Member

    Just to give you all an update....

    I implemented Twan's suggestions yesterday and it ran 14 hours. Better than 21 hours!

    Seen that this proc is still in development, it has not been tested on the production server yet, so I do not have any feedback from the users as to accessing the table or not.

    This process will form part of the 'daily' run that gets executed at night. (That is IF I can get it to run within a 30 minute window!! [:0]) This is the only time that I will have exclusive rights to the tables mentioned. If I can get either the BCP option or the INSERT INTO option done in this time frame then it's fine, else I will have to go with what I have now, and the users will just have to run and kill this proc everyday until such time as the table is "cleared".

    Sanette

    SQL Developer
    JustEnough Software Corporation

    quote:Walking on water and coding of a spec is easy, as long as both are frozen - A friend
  30. Twan New Member


    30 minutes to delete 400 million rows sounds a little ambitious to me...

    perhaps some more lateral thinking is required... Are you able to create a partitioned table with a view combining it all back together? then you can redefine your view as and when required and dropcreate tables?

    so if you have up to 13 tables (i.e. one for each non-Monday in the fortnight plus one for Monday)each set up with the same columns/datatypes, etc. but with a check constraint on ArchiveInventoryDate for either a specific date and/or datepart

    not sure, just thinking outloud

    really thinking that bcping in several 100M rows will take > 30 minutes and that's without any indexes...

    Twan

  31. SQL_Guess New Member

    Let's face : 30 mins means way less deletes.

    How many rows inserted per day ? 10 000 000 ? How many can you delete in 30 minutes ? You may very well want to consider something like a controlled delete process, doing batches. Remember, if you are trying to delete 300 Mil in 30 mins, and the users kill the SP .... all your delete's rollback, and you've achieved nothing. How about trying loop through in smaller chunks of , say , 1 mill at a time. GRanted, 300 x 1million row delete > 1x 300 million delete in total execution time BUT it is better if you delete 20 mil a day, than trying to do 300, being killed, and tomorrow trying to do 310 mil ...

    I like the idea of changing to a partitioned table.

    @Sanette : you say you insert 10 Mil a day ? Via batch load, or by individual inserts ? If you're doing it via a batch, can't we try and change the insert process to remove unnecessary data ? This goes back to my original idea of trying to dump the KEEP daat into a table, add the new data, truncate the orig table, rename the KEEP table to prod, and away you go ...

    Otherwise, to be honest, if you're trying to keep 23.5 x 7 (your 30 minute window), then the trickle delete may be the only way to go... and that's assuming you can delete more in a day than you insert, otherwise you're alway going to fall further and further behind, until you can force them to give you a real window to cleanup.

    Good Luck !!!
  32. SanetteWessels New Member

    <blockquote id="quote"><font size="1" face="Verdana, Arial, Helvetica" id="quote">quote:<hr height="1" noshade id="quote">you say you insert 10 Mil a day ? Via batch load, or by individual inserts <hr height="1" noshade id="quote"></blockquote id="quote"></font id="quote"><br />Yes (sometimes more, sometimes less) and yes it is via batch mode.<br /><br /><blockquote id="quote"><font size="1" face="Verdana, Arial, Helvetica" id="quote">quote:<hr height="1" noshade id="quote">change the insert process to remove unnecessary data <hr height="1" noshade id="quote"></blockquote id="quote"></font id="quote"><br /><br />I will definitely build that in once I get this table cleared of all the unwanted stuff.<br /><br />I have also added a filter than will clear a million at a time. (That is fairly fast.. around 8 min). So, yes thanks for all the help so far. The user will just have to run this proc over and over as often as possible during the day. [<img src='/community/emoticons/emotion-2.gif' alt=':D' />]<br /><br />( NS: Brrr dis BAIE koud vandag!!)<br /><br />Sanette
  33. SQL_Guess New Member

    <blockquote id="quote"><font size="1" face="Verdana, Arial, Helvetica" id="quote">quote:<hr height="1" noshade id="quote"><i>Originally posted by SanetteWessels</i><br /><br />I will definitely build that in once I get this table cleared of all the unwanted stuff.<br /><br />I have also added a filter than will clear a million at a time. (That is fairly fast.. around 8 min). So, yes thanks for all the help so far. The user will just have to run this proc over and over as often as possible during the day. [<img src='/community/emoticons/emotion-2.gif' alt=':D' />]<br /><hr height="1" noshade id="quote"></blockquote id="quote"></font id="quote"><br /><br />Well, good luck with that1<br /><br /><blockquote id="quote"><font size="1" face="Verdana, Arial, Helvetica" id="quote">quote:<hr height="1" noshade id="quote"><br />( NS: Brrr dis BAIE koud vandag!!)<br /><br />Sanette<br /><hr height="1" noshade id="quote"></blockquote id="quote"></font id="quote"><br /><br />@Sanette - Jy woon nie in ons lieflikke sneeuland, hier in die Wes-Kaap ? Ek is in Kaapstad , btw.
  34. SanetteWessels New Member

    quote:Jy woon nie in ons lieflikke sneeuland, hier in die Wes-Kaap ? Ek is in Kaapstad , btw.
    Nee, in Johannesburg. Wens ek het in die mooie Kaap gewoon! Sterkte met die koue!!

    Sanette

    SQL Developer
    JustEnough Software Corporation

    quote:Walking on water and coding of a spec is easy, as long as both are frozen - A friend

Share This Page