SQL Server Performance

Delete rows or mark them as not valid

Discussion in 'T-SQL Performance Tuning for Developers' started by Cesar, May 27, 2005.

  1. Cesar New Member

    Is it a good practice delete rows in a table when they are no longer valid? If so, it can be deleted whatever number of rows and in whatever position?

    Or is it always better add a column in the table and simply put a signal in that column for the rows that are no longer valid? If so, is there a limit of ‘no valid#%92 rows in a table with this system?

    Thanks
  2. Madhivanan Moderator

    It is better not to keep unwanted records. So delete them.
    If you want to keep them, then use trigger to move them to temporary table and dont keep them with the active table.


    Madhivanan

    Failing to plan is Planning to fail
  3. dineshasanka Moderator

    If you need historical data do not delete instead put a status filed and make it D (disble) or S suspened.
    if not definetly better to delete them

    quote:Originally posted by Cesar

    Is it a good practice delete rows in a table when they are no longer valid? If so, it can be deleted whatever number of rows and in whatever position?

    Or is it always better add a column in the table and simply put a signal in that column for the rows that are no longer valid? If so, is there a limit of ‘no valid#%92 rows in a table with this system?

    Thanks

  4. satya Moderator

    If they are not used then I will consider another database or table as archived for historic MI purpose.

    Satya SKJ
    Moderator
    http://www.SQL-Server-Performance.Com/forum
    This posting is provided “AS IS” with no rights for the sake of knowledge sharing.
  5. Cesar New Member

    Ok I see, then can I delete as many rows as I want anywhere? And this will not affect the table or DB performance because the identity column (also Primary Key in my tables) will have a lot of gaps without the correlative index? I mean for example the primary column values like this: 1, 2, 3, 15, 16, 25, 26, 27, 50, 51, 75, 76, 105, 106, 107, 180, 181, 182, etc.
  6. FrankKalis Moderator

    Sure, you can delete as much rows from a table as there are. That depends on the WHERE clause.
    Performance shouldn't suffer, as gaps in the IDENTITY sequence have nothing to do with performance.

    --
    Frank Kalis
    SQL Server MVP
    http://www.insidesql.de
  7. Madhivanan Moderator

    What are you going to have problem if there are gaps? I think logically it is correct


    Madhivanan

    Failing to plan is Planning to fail
  8. satya Moderator

    After a while you can reseed the continuity of Identify field.

    Satya SKJ
    Moderator
    http://www.SQL-Server-Performance.Com/forum
    This posting is provided “AS IS” with no rights for the sake of knowledge sharing.
  9. Cesar New Member


    You may wonder why I ask this. And it is because I was told things like this:

    The follow answer is in response to my suggestion to ‘update#%92 a set of repeating rows in a frequent user process, deleting the current inserted rows and insert the new ones (only when the user decides to change the sent data) :

    quote: You never want to just delete and reinsert EVERYTHING. That's crazy and will result in some really messed up indexes, a lot of contention, and generally just bad performance. This is especially true if it's a high transaction system.


  10. joechang New Member

    yes deleting does create gaps in indexes, thats why there is the index defrag and reindex feature, or you just simply drop the index and recreate manually, either way, the indexes are no longer messed up.

    my considerations: if the table does not have many indexes, and if none of the execution plans show table scans, or would otherwise need to read many inactive rows at the index leaf level, you could just leave the rows in place with a status flag.

    if the table does have many indexes, and the archive table does not need many indexes, or if you have queries that must scan all active records, then do move the inactive records to a archive table
  11. Cesar New Member

    My tables only has only one automated index (Identity and Primary Key at the same time) to identify the records. So, I can delete all the current rows of a specified customer ordering process when he/she decides change the sent data, and then insert the new ones? Because I understand that 'the gaps' will not affect the performance, and because I suppose that delete and reinsert data (as a system to 'update' repeating rows) will not have any other problems.<br /><br />Which are your opinions on this? (It' s very important to me) [<img src='/community/emoticons/emotion-1.gif' alt=':)' />]<br /><br />Thanks<br /><br />
  12. FrankKalis Moderator

    http://www.microsoft.com/technet/prodtechnol/sql/2000/maintain/ss2kidbp.mspx

    --
    Frank Kalis
    SQL Server MVP
    http://www.insidesql.de
  13. FrankKalis Moderator

  14. Cesar New Member

    Hi [<img src='/community/emoticons/emotion-1.gif' alt=':)' />],<br /><br />Is an Online Transaction Processing (OLTP)
  15. Cesar New Member

  16. Cesar New Member

    Solved, now I already can see the article
  17. ghemant Moderator

    Dear sir,
    Its good practice to mark the invalid rows using FLAG/STATUS if u dont want to delete the row, i do the same way since last 2 yrs, and it doesnt affect db performance bcoz whenever the query fired it have to have the criteria as conditioned for FLAG and thus , or its feasible that u can move the data to another db for archive it move to another db to warehouse it...

    thanx.

    hsGoswami
    ghemant@gmail.com
  18. dhilditch New Member

    For data like this - customer details - bear in mind that Data Protection issues come into play and it may be that by law you MUST delete the data. If you can keep the data and want to keep historical changes, I like the following solution:

    Have a column called status tinyint type, default it to 0.

    Create a view for your table that your application will use which is:-

    SELECT * FROM mytable
    WHERE status = 0

    Use that to select from and insert to.

    When it comes to deleting, since you are inserting afterwards, use the following code to delete:

    UPDATE mytable
    set status = status + 1
    where customerid = X

    And then you will have nice versioned data stored in your database. You can archive this off later if you need to or delete as appropriate to comply with data protection.

    Put a clustered index on customerid with 100% fill factor, a non-clustered index on status (not clustered as you are updating this column) and bobs your uncle.

    Dave

    david@matiogi.com
    www.matiogi.com
  19. rockmoose New Member

    Nice technique Dave.
    However it will be difficult to maintain FK references to the table, unless there is s surrogate key as well.
    I disagree about the 100% fillfactor on customerid, this will force sql server to reallocate the data when a new
    version of a particular customerid is inserted.
  20. dhilditch New Member

    I was basing the 100% fillfactor on having an identity property on the customerid column. Whenever a new customer is added, the primary key value will always be larger than any that already exist so no re-organisation is needed and in fact it would be wasteful with a primary key on your customerid column to not set a 100% fill factor in my opinion.

    And regarding FK references, surely there wouldn't be an FK reference to the status column? Except maybe a static status table that describes each status value if you want but I'd avoid that.

    david@matiogi.com
    www.matiogi.com
  21. rockmoose New Member

    Ok,
    I was thinking You kept the same customerid and incrementing the status everytime there was a new version of the customer.
  22. dhilditch New Member

    Yes I was thinking that.

    david@matiogi.com
    www.matiogi.com

Share This Page