SQL Server Performance Forum – Threads Archive

Keeping Data History

Hello I need to design a database system that keeps all data history for some tables.
Every time a row in the table is updated, the entire old row should be kept as history either in the same table with a date column, or in a different table, But on the same database. What is the best course of action from performance, fragmentation and low maintenance point of view? Thanks

How often is the data updated? Can rows be deleted? do you want to track deletes? Where you do this depends on how you plan to use the data. You can create an ‘audit’ table that will store the ID of the row changed and one record for every update, with the old and new values. This model will only leave current data in your main table and you’d have to join to do point in time checks. You can ‘expire’ the existing row and insert a new one with the updated with a range of effective dates. Then all your selects would have to go on the effective/expired date ranges even when all you want is current data. You can write old rows into a new table with effective/expired dates. Your main table will only have current data. You would write a union type query when you wanted all data over time. As you can see, how you intend to access your data should drive how you set this up structurally. You can do any of these with a trigger on your main table. Chris

Yes, like Chris said. In triggers you can query the special "deleted" and "inserted" tables to look at the data "before" and "after" the event. The only problem is that you cannot refer to TEXT and NTEXT columns in those special tables – you can retrieve the "after" data on those columns by joining "inserted" to the actual table, but you cannot retrieve the "before" data for those same columns. We have created an auditing system with triggers in the past, on a column-by-column basis, using the IF UPDATED(<column_name&gt<img src=’/community/emoticons/emotion-5.gif’ alt=’

‘ /> syntax. The problem there is that you cannot use interactive T-SQL but you must write out the checks and INSERT statement for each column that you need to cover. With a row-based audit the coding would be easier, but for UPDATE events it will be very difficult to pinpoint exactly what was changed – though there is a bitwise trick that you can play with the UPDATED() function. Another advantage of a row-based system is that you can handle batch operations more easily.

Thanks you very much The table may be updated several thousands times a day. I plan on keeping Data which is a month or two old. I don’t care about keeping deleted rows, only updated ones. Most of my queries will fetch the new rows so I think I prefer the separate table with effective/expired date solution, but I’m worried about fragmentation problems since every update will actually cause insert +delete. Is it advisable to use an Instead Of Trigger for that?
Would that degraded performance since each update will cause the insert+delete?

quote:
Would that degraded performance since each update will cause the insert+delete?

Every UPDATE consists of an INSERT and a DELETE operation behind the scenes. That’s wy a trigger is able to work with the inserted and deleted tables.
As for fragmentation, see if this helps:
http://www.microsoft.com/technet/prodtechnol/sql/2000/maintain/ss2kidbp.mspx —
Frank Kalis
SQL Server MVP
http://www.insidesql.de

]]>