SQL Server Performance

duplication in two tables

Discussion in 'General DBA Questions' started by Tahir, Feb 24, 2010.

  1. Tahir New Member

    Hi, I am having sql server 2005 enterprise edition with sp3. I am running a job for archiving older data. The job first copy data from one table2 to table1 and when the copy completes, it deletes the record from the orignial table. Hear is the job script:"
    BEGIN TRANSACTION
    Begin try
    declare @Ids bigint
    Set @Ids=0
    select @Ids=max(id) from table1
    INSERT INTO table1
    SELECT * FROM table2 where UtcTime < DateADD(DAY, -90, getdate());
    delete from table2 where id IN( SELECT id FROM table1 where id > @Ids)

    COMMIT TRANSACTION;
    END Try
    Begin Catch
    IF @@TRANCOUNT > 0
    ROLLBACK TRANSACTION;
    End Catch
    "Now when the job executes it copies records from table2 to table1. Next time the job does not copy any data cause when the jobs runs if finds duplication in both table1 and table2. My question is when the insertion in table1 successfully made, I delete those rows from table2 and if any thing happens every thing roll back. Where the problem is? I need help
  2. Adriaan New Member

    I think you first need to concentrate on the problem with duplicates that is occurring in table1.
    Solve that, and it may already eliminate the problems with the transfer to table2.
  3. Tahir New Member

    Thanks. OK but what in your opinion is the solution to avoid duplicate values being coming into table1. Do you thin i should make the id in table1 as autogenerated value?
  4. Adriaan New Member

    Table1 is your history table.
    For each key value from table2, do you want to see only the last archived row? Then you have two optons:
    (1) Update the matching rows in table1 (matching key, and answering criteria on UTCTime) then insert unmatched rows (answering criteria on UTCTime) from table2.
    (2) Delete the matching rows from table1 (matching key, and answering criteria on UTCTime) then insert all rows from table2 (answering criteria on UTCTime).
    If you want to keep previously archived rows, then either add an IDENTITY column to table1, and drop the uniqueness for the current unique constraint, or extend the key to table1 with the UTCTime.
  5. Tahir New Member

    OK but I didn't pick your answer clearly. Let me repeat. Table1 is my history table, since table2 is production table and i don't want to have so much records in it, that is why I archive 3 months older data in table1, when data successfully copies to table1 i delete these records from table2. In my script every thing is in Transaction, if the copy to table1 is successful then it should be deleted from table2 or the transaction should be roll back. Why the records are their in both table1 and table2.Thanks
  6. Adriaan New Member

    Your history table appears to have the same key constraints as your production table.
    Let's say you have a record with key value 'A' in your production table. You're moving it to your history table - so far, so good. Apparently your production table is getting populated again and there is another row with key value 'A'.
    So the next time you want to archive from your production table, you try to copy the row with key value 'A' to your history table. Since your history table has a unique constraint on that key value, the insert will fail.
    If you want to keep the old archived row, and add this newer row as well, you could add the UPC timestamp to the key constraint on the history table, and make sure the archiving routine will copy this value over too.
    If you only need the latest row with key value 'A', then you have two options, which will have the exact same result:

    (1)
    DELETE FROM history
    WHERE history.keyvalue
    IN (SELECT keyvalue FROM production WHERE ...archiving_criteria ...);
    INSERT INTO history (columns)
    SELECT columns
    FROM poduction
    WHERE ...archiving_criteria ...;
    (2)
    UPDATE history
    SET history.column = production.column, etc
    FROM history
    INNER JOIN production ON history.keycolumn = production.keycolumn
    WHERE ...archiving_criteria ...;
    INSERT INTO history (columns)
    SELECT columns
    FROM poduction
    LEFT JOIN history ON production.keycolumn = history.keycolumn
    WHERE history.keycolumn IS NULL
    AND ...archiving_criteria ...;
  7. Tahir New Member

    its ok i got it but my production table has identity column which is autogenerated value when it gets deleted it does not start from the deleted id but rather a new id is generated for the new record, then how can the old value get inserted into it(Production table). Thanks again for your help.
  8. Adriaan New Member

    Is the identity column the only unique key on the production table? It is often a substitute key, next to a natural key (which can consist of one or more columns). The natural key is probably what is causing the duplicate problems.
  9. Tahir New Member

    Yes the only unique column in the production table is identity. What I want to convey is that, let me explain the scenario with an example:table1 is having following records:"table1id Name1 Tahir2 Adriaan Now suppose I execute the following statement Delete from table1 where id = 2 select * from table1resultid Name1 TahirNow I insert another value in table 1insert Into table1 values('Adriaan')Now If I execute the following statementSelect * from table1 resultid Name1 Tahir3 Adriaan"This is what I want to convey, the new identity column is 3 not 2, when I delete an id from table1 then how does it come again in table1, which cause duplication.
  10. Adriaan New Member

    Identity columns can get reset ... Also note that if the identity column is not also the primary key, that you must add a unique constraint to the identity column in order to assure uniqueness.
    Anyway, in your example 'Tahir' and 'Adriaan' are the natural key. If there is no unique constraint on the production table (besides the one on the identity column) that doesn't mean there is not a candidate key - a (set of) column(s) that contains unique (sets of) values.
  11. Tahir New Member

    Yes we can reset the identity column means explicitly but if we don't want to reset it explicitly then SQL server don't reset explicitly. 2ndly what do you mean by Natural Key. 3rd thing I learn a lot from your discussion, can you give me a url where I can find all about keys in sql server and identity column. Is it possible to find out a collection of stored procedure on the net so that I learn from it how complex stored procedures are written. Thanks and you are doing a great service to humanity.

Share This Page