SQL Server Performance

Duplicate Records Error

Discussion in 'SQL Server 2005 Integration Services' started by rohit2900, Aug 7, 2008.

  1. rohit2900 Member

    Hi,
    We have a SSIS package running on one server and its doing lost of tasks including updating, inserting records in different tables on different servers.
    Now in one step its creating a temp table and inserting records in it (max will be around 250 records at once) and the data in this table will be used to insert records in a other table on some other database, now after this we are adding a identity column in it and initializing it with the max of PK of the table in which data is to be inserted + 1, and then inserting all the records in the destination table. Now my concern is while inserting these records if some users insert any records through front-end then my insert in SSIS will fail.
    FYI: We cannot have a table level lock while inserting.
    I've suggested the below solution
    Instead of adding an identity column in temp table, use cursor to insert one by one record and in each record take max of PK and add 1 for that particular insert.
    Can you guys suggest any more solutions and what impact my solution will be having on performance and how should I test its performance as to what extent it will affect it.
    Rohit
  2. Adriaan New Member

    Assuming the data in the temp table has a unique key that is also a unique key on the target table, I would add a nullable column on the temp table for the identity value.
    Insert the data into the target table.
    Update the column on the temp table with the identity values from the target table, joining the two on the unique key column(s).
  3. rohit2900 Member

    Hi...
    This is not what I'm lookin for..please read my post again.
    What I'm looking for is we have a package running on server A and it will insert some records in a table on server B. it will create a temp table with the exact schema as the destination table and after inserting records in temp table except the PK column, its initilizing it with the max of PK in destination table plus 1 with increment of one its making all the records unique (for e.g if max of PK is 1000 then in the temp table it will start with 1001 for first, 1002 for second and theron) and then inserting all the records in the destination table, including the PK which is not auto generated. now if in between if any use inserts some records in the table then my insert which is in the SSIS package will fail as PK cannot be duplicate.
    I think I've explained the scenerio.


  4. Jon M Member

    Maybe you can add a script that will do the following:
    1. Reserve the rows by inserting X amount of rows into the destination table. When I say "reserve", you will insert rows into the destination table depending on the number of rows to be inserted from the temp table. But you need to have a field that will determine which rows were reserved on the destination table...maybe you will have to add a field named STATUS that you will initially set to 'Pending'.
    2. Update the temp table keyfield using the keyfield from the destination table where status is pending. Maybe you can use a cursor or ForEach Loop container.
    3. Insert the records from the temp table by using an update statement. Something like this:
    update destination
    set destination fields = a.fields
    from temp a
    where destination.keyfield = a.keyfield and destination.STATUS ='Pending'
    Jon M
  5. rachel New Member

    Why isn't the PK on the destination table autogenerated? If it's a sequential numeric ID, I think that would be the simplest solution.
  6. Madhivanan Moderator

    Try something like thisdeclare
    @t1 table(i int, v varchar(100))
    insert into @t1
    select top 10 row_number() over (order by name),name from sysobjectsdeclare
    @t2 table(i int, v varchar(100))
    insert into @t2
    select max_i+i,v from @t1 cross join (select max(i) as max_i from @t1) as tselect
    * from @t1
    select * from @t2
  7. rohit2900 Member

    Thx Guys for your Prompt response...
    But I would like to add few more things as I can't make any data model change and whatever I've to do is in the package only. And can you guys through some light on the solution I suggested like what can be the pros & cons of that approach. or any new logic that I can add in existing package.
    Rachel: I know that the simplest and best solution but the data model is been designed in this way a long back and we are doing some enhancements and as I said earlier I can't make any data model change.
    Thanks,
  8. Madhivanan Moderator

    [quote user="rohit2900"]
    Thx Guys for your Prompt response...
    But I would like to add few more things as I can't make any data model change and whatever I've to do is in the package only. And can you guys through some light on the solution I suggested like what can be the pros & cons of that approach. or any new logic that I can add in existing package.
    Rachel: I know that the simplest and best solution but the data model is been designed in this way a long back and we are doing some enhancements and as I said earlier I can't make any data model change.
    Thanks,
    [/quote]
    Did you try my suggestion?
  9. rohit2900 Member

    Madhivanan: I tried your query, but frankly speaking I didn't understand as what exactly this query is doing. As its giving two sets of results with different id's and same names....1-10 in first and 11-20 in the second set... And how should I use it in my package?
    Can you explain it .
    Thanks.
  10. Madhivanan Moderator

    [quote user="rohit2900"]
    Madhivanan: I tried your query, but frankly speaking I didn't understand as what exactly this query is doing. As its giving two sets of results with different id's and same names....1-10 in first and 11-20 in the second set... And how should I use it in my package?
    Can you explain it .
    Thanks.
    [/quote]
    It is just an example. You can get some ideas from it [:)]

Share This Page