SQL Server Performance

Deleting Duplicate Records in a table

Discussion in 'General Developer Questions' started by desabhatla, Jul 8, 2003.

  1. desabhatla New Member

    Hi,
    I need a help from all of you. I have a SQL Server table where data comes from an external Oracle System which contains around 35 columns. We are picking up only 5 fields from the source which contains one ID column and description fields for the rest 4 fields. My table also consists of a primary key id and the 5 columns picked from the Oracle system. My problem arises from the fact that few duplicate records get inserted into my table although the primary identifying key is different. The description fields of few records are same although the Id of them are different. I have restriction of not able to picking up distinct values from the Source and whatever I have to do is at my system only. Now my purpose is to delete one of the sets of the duplicate records that I have at my current system. I am not able to do it using the query I used. Can anyone of you please help eliminate only one set of the two duplciate records set. My query below can eliminate all the records including the original and duplcate records.


    select * FROM TB_APP035_SUB_BUSINESS_TIER3
    WHERE EXISTS (SELECT NULL FROM TB_APP035_SUB_BUSINESS_TIER3 b WHERE
    b.SUB_BUSINESS_NAM_TIER3=TB_APP035_SUB_BUSINESS_TIER3.SUB_BUSINESS_NAM_TIER3 and
    b.SUB_BUSINESS_ID=TB_APP035_SUB_BUSINESS_TIER3.SUB_BUSINESS_ID and
    b.BUS_TEAM_TIER3_CD=TB_APP035_SUB_BUSINESS_TIER3.BUS_TEAM_TIER3_CD and
    b.BUS_TEAM_TIER2_CD=TB_APP035_SUB_BUSINESS_TIER3.BUS_TEAM_TIER2_CD
    GROUP BY
    b.SUB_BUSINESS_NAM_TIER3,b.SUB_BUSINESS_ID,b.BUS_TEAM_TIER3_CD,b.BUS_TEAM_TIER2_CD
    HAVING
    count(b.SUB_BUSINESS_ID)=2
    )

    Thanks in advance for your help.

    Thanks & Regards,
    Rajesh Desabhatla
    GE India Business Center
    Fifth Floor, Edison Towers,
    S P Road
    Hyderabad 500 004 India.
    Tel Office # +91 40 2788 1839
    Tel Residence # +91 40 2717 5490
    Cell # 9885207592

  2. bambola New Member

    Assuming SUB_BUSINESS_ID is an IDENTITY column.


    SELECT *
    FROM TB_APP035_SUB_BUSINESS_TIER3
    WHERE EXISTS
    (
    SELECT NULL
    FROM TB_APP035_SUB_BUSINESS_TIER3 b
    WHERE b.SUB_BUSINESS_NAM_TIER3 = TB_APP035_SUB_BUSINESS_TIER3.SUB_BUSINESS_NAM_TIER3
    AND b.SUB_BUSINESS_ID = TB_APP035_SUB_BUSINESS_TIER3.SUB_BUSINESS_ID
    AND b.BUS_TEAM_TIER3_CD = TB_APP035_SUB_BUSINESS_TIER3.BUS_TEAM_TIER3_CD
    AND b.BUS_TEAM_TIER2_CD = TB_APP035_SUB_BUSINESS_TIER3.BUS_TEAM_TIER2_CD
    HAVING TB_APP035_SUB_BUSINESS_TIER3.SUB_BUSINESS_ID < MAX(b.SUB_BUSINESS_ID)
    )
    To delete replace the first line with DELETE.

    Bambola.
  3. desabhatla New Member

    Thanks bambola. Your solution worked very well.

    Thanks & Regards,
    Rajesh Desabhatla
    GE India Business Center
    Fifth Floor, Edison Towers,
    S P Road
    Hyderabad 500 004 India.
    Tel Office # +91 40 2788 1839
    Tel Residence # +91 40 2717 5490
    Cell # 9885207592
  4. desabhatla New Member

    Bambola,
    A new problem has arisen because of the new implementational query. As above our data comes from external Oracle Datasource and each time new duplicate records get created because of the problem which I mentioned above. Now if I use

    HAVING TB_APP035_SUB_BUSINESS_TIER3.SUB_BUSINESS_ID < MAX(b.SUB_BUSINESS_ID)

    it is showing me records of earlier existing record and not the newly added duplicate record(Which has a greater SUB_BUSINESS_ID than the existing record). The old record has already associations with some other tables and hence I am unable to delete that record. Is it possible that I get the newly added duplicate record which has the larger value of SUB_BUSINESS_ID than the existing record.

    Thanks in advance for your help.


    Thanks & Regards,
    Rajesh Desabhatla
    GE India Business Center
    Fifth Floor, Edison Towers,
    S P Road
    Hyderabad 500 004 India.
    Tel Office # +91 40 2788 1839
    Tel Residence # +91 40 2717 5490
    Cell # 9885207592
  5. bambola New Member

    If SUB_BUSINESS_ID can only increment, than you need to keep the MIN(SUB_BUSINESS_ID) value and delete the others.
    HAVING TB_APP035_SUB_BUSINESS_TIER3.SUB_BUSINESS_ID > MIN(b.SUB_BUSINESS_ID)

    Bambola.

Share This Page