SQL Server Performance

Help with finding distinct sets of data

Discussion in 'ALL SQL SERVER QUESTIONS' started by darkangelBDF, Apr 22, 2015.

  1. darkangelBDF Member

    Hi there. I have a question regarding filtering data that is unique. I know this sounds a bit vague. Let me explain. I have a live table (TableA) with a PK consisting of Field1 + Field2 + Field3.

    I also have a staging table where data gets loaded into before being pumped into the live table. There is no primary key on this table as they would like to trap duplicates as part of the error logging. My question is this... How would I trap the duplicates when Field1 could appear more than once, Field2 could appear more than once and Field3 could appear more than once. I.e. COUNT(FieldName) will not help as either of these fields can have one value appear more than once.

    Is there any way I can do an IF EXISTS (find duplicates)?
  2. moh_hassan20 New Member

    If the primary key is: Field1 , Field2 , Field3 , and you import these data into the staging table , so duplicate rows is : any row that have the same (Field1 , Field2 , Field3)
    so to get these duplicate rows
    Code:
    select field1, field2, field3  , count(*) row_count
    from mystagingTable
    group by field1, field2, field3
    HAVING COUNT(*) > 1
    
    you can do more process on these duplicates using Common Table Expressions (CTE) ,
    example: find duplication of field1 within the duplicted rows:
    Code:
    ;with cte
    as
    (
    select field1, field2, field3  , count(*) row_count
    from mystagingTable
    group by field1, field2, field3
    HAVING COUNT(*) > 1
    )
    select te. field1,  count(cte.field1 )  from cte
    group by cte.field1
    
    

Share This Page