SQL Server Performance

Query Re-Write Help

Discussion in 'ALL SQL SERVER QUESTIONS' started by L0st_Pr0phet, May 31, 2013.

  1. L0st_Pr0phet Member

    Hi,
    I have a third party query that runs for 4 hours and max out 6 CPUS. I want to re write it but when I do I get a row discrepancy.

    Original Query:
    SELECT Relevant, COUNT(*)
    FROM LegalReviewView_ATW_2012_ECC
    WHERE (IsRepresentation=0 OR
    (IsRepresentation=1 AND DedupSHA5BinHash + zcxcst = IsRepresentationOfHash + zcxcst
    OR IsRepresentationOfHash not in
    (Select IsRepresentationOfHash
    from LegalReviewView_ATW_2012_ECC
    WHERE DedupSHA5BinHash + zcxcst = IsRepresentationOfHash + zcxcst
    )
    )
    )
    GROUP BY Relevant
    ORDER BY Relevant

    My Query:


    Select
    v1.Relevant,
    Count(*)
    From LegalReviewView_ATW_2012_ECC v1
    Left Outer Join LegalReviewView_ATW_2012_ECC v2 on
    v1.IsRepresentationOfHash = v2.IsRepresentationOfHash And v2.DedupSHA5BinHash + v2.zcxcst = v2.IsRepresentationOfHash + v2.zcxcst
    Where v1.IsRepresentation=0
    OR (v1.IsRepresentation=1 AND v1.DedupSHA5BinHash + v1.zcxcst = v1.IsRepresentationOfHash + v1.zcxcst
    OR v2.IsRepresentationOfHash is null)
    GROUP BY v1.Relevant

    ORDER BY v1.Relevant

    I am get a count difference of 4,926 rows in my query compared to the original, can anyone spot why?? Any help would be much appreciated.

    Thanks
    Stewart
  2. Shehap MVP, MCTS, MCITP SQL Server

    You can try that one below:

    SELECT Relevant, COUNT(*)
    FROM LegalReviewView_ATW_2012_ECC
    WHERE (IsRepresentation=0 OR
    (IsRepresentation=1 AND DedupSHA5BinHash + zcxcst = IsRepresentationOfHash + zcxcst
    OR DedupSHA5BinHash <> IsRepresentationOfHash
    )
    )
    GROUP BY Relevant
    ORDER BY Relevant

    If it still persists, you can try this one below

    Create table #temp( IsRepresentationOfHash primary clustered index )
    Insert into #temp Select distinct IsRepresentationOfHash from LegalReviewView_ATW_2012_ECC
    WHERE DedupSHA5BinHash <> IsRepresentationOfHash

    SELECT Relevant, COUNT(*)
    FROM LegalReviewView_ATW_2012_ECC inner join #temp on #Temp.IsRepresentationOfHash =LegalReviewView_ATW_2012_ECC.IsRepresentationOfHash
    WHERE (IsRepresentation=0 OR
    (IsRepresentation=1 AND DedupSHA5BinHash + zcxcst = IsRepresentationOfHash + zcxcst
    ))
    GROUP BY Relevant
    ORDER BY Relevant

    Please let me know your feedback on both queries

Share This Page