Dear sir/madam, I have got a table which contains about 30,000 rows including duplicate rows. Now, I've been asked to remove the duplicate records from this table and I should not allow any duplicates in the future. How to get around this problem?? Please mail me the solution to: erajendar@rediffmail.com Regards, Rajendar ok
You can remove the duplicate records from your table by something like the following: BEGIN TRAN SELECT DISTINCT * FROM myTable INTO #temptable DELETE FROM myTable INSERT INTO myTable SELECT * FROM #temptable DROP #temptable COMMIT TRAN Next you have to define a Primary Key on the table, this will prevent duplicates from being inserted.
If you have an identity field or some other unique value in table you can do this: DELETE FROM table_name WHERE EXISTS ( SELECT NULL FROM table_name a WITH (NOLOCK) WHERE table_name.unique_column = a.unique_column HAVING table_name.id_column + < MAX(a.id_column) ) unique_column - is the column you need to be DISTINCT. id_column - is an identity column. You can change the MAX() function to your needs. Bambola.
I just had the same issue and used the info below which is from an article at http://www.devguru.com/features/tutorials/AdvancedSQL/advancedSQL.html it worked great. I set it up as a stored procedure which uses a view to return the duplicates Finding Duplicates - I modified these to my data and set it up as a view If we group records together by certain identifying fields we can then use a Count function to extract those that are duplicated. This query utilizes a sub-query and, based on their first and last names, returns a list of DelegateIDs for those delegates that have duplicate records. SELECT Delegate.DelegateID, Delegate.FirstName, Delegate.LastName FROM Delegate INNER JOIN (SELECT FirstName, LastName FROM Delegate GROUP BY FirstName, LastName HAVING Count(DelegateID) > 1 ) AS Duplicates ON Delegate.FirstName = Duplicates.FirstName AND Delegate.LastName = Duplicates.LastName Removing Duplicates The following query removes all duplicates from the delegate table, leaving only the originals of the duplicated record. Here, we define a record as having a duplicate if there is another record with matching FirstName and LastName fields. This query relies on a view being set up that returns a list of duplicate delegates. The view is called DupDels. DELETE FROM Delegate WHERE DelegateID = ANY ( SELECT DupDels.DelegateID FROM DupDels LEFT JOIN (SELECT Min(DelegateID) AS DelegateID, FirstName, LastName FROM DupDels GROUP BY FirstName, LastName) AS FirstDup ON DupDels.DelegateID = FirstDup.DelegateID WHERE FirstDup.DelegateID IS NULL ) This worked out great for me and I hope it helps. MacDaddio
I just had the same issue and used the info below which is from an article at http://www.devguru.com/features/tutorials/AdvancedSQL/advancedSQL.html it worked great. I set it up as a stored procedure which uses a view to return the duplicates Finding Duplicates - I modified these to my data and set it up as a view If we group records together by certain identifying fields we can then use a Count function to extract those that are duplicated. This query utilizes a sub-query and, based on their first and last names, returns a list of DelegateIDs for those delegates that have duplicate records. SELECT Delegate.DelegateID, Delegate.FirstName, Delegate.LastName FROM Delegate INNER JOIN (SELECT FirstName, LastName FROM Delegate GROUP BY FirstName, LastName HAVING Count(DelegateID) > 1 ) AS Duplicates ON Delegate.FirstName = Duplicates.FirstName AND Delegate.LastName = Duplicates.LastName Removing Duplicates The following query removes all duplicates from the delegate table, leaving only the originals of the duplicated record. Here, we define a record as having a duplicate if there is another record with matching FirstName and LastName fields. This query relies on a view being set up that returns a list of duplicate delegates. The view is called DupDels. DELETE FROM Delegate WHERE DelegateID = ANY ( SELECT DupDels.DelegateID FROM DupDels LEFT JOIN (SELECT Min(DelegateID) AS DelegateID, FirstName, LastName FROM DupDels GROUP BY FirstName, LastName) AS FirstDup ON DupDels.DelegateID = FirstDup.DelegateID WHERE FirstDup.DelegateID IS NULL ) This worked out great for me and I hope it helps. MacDaddio