SQL Server Performance Forum – Threads Archive

select distinct + another column?

Hi, I’m trying to issue a statement like this: select distinct(TransactionNum), TransactionDescription from Transaction But it doesn’t work. What I want it to do is give me one record from the transaction table for every TransactionNum, but no duplicate TransactionNums. The reason I’m trying to do this is I want to make TransactionNum the PK, but I cant because there are currently duplicate entries in this table. I know I can identify the duplicates using a select/group by/having statement and then delete them, but I’m working with a 1.6B row table and it would take around 30 hours. I have found that selecting the data out of the table into a new table is much faster (4 hours instead of 30 hours). Since I want to make TransactionNum the PK, this is not an option:
select distinct(TransactionNum, TransactionDescription) from Transaction Because there are cases where a single transactionnum has been entered multiple times with not quite identical descriptions, so it could return non-unique transactionNums. Is there a way using distinct to remove duplicate records, but still get all the columns? In the event that there are duplicate TransactioNums, I dont particularly care which one’s description gets returned. Thanks! Steve

I did this once. Let me think….. If you dont care which description gets picked
AND
If you want to go to the new table option how about this insert newtable
select transactionnum, min(transactiondescription) as transacationdescription from transaction
group by transactionnum, min(transactiondescription
You could also do max on the description, but if you really dont care which, just pick an order and take the first.
Chris

Chris, Thanks for the idea! I’ll give this a try. Steve

]]>