SQL Server Performance Forum – Threads Archive

T-SQL Expert help!!

I have data in this format in a table…..
ID firstname lastname phone address city state zip
1 John Doe 0 Test 0 NY 0
1 John Doe 45543 0 NY 0 456
1 0 0 33333 0 0 0 0
2 jane doe 0 example 0 CA 0
2 Jane doe 0 0 SD 0 123
2 0 0 45343 0 0 0 0
Now Basically the final result sould be one row with all the gaps filled from each other ID firstname lastname phone address city state zip
1 John Doe 33333 Test NY NY 000
2 jane doe 45343 example SD CA 123 Basically some values are found in some rows and vice versa, i need just on row for each id number with complete values filled.
Please help!

Why don’t update the table when you have new data like phone or zip?
Luis Martin
Moderator
SQL-Server-Performance.com One of the symptoms of an approaching nervous breakdown is the belief that one’s work is terribly important
Bertrand Russell
All postings are provided â€œAS ISâ€ with no warranties for accuracy.

Because this is how the table is. it get imported from a text file into a table.

Well, I’m not a developer but if I were you, I’ll work with text file before import to SQL. Let me tell you why. 1rs row has 7 fields.
2dn row has 5 fields and phone is NY when I’ll expect 0 or some number.
3rd row has 8 fields. But I suggest to wait for developers members experts. Luis Martin
Moderator
SQL-Server-Performance.com One of the symptoms of an approaching nervous breakdown is the belief that one’s work is terribly important
Bertrand Russell
All postings are provided â€œAS ISâ€ with no warranties for accuracy.

That was probabaly a typo ID firstname lastname phone address city state zip
1 John Doe 0 Test 0 NY 0
1 John Doe 45543 0 NY 0 456
1 0 0 33333 0 0 0 0
2 jane doe 0 example 0 CA 0
2 Jane doe 0 0 SD 0 123
2 0 0 45343 0 0 0 0 basically some values are stored in one row and other in the other rows for one ID.

Just wanted to clarify exactly what you are trying to do- do you want to end up with one row in the table for each person??? Or do you want a query that will give the row? Does the table have a primary key?

Please post the DDL of that table. Also how many occurences can there be for each ID? Unlimited?
Surely this is doable in T SQL, but frankly, I would do this data scrubbing in a scripting language outside SQL Server, since I suspect every T SQL solution to be some kind of kludge here anyway.
Hm, just reread the thread. How do you import the file? Have you considered doing this cleansing with DTS?
—
Frank
http://www.insidesql.de

The function MAX may get you a good approximation of what you want (see BOL for details) eg. SELECT [ID], max(firstname), max(lastname), max(phone), max(address), max(city), max(state), max(zip)
FROM yourtable
GROUP BY [ID] However if you have multiple firstnames for an ID such as Robert and Bob, you will only get the last alphabetically, i.e. Robert. You would however be best to have only one row per person and if you don’t have a value for a field use NULL instead of 0. What everyone else is saying is: Rethink how your data is stored. Regards, Robert

One possible solution: Store data in an auxiliary table first. Next, copy all distinct ids with null values for all other columns in another auxiliary tables. Then, update one column at time filtering out rows with meaningless column values. Next copy rows to production table. The problem may be when you have two different meaningfull column values for the same ID. You have to figure out rule which value should be stored. If you have dateCreated/Modified then it should probably be value from the last created/modified row.

UPDATE my_staging_table
SET
my_staging_table.address = B.address
FROM (SELECT * FROM my_staging_table
WHERE TYPE = ‘datasource1’) B
WHERE my_staging_table.ID = B.ID
AND my_staging_table.TYPE = ‘datasource3’ DELETE FROM my_staging_table
Where my_staging_table.ID
In ( SELECT ID FROM my_staging_table
WHERE TYPE = ‘datasource3’)
And TYPE = ‘datasource1’ UPDATE my_staging_table
SET
my_staging_table.city = B.city
FROM (SELECT * FROM my_staging_table
WHERE TYPE = ‘datasource2’) B
WHERE my_staging_table.ID = B.ID
AND my_staging_table.TYPE = ‘datasource3’
DELETE FROM my_staging_table
Where my_staging_table.ID
In ( SELECT ID FROM my_staging_table
WHERE TYPE = ‘datasource3’)
And TYPE = ‘datasource2’ Here in example I used and it seems to work. You see im getting three rows for each ID number. Each one of these three rows stores a value that the others dont have. I was able to to put a flag for each type of row and called it datasource1, datasource2 etc.. So in this example Im grabing the contents of datasource1 and putting it in datasource3 then deleting, grabbing the contents of datasource2 and putting it in datasourc3 then deleting it. Finally I just end up with datasource3 all filled up. I maybe did not specify the whole situation so maybe it confused soem of you. Sorry but thanks for your valuable help. I m gonna try your solutions to see if they are better.

]]>