SQL Server Performance

Adding tabular duplicates from a flat file

Discussion in 'General Developer Questions' started by Tahsin, Oct 3, 2006.

  1. Tahsin New Member

    Hello Friends,
    I have the following stumper:

    I have a flat file with a couple of million records in a format similar to this:

    1:, NULL
    1374023,2005-09-28
    2542849,2005-12-17
    10:, NULL
    1678797,2004-09-03
    1524903,2005-12-22
    972246,2005-05-29
    10119:, NULL
    2463090,2005-04-26
    514398,2005-10-11
    1012:, NULL
    1037308,2005-10-13
    109094,2005-10-24
    10120:, NULL
    109409,2005-12-20
    564451,2005-12-30

    I was trying to modify it using Ultra-Edit, but apparently you cannot copy and paste one column into multiple columns (in Column mode). I was also unable to create a macro to accomplish this. I was able to generate a script using cursor based logic to do what I want, but it will be incredibly slow if I apply it to more than a few thousand records.

    I want the file/table to look like the following:

    1:, 1374023, 2005-09-28
    1:, 2542849, 2005-12-17
    10:, 1678797, 2004-09-03
    10:, 1524903, 2005-12-22
    10:, 972246, 2005-05-29
    10119:, 2463090, 2005-04-26
    10119:, 514398, 2005-10-11
    1012:, 1037308, 2005-10-13
    1012:, 109094, 2005-10-24
    10120:, 109409, 2005-12-20
    10120:, 564451,2005-12-30

    Any ideas?

    Thanks!
  2. Roji. P. Thomas New Member

  3. Code Carpenter New Member

    It will take a few (1<img src='/community/emoticons/emotion-11.gif' alt='8)' /> steps, and does not involve SQL, but try the following:<br />1. Remove all blanks from the file.<br />2. Remove the :'s from the file.<br />3. Open the file using Excel. IMPORTANT: Treat first column as number, second as text.<br />4. Insert two columns in from of column A.<br />5. Set A1 = C1.<br />6. Set all of column B to ':'<br />7. Set A2 to '=IF(D2="NULL",C2,A1)'<br />8. Copy A2 down to the rest of column A.<br />9. Select all four columns for the full set of rows.<br />10. Copy (Ctrl-C)<br />11. Select Sheet2<br />12. Paste Special, use the radio button 'Value'.<br />13. Sort by column D. This will move all the NULLS to the bottom.<br />14. Delete the NULL elements.<br />15. Sort by column A, and column D. This will put the file in the correct order.<br />16. Copy all of this into the original file.<br />17. Replace the tabs with ,<br />18. Replace the ,:, with :,<br />19. Save the new file. Now your file looks as you want it.<br />20. Write back here saying how easy that was, easier than it looked. <img src='/community/emoticons/emotion-1.gif' alt=':)' /><br /><br />Here is what I got using that method...<br /><br />1:,1374023,2005-09-28<br />1:,2542849,2005-12-17<br />10:,1678797,2004-09-03<br />10:,972246,2005-05-29<br />10:,1524903,2005-12-22<br />1012:,1037308,2005-10-13<br />1012:,109094,2005-10-24<br />10119:,2463090,2005-04-26<br />10119:,514398,2005-10-11<br />10120:,109409,2005-12-20<br />10120:,564451,2005-12-30<br />
  4. Tahsin New Member

    Thanks for the response Code/Roji. I actually used the cursor based method, inserted an IDENTITY column along with a clustered primary key and all this data into a table. It took 22 minutes to complete a few million records, but since this is a 1 time shot, I didn't really care. I was really trying to find a program that would let you copy multiple instances of a single attribute in column mode, but was unsuccessful. Ultra-Edit, PSPad and the leading text editors don't seem to have this option available.

    Roji, your solution will not work, because it doesn't address the problem correctly. Code,
    I will give your solution a try since it appears to be correct. Don't really know which method is easier (mine or yours), but it's always good to learn a new approach.

    Thank you guys.
  5. Adriaan New Member

    You would be amazed what MS Word can do for you with Find and Replace (Ctrl-H) - wildcards and all. I've used it for converting flat-text data files many times, for instance when the source database did not offer any data dump facilities, and all you can do is print reports to files - which you then have to re-organize into data.

    After you clean up with Word, you can use Access to convert the lines of data into tables and columns.

    You can also do the whole thing in Access.

Share This Page