Importing Poorly Formatted Text Files using SSIS

Case 3: Variable Columns

In the previous cases we have the same number of columns throughout. What if the number of columns also varied? For example, consider the following text file:

P1,P0001,Product 1
P2,P0002,Product 2,01/10/2007
P3,P0003,Product 3
P4,P0004,Product 4
P5,P0005,Product 5
P6,P0006,Product 6,21/06/2007
P7,P0007,Product 7
P8,P0008,Product 8,15/10/2008
P9,P0009,Product 9
P10,P00010,Product 10

You can see that the date is available for only few columns. You will not be able to use traditional text file handling with SSIS for above case.

Again, the difference with the previous two cases is the script component. To start with you will need to add new output column named Date. Then we need to add some .NET code for the script component:

Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)

Dim strRow As String
Dim strColSeperator As String
Dim rowValues As String()

rowValues = Row.Line.Split(CChar(“,”))
Row.ShCode = rowValues.GetValue(0).ToString()
Row.Code = rowValues.GetValue(1).ToString()
Row.Description = rowValues.GetValue(2).ToString()

If rowValues.GetUpperBound(0) = 3 Then
Row.Date = rowValues.GetValue(3).ToString()
End If 

End Sub

End Class 

In this case aftr splitting, the script checks the number of elements the array has. If it 3 then you have a date column in the data row. Again we will check the data viewer output:


Sample

For your reference I have added sample SSIS package for you along with the text files I have used. Download here

Case

Package

Text File

Case 1

MultipleRowDelDifferentRows.dtsx

MultipleRowDelDifferentRows.txt

Case 2

MultipleRowDelSameRows.dtsx

MultipleRowDelSameRows.txt

Case 3

MultipleNumberofColums.dtsx

MultipleNumberofColums.txt

You may need to change all the connection manager file paths to suit to your file path.

 Conclusion
I am sure you have come across with many situations with text files. What are those situations and what were alternatives you implemented? I would love to hear your feedback at dineshasanka@gmail.com

]]>

Leave a comment

Your email address will not be published.