Importing Poorly Formatted Text Files using SSIS

Next select Output Columns and add two columns , code and description by clicking the Add Column button. Allocate the correct data type and length for each column. In this case I have selected string [DT_STR] as data type and 50 as the data length.  

An important configuration of the script component is Script option. Select the Script option and click the Design Script button. You will be taken to the Microsoft Visual Studio to add necessary .NET code.

Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)

Dim strRow As String
Dim strColSeperator As String
Dim rowValues As String()
strRow = Row.Line.ToString()
If strRow.Contains(“,”) Then
strColSeperator = (“,”)
ElseIf strRow.Contains(“;”) Then
strColSeperator = “;”
End If

rowValues = Row.Line.Split(CChar(strColSeperator))
Row.Code = rowValues.GetValue(0).ToString()
Row.Description = rowValues.GetValue(1).ToString()

 End Sub

Above is the only code you need to add. From the Contains function we will identify the column separator for the row. Then using split function and passing the correct column delimiter we can separate the two columns. 

For demonstration purposes I have added a data viewer. Above is the output of the data viewer.


You can see that data was separated despite containing different column delimiters.  

Here is what the final package looks like:


Case 2: different Column Delimiters in Same Row

In Case 1 we discussed text files which have different column separators in different columns. What if you have different column delimiters in same row. For example, consider the following data set in a text file.

P1;P0001,Product 1
P2;P0002,Product 2
P3;P0003,Product 3
P4;P0004,Product 4
P5;P0005,Product 5
P6;P0006,Product 6
P7;P0007,Product 7
P8;P0008,Product 8
P9;P0009,Product 9
P10;P00010,Product 10 

The first two columns are separated by semi colon while the other two columns are separated by a comma. There is a similar situation with Case 1 – the only difference is the script component.

You will need to add a another column to the output columns, namely Sh_code.

Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)

Dim strRow As String
Dim strColSeperator As String
Dim rowValues1 As String()
Dim rowValues2 As String() 

rowValues1 = Row.Line.Split(CChar(“;”))
Row.ShCode = rowValues1.GetValue(0).ToString()
rowValues2 = rowValues1.GetValue(1).ToString().Split(CChar(“,”))
Row.Code = rowValues2.GetValue(0).ToString()
Row.Description = rowValues2.GetValue(1).ToString() 

End Sub 

This time we have used two split functions to seperate the data. The final output will appear as below:


Continues…

Leave a comment

Your email address will not be published.