How to Eliminate Unions After a Lookup Operation
The lookup is an essential component in SQL Server Integration Services (SSIS). In scenarios when a lookup is unable to find the correct match we should log the relevant record with some predefined value like NA or unknown.
Let’s assume that we have a text file which contains employee identity numbers. Using these numbere we need to find the employee id in the HumanResources.Employee table. The following example is a common way of doing this using an SSIS Package.
In the above example a flat file source is directed to the lookup and the lookup checks the existing records. In the lookup Configure Error Output option you should set the Lookup Output Error to Redirect row. Because of the redirect configuration, error records will be directed to a separate data flow. In this flow, the derived column is introduced to include unknown records. This derived control will look like this:
The next step is to union both data flows into one data stream by using a Union data flow task however there is an alternative way of achieving the same results which is illustarted below:
You can see that the redirect and union is removed from the previous package. There are two changes to the original package to acheive this.
- The Error configuration is changed to Ignore Failure from Redirect Row. When the Error configuration is set to Ignore Failure and when no records are found it will return null.
- The Derived Column column is modified as illustrated below:
Using this implementation you don’t need to use the Union data flow, which reduces the number of memory buffers. Hence this example will help improve the performance of the Package. In the case where several lookups are perfromed you can have only one Derived Column to derive all the keys, which will be a another performance enhancement over the inital example.