This article, covering Validation, is the final in a 4 part series exploring the Features and Properties of SSIS.
Recall from part 3, that ETL operations are time bound and may take several hours to pull and transform millions of records. In the scenario where a package fails on the last task after executing all the other tasks; it is possible to use checkpoint and start from the point of failure as discussed above. Another method to avoid time waste from task failure is to identify potential problems at the beginning. This process is known as validating a package.
Validation is the process of ensuring the package will execute successfully during runtime given its current property settings. It raises warnings or errors alerting of any potential or real problems with the package during execution. There are two types of validation, Design-time and Run-time validation, each having a further two validation sub-types: package level and component level validation.
Design–time validation – done by the SSIS designer when the package is opened in designer and reports a warning and errors as follows (Note the little icon on the right side of the task), this validation occurs twice at package level and at component level:
Run-time validation – occurs when the package is executed. Happens in two phases as given below similar to design time validation:
Package level validation (a.k.a. Early Validation) – Whenever the runtime calls the Execute method of package, the package is validated before executing. Everything in the package gets validated down to the containers and components.
Component level validation (a.k.a. Late Validation) – Whenever the runtime calls Execute method of the task, the task is validated again before its execution.
DelayValidation Property – This is a property of a package and of all its components. By default its value is FALSE, indicating a task is validated during package level validation and also during component level validation. If set to TRUE, the validation of that task will be skipped during package level validation and will be validated at component level validation only (Note there is no way to entirely skip the validation).
Setting the DelayValidation property to TRUE is required in the case of Task2 being dependent on prior execution of Task1. For example, Task1 creates a table and Task2 uploads data into the table created by Task1. If the DelayValidation property of Task2 isn’t set to TRUE, the execution of the package will fail because of validation error as follows.
Once the DelayValidation property of container is set to TRUE, the validation of all tasks inside the container will be skipped during early validation as shown below.
ValidateExternalMetadata Property – Every task and containers has got DelayValidation property to delay the validation until runtime, but this is not the case with data flow source components. OLEDB have a different property, called ValidateExternalMetadata, with default value TRUE, of which validates the external metadata (the component connects to its external data source and validates the columns in its input or output collections against the columns at the external data source).
This first series of articles discussed advance features and properties of SSIS; Event Logging, Event Handlers, Transaction Support, Checkpoint Restart-ability and SSIS validation process. The next series will cover SSIS architecture, SSIS buffer management, Types of transformation and execution tree. These concepts will enchance understanding of the internals of SSIS. Resulting knowledge will enable optimised package developmemt and performance.
SQL Server Books Online (BOL) / MSDN