SQL Server Integration Services Features And Properties Part 3
This article is part 3 of a 4 part series that explores the Features and Properties of SSIS. This article will explore Checkpoints and Restarting an SSIS Package.
Normally extraction, tranformation and loading (ETL) operations are complex and time consuming processes, often dealing with millions of records. Such operations may take several hours to complete.
Consider a scenario where a package has 10 tasks, and on execution the package failed at the 8th task (due to a temporary issue such as connection break, request time-out etc). What would happen if the package was re-run? By default the package will start from the first task again, irrespective of its last execution outcome. To handle this kind of scenario, SSIS provides checkpoint restart functionality. This enables the package, on the next execution, to start from the point of failure instead of starting from the beginning.
Once checkpoint functionality for a package is enabled, SSIS runtime will store information/status about the package execution to a checkpoint file (a text file at a location to be specified when enabling checkpoint). If a package using checkpoints fails, the SSIS runtime will not execute the package from the beginning in next attempt, instead, it will process the checkpoint file and start execution of the package from the container/task that was being processed at the point of failure or that failed in the last attempt.
Checkpoint restart functionality can significantly simplify the recoverability of packages that contain complex operations and can provide significant time savings for packages containing long-running tasks as the package does not need to reprocess tasks prior to the checkpoint.
Checkpoint functionality is enabled at package level and FailPackageOnFailure property needs to be set to TRUE for all tasks included in checkpoint restartability.
Note: To have execution begin from a task following a failure on that task, setting its FailPackageOnFailure property to true is a mandate.
Enabling Checkpoint Functionality
There are 3 relevant properties to consider when enabling the checkpoint functionality.
SaveCheckpoints – must be set to True (the default is False) in order to enable checkpoint support, indicates whether the package saves checkpoints information to a file.
CheckpointFileName – specify the location and name of the checkpoint log file.
CheckpointUsage – determines package behavior following the restart after a failure and takes on one of three possible values:
Never – the default value which specifies that the checkpoint file is not used (regardless of other two settings) and the package runs from the start of the package workflow on subsequent run after the failure.
Always – specifies that the checkpoint file is always used and that the package restarts from the point of the previous execution failure. If the checkpoint file is not found, the package fails.
IfExists – similar to the Always setting, mandates checkpoint use if SaveCheckpoints has been enabled and CheckpointFileName property has a valid value (or a valid checkpoint file exists). Following package failure, if the checkpoint file does not exist it will launch from the start of the package workflow.
Please note these traits of checkpoint functionality:
- Applies to the package’s control flow and not to the package’s data flow.
- An OnPreExecute event on the package will not execute if the package is restarted using a checkpoint file.
- Variable values are persisted in the checkpoint file except for variables of type Object.
- Properties set by a configuration are persisted in the checkpoint file.
- A ForEach loop will iterate over all items in a collection regardless of whether the package previously failed during iteration. The checkpoint file does not store information about how many iterations the ForEach loop has gone through.
- The above is not true of the For Loop as this is dependent on variable values which are persistant in the checkpoint file.
In the example is a package with three tasks, first observe the default behavior of the package (CheckpointUsage = Never). In the images below, notice that on subsequent execution after the failure in first attempt, the execution started from the beginning of the package workflow.
Below illustrates use of checkpoint functionality (CheckpointUsage = Always/IfExists). The images below show that on subsequent execution attempts the package started from the failed task and not from the beginning.
The last artilce in this series will explore how to Validate an SSIS Package.