The Restart Framework exists to cater for a deficiency within SSIS, that being the poor support for restartability. Let's define what I mean by restartability:
A SSIS execution that fails should, when re-executed, have the ability to start from the previous point of failure.
SSIS provides a feature called checkpoint files that are intended to help in this scenario but I am of the opinion that checkpoint files are an inadequate solution to the problem, I explain why in my blog post
Why I don't use SSIS checkpoint files
The Restart Framework was designed to overcome the shortcomings of checkpoint files. Note that the Restart Framework only supports SSIS 2012 at the time of writing and will never support any version prior to that due to it being built upon the SSIS Catalog
(aka SSISDB) that debuted in SSIS 2012.
Let's define some important terms that you will need to become familiar with if you are going to use the Restart Framework.
An ETLJob is the definition of some work that an end-to-end ETL process needs to perform. An ETLJob would typically incorporate many SSIS packages. Each ETLJob has a name (termed ETLJobName) which can be any value you want, some example ETLJobNames might include:
- Nightly Data Warehouse Load
- Monthly Reconciliation
- All backups
Each ETLJob contains one or more ETLJobStages. These are the "building blocks" of your solution and for each ETLJobStage there must exist a package in your SSIS project with a matching name. For example, an ETLJobStage with the name "FactInternetSales"
will require a SSIS package called "FactInternetSales.dtsx".
The Restart Framework allows the declaration of dependencies between ETLJobStages. This is a fundamental tenet of the Restart Framework as it needs to know the order in which ETLJobStages need to occur in order that it can restart execution from the previous
point of failure.
blah blah blah
Crucially, when a previously-failed ETLJob is restarted it retains the same ETLJobHistoryId, compare this to SSIS' own execution_id which will be different whenever an ETLJob is restarted.
The ETLJobHstoryId can be particularly useful when used for lineage purposes in a data warehouse loading routine. Every inserted or updated record can have the ETLJobHistoryId stored against it which is useful for providing ...