THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Jamie Thomson

This is the blog of Jamie Thomson, a freelance data mangler in London

Restart Framework added to SSIS Reporting Pack

On 31st March 2014 I released version 1.2.0.0 of SSIS Reporting Pack, my open source project that aims to enhance the SSIS Catalog that was introduced in SSIS 2012. This is a big release because it includes an entirely new feature  -the Restart Framework.

Introduction

The Restart Framework exists to cater for a deficiency within SSIS, that being the poor support for restartability. Let's define what I mean by restartability:

A SSIS execution that fails should, when re-executed, have the ability to start from the previous point of failure.

SSIS provides a feature called checkpoint files that are intended to help in this scenario but I am of the opinion that checkpoint files are an inadequate solution to the problem, I explain why in my blog post Why I don't use SSIS checkpoint files.
The Restart Framework was designed to overcome the many shortcomings of checkpoint files.

One of the fundamental tenets of the Restart Framework is that the packages that you, the developer, build for your solution should not be required to contain any variables, parameters, tasks, or event handlers in order to make them work with the Restart Framework. In fact your packages should be agnostic of the fact that they are being executed by the Restart Framework.

TL;DR: A video that demonstrates the installation and base functionality of the Restart Framework can be viewed at https://www.youtube.com/watch?v=syV0Wpwhlnk

Terminology

Let's define some important terms that you will need to become familiar with if you are going to use the Restart Framework.

ETLJob

An ETLJob is the definition of some work that an end-to-end ETL process needs to perform. An ETLJob would typically incorporate many SSIS packages. Each ETLJob has a name (termed ETLJobName) which can be any value you want, some example ETLJobNames might include:
  • Nightly Data Warehouse Load
  • Monthly Reconciliation
  • All backups

ETLJobStage

Each ETLJob contains one or more ETLJobStages. These are the "building blocks" of your solution and for each ETLJobStage there must exist a package in your SSIS project with a matching name. For example, an ETLJobStage with the name "FactInternetSales" will require a SSIS package called "FactInternetSales.dtsx".

The Restart Framework allows the declaration of dependencies between ETLJobStages - an ETLJobStage cannot start until all ETLJobStages with a lower ETLJobStageOrder have completed successfully. This is a fundamental tenet of the Restart Framework as it needs to know the order in which ETLJobStages need to occur in order that it can restart execution from the previous point of failure.

The Restart Framework provides some stored procedures that should be used to define ETLJobs, ETLJobStages and the dependencies between them.

One important point to make about ETLJobStages is that the Restart Framework only supports restartability of a failed ETLJobStage, the Restart Framework has no control (and, indeed, does not care) what occurs within that ETLJobStage. The implication therefore is that the onus is on the package developer to ensure that each ETLJobStage is re-runnable from the start of that package in the event of failure; in other words an ETLJobStage must be idempotent.

ETLJobHistory

Each time an ETLJob is executed a record is inserted into a table called ETLJobHistory and a unique ETLJobHistoryId is assigned. Crucially, when a previously-failed ETLJob is restarted it retains the same ETLJobHistoryId, compare this to SSIS' own execution_id which will be different whenever an ETLJob is restarted.

The ETLJobHistoryId can be particularly useful when used for lineage purposes in a data warehouse loading routine. Every inserted or updated record can have the ETLJobHistoryId stored against it which is useful for providing lineage information such as when the record was inserted/updated.

What's included

SsisReportingPack database

This is the same database that houses usp_ssiscatalog and all of its supporting code modules. All of the database objects that support the Restart Framework are in a schema called RestartFramework.


20140408 restartframeworkschema.png

SSIS packages

The Restart Framework consists of two packages that must be included in every SSIS project that is intending to use the Restart Framework hence they will need to be added into your SSIS project within Visual Studio.
Root.dtsx

This package must be executed in order to have any execution be managed by the Restart Framework. It takes a single parameter, ETLJobName, to indicate which ETLJob it should execute. Root.dtsx will interrogate the Restart Framework metadata in the SsisReportingPack database to determine which ETLJobStages are included.

For each ETLJobStage Root.dtsx will fire off a new instance of ThreadControllor.dtsx, passing it a ThreadID and an ETLJobStageOrder.

Root.dtsx can fire off eight concurrent instances of ThreadController.dtsx. This number if configurable however eight is the maximum. You could easily extend Root.dtsx to fire off more than eight if you so desired.

Here is a screenshot of Root.dtsx control flow:
20140412 22-05-10 root.dtsx.png

ThreadControllor.dtsx
This package is responsible for calling your packages that actually do some work. It receives a ThreadId and ETLJobStageOrder from Root.dtsx which it uses to interrogate the database to get a list of ETLJobStageNames that it needs to execute. It loops over that list and executes a package of the same name from the current project.
When an ETLJobStage completes successfully it is the job of ThreadController.dtsx to update the database to indicate that this has occurred.
Here is a screenshot of ThreadController.dtsx control flow:
20140412 22-13-06 threadcontroller.dtsx.png
Published Saturday, April 12, 2014 11:28 PM by jamiet

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Marco Schreuder said:

Maybe you can pull in the average execution time of the last 3 runs of a package and use it to first execute the long running packages with the same prio? That can save a lot in the total execution time.

April 16, 2014 7:37 AM
 

jamiet said:

Hi Marco,

Nice idea. So far with this I've tread a careful line between simplicity and lots of functionality and that may fall into the latter category. I'm also a little bit wary of the framework making assumptions as to what can run before something else - I'd rather leave that as an exercise for the user. Its certainly something folks could do to extend it for themselves though.

regards

Jamie

April 16, 2014 8:13 AM
 

suhail ali said:

It looks like when packages are run in parallel when one package finishes another one doesn't take it's place until all the packages submitted finish and then the next set of parallel packages are run.  Is that right Jamie? I might need to test it myself to verify and I am just looking at the code.

April 17, 2014 1:16 PM
 

jamiet said:

Hi suhail,

Well, kind of. all packages for a given ETLJobStageOrder have to complete before moving onto the next JobStageOrder. That said, its possible that a thread may have multiple packages for the same ETLJobStageOrder assigned to it, effectively building a queue of packages for that thread. If there are more packages in that queue then the next will get executed without waiting on anything

Regards

Jamie

April 17, 2014 6:43 PM
 

Peter said:

Hi Jamie,

Thanks for great work. Its really simplifies the whole complexity many packages we have to deal with.

Just one question though, If there is a job fails, does it roll back for the next re-start or this is not part of this RestartFrameWork

Regards

June 10, 2014 7:35 AM
 

jamiet said:

Hi Peter,

Not entirely sure what you mean but I'll do my best to answer.

If you mean "rollback" in the transactional sense of the word (i.e. all work done under the transaction prior to the failure gets rolled back) then , no, there is no rollback mechanism. As I say above

"the onus is on the package developer to ensure that each ETLJobStage is re-runnable from the start of that package in the event of failure; in other words an ETLJobStage must be idempotent."

Hope that helps.

Jamie

June 10, 2014 7:45 AM
 

Robert said:

Hi Jamie,

I'm interested in making our packages restartable and your approach is very interesting. It seems like it would work very well as long as the child packages can be called in a straight sequence and they function independently. Do you have any ways to handle logical branching between the packages or passing values between parent and child?

July 4, 2014 7:38 PM
 

jamiet said:

Hi Robert,

Thanks for your interest.

" Do you have any ways to handle logical branching between the packages "

Not sure what you mean by that. Could you expand upon it?

"Do you have any ways to handle passing values between parent and child?"

Depends on whether the parent that you speak of is Root.dtsx or ThreadController.dtsx, or one of your own packages. If Root.dtsx/ThredaController.dtsx then the answer is "no", those packages don't know anything about your application therefore cannot pass values. If you're talking about passing values between your own packagess then nothing in this framework prevents you from doing that, you should implement it exactly as you would if you were not using this framework.

Hope that helps.

JT

July 7, 2014 4:46 AM
 

Robert said:

Hi Jamie,

Thanks for your response. Here are some more details on what we are trying to accomplish.

Our standard ETL job consists of a parent package that calls 15 child packages. The goal is to make our jobs restartable by replacing our parent package with your Root and ThreadController.

One problem is that our first child package generates config information like dynamic connection strings and file paths. Those values are passed back to the parent so they can be shared with the other child packages. If we replace our parent package with your Root.dtsx then we don't have a way to pass values back to the Root so they can be shared. I could get around this by writing the values to the database and changing the package configurations of each child package but I'm hoping to avoid this as I have over 300 packages that would need to be changed.

The second problem is that we use expressions on the precedence constraints to create several branches in our control flow. For example, if Success and @RowCount == 0 then follow Path A, otherwise, if Success and @RowCount > 0 follow Path B. Each path may have multiple steps that we want to be restartable at any point along the path. I don't see a way to mimic this behavior in your Restart Framework. I have ideas on how this feature could be implemented but don't see a way to do it as it stands now.

Am I understanding it correctly or am I missing something?

Thanks,

Robert

July 8, 2014 4:18 AM
 

jamiet said:

Hi Robert,

Easy one first...

"For example, if Success and @RowCount == 0 then follow Path A, otherwise, if Success and @RowCount > 0 follow Path B. Each path may have multiple steps that we want to be restartable at any point along the path. I don't see a way to mimic this behavior in your Restart Framework"

Correct. the Restart Framework doesn't allow definition of complicated workflow logic like that. Moreover, it never will. The goal here isn't to provide a way of defining complicated workflow logic - its to enable restartability of discrete code modules that have dependencies between them.

"If we replace our parent package with your Root.dtsx then we don't have a way to pass values back to the Root so they can be shared"

Well, that's not quite true. Once you start using Root.dtsx then you can whatever changes to it that you like (probably should have said that in my previous reply). Once you start using it then its your code, the other side of that coin is that if you break the Restart Framework thru any changes that you make...you're on your own.

regards

Jamie

July 8, 2014 11:02 AM
 

Robert said:

Thanks, Jamie. You've done some great work with this Restart Framework and we will be using it in our environment. If we end up customizing our instance I will share it with you to be considered in future versions. Thanks for the tips.

July 8, 2014 12:38 PM
 

John said:

Really great work Jamie. its just but awesome!!    

Just a simple question, If a job within the same  ETLJobStageOrder fails, does it mean others within the same ETLJobStageOrder fails or this only affects the next  ETLJobStageOrder which according to your notes above will not run.

Many thanks

August 26, 2014 12:07 PM
 

jamiet said:

Hi John,

Good question.

If a JobStage fails then no, other JobStages with the same JobStageOrder will not be deemed to have failed. Each JobStage is considered to be a discrete code module that can fail or succeed independently.

You are correct that no JobStages with a higher JobStageOrder will run.

Regards

Jamie

August 26, 2014 5:54 PM

Leave a Comment

(required) 
(required) 
Submit

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement