THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Andy Leonard

Andy Leonard is a Data Philosopher at Enterprise Data & Analytics, an SSIS Trainer, Consultant, and developer; a Business Intelligence Markup Language (Biml) developer and BimlHero; SQL Server database and data warehouse developer, community mentor, engineer, and farmer. He is a co-author of SQL Server Integration Services Design Patterns, and author of Managing Geeks - A Journey of Leading by Doing, and the Stairway to Integration Services.

  • IESSIS1 2-6 May 2016 in Chicago!

    Want to learn more about SSIS (from me!)? IESSIS1: Immersion Event on Learning SQL Server Integration Services is 2-6 May in Chicago! It’s not too late to register.

    This 1-300-level, 5-day, SQL Server Integration Services (SSIS) training class was created to train technology professionals in the fine art of using SSIS to build data integration and Extract-Transform-Load (ETL) solutions. Developed by Andy Leonard, the training is focused around labs and emphasizes a hands-on approach. Most technologists learn SSIS by doing so this training is designed to maximize the time attendees spend working with SSIS.

    At the conclusion of the training, attendees will have been exposed to:

    • Using SSIS to develop data integration solutions
    • Using SSIS to load a data warehouse dimension
    • Troubleshooting real-world SSIS Data Flow Task errors
    • Deploying SSIS Solutions
    • Managing, monitoring, and administering SSIS in the enterprise

    Target audience: Database professionals, application developers, and business intelligence practitioners interested in acquiring or expanding their existing SSIS skill set.

    I hope to see you there!

    :{>

  • Introducing DILM Suite

    Kevin Hazzard (Blog | @KevinHazzard), Kent Bradshaw, and I are excited to announce DILM Suite!

    I can hear you thinking, “What is DILM, Andy?” I’m glad you asked! Data Integration Lifecycle Management (DILM) is the art and science of managing data integration in the enterprise.

    “Ok, so what is this DILM Suite of which you write?” Another excellent question. Sit back and let Grandpa Andy tell you a story…

    A Story

    A long time ago, in a place way out in the sticks (named Farmville), a redneck engineer started cyphering (that’s what redneck engineers call “thinking”) about how to best manage data integration with SSIS in the enterprise. He had lots of thoughts about solving business problems, developing SSIS, monitoring and testing ETL, and simplifying the execution of lots SSIS packages.

    The redneck engineer developed SSIS Frameworks to manage execution and logging, then championed design patterns for repeatable development, and then started using Biml to automate design patterns development. There were still some gaps in enterprise DevOps, at least from a data integration perspective.

    He started working on DILM, and talked a couple of his awesome buddies into helping him.

    And Now…

    Today you can keep track of our work at the DILM Suite website. At launch, our product list includes:

    The coolest part? Two of the three applications listed above are FREE!

    Please join our mailing list to learn about new stuff first. You can follow us on social media – Twitter and Facebook.

    :{>

  • Tinkering with SSIS 2016 RC2

    I’ve been testing SSIS 2016 as the CTPs (Community Technology Previews) and RCs (Release Candidates) have been made available. I’ve been most interested in changes to the SSIS Catalog. Getting RC2 running posed a couple challenges for me so I thought I would blog about my test setup – in case anyone else is experiencing similar challenges.

    I want to give kudos to the SSIS and SSDT Development Teams at Microsoft. From what I can tell, they’ve been cranking out some good code the past few months. In addition, the team members have been responsive to my questions and concerns about the SSIS 2016 release. I admire the work they’ve done and I’m very grateful for their help. As I wrote in the past, Microsoft is listening. This blog post should not be construed as a complaint against them or their work.

    My Software Testing Setup

    I’m a big believer in virtual machines. I haven’t quite made the leap to cloud VMs for testing, although I have quite a few VMs in the cloud – at Azure and Amazon, no Google cloud… yet.

    I use VirtualBox. I started using VirtualBox years ago when it was the only free VM platform that supported 64-bit clients that I could locate. I still like it. It does what I want and the price (free) is right. :)

    I’m also tinkering with Windows Server 2016 Technical Preview 4. I don’t think SQL Server 2016 supports installations on Windows 7. I tried installing one of the earlier SQL Server 2016 CTPs on Windows 7 and got an error. (Why would I want to install SQL Server 2016 on Windows 7? Don’t get me [or Sahil] started…)

    Installing the Software

    At the time of this writing (11 Apr 2016), there’s a version of Visual Studio 15 in Preview. After getting the OS up and running, I first installed a custom install of Visual Studio 15 Preview. My only customization? I didn’t include the Web stuff, just VB and C#.

    Next, I installed SQL Server 2016 RC2. Also a custom installation, I was only interested in the relational engine, backwards-compatibility connection components, SSRS and SSIS. I like that the SQL Server 2016 Installation Center provides links to SSMS and SSDT:

    SQL2016InstallSSMSSSDT

    The SSMS 2016 RC2 link works just fine. The SSDT 2016 RC2 link does not work.

    Fear not! The SSIS team has provided a set of updated links for SSIS 2016 SSDT for RC2. There’s other good information in that post. If you want to tinker with SSIS 2016 RC2, I encourage you to read it.

    But Wait, There’s More

    Once I’d done all this, I could create an SSIS project and add a Script Task to a package. But I could not open the Visual Studio Tools for Applications (VSTA) code editor. When I clicked the “Edit Script…” button in the Script Task Editor, nothing happened.

    I contacted the SSIS Development Team (we hang out), and let them know what I was seeing. They are aware of the issue and sent the following screenshot:

    ReinstallingVSTA15Preview

    Repairing the Microsoft Visual Studio Tools for Application 2015 installation corrected the issue with the VSTA editor.

    Time to Test!

    So now, it’s time for me to get some more SSIS 2016 testing completed! Now that you know one way to set up your own test environment, you can sing along at home!

    :{>

    Learn more:

    Related Training:
    IESSIS1: Immersion Event on Learning SQL Server Integration Services
    IESSIS2: Immersion Event on Advanced SQL Server Integration Services

    Related Articles:
    Stairway to Integration Services
    Stairway to Biml

    Related Books
    SSIS Design Patterns

    Help!
    Enterprise Data & Analytics – We are here to help.

  • BI’s Not Dead

    Timo Elliot [Blog | @timoelliott] wrote yesterday that BI is Dead and Julie Koesmarno [Blog | @mssqlgirl] was kind enough to tweet about it. The compelling point in Mr. Elliott’s post is:

    “The most charitable view is that Gartner feels it has to exaggerate the demise of BI in order to get customers to pay attention to the changes before it’s too late.”

    Mr. Elliott goes on to point out Gartner is redefining the term “BI” to apply only to what most BI folks now refer to as “self-service BI.” But Gartner also admits:

    “’IT-modeled data structures… promote governance and reusability across the organization’ and ‘In many organizations, extending IT-modeled structures in an agile manner and combining them with any of the sources listed above is a core requirement’”

    If you’re not confused you’re reading it wrong. I can hear you asking, “Andy, what do you think?” I’m glad you asked!

    What I Think

    I won’t pretend to understand the subtleties of Gartner’s logic at the time of this writing. I promise to dig more and try to get my head around what they’re saying and writing. I doubt, however, that Gartner is being wishy-washy. I think they’re trying to tell us something about a trend in Business Analytics.

    What’s the Trend?

    <trend>Self-service BI is gaining traction.</trend>

    I’ve been in the Business Intelligence  field since before I knew what it was called. When I did learn what it was called, we called it Decision Support Systems (DSS) and (in manufacturing) Manufacturing Execution Systems (MES). Later, I learned that the part of I really enjoyed was called Data Acquisition, a vital part of Supervisory Control and Data Acquisition (SCADA). Which led me to a career in Data Integration.

    I’ve heard variations on this theme of “data integration is dead” for two decades. It’s not true. I believe it may one day become true, but today’s not the day.

    “Why Isn’t the Report of BI’s Demise Accurate, Andy?”

    Consider Grant Fritchey’s [Blog | @grfritchey] recent foray into R. Grant is, in my humble opinion, one of the smartest and most capable data minds on the planet. He’s an engineer who gets the value of analysis. Don’t take my word for it, ask anyone who’s read his books or heard Grant present.  I promise I am not picking on Grant, I am using his post to make a point. And that point is this:

    “Data analytics is often hard.” – Andy, circa 2016

    Self-service BI is in better shape today than at any time in the past. The plethora of available tools are awesome and empower users like never before.

    But…

    Raw data is rarely in a format that lends itself to consumption by self-service BI tools.

    I believe this is the lesson of Grant’s post.

    It’s Not Just Big Data, Either

    There’s a lot of buzz surrounding Big Data, and the buzz is right and proper most of the time. But most of my customers do not have data volumes that reach the threshold of Big Data. As I wrote a few years back, Little Data Remains Important and most of my customers have Complex (or Difficult, or Hard) Data – not Big Data.

    Just Add Garbage…

    Data Quality is important because garbage-in = garbage-out. But what you may not realize is data collected into any store for analysis is going to be aggregated. Small variations in data quality, when summed or multiplied, bend predictive analytics lines in less-than-accurate directions. So much so that – sometimes – the ratio of signal-to-noise (good quality data to bad quality data) can be north of 99% when analytics solutions become so inaccurate that they’re useless for predictive analytics. Note: this isn’t the DQ threshold for all solutions but it is for some. All data-based solutions become useless once data quality dips below some threshold, and I’d hazard a guestimate that most solutions cross that line with a good:bad ratio in the 90-something percentile.

    Combined, complexity and data quality limit the effectiveness of self-service BI. It’s not the fault of the individual or the individual self-service BI tool. It’s the data. As I stated, data analytics is often hard.

    Conclusion

    In my opinion, Gartner is right to raise the flag about the shift towards self-service BI because this shift will impact the workloads for many IT shops. For now, there’s still plenty of data collection, master data management, integration, and cleansing work to be had; much of it in support of self-service BI tools.

    :{>

    Learn more:

    Related Training:
    IESSIS1: Immersion Event on Learning SQL Server Integration Services
    IESSIS2: Immersion Event on Advanced SQL Server Integration Services

    Related Articles:
    Stairway to Integration Services
    Stairway to Biml

    Related Books
    SSIS Design Patterns

    Help!
    Enterprise Data & Analytics – We are here to help.

  • SQL Server Developer Edition is FREE!

    I’ve blogged about the value of SQL Server Developer Edition for years. It’s Enterprise Edition with a End User License Agreement (EULA) that states you cannot use the SKU for production. Developer Edition used to sell for around $50 USD, but it is now free!

    Download SQL Server 2014 Developer Edition today!

    :{>

  • Announcing Biml Academy!

    Announcing Enterprise Data & Analytics Biml Academy – 5 Days of Free Training – 9-13 May!

    Welcome to Biml Academy! Join me for a five-day online event and learn Biml now!

    Biml Academy – Lesson 0 – SSIS Design Patterns – Before we dive into code generation, let’s discuss and demonstrate some SSIS Design Patterns.
    Biml Academy – Lesson 1 – Build Your First SSIS Package with Biml – Learn how to build your an SSIS package using Biml.
    Biml Academy – Lesson 2 – Use Biml with SSIS Design Patterns – Learn how to use Biml with SSIS Design Patterns.
    Biml Academy – Lesson 3 – Use Biml to Build and Load a Staging Database – Learn how to build and load a staging database using Biml.
    Biml Academy – Lesson 4 – Automate SSIS Design Patterns with Biml to Load a Dimension – Learn how to automate SSIS Design Patterns with Biml to a load star schema dimension table.

    Updates

    I estimate each lesson will average 1 hour. 

    The lessons will be recorded. 

    :{>

  • The Recording for DILM with SSIS Catalog Compare is now Available

    The recording for the (free!) Enterprise Data & Analytics webinar “DILM with Catalog Compare” is now available. Registration is required.

    Enjoy!

    :{>

  • Free Webinar 28 Mar at 1:00 PM EDT - DILM with Catalog Compare

    Join me 28 Mar 2016 at 1:00 PM EDT for DILM with Catalog Compare - another (FREE!) webinar from Enterprise Data & Analytics!

    DILM (Data Integration Lifecycle Management) is how data integration is done in the modern DevOps enterprise. Join Andy Leonard, Data Philosopher at Enterprise Data & Analytics, for a look at a helpful and exciting new utility – SSIS Catalog Compare – that will help you manage enterprise data integration with SSIS.

    Register today!

    :{>

  • Recording Available for “Biml 201: Batched Incremental Load in the SSIS Framework”

    You can view the recording for Biml 201: Batched Incremental Load in the SSIS Framework - a free webinar from Enterprise Data & Analytics (registration required).

    The source code is available here.

    Enjoy!

    :{>

    Next webinar: DILM with Catalog Compare - 28 Mar 2016, 1:00 PM EDT. Register today!

  • Free Webinar–“Biml 201: Batched Incremental Load in the SSIS Framework” 7 Mar

    Join me 7 Mar 2016 at 12:00 PM ET for a (FREE!) webinar titled Biml 201: Batched Incremental Load in the SSIS Framework.

    While delivering the (FREE!) webinar titled SSIS Framework Community Edition, my friend John Sterrett (blog | @johnsterrett) asked about using Biml to generate SSIS packages that execute in an SSIS Framework like the SSIS Framework Community Edition (download the code and documentation here). I said, “No, but I can whip one up!” So I did. And now it’s scheduled for 7 Mar at noon ET.

    I hope to see you there!

    :{>

  • The Recording “Introducing SSIS Framework Community Edition” is Available!

    The recording for the free webinar titled Introducing SSIS Framework Community Edition is now available! Registration is required. You may download the code and documentation here.

    Enjoy!

    :{>

  • Easy SSIS Migration with SSIS Catalog Compare Recording is Available!

    The recording for the webinar titled Easy SSIS Migration with SSIS Catalog Compare is now available!

    Enjoy!

    :{>

  • Microsoft is Listening

    There’s a back-story to my post titled SSIS 2016 CTP 3 and Data Flow Column LineageID. I’ll get to the story in a minute, but first I’d like to say that every company has excellent customer service. You don’t believe me? Ask them, “Company, do you have excellent customer service?” They will all answer, “Yes. Yes we do.”

    Microsoft has been engaging the software development community and MVP’s for years – decades, even. As a former SQL Server MVP, I was repeatedly told Microsoft is listening and that my thoughts and suggestions were welcome. So I shared my thoughts and suggestions, like many of my colleagues in the community. Most of our suggestions were not implemented. Many were marked “Works as designed” or “Will not fix” at Microsoft Connect. Granted, no one can implement every feature or even correct every single bug. It’s a) not possible; and b) not cost-effective to do so. But after quite a bit of time making requests and up-voting the excellent requests of others – and then seeing disappointing responses – I (and many others) realized some other force was driving the agenda for Microsoft’s development teams and overriding our suggestions. Many became jaded as a result. I will confess some jading occurred on my part.

    Recently, I and others began hearing fresh calls for thoughts and suggestions. My first (somewhat jaded) thought was, “That’s nice. That’s real nice.” (…which is a reference to an old Southern US joke that I will classify as “unkind”…)

    The Story

    About the time SQL Server 2016 CTP 3.1 was released, I read about the features I covered in the post titled SSIS 2016 CTP 3 and Data Flow Column LineageID. As I worked through some demo packages, however, I found some things did not appear to work – at least not yet; not in the Visual Studio 2015 SSIS template. I communicated with the SSIS team and was pleasantly surprised to receive an email stating they would attempt to work the changes into CTP 3.3. I was optimistic, if cautiously so.

    After SQL Server 2016 CTP 3.3 was released, I received a follow-up email from the team member who had addressed the concerns I identified – apologizing for the long wait and pointing to a detailed blog post describing the implementation of the updated features. </Applause!>

    Conclusion

    It’s one thing to say you’re listening to your community. It’s a different thing to actually listen to your community. One is some words. The other is some action.

    I’m happy to report Microsoft is, in fact, listening. I couldn’t be happier. Thank you!

    :{>

  • SSIS 2016 CTP 3 and Data Flow Column LineageID

    Back When We Used to Carve Our Own Chips Out of Wood…

    Back in the old days (2005), SSIS Data Flow LineageIDs were created at design-time. Here’s a screenshot of an SSIS 2005 package’s XML:

    SSIS2016Lineage03

    When an error occurred, the ID property of the column in error was supplied to the ErrorColumn column field of the Error Output, as shown here:

    SSIS2016Lineage04

    Although it wasn’t simple, it was possible to use the value of the ID supplied in the ErrorColumn value to identify the offending column in a Data Flow Task. It was a manual process that involved:

    · Opening the package’s XML

    · Finding the column ID value shown in the ErrorColumn field of the Error Output

    · Reading the LineageID value for that same column

    · Tracing the LineageID back to its original assignment in the Data Flow Task:

    SSIS2016Lineage05

    The LineageID “49” maps to the “name” column in the image at the top of this post. The “name” column raised an error because someone (me) made the destination column way too small to hold any name values.

    Back in the day, some folks came up with some pretty clever ways to automate identifying the name of the column causing SSIS Data Flow errors.

    In Later Versions of SSIS…

    The LineageID property of Data Flow columns changed in later versions of SSIS. In SSIS 2012, the LineageID property is there, but the value looks more like an SSIS package path than its SSIS 2005 counterpart.:

    SSIS2016Lineag1a

    The LineageID attribute in SSIS 2016 CTP 3.3, however, appears similarly:

    SSIS2016Lineage_LineageID

    A New Beginning…

    SSIS 2016 CTP3.3 offers a solution. First, there are now two new columns in the SSIS Data Flow Component Error Output – ErrorCode – Description and ErrorColumn – Description:

    SSIS2016Lineage06a

    The new columns provide extremely useful (plain language) error metadata that will, in my opinion, greatly reduce the amount of time required to identify data-related load failures in the Data Flow Task.

    But that’s not all. If you configure a log to capture the DiagnosticEx event, you will receive a message that provides the Data Flow column ID. To have a look, add a new log to an SSIS package that contains a configured Data Flow Task. On the Details tab, select the DiagnosticEx event:

    SSIS2016LineageDiagnosticEx

    When the package runs, a DiagnosticEx event will record XML describing the DTS:PipelineColumnMap. Viewing the XML in either a text file or SQL Server is no fun, but if you copy the XML and paste it into an XML file in Visual Studio (SSDT), you can format it nicely, as shown below:

    SSIS2016Lineage07a

    It’s possible to call a new method (GetIdentificationStringByID<) on the ComponentMetadata class in a Script Component, passing it the LineageID of a Data Flow column, and get the IdentificationString of the column. You can learn more about the GetIdentificationStringByID method – and everything I’ve written in this post – by reading Bo Fan’s (excellent) blog post on the subject here.

    :{>

  • Congratulations to BimlHero Cathrine Wilhelmsen!

    CathrineBimlHero

    I’m happy to pass along the news that my friend Cathrine Wilhelmsen (blog | @cathrinew) has been recognized as a BimlHero! Congratulations, Cathrine!

    You can learn more about Biml from Cathrine and Scott Currie at SQLBits in May 2016. On 4 May, Scott and Cathrine deliver a full day of Biml training titled Learn Biml Today: Start Using It Tomorrow. Note, at the time of this writing I am also listed as a co-presenter but I will, unfortunately, be unable to attend SQLBits this year. :( But go anyway! See Scott and Cathrine!

    :{>

More Posts Next page »

This Blog

Syndication

News


My Companies



Community Awards

Friend of Red Gate

Contact Me

Archives

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement