World of Whatever

A blog about SQL Server, SSIS, C# and whatever else I happen to be dealing with in my professional life.

Find ramblings

Loading...

Tuesday, April 21, 2015

Biml - separate dataflows into separate packages

I have an upcoming client with overly complicated SSIS packages. One package alone has 80 dataflow tasks, very few of which are related to each other. You wouldn't believe, but they have maintenance issues with this package! Especially with regard to concurrent editing of an SSIS package which doesn't exist. To simplify their solution, part of my proposal is that we take the massive, monolithic SSIS package and break it out into many teeny-tiny SSIS packages that just focus on a single task. This will allow for one developer to change the sales order detail dataflow task while another can safely edit the sales employee data flow task as they'll be contained in completely separate packages. I'll then build a master package that ensures proper invocation of the subpackages.

To get started, I'm going to use Mist to reverse engineer my packages into Biml. Let's assume it generates the following package.

<Biml xmlns="http://schemas.varigence.com/biml.xsd">
    <Packages>
        <Package Name="SourcePackage">
            <Tasks>
                <Dataflow Name="DFT 1" />
                <Dataflow Name="DFT 2" />
                <ExecuteProcess Name="EPT Do not include" Executable="C:\Windows\System32\cmd.exe" />
                <Container Name="SEQ A">
                    <Tasks>
                        <Dataflow Name="DFT A1" />
                    </Tasks>
                </Container>
            </Tasks>
        </Package>
        <Package Name="SourcePackage2">
            <Tasks>
                <Dataflow Name="DFT 1" />
                <Dataflow Name="DFT 2" />
                <ExecuteProcess Name="EPT Do not include" Executable="C:\Windows\System32\cmd.exe" />
                <Container Name="SEQ A">
                    <Tasks>
                        <Dataflow Name="DFT A1" />
                    </Tasks>
                </Container>
            </Tasks>
        </Package>
    </Packages>
</Biml>
I only want the data flow tasks extracted from this. You can also see that I have data flow tasks encapsulated into containers. That's a pain to enumerate through in the .NET object model but Biml gives us lovely LINQ shortcuts. The first thing I want to do is find all of dataflow tasks. RootNode.AllDefinedSuccessors().OfType() generates an enumerable list.

For each of the items in our list, we want to do something. Here I'm creating a new package with the name of the original package + the data flow task. I assume this will be unique but it's not guaranteed. Within my Tasks collection, I simply call the GetBiml method which preserves the definition of my data flow.

<#@ template language="C#" tier="2" #>
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
    <Packages>
    <# 
    foreach(var t in RootNode.AllDefinedSuccessors().OfType<AstDataflowTaskNode>())
    {
    #>
        <Package Name="<#=t.FirstThisOrParent<AstPackageNode>() #> - <#=t.Name #>">
            <Tasks>
                <#=t.GetBiml() #>
            </Tasks>
        </Package>
    <#
    }
    #>        
    </Packages>
</Biml>
If you're in BIDS Helper, simply highlight both files and click generate SSIS packages. Out comes 8 SSIS packages, the 2 originals and then one for each data flow.

Think about the possibilities this opens up for you! Maybe I'd also like to add in an auditing framework (I would). Maybe I want to then turn on logging and configuration since this client is on 2008. That's just as easy. Oh Biml, I <3 you.

Wednesday, April 1, 2015

Never trust the SSMS GUI

There's a group of SQL Server professionals that try to help on StackOverflow and DBA.StackExchange.com. One of the most telling symptoms of a bad question is people using the GUI to administer their SQL Server instances. "I clicked here, here, here and bad thing happened". What did you actually do? Do you know?

It's ok, you don't have to be ultra hardcore and memorize every bit of syntax or spend time looking through Books Online to find the specific command you're looking. You can use the GUI to administer your machine. Just don't click the OK button. Instead, look up.

That beautiful Script button is the one you're looking for. That generates all the DDL that clicking OK would have done but now you have something you can inspect. "Trust, but verify" as someone once said. Not only can you inspect it, you can share those commands with someone else.

In today's example, I learned the GUI was silently swallowing the error. I had created an operator for myself but misspelled my last name. I correct my name, clicked OK and ... it never changed. Weird, I must have clicked Cancel. This time, I ensured I clicked that big OK button.

I hit refresh on the Operators list and what the heck? Why am I still "Bill Fellow?" I demand an "S"! Once more unto the GUI and this time, I clicked Script Action to new query window.

EXEC msdb.dbo.sp_update_operator @name=N'Bill Fellows', 
        @enabled=1, 
        @pager_days=0, 
        @email_address=N'BillinKC@world.domination', 
        @pager_address=N'', 
        @netsend_address=N''
Perfect, this is what I want. I'm updating the operator...
Msg 14262, Level 16, State 1, Procedure sp_update_operator, Line 61
The specified @name ('Bill Fellows') does not exist.
Or not. I could have been spent time yelling at the SQL Server for being "dumb" or posted some question with exact reproduction and if someone was inclined, they could have verified that yes, it doesn't appear to rename. But look what happens when you actually run the command yourself. It tells you exactly why it's not working.

Asking questions is good. We should all endeavor to ask questions of everything. But even better is go find the answer.

Happy scripting.

Thursday, March 26, 2015

Biml SSIS ErrorCode and ErrorColumn

Did you know that ErrorCode and ErrorColumn are "reserved" column names in an SSIS Data Flow? I've been doing this for ten years now and I had never run into this until this week. My client's application has an ErrorCodes table defined like this
USE [tempdb]
GO
CREATE TABLE [dbo].[ErrorCodes]
(
    [ErrorCodeId] [uniqueidentifier] NOT NULL
,   [ErrorCode] [nvarchar](3) NOT NULL
,   [ErrorText] [nvarchar](50) NOT NULL
,   [CreateDate] [datetime] NOT NULL
,   [CreatedBy] [uniqueidentifier] NULL
,   [ModifiedDate] [datetime] NULL
,   [ModifiedBy] [uniqueidentifier] NULL
,   [ModuleReference] [uniqueidentifier] NULL
,   [RowStatusId] [uniqueidentifier] NULL
,   CONSTRAINT [PK_dboErrorCode] PRIMARY KEY CLUSTERED ([ErrorCodeId] ASC)
);
We needed to replicate the data out of the application tables so naturally, I had described the pattern in Biml and let it run. Until it blew up on me. The 1.7.0 release of BIDS Helper uses the new and improved Biml engine and it actually reported an error on emission. Contrast that with the 1.6.0 release which happily emitted the DTSX. This of course is documented in the Release Notes for Mist 4.0 Update 1

BimlScript Errors/Warnings

  • Added an error for duplicate dataflow column nodes. This usually arises with "reserved" column names "ErrorCode" and "ErrorText" (sic)

That should actually be "ErrorColumn". There's an ErrorDescription that is added in the OnError event and I've sent an email along to Varigence to get that corrected but I'm not seeing an ErrorText anywhere.

Reproduction

The following Biml is sufficient for you to see the error generated (and to show off the new error reporting functionality in BIDS Helper 1.7.0)
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
    <Connections>
        <OleDbConnection Name="CM_OLE" ConnectionString="Data Source=localhost\dev2014;Initial Catalog=tempdb;Provider=SQLNCLI10.1;Integrated Security=SSPI;Auto Translate=False;" />
    </Connections>
    <Packages>
        <Package ConstraintMode="Linear" Name="ErrorCodeTest">
            <Variables>
                <Variable DataType="String" Name="SimulateBadTable"><![CDATA[SELECT * FROM (VALUES (NULL, NULL)) D(ErrorCode, ErrorColumn); ]]></Variable>
            </Variables>
            <Tasks>
                <Dataflow Name="DFT Test">
                    <Transformations>
                        <OleDbSource ConnectionName="CM_OLE" Name="OLE_SRC dbo_ErrorCodes">
                            <VariableInput VariableName="User.SimulateBadTable"></VariableInput>
                        </OleDbSource>
                        <DerivedColumns Name="Do Nothing"></DerivedColumns>
                    </Transformations>
                </Dataflow>
            </Tasks>
        </Package>
    </Packages>
</Biml>
Component OLE_SRC dbo_ErrorCodes of type AstOleDbSourceNode in Dataflow DFT Test has duplicate column reference with name 'ErrorColumn' on ouput 'Error'. This may be due to the 'ErrorCode' and 'ErrorColumn' columns that are automatically added to error output paths by SSIS. Ensure that all column names are unique.

Known issue

This is apparently a known issue based on this Connect Item SSIS - ErrorCode column in source table causes duplicate ErrorCode to be introduced but it's marked as Closed - Fixed. I'm not seeing the fixed part with SQL Server 2014

For those Bimling, it's also logged over on the Varigence forums but they simply reference the Connect item.

What to do

I don't know yet (sadly, this excites me). My lazy pattern was using the table selector for my OLE DB Source but for this one table, I'd need to explicitly grab the column list and alias the ErrorCode column as something else. What I'd rather do is rename the Error path's columns but that does not seem possible.

Wednesday, February 25, 2015

Biml - Snippets

I've been working with Biml for a year and a half now. With the Intellisense built into BIDS Helper or Mist itself, I can bang out some code fairly quick. My mental parser isn't too bad either, I can read and generally see what is/isn't set correctly in it. Except for the Script Tasks and Components. Even in Mist, they still kick me in the pants. What's the ProjectCoreName and how does that differ from the ScriptTaskProjectName and should it differ? What's the crazy syntax for escaping my code within the code? Yeah, I don't care anymore.

I no longer care, because I have a snippet. If only I had a Donk! A snippet is like a macro — type some mnemonic keystroke and if you want the snippet, hit Tab. The C# snippet that comes to mind is cw which autocompletes to Console.WriteLine. Man, that'd be helpful for slingin' Biml.

Wait, why haven't I used them? I know you can create custom snippets for .NET so why not one for "XML?" Yeah self, why not? To save you the trouble of arguing with yourself for not being clever, I'm going to tell you to start creating your own snippets, contribute them to bimlscript.com and let's get cranking.

Getting started with snippets

I don't know that you have to, but there's a very handy tool called Snippet Designer that makes it a cinch to create snippets.

Highlight the text you're interested in and in your right click menu, Export as Snippet. You can ignore the Create Snippet..., that's Red Gate's SQL Prompt and won't create the right type of snippet for these file types.

For a script task, I'm just going to assume I'm starting with an brand new Biml file so I've selected everything but that and put it into a snippet. You'll then be presented with a nice little editor so you can use things like anchors and such which I made heavy use of in TextPad's clip library.

I save it out and it goes into a file called C:\Users\bfellows\Documents\Visual Studio 2012\Code Snippets\XML\My Xml Snippets\ScriptTaskCS_2012.snippet

<?xml version="1.0" encoding="utf-8"?>
<CodeSnippets xmlns="http://schemas.microsoft.com/VisualStudio/2005/CodeSnippet">
  <CodeSnippet Format="1.0.0">
    <Header>
      <SnippetTypes>
        <SnippetType>Expansion</SnippetType>
      </SnippetTypes>
      <Title>ScriptTaskCS_2012</Title>
      <Author>admin</Author>
      <Description>
      </Description>
      <HelpUrl>
      </HelpUrl>
      <Shortcut>
      </Shortcut>
    </Header>
    <Snippet>
      <Code Language="xml"><![CDATA[    <ScriptProjects>
        <ScriptTaskProject ProjectCoreName="ST_12345" Name="ST_12345" VstaMajorVersion="0">
            <ReadOnlyVariables>
                <Variable Namespace="System" VariableName="MachineName" DataType="Boolean" />
            </ReadOnlyVariables>
            <Files>
                <File Path="ScriptMain.cs" BuildAction="Compile">using System;
using System.Data;
using Microsoft.SqlServer.Dts.Runtime;
using System.Windows.Forms;

namespace ST_12345
{
    [Microsoft.SqlServer.Dts.Tasks.ScriptTask.SSISScriptTaskEntryPointAttribute]
    public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
    {
        public void Main()
        {
            bool fireAgain = false;
            string message = Dts.Variables["System::MachineName"].Value.ToString();
            Dts.Events.FireInformation(0, "Log MachineName", message, string.Empty, 0, ref fireAgain);

            Dts.TaskResult = (int)ScriptResults.Success;
        }

        enum ScriptResults
        {
            Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,
            Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
        };
    }
}                </File>
                <File Path="Properties\AssemblyInfo.cs" BuildAction="Compile">
using System.Reflection;
using System.Runtime.CompilerServices;

//
// General Information about an assembly is controlled through the following
// set of attributes. Change these attribute values to modify the information
// associated with an assembly.
//
[assembly: AssemblyTitle("AssemblyTitle")]
[assembly: AssemblyDescription("")]
[assembly: AssemblyConfiguration("")]
[assembly: AssemblyCompany("I <3 @billinkc")]
[assembly: AssemblyProduct("ProductName")]
[assembly: AssemblyCopyright("Copyright @  2015")]
[assembly: AssemblyTrademark("")]
[assembly: AssemblyCulture("")]
//
// Version information for an assembly consists of the following four values:
//
//      Major Version
//      Minor Version
//      Build Number
//      Revision
//
// You can specify all the values or you can default the Revision and Build Numbers
// by using the '*' as shown below:

[assembly: AssemblyVersion("1.0.*")]
                </File>
            </Files>
            <AssemblyReferences>
                <AssemblyReference AssemblyPath="System" />
                <AssemblyReference AssemblyPath="System.Data" />
                <AssemblyReference AssemblyPath="System.Windows.Forms" />
                <AssemblyReference AssemblyPath="System.Xml" />
                <AssemblyReference AssemblyPath="Microsoft.SqlServer.ManagedDTS.dll" />
                <AssemblyReference AssemblyPath="Microsoft.SqlServer.ScriptTask.dll" />
            </AssemblyReferences>
        </ScriptTaskProject>
    </ScriptProjects>
    <Packages>
        <Package Name="BasicScriptTask" ConstraintMode="Linear">
            <Tasks>
                <Script ProjectCoreName="ST_12345" Name="SCR Do Stuff">
                    <ScriptTaskProjectReference ScriptTaskProjectName="ST_12345" />
                </Script>
            </Tasks>
        </Package>
    </Packages>]]></Code>
    </Snippet>
  </CodeSnippet>
</CodeSnippets>
Now when I need a script task, my workflow is
  1. Add new biml file
  2. Right click, Insert Snippet (Ctrl-K, Ctrl-X)
  3. Navigate to My Xml Snippets, select ScriptTaskCS_2012
  4. Replace the Name attribute for Package and replace all instances of ST_1235 with something a little more unique

If I click Generate SSIS Package, the biml engine is going to fire up and emit an SSIS package with a script task. How cool is that? Think about how you can leverage snippets and CallBimlScript: Replicate-o-matic, Don't Repeat Your Biml, Callable BimlScript (Caller), etc.

Like this? Joost van Rossum (b|t) just posted Creating BIML Script Component Transformation (rownumber). That's your framework for creating a Script Component, acting as a transform. Add that to your Snippets and now you have an example of each.

For large libraries, you might want to make those CoreNames unique. I'll see if there's an API call for generating a unique name. Also, you can make this into an in-line project script as Scott shows on Creating Script Task Projects inline. The difference between the two approaches boils down to do you want to create shareable, project level tasks or per-package tasks.

I am very excited about integrating snippets into my biml workflow and I hope this has opened your eyes to another means for speeding your development.

Tuesday, February 24, 2015

Is my SSIS package running in 32 or 64 bit mode?

32 versus 64 bit backstory

I might have looked at some of the questions over on StackOverflow and I've lost track of the number of times a package has worked but then it doesn't on another machine and it ends up being a mismatch between the "bittedness" of their driver and their execution model.

32 or 64 bit?

Windows was a 32 Operating System, OS, until around 2003. In the main Windows folder, you had a System folder, which was for 16 bit libraries, and System32 folder, which was for 32 bit libraries and applications. All is well and good.

Then the 64 architecture made it into Windows and now we have a third folder, this one called SysWoW64 which contains? .... 32 bit applications. Of course. The 64bit applications and libraries are in the System32 folder. I am not making this up. So, 32 bit in the folder named 64, 64 bit in the folder named 32. Got it.

But wait, there's more. Not only can your OS come in 32 and 64 bit flavours, so can your applications and drivers! Applications usually install in the Program Files directory. If you are on a 64 bit OS, then your "Program Files" is going to contain your 64 bit executables while your "Program Files (x86)" will contain your 32 bit executables. But don't worry, if you're on a 32 bit OS, there won't be an (x86) folder and the 32 bit executables will be in "Program Files."

So what?

Think of 32 vs 64 bit as height in inches. A two year old is probably 32" while a 17 years old may be 64". If I put a knick knack on the top shelf, only the 17 year old can reach (address) it. If I make the mistake of putting it on a low shelf, 17 year old can't be bothered to bend down to pick it up but the 2 year old can and will do their best Godzilla impersonation on it.

What's drivers got to do with it

SSIS can target a variety of sources and sinks out of the box. Flat files, web services, Active Directory, SQL Server: piece of cake. Excel, Access, Informix, DB2, MySQL, Oracle: not so much with the cake. The problem is that you need special drivers to get SSIS to talk to these providers. Some, like Excel* are part of the base installation. Others might require an special download.

Some drivers come in both kinds, 32 and 64 bit. Others, are only found in the 32 bit variety. I'm sure there's some esoteric driver that only works in 64 bit but I've never found someone lamenting this on a forum.

Often, with these providers, you will create a Data Source Name, DSN, to provide configuration information or provide your own unique file that starts with TNS and ends with ORA. What's important to realize is that you will need to align these configuration values with the correct bit version of your driver and your target package execution mode.

The executable for ODBC driver administration is odbcad32.exe. That tool exists in SysWow64, our 32 bit app location, and also in System32, our 64 bit app location but it's still physically called odbcad32.exe in both locations. Have I mentioned I deal with this confusion with some frequency? I don't know why, it seems so readily apparent. Further muddying the waters, I believe it was the Server 2003 the Control Panel, Administrator Tools, ODBC Data Sources only pointed to one of them. You had to know the other existed and find it to use it. The current interface at least lists it twice and indicates which is 32 versus 64 bit.

Running SSIS packages

SSIS packages "run" by getting called from an executable named dtexec.exe. If you have SSIS/BIDS/SSDT installed on your machine, you likely have two versions of dtexec.exe. Assuming default paths, you likely have the following them installed at the following paths.

SQL Server Version32 bit64 bit (default)
2005C:\Program Files (x86)\Microsoft SQL Server\90\DTS\Binn\DTExec.exeC:\Program Files\Microsoft SQL Server\90\DTS\Binn\DTExec.exe
2008C:\Program Files (x86)\Microsoft SQL Server\100\DTS\Binn\DTExec.exeC:\Program Files\Microsoft SQL Server\100\DTS\Binn\DTExec.exe
2012C:\Program Files (x86)\Microsoft SQL Server\110\DTS\Binn\DTExec.exeC:\Program Files\Microsoft SQL Server\110\DTS\Binn\DTExec.exe
2014C:\Program Files (x86)\Microsoft SQL Server\120\DTS\Binn\DTExec.exeC:\Program Files\Microsoft SQL Server\120\DTS\Binn\DTExec.exe

This matters as the default executable that gets run will likely be the 64 bit version. If you're trying to execute a package in 32 bit mode from the command line, you will need to explicitly reference the dtexec.exe in the x86 folder.

Observant folks may see that the dtexec offers a /X86 option. Don't believe it. The only way to get the correct bit-ness is to explicitly call the correct dtexec.exe The documentation even says as much but nobody reads documentation. This option is only used by SQL Server Agent. This option is ignored if you run the dtexec utility at the command prompt.

SSISDB notes

For those working with the Project Deployment Model (2012/2014), you don't have to worry about paths when spawning an execution instance. It's a simple matter of passing True/False to the @user32bitruntime parameter (line 6 below)

   1:  EXEC [SSISDB].[catalog].[create_execution]
   2:      @package_name = N'PartialLookup.dtsx'
   3:  ,   @execution_id = @execution_id OUTPUT
   4:  ,   @folder_name = N'POC'
   5:  ,   @project_name = N'BimlTest'
   6:  ,   @use32bitruntime = True
   7:  ,   @reference_id = NULL

Tracking whether an execution instance was 32 or 64 bit isn't readily apparent with the native reports but a simple query against catalog.executions will reveal it.

SELECT
    E.use32bitruntime
,   *
FROM
    catalog.executions AS E
WHERE
    E.execution_id = @OperationIDFromReportUpperLeftCorner;

Putting it all together, mostly

If I'm in 32 bit space, I can only work with the drivers and data source names I know about. The DSN I'm looking for might be in 64 bit space but I'll never be able to reach it from the depths of 32 bit. To paraphrase: 32 is 32, and 64 is 64, and ne'er the twain shall meet.

You might swear up and down you created the DSN or ran it in 32 bit mode but JET's not installed and this a bug with SSIS but before you post a question on StackOverflow, the MSDN forums or Connect, double check.

How do I double check?

Finally, what I was originally putting this post together to cover. For 2012/2014 packages in the project deployment model, you already have your answer. For everything else, there's a Script Task.

I know, I heard you saw ewwwww. It's not true. When you run your package from the command line, the first two lines

Microsoft (R) SQL Server Execute Package Utility
Version 12.0.2000.8 for 64-bit
Right there, it says this is the 64 bit version of dtexec and that I don't need to patch my VM. But, there are other ways of invoking a package. What if I started my package via .NET code or am running it in Visual Studio?

The quickest way I've been able to determine is to evaluate IntPtr.Size. If it's 4, then it's 32 bit, if it's 8, then it's 64.

Assuming you pass in System::InteractiveMode as a read parameter, this little script will fire an information event alerting you whether you're 32 or 64 bit. It can also pop up a message box, because everyone loves those.

            bool fireAgain = false;
            string message = string.Empty;
            message = string.Format("I am running in {0} mode", (IntPtr.Size == 4) ? "32 bit":"64 bit");
            Dts.Events.FireInformation(0, "Log bittedness", message, string.Empty, 0, ref fireAgain);
            if ((bool)this.Dts.Variables["System::InteractiveMode"].Value)
            {
                MessageBox.Show(message);
            }

Biml

A post wouldn't be complete without some Biml! If you wanted to, you could add this little bit into every package you emit and then you'd know, absolutely know whether stumbled onto a bug or the problem exists between keyboard and chair.

Don't be afraid of that hot mess of code. All it does is generate an SSIS package that uses a Script Task that has the above code in it. You can use the CallBimlScript trick from the Replicate-O-Matic post to encapsulate the script task into an external file but that's not really as flexible as I'd want it to be.

<Biml xmlns="http://schemas.varigence.com/biml.xsd">

<ScriptProjects>
    <ScriptTaskProject ProjectCoreName="ST_12345" Name="ST_12345" VstaMajorVersion="0">
        <ReadOnlyVariables>
            <Variable Namespace="System" VariableName="InteractiveMode" DataType="Boolean" />
        </ReadOnlyVariables>
        <Files>
            <File Path="ScriptMain.cs" BuildAction="Compile">using System;
using System.Data;
using Microsoft.SqlServer.Dts.Runtime;
using System.Windows.Forms;

namespace ST_12345
{
    [Microsoft.SqlServer.Dts.Tasks.ScriptTask.SSISScriptTaskEntryPointAttribute]
    public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
    {
        public void Main()
        {
            bool fireAgain = false;
            string message = string.Empty;
            message = string.Format("I am running in {0} mode", (IntPtr.Size == 4) ? "32 bit":"64 bit");
            Dts.Events.FireInformation(0, "Log bittedness", message, string.Empty, 0, ref fireAgain);
            if ((bool)this.Dts.Variables["System::InteractiveMode"].Value)
            {
                MessageBox.Show(message);
            }

            Dts.TaskResult = (int)ScriptResults.Success;
        }

        enum ScriptResults
        {
            Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,
            Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
        };
    }
}                </File>
            <File Path="Properties\AssemblyInfo.cs" BuildAction="Compile">
using System.Reflection;
using System.Runtime.CompilerServices;

//
// General Information about an assembly is controlled through the following
// set of attributes. Change these attribute values to modify the information
// associated with an assembly.
//
[assembly: AssemblyTitle("AssemblyTitle")]
[assembly: AssemblyDescription("Bill is awesome")]
[assembly: AssemblyConfiguration("")]
[assembly: AssemblyCompany("")]
[assembly: AssemblyProduct("ProductName")]
[assembly: AssemblyCopyright("Copyright @  2015")]
[assembly: AssemblyTrademark("")]
[assembly: AssemblyCulture("")]
//
// Version information for an assembly consists of the following four values:
//
//      Major Version
//      Minor Version
//      Build Number
//      Revision
//
// You can specify all the values or you can default the Revision and Build Numbers
// by using the '*' as shown below:

[assembly: AssemblyVersion("1.0.*")]
                </File>
        </Files>
        <AssemblyReferences>
            <AssemblyReference AssemblyPath="System" />
            <AssemblyReference AssemblyPath="System.Data" />
            <AssemblyReference AssemblyPath="System.Windows.Forms" />
            <AssemblyReference AssemblyPath="System.Xml" />
            <AssemblyReference AssemblyPath="Microsoft.SqlServer.ManagedDTS.dll" />
            <AssemblyReference AssemblyPath="Microsoft.SqlServer.ScriptTask.dll" />
        </AssemblyReferences>
    </ScriptTaskProject>
</ScriptProjects>
<Packages>
    <Package Name="CheckMyBits" ConstraintMode="Linear">
        <Tasks>
            <Script ProjectCoreName="ST_12345" Name="SCR Do Stuff">
                <ScriptTaskProjectReference ScriptTaskProjectName="ST_12345" />
            </Script>
        </Tasks>
    </Package>
</Packages>
</Biml>

That's a lot of Biml to try and remember, if only there was a way to keep that handy...

Wednesday, February 18, 2015

Slimming down the SSIS Script Task

SSIS Script Task History

Gather 'round children, I want to tell you a tale of woe. The 2005 release of SQL Server Integration Services allowed you to use any .NET language you wanted in a Script Task or Script Component, as long as you liked Visual Basic .NET. The 2008 release of SSIS allowed us to use either "Microsoft Visual C# 2008" or "Microsoft Visual Basic 2008". Many .NET devs rejoiced over this and that's what today's post is about.

This, is what the standard Task would generate as template code.

/*
   Microsoft SQL Server Integration Services Script Task
   Write scripts using Microsoft Visual C# 2008.
   The ScriptMain is the entry point class of the script.
*/

using System;
using System.Data;
using Microsoft.SqlServer.Dts.Runtime;
using System.Windows.Forms;

namespace ST_d80f6050516944fa8639234f7b2e50b9.csproj
{
    [System.AddIn.AddIn("ScriptMain", Version = "1.0", Publisher = "", Description = "")]
    public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
    {

        #region VSTA generated code
        enum ScriptResults
        {
            Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,
            Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
        };
        #endregion

        /*
        The execution engine calls this method when the task executes.
        To access the object model, use the Dts property. Connections, variables, events,
        and logging features are available as members of the Dts property as shown in the following examples.

        To reference a variable, call Dts.Variables["MyCaseSensitiveVariableName"].Value;
        To post a log entry, call Dts.Log("This is my log text", 999, null);
        To fire an event, call Dts.Events.FireInformation(99, "test", "hit the help message", "", 0, true);

        To use the connections collection use something like the following:
        ConnectionManager cm = Dts.Connections.Add("OLEDB");
        cm.ConnectionString = "Data Source=localhost;Initial Catalog=AdventureWorks;Provider=SQLNCLI10;Integrated Security=SSPI;Auto Translate=False;";

        Before returning from this method, set the value of Dts.TaskResult to indicate success or failure.
        
        To open Help, press F1.
    */

        public void Main()
        {
            // TODO: Add your code here
            Dts.TaskResult = (int)ScriptResults.Success;
        }
    }
}
That's 50 lines in total, my trim job brings that down to 23.

I hate regions. Hate them with the fury of a thousand suns. Ctrl-M, Ctrl-P stops all outlining but I have to click that every time I open the script, or I have to remove the stupid #region lines. That's a minor annoyance but one I lived through.

The 2012/2014 release of SSIS was designed to make it easier for people to get started. We had these getting started videos that were suggested every time you create a new integration services project which is really charming when your job is an ETL developer. As part of the rookie developer changes, the default Script Task now provides you with a lot more hand holding with regard to developing your first Task. The following is that template

#region Help:  Introduction to the script task
/* The Script Task allows you to perform virtually any operation that can be accomplished in
 * a .Net application within the context of an Integration Services control flow. 
 * 
 * Expand the other regions which have "Help" prefixes for examples of specific ways to use
 * Integration Services features within this script task. */
#endregion


#region Namespaces
using System;
using System.Data;
using Microsoft.SqlServer.Dts.Runtime;
using System.Windows.Forms;
#endregion

namespace ST_12345
{
    /// <summary>
    /// ScriptMain is the entry point class of the script.  Do not change the name, attributes,
    /// or parent of this class.
    /// </summary>
    [Microsoft.SqlServer.Dts.Tasks.ScriptTask.SSISScriptTaskEntryPointAttribute]
    public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
    {
        #region Help:  Using Integration Services variables and parameters in a script
        /* To use a variable in this script, first ensure that the variable has been added to 
         * either the list contained in the ReadOnlyVariables property or the list contained in 
         * the ReadWriteVariables property of this script task, according to whether or not your
         * code needs to write to the variable.  To add the variable, save this script, close this instance of
         * Visual Studio, and update the ReadOnlyVariables and 
         * ReadWriteVariables properties in the Script Transformation Editor window.
         * To use a parameter in this script, follow the same steps. Parameters are always read-only.
         * 
         * Example of reading from a variable:
         *  DateTime startTime = (DateTime) Dts.Variables["System::StartTime"].Value;
         * 
         * Example of writing to a variable:
         *  Dts.Variables["User::myStringVariable"].Value = "new value";
         * 
         * Example of reading from a package parameter:
         *  int batchId = (int) Dts.Variables["$Package::batchId"].Value;
         *  
         * Example of reading from a project parameter:
         *  int batchId = (int) Dts.Variables["$Project::batchId"].Value;
         * 
         * Example of reading from a sensitive project parameter:
         *  int batchId = (int) Dts.Variables["$Project::batchId"].GetSensitiveValue();
         * */

        #endregion

        #region Help:  Firing Integration Services events from a script
        /* This script task can fire events for logging purposes.
         * 
         * Example of firing an error event:
         *  Dts.Events.FireError(18, "Process Values", "Bad value", "", 0);
         * 
         * Example of firing an information event:
         *  Dts.Events.FireInformation(3, "Process Values", "Processing has started", "", 0, ref fireAgain)
         * 
         * Example of firing a warning event:
         *  Dts.Events.FireWarning(14, "Process Values", "No values received for input", "", 0);
         * */
        #endregion

        #region Help:  Using Integration Services connection managers in a script
        /* Some types of connection managers can be used in this script task.  See the topic 
         * "Working with Connection Managers Programatically" for details.
         * 
         * Example of using an ADO.Net connection manager:
         *  object rawConnection = Dts.Connections["Sales DB"].AcquireConnection(Dts.Transaction);
         *  SqlConnection myADONETConnection = (SqlConnection)rawConnection;
         *  //Use the connection in some code here, then release the connection
         *  Dts.Connections["Sales DB"].ReleaseConnection(rawConnection);
         *
         * Example of using a File connection manager
         *  object rawConnection = Dts.Connections["Prices.zip"].AcquireConnection(Dts.Transaction);
         *  string filePath = (string)rawConnection;
         *  //Use the connection in some code here, then release the connection
         *  Dts.Connections["Prices.zip"].ReleaseConnection(rawConnection);
         * */
        #endregion


        /// <summary>
        /// This method is called when this script task executes in the control flow.
        /// Before returning from this method, set the value of Dts.TaskResult to indicate success or failure.
        /// To open Help, press F1.
        /// </summary>
        public void Main()
        {
            // TODO: Add your code here

            Dts.TaskResult = (int)ScriptResults.Success;
        }

        #region ScriptResults declaration
        /// <summary>
        /// This enum provides a convenient shorthand within the scope of this class for setting the
        /// result of the script.
        /// 
        /// This code was generated automatically.
        /// </summary>
        enum ScriptResults
        {
            Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,
            Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
        };
        #endregion

    }
}

Mercy, I get all twitchy in the eye just looking at it. That is 113 lines of code and for those that really pay attention to such things, there's a delightful mix of tabs and space characters. Again, after I go through my defluffing process, I'm back to 23 lines. But, the stripping process is much slower. I have more regions to deal with and inaccurate comments to clean and it's just a lot more work than my OCD brain should have to expend.

There must be a way to take the training wheels off a Script Task. Today I found that switch. Meet the ProjectTemplatePath property. On my 2012 installation, it points to C:\Program Files (x86)\Microsoft SQL Server\110\DTS\Binn\VSTA11_IS_ST_CS_Template.vstax

VSTA11_IS_ST_CS_Template.vstax

In fine Microsoft tradition, the vstax file is a ... what? Anyone? This fine document states An Open Packaging Container (OPC) file that contains one or more project templates which to me says "a zip file." So, I copied that out of my installation location and unzipped it. 10 files
  • AssemblyInfo.cs
  • IS%20Script%20Task%20Project.csproj
  • Resources.Designer.cs
  • Resources.resx
  • ScriptMain.cs
  • Settings.Designer.cs
  • Settings.settings
  • VSTA11_IS_ST_CS_Template.vstatemplate
  • vstax.manifest
  • [Content_Types].xml
Oh, how lovely! Point your favourite text editor at ScriptMain.cs. Look familiar? That is your ScriptMain, except it has a token of $safeprojectname$ instead of ST_12345 for the namespace.

There be dragons here

What we're about to do has the possibility of breaking your Visual Studio/BIDS/SSDT installation and you should not do it.

Really, by mucking about you and you alone are responsible for your actions. You break it, you fix it.

Excellent, you're still here. Step 1. Edit the contents of ScriptMain.cs as you see fit. Mine looks like the following.

using System;
using System.Data;
using Microsoft.SqlServer.Dts.Runtime;
using System.Windows.Forms;

namespace $safeprojectname$
{
    [Microsoft.SqlServer.Dts.Tasks.ScriptTask.SSISScriptTaskEntryPointAttribute]
    public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
    {
        public void Main()
        {
            Dts.TaskResult = (int)ScriptResults.Success;
        }

        enum ScriptResults
        {
            Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,
            Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
        };

    }
}
I do comment my code, usually to the level that F/X Cop stops barking at me but I don't want to start with the annoying boilerplate comments.

Step 2 Create a new version of the vstax file. I could not figure out how to make the native windows compressed folder thing to not compress the archive. 7-Zip however makes this a snap.

  1. Select all 10 files in the folder. Do not select the enclosing folder or your archive will be wrong
  2. Right click and choose the Add to Archive option
  3. Change your "Archive format" to zip and the "Compression level" to Store
  4. Give the archive a name like VSTA11_IS_ST_CS_Template.vstax
If the resulting zip is approximately 8KB, you have compression turned on. This won't cause SSDT to break, but it also won't be able to instantiate the Task editor until you fix it.

Step 3 Backup. Make a copy of your existing VSTA11_IS_ST_CS_Template.vstax file in the Binn folder. Keep this safe.

Step 4 Replace the vstax file in the Binn folder with the one you just created. You will be prompted to perform an admin task with this since it's a protected folder. I clicked yes, but you shouldn't because you machine may catch on fire.

Step 5 Test. You don't even have to restart Visual Studio. Just drag a new Script Task onto your canvas and click edit script. If you've done everything correctly, you'll be sporting a slimmer Script Task. If this doesn't work, then replace your modified vstax file with the copy you made in Step 3.

Thursday, January 22, 2015

Biml - Replicate-O-Matic

I while back I posted about Copy all the tables. Since then, I've done a lot more Biml and one of the problems I had was duplicated logic across lots of .biml files. I say I had the problem because I didn't know about CallBimlScript. Well, I knew about it because yeah, I can include a file but I didn't realize the power there. CallBimlScript allows you to make functions for your biml. You can define parameters to it. That changes everything in my mind. Let's look the copy all tables example but this time, let's modularize it.

We'll create two biml files. One will be inc_Package.biml and the other will be Driver.biml. The definition of Driver.Biml remains the same, we're just going to cut out the actual Package block(s).

<#@ template language="C#" hostspecific="true" #>
<#@ import namespace="System.Data" #>
<#@ import namespace="System.Data.SqlClient" #>
<#@ import namespace="System.IO" #>
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<!--
<#
    string connectionStringSource = @"Server=localhost\dev2012;Initial Catalog=AdventureWorksDW2012;Integrated Security=SSPI;Provider=SQLNCLI11.1";
    string connectionStringDestination = @"Server=localhost\dev2012;Initial Catalog=AdventureWorksDW2012_DEST;Integrated Security=SSPI;Provider=SQLNCLI11.1";

    string SrcTableQuery =     @"
SELECT
    SCHEMA_NAME(t.schema_id) AS schemaName
,   T.name AS tableName
FROM
    sys.tables AS T
WHERE
    T.is_ms_shipped = 0
    AND T.name <> 'sysdiagrams';
";

    DataTable dt = null;
    dt = ExternalDataAccess.GetDataTable(connectionStringSource, SrcTableQuery);
#>    
-->
    <Connections>
        <OleDbConnection
            Name="SRC"
            CreateInProject="false"
            ConnectionString="<#=connectionStringSource#>"
            RetainSameConnection="false">
        </OleDbConnection>
        <OleDbConnection
            Name="DST"
            CreateInProject="false"
            ConnectionString="<#=connectionStringDestination#>"
            RetainSameConnection="false">
        </OleDbConnection>
    </Connections>

    <Packages>
        <# foreach (DataRow dr in dt.Rows) { #>
            <#=CallBimlScript("inc_Package.biml", dr[0].ToString(), dr[1].ToString())#>
        <# } #>
    </Packages>
</Biml>
So that looks a little easier to understand. We get our dataset from our external source, create the connections and for each table in our dataset, we call the inc_Package.biml file passing in columns 0 and 1, which corresponds to our schema name and table name.

Our inc_Package.biml file isn't that different either. What's crucial is the first two lines where we define our property values of schema and table. Our file expects two parameters now and then we reference them in the code as <#=schema#>, where previously we would have used <#=dr[0].ToString()#>

<#@ property name="schema" type="String" #>
<#@ property name="table" type="String" #>

            <Package ConstraintMode="Linear"
                Name="<#=schema#>_<#=table#>"

            >
            <Variables>
                <Variable Name="SchemaName" DataType="String"><#=schema#></Variable>
                <Variable Name="TableName" DataType="String"><#=table#></Variable>
                <Variable Name="QualifiedTableSchema"
                          DataType="String"
                          EvaluateAsExpression="true">"[" +  @[User::SchemaName] + "].[" +  @[User::TableName] + "]"</Variable>
            </Variables>
            <Tasks>
                <Dataflow
                    Name="DFT"
                >
                    <Transformations>
                        <OleDbSource
                            Name="OLE_SRC <#=schema#>_<#=table#>"
                            ConnectionName="SRC"
                        >
                            <TableFromVariableInput VariableName="User.QualifiedTableSchema"/>
                        </OleDbSource>
                        <OleDbDestination
                            Name="OLE_DST <#=schema#>_<#=table#>"
                            ConnectionName="DST"
                            KeepIdentity="true"
                            TableLock="true"
                            UseFastLoadIfAvailable="true"
                            KeepNulls="true"
                            >
                            <TableFromVariableOutput VariableName="User.QualifiedTableSchema" />                        
                        </OleDbDestination>
                    </Transformations>
                </Dataflow>

            </Tasks>
            </Package>

I right click on Driver.biml and select Generate SSIS Packages and boom! many packages are created to replicate all the data between my source and destination. I can't believe this is available in BIDS Helper, aka free version, of the biml engine. I have so much code to go and simplify.

Further reading