A blog about SQL Server, SSIS, C# and whatever else I happen to be dealing with in my professional life.

Find ramblings

Tuesday, April 21, 2015

Biml - separate dataflows into separate packages

I have an upcoming client with overly complicated SSIS packages. One package alone has 80 dataflow tasks, very few of which are related to each other. You wouldn't believe, but they have maintenance issues with this package! Especially with regard to concurrent editing of an SSIS package which doesn't exist. To simplify their solution, part of my proposal is that we take the massive, monolithic SSIS package and break it out into many teeny-tiny SSIS packages that just focus on a single task. This will allow for one developer to change the sales order detail dataflow task while another can safely edit the sales employee data flow task as they'll be contained in completely separate packages. I'll then build a master package that ensures proper invocation of the subpackages.

To get started, I'm going to use Mist to reverse engineer my packages into Biml. Let's assume it generates the following package.

<Biml xmlns="http://schemas.varigence.com/biml.xsd">
    <Packages>
        <Package Name="SourcePackage">
            <Tasks>
                <Dataflow Name="DFT 1" />
                <Dataflow Name="DFT 2" />
                <ExecuteProcess Name="EPT Do not include" Executable="C:\Windows\System32\cmd.exe" />
                <Container Name="SEQ A">
                    <Tasks>
                        <Dataflow Name="DFT A1" />
                    </Tasks>
                </Container>
            </Tasks>
        </Package>
        <Package Name="SourcePackage2">
            <Tasks>
                <Dataflow Name="DFT 1" />
                <Dataflow Name="DFT 2" />
                <ExecuteProcess Name="EPT Do not include" Executable="C:\Windows\System32\cmd.exe" />
                <Container Name="SEQ A">
                    <Tasks>
                        <Dataflow Name="DFT A1" />
                    </Tasks>
                </Container>
            </Tasks>
        </Package>
    </Packages>
</Biml>
I only want the data flow tasks extracted from this. You can also see that I have data flow tasks encapsulated into containers. That's a pain to enumerate through in the .NET object model but Biml gives us lovely LINQ shortcuts. The first thing I want to do is find all of dataflow tasks. RootNode.AllDefinedSuccessors().OfType() generates an enumerable list.

For each of the items in our list, we want to do something. Here I'm creating a new package with the name of the original package + the data flow task. I assume this will be unique but it's not guaranteed. Within my Tasks collection, I simply call the GetBiml method which preserves the definition of my data flow.

<#@ template language="C#" tier="2" #>
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
    <Packages>
    <# 
    foreach(var t in RootNode.AllDefinedSuccessors().OfType<AstDataflowTaskNode>())
    {
    #>
        <Package Name="<#=t.FirstThisOrParent<AstPackageNode>() #> - <#=t.Name #>">
            <Tasks>
                <#=t.GetBiml() #>
            </Tasks>
        </Package>
    <#
    }
    #>        
    </Packages>
</Biml>
If you're in BIDS Helper, simply highlight both files and click generate SSIS packages. Out comes 8 SSIS packages, the 2 originals and then one for each data flow.

Think about the possibilities this opens up for you! Maybe I'd also like to add in an auditing framework (I would). Maybe I want to then turn on logging and configuration since this client is on 2008. That's just as easy. Oh Biml, I <3 you.

No comments: