A blog about SQL Server, SSIS, C# and whatever else I happen to be dealing with in my professional life.

Find ramblings

Thursday, January 6, 2022

SSIS Azure Feature Pack and the Flexible File components

SSIS Azure Feature Pack and the Flexible File components

The Azure Feature Pack for SSIS is something I had not worked with before today. I have a client that wants to use the Flexible File Task/Flexible File Source/Flexible File Destination but they were having issues. The Flexible File tools allow you to work with Azure Blob storage. We were dealing with ADLS Gen2 but the feature pack can work with classic blob storage as well. In my hubris, I said no problem, I know SSIS. Dear reader, I did not know as much as I thought I did...

Our scenario was simple. We had a root folder datalake and subfolders of Raw. And into that we were needed to land and then consume files. Easy peasy. The Flexible File Destination allows us to write. The Flexible File Source allows us to read and we can configure a Foreach File enumerator to use the "Foreach Data Lake Storage Gen2 File Enumerator" to interact with the file system. Everything is the same as Windows except we use forward slashes instead of backslashes for path separators.

I started with the Flexible File Source after I manually created a CSV and uploaded it to data lake. One of the things I wasn't sure about was path syntax - do I need to specify the leading slash, do I need to specify the trailing slash, etc. I think I determined it didn't matter, it'd figure out whether folder path needs a trailing slash if it wasn't specified.

As I was testing things, changing that value and the value in the Flat File Destination each time was going to be annoying so I wanted to see what my options were for using Expresssions. Expressions are the secret sauce to making your SSIS packages graceful. The development teams took the same approach they have with the ADO.NET components in that there are no expressions on the components themselves. Instead, if you want to make the Flexible File Source/Flexible File Destination dynamic at all, you're going to have to look at the Data Flow Tasks Expressions and then configure there.

Here I used an SSIS package variable @[User::ADLSPath] so I could easily switch out /datalake, datalake, datalake/raw, datalake/Raw/, /datalake/raw/, and evaluate case sensitivity, path manipulation, etc.

F5

Transfer data error : System.IO.FileNotFoundException: Could not load file or assembly 'Microsoft.Azure.Storage.Common, Version=11.1.3.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35' or one of its dependencies. The system cannot find the file specified. File name: 'Microsoft.Azure.Storage.Common, Version=11.1.3.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35'

I tried it again, same issue. I flipped from 64 to 32 bit (I had installed both sets of the Feature Pack). I reread the install requirements document. I looked through github bugs. I contemplated asking a question on StackOverflow. I ended up trying to reproduce the error on my desktop instead of the client's VM. Same bloody issue! Finally, I said to heck with it, maybe it more strongly worded, and I'll get this assembly and install to the GAC. Maybe something went wonky with the installs on both machines.

How do I install something into the global assembly cache?

StackOverflow answer On your file system, you might have a utility called "gacutil.exe" From a administrative command prompt, I'd type "CD \" and then "dir /s /b gacutil.exe" That should provide a list of all the instances of gacutil on your machine. Pick one, I don't care which. If there's a space in the path name, then you'll need to wrap it with double quotes. "C:\Program Files (x86)\Microsoft SDKs\Windows\v10.0A\bin\NETFX 4.8 Tools\x64\gacutil.exe" /? That would bring up the help text for using the gacutil that lives at that path. If the double quotes were not there, you'd get an error message stating
'C:\Program' is not recognized as an internal or external command, operable program or batch file.

How do I get Microsoft.Azure.Storage.Common.dll?

I hope you comment below and tell me an easier way because I am not a nuget master. Visual Studio has a nuget package manager in it. However, it only works in the contet of a solution or project and the project type must support packages. If you have a single project in your solution and that project is an SSIS project, attempting to use the nuget package manager will result in an error
Operation Failed. No projects supported by NuGet in the solution. Well fiddlesticks, I guess we have to give up.

Or, add a script task to the SSIS package and then click Edit Script. In the new instance of Visual Studio, guess what - it thinks it supports NuGet. It does not, when you close the editor, the packages go away and when the script runs, it does not have the brains to get the assemblies back from NuGet land. So, don't.close.the.editor. yet. In the NuGet Package manager, on the Browse tab, type in Microsoft.Azure.Storage.Common and look at that - Deprecated. Last stable version 11.2.3. But this error indicates the component expects 11.1.3.0 so in the Version, scroll all the way back and find it. Click the check box and click Install button. Yes, you agree to the other packages as well as the MIT license.

At this point, in the volatile/temporary file storage on your computer, you have an on disk representation of your empty script Task with a NuGet package reference. We need to find that location, e.g. C:\Users\bfellows\AppData\Local\Temp\Vsta\SSIS_ST150\VstacIlBNvWB__0KemyB8zd1UMw\ Copy the desired DLLs out into a safe spot (because if I have to do it here, I'll likely have to do it on the server and the other three development VMs) and then use the gacutil to install them.

Right click on Solution VstaProjects and choose Open in Terminal. The assembly we're looking for will be located at .\packages\Azure.Storage.Common.12.9.0\lib\netstandard2.0\Azure.Storage.Common.dll. Assume I copied it to C:\temp

"C:\Program Files (x86)\Microsoft SDKs\Windows\v10.0A\bin\NETFX 4.8 Tools\x64\gacutil.exe" -if C:\temp\Azure.Storage.Common.dll will force install the dll to the GAC. Now when I run SSIS, we'll see whether that has resolved our error.

Will the real error please stand up

It did resolve our error, but not. The error that was reported, missing assembly was Mickey Mouse, mate. Spurious. Not genuine.

Look what error decided to show up now that it could error out "better"
Transfer data error : Microsoft.DataTransfer.Common.Shared.HybridDeliveryException: ADLS Gen2 operation failed for: Operation returned an invalid status code 'BadRequest'. Account: 'datalakedev'. FileSystem: 'datalake'. ErrorCode: 'TlsVersionNotPermitted'. Message: 'The TLS version of the connection is not permitted on this storage account.'

If I go to my storage account, under Settings, Configuration, there I can change Minimum TLS version from 1.2 to 1.1. Oh, except that still isn't kosher - same error. 1.0 it is and lo and behold, I have data transfer. The root cause is not a missing assembly, it is a red herring error message that could only be resolved by adding the assembly to the global assembly cache.

Rant

How in the hell would a normal person make the connection between "Could not load file or assembly" and Oh, I need to change the TLS? What's really galling is the fact that when I used the Flexible File Source for my data flow, I specified a file on blob storage and SSIS was able to connect and read that file because it identified the associated metadata. I has two columns and here are the data types (defaulted to string but who cares, that's consistent with flat file source). BUT IT PICKED UP THE METADATA. IT COULD TALK TO AZURE BLOB STORAGE EVEN THOUGH IT ONLY ALLOWED 1.2! And yet, when it came time to run, it could not talk on the same channel. Can you see how I lost a large portion of my day trying to decipher this foolishness?

By the way, knowing that the root cause it a mismatch between the TLS SSIS/my computer is using and the default on the storage account, let's go back to the writeup for the Azure Feature Pack

Use TLS 1.2 The TLS version used by Azure Feature Pack follows system .NET Framework settings. To use TLS 1.2, add a REG_DWORD value named SchUseStrongCrypto with data 1 under the following two registry keys. HKEY_LOCAL_MACHINE\SOFTWARE\WOW6432Node\Microsoft\.NETFramework\v4.0.30319 HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\.NETFramework\v4.0.30319

So, sometimes an error message isn't the real error message but you have to clear away all the garbage between you and the source of the error to figure out what it really is.

No comments: