World of Whatever

A blog about SQL Server, SSIS, C# and whatever else I happen to be dealing with in my professional life.

Find ramblings

Thursday, February 20, 2020

Making a delimited list

Making a delimited list

There are various ways to concatenate values together. A common approach I see is that people will add a delimiter and then the value and loop until they finish. Then they take away the first delimiter. Generally, that's easier coding, prepending a delimiter, than to append the delimiter and not do it for the final element of a list. Or add it and then remove the final delimiter from the resulting string.

Gentle reader, there is a better way. And has been for quite some time but if you weren't looking for it, you might not know it exists. In .NET, it's String.Join

Look at the following code, what would you rather write? The Avoid this block or simply use the libraries?

using System;
using System.Collections.Generic;
using System.Linq;
using System.Xml.Linq;
using System.Text;
                    
public class Program
{
    public static void Main()
    {
        // generate a list of numbers
        List data = (Enumerable.Range(0, 10)).ToList();

        string delimiter = ",";
        // Avoid this, unless you know you need to do it for a specific reason
        {
            StringBuilder sb = new StringBuilder();
            foreach(var i in data)
            {
                sb.Append(delimiter);
                sb.Append(i);
            }
            
            // Convert the string builder to a string and then strip the first
            // character out of it
            string final = sb.ToString().Substring(delimiter.Length);
            
            Console.WriteLine(final);
        }
        
        // Make a comma delimited list
        Console.WriteLine(String.Join(delimiter, data));
        
        // What if we want to do something, like put each element in an XML tag?
        Console.WriteLine(String.Join(string.Empty, data.Select(x => string.Format("{0}", x)).ToList()));

    }
}
Output of running the above
0,1,2,3,4,5,6,7,8,9
0,1,2,3,4,5,6,7,8,9
0123456789
Gist for .net

But Bill, I use Python. The syntax changes but the concept remains the same. delimiter.join(data). Whatever language you use, there's probably an equivalent method. Look for it and use it. Don't write your own implementation.

Was there a better way to have done this? Let me know.

Tuesday, November 19, 2019

Generating characters in TSQL

Generating characters in TSQL

I had to do a thing* and it involved generating "codes" as numbers were too hard for people. So, if you have need to convert an arbitrary number into characters, this is your lucky day/post.

Background

As I get longer in the tooth programming becomes more accessible, I find that people might not have been exposed to underpinnings of how things used to work. Strings were just a bunch of characters put together and a character was a subset of the Latin alphabet shoved into 128 characters (0 to 127). The characters below 32 were referred to as the non-printable characters or control characters. Things above 32 are what you see on a US keyboard. There was a time, if you bought a programming book, it would have an ASCII table somewhere in the reference. Capital A is character 65, Capital Z is character 90 (65/A + 25 characters later). In TSQL, the CHAR function takes a number and gives you the ASCII character for the value so SELECT CHAR(66) AS B; will generate a capital B.

The mod or modulus function will return the remainder after division. Modding a value is a handy way to constrain a value between 0 and an upper threshold. In this case, if I modded any number by 26 (because there are 26 characters in the English alphabet), I'll get 0 to 25 as my result.

Knowing that the modulus function will give me 0 to 25 and knowing that my target character range starts at 65, I could use the previous expression to print any number's ascii value like SELECT CHAR((2147483625 % 26) + 65) AS StillB;. Break that apart, we do the modulus, %, which gives us the value of 1 which we then add to the starting offset (65).

Rolling all that together, here's a quick little tester to see what we can then do with it.

SELECT
    D.rn
,   ASCII_ORD.ord_value
,   ASCII_ORD.replicate_count
    -- CHAR converts a number to a character
,   CHAR(ASCII_ORD.ord_value) AS ord_value_as_character
    -- REPLICATE repeats a string N times
,   REPLICATE(CHAR(ASCII_ORD.ord_value), ASCII_ORD.replicate_count) AS RepeatedCharacter
    -- CONCAT is a null and type approach for string building (requires 2012+)
,   CONCAT(CHAR(ASCII_ORD.ord_value), ASCII_ORD.replicate_count) AS ConcatenatedCharacter
FROM
(
    -- Generate 0 to N-1 rows
    SELECT TOP (300)
        ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) -1
    FROM
        sys.all_columns AS AC
)D(rn)
CROSS APPLY
(
    -- There are 26 characters in the English language
    -- 65 is the ASCII ordinal position of a capital A
    SELECT
        D.rn % 26 + 65
    ,   D.rn / 26 + 1
) ASCII_ORD(ord_value, replicate_count)
ORDER BY
    D.rn
;

Ultimately, it was decided that using a combination of character and digits (ConcatenatedCharacter) might be more user friendly than purely a repeated character approach. Neither of which will help you when you're in the 2 billion range like our sample input of 2147483625

Key takeaways

Don't confuse the CHAR function with the char data type. Similar but different

That's why books always had ASCII tables in them

Modulus function can generate a bounded set of numbers

Older developers might know some weird tricks/trivia

Even older developers will scoff at memorized ASCII tables in favor of EBCDIC tables

Thursday, September 5, 2019

Using Newtonsoft.Json with Biml

Using Newtonsoft.Json with Biml

Twitter provided an opportunity for a quick blog post

#sqlhelp #biml I would have the metadata in a Json structure. How would you parse the json in the C# BIML Script? I was thinking use Newtonsoft.Json but I don't know how to add the reference to it

Adding external assemblies is a snap but here I'll show how to use the NewtonSoft Json library to parse a Json based metadata structure and then use that in our Biml.

<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<#
// Given the following structure

///{
///  "packages": [
///    "p1",
///    "p2",
///    "p3"
///  ]
///}

// Assume the json file is located as specified
string sourceFile = @"C:\ssisdata\trivial.json";

// Read the data into a string variable
string json = System.IO.File.ReadAllText(sourceFile);
 
// Deserialize the json into a dictionary of strings (packages) and a list of strings (p1, p2, p3)
Dictionary<string, List<string>> metadata = JsonConvert.DeserializeObject<Dictionary<string, List<string>>>(json);
#>
<Packages>
<#
// Shred the dictionary for our values
foreach (string item in metadata["packages"])
{
    //WriteLine(String.Format("<!-- {0} -->", item));
#>
    <Package Name="<#=item #>" />
<#
}
#> 
</Packages>
</Biml>

<#@ import namespace="Newtonsoft.Json" #>
<#* Assuming we have GAC'ed the assembly *#>
<#@ assembly name= "Newtonsoft.Json.dll" #>

The gist is also posted in case I mangled the hand crafted html entities above.

Also, not covered is GAC'ing the assembly but you can use an explicit path to your DLL name="C:\where\did\I\put\this\Newtonsoft.Json.dll"

Wednesday, February 27, 2019

Biml in Azure aka CallBimlScriptContent

CallBimlScriptContent was introduced with the migration from Mist to BimlStudio. Why is this cool? You do not have to use files sitting on your computer as the source for your Biml. As long as you can reconstitute the Biml contents into a string, you can store your scripts where ever you'd like. If you want them in a database, that's great. Store them in the cloud? Knock yourself out.

As a consultant, the latter is rather compelling. Maybe I'm only licensing my clients to use accelerators during our engagement. If I leave files on the file system after I roll off, or they image my computer and accidentally collect them, I am David fighting Goliath. CallBimlScriptContent is a means to protect myself and my IP. Let's look at a trivial example. I set a C# string with an empty Package tag (hooray for doubling up my double quotes). Within my Packages collection, I invoke CallBimlScriptContent passing in my Biml content.

<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<#
// Do something here to populate myBimlFile
string myBimlFile = @"<Package Name=""ABC"" />";
#>    
    <Packages>
        <#=CallBimlScriptContent(myBimlFile)#>
    </Packages>
</Biml>
The rendered Biml for the above would look like
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
  <Packages>
    <Package Name="ABC" />
  </Packages>
</Biml>

It's super that it works, but that's not convenient. Like, at all! Plus, good luck trying to embed any complexity in that string.

So, let's try something a little more complex. Conceptually, imagine we have two Biml Scripts we might choose to call inc_Package_00.biml and inc_Package_10.biml <#@ property name="parameterName" type="string" #>

inc_Package_00.biml

<Package Name="ABC" />

inc_Package_10.biml

<#@ property name="packageName" type="string" #>
<Package Name="packageName" />
Our original code could then look like
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<#
// Do something here to populate myBimlFile
string myBimlFile =System.IO.File.ReadAllText(@"C:\tmp\inc_Package_00.biml");
#>    
    <Packages>
        <#=CallBimlScriptContent(myBimlFile, "Package_00)"#>
    </Packages>
</Biml>
Do you need to pass parameters? It's no different than what you're used to doing
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<#
// Do something here to populate myBimlFile
string myBimlFile =System.IO.File.ReadAllText(@"C:\tmp\inc_Package_10.biml");
#>    
    <Packages>
        <#=CallBimlScriptContent(myBimlFile, "Package_10)"#>
    </Packages>
</Biml>

In the next post, I'll show you how to use reference data stored in tables or Azure as your BimlScript Content. Stay tuned!

Thursday, December 20, 2018

My github repository

In preparation for my talk at the Kansas City SQL Server User Group this afternoon, I am putting this post here so people can get the materials easily.

Lightning Talks

Friday, October 26, 2018

SQL Server Agent Job Sort Order

SQL Server Agent Job Sort Order

Today's post could also be titled "I have no idea what is happening here." We have an agent job, "Job - Do Stuff". We then created a few hundred jobs (templates for the win) all named like "Job - Do XYZ" where XYZ is a mainframe module identifier. When I'm scrolling through the list of jobs, it takes a few passes for my eye to find Do Stuff between DASD and DURR. I didn't want to change the leading portion of my job but I wanted my job to be sorted first. I open an ASCII table and find a useful character that sorts before the dash. Ah, asterisk, ASCII 42 comes before dash, ASCII 45.

Well, that was unexpected. In my reproduction here, the job names will take the form of the literal string "JOB " (trailing space there). I then use a single ASCII character as separator. A use another string literal "CHAR(" and then I display the ASCII ordinal value and for completeness, I close the parenthesis. Thus, JOB * CHAR(42) and JOB - CHAR(45). Assuming I sort ascending alphabetically, which under the sheets I would convert each character to its ASCII value, would lead to me JOB * CHAR(42) on top.

That ain't the way it's being sorted in SSMS though. Let's figure out "is this an application issue or a database issue?" Jobs are stored in the database msdb in a table called sysjobs in the dbo schema. Let's start there.

SELECT
    S.name AS JobName
FROM
    msdb.dbo.sysjobs AS S
WHERE
    S.name LIKE 'JOB%'
ORDER BY
    S.name;

Huh.

Ok, so what goes into sorting? Collations


SELECT
    S.name AS SchemaName
,   T.name AS TableName
,   C.name AS ColumnName
,   T2.name AS DataTypeName
,   C.collation_name AS ColumnCollationName
,   T2.collation_name AS TypeCollationName
FROM
    msdb.sys.schemas AS S
    INNER JOIN
        msdb.sys.tables AS T
        ON T.schema_id = S.schema_id
    INNER JOIN
        msdb.sys.columns AS C
        ON C.object_id = T.object_id
    INNER JOIN
        msdb.sys.types AS T2
        ON T2.user_type_id = C.user_type_id
WHERE
    S.name = 'dbo'
    AND T.name = 'sysjobs'
    AND C.name = 'name';

The name column for dbo.sysjobs is of data type sysname which uses the collation of "SQL_Latin1_General_CP1_CI_AS". If it's the collation causing the "weird" sort, then we should be able to reproduce it, right?

SELECT *
FROM
(
    VALUES
        ('JOB - CHAR(45)' COLLATE SQL_Latin1_General_CP1_CI_AS)
    ,   ('JOB * CHAR(42)' COLLATE SQL_Latin1_General_CP1_CI_AS)
) D(jobName)
ORDER BY
    D.jobName COLLATE SQL_Latin1_General_CP1_CI_AS;

Nope, not the collation since this returns in the expected sort order.

At this point, I waste a lot time going down rabbit holes that this isn't, because in my reproduction was not verbatim. I neglected to preface my strings with an N thus leaving them as ascii strings, not unicode strings.

SELECT *
FROM
(
    VALUES
        (N'JOB - CHAR(45)' COLLATE SQL_Latin1_General_CP1_CI_AS)
    ,   (N'JOB * CHAR(42)' COLLATE SQL_Latin1_General_CP1_CI_AS)
) D(jobName)
ORDER BY
    D.jobName COLLATE SQL_Latin1_General_CP1_CI_AS;

Running that, we get the same sort from sysjobs. At this point, I remember something about unicode sorting being different than old school dictionary sort like I was expecting. And after finding this answer on collations I'm happy simply setting my quest aside and stepping away from the keyboard.

Oh, but if you want to see what the glorious sort order is for characters in the printable range (32 to 127), my script is below. Technically, 127 is a cheat since it's the DELETE but I include it because of where it sorts.

Make the jobs

This script has two templates in it - @MischiefManaged deletes a job and @Template creates a job. I query against sys.all_columns to get a sequential set of numbers from 1 to (127 -32). I use that number and string concatenation (requires 2012+) plus the CHAR function to translate the number into the corresponding ASCII character. It will print out "JOB ' CHAR(39)" once complete because I'm lazy.

DECLARE
    @Template nvarchar(max) = N'
use msdb;
IF EXISTS (SELECT * FROM dbo.sysjobs AS S WHERE S.name = ''<JobName/>'')
BEGIN
    EXECUTE dbo.sp_delete_job @job_name = ''<JobName/>'';
END
EXECUTE dbo.sp_add_job
    @job_name = N''<JobName/>''
,   @enabled = 1
,   @notify_level_eventlog = 0
,   @notify_level_email = 2
,   @notify_level_page = 2
,   @delete_level = 0
,   @category_name = N''[Uncategorized (Local)]'';

EXECUTE dbo.sp_add_jobserver
    @job_name = N''<JobName/>''
,   @server_name = @@SERVERNAME;

EXEC dbo.sp_add_jobstep
    @job_name = N''<JobName/>''
,   @step_name = N''MinimumViableJob''
,   @step_id = 1
,   @cmdexec_success_code = 0
,   @on_success_action = 2
,   @on_fail_action = 2
,   @retry_attempts = 0
,   @retry_interval = 0
,   @os_run_priority = 0
,   @subsystem = N''TSQL''
,   @command = N''SELECT 1''
,   @database_name = N''msdb''
,   @flags = 0;

EXEC dbo.sp_update_job
    @job_name = N''<JobName/>''
,   @start_step_id = 1;
'
,   @MischiefManaged nvarchar(4000) = N'
use msdb;
IF EXISTS (SELECT * FROM dbo.sysjobs AS S WHERE S.name = ''<JobName/>'')
BEGIN
    EXECUTE dbo.sp_delete_job @job_name = ''<JobName/>'';
END'
,   @Token sysname = '<JobName/>'
,   @JobName sysname
,   @Query nvarchar(max);

DECLARE
    CSR CURSOR
FAST_FORWARD
FOR
SELECT
    J.jobName
FROM
(
    SELECT TOP (127-31)
        31 + (ROW_NUMBER() OVER (ORDER BY (SELECT NULL))) AS rn
    FROM sys.all_columns AS AC
) D(rn)
    CROSS APPLY
    (
        SELECT
            CONCAT('JOB ', CHAR(D.rn), ' CHAR(', D.rn, ')')
    )J(jobName)

OPEN CSR;
FETCH NEXT FROM CSR INTO @JobName;

WHILE @@FETCH_STATUS = 0
BEGIN
    BEGIN TRY  
        SET @Query = REPLACE(@Template, @Token, @JobName);
        ---- Uncomment the following to clean up our jobs
        --SET @Query = REPLACE(@MischiefManaged, @Token, @JobName);
        EXECUTE sys.sp_executesql @Query, N'';
    END TRY
    BEGIN CATCH
        PRINT @JobName;
    END CATCH
    FETCH NEXT FROM CSR INTO @JobName;
END
CLOSE CSR;
DEALLOCATE CSR;

At this point, you can refresh the Jobs list in SSMS and the result is this job sort.

Once you're satisfied with how things look, uncomment this line SET @Query = REPLACE(@MischiefManaged, @Token, @JobName); and rerun the script. All will be cleaned up.

Let's just chalk sorting up there with timezones, ok? Sounds easy but isn't. If you know more than me, please explain away in the comments section and share your knowledge.

Thursday, October 18, 2018

Polling in SQL Agent

Polling in SQL Agent

A fun question over on StackOverflow asked about using SQL Agent with SSIS to poll for a file's existence. As the comments indicate, there's a non-zero startup time associated with SSIS (it must validate the metadata associated to the sources and destinations), but there is a faster, lighter weight alternative. Putting together a host of TSQL ingredients, including undocumented extended stored procedures, the following recipe could be used as a SQL Agent job step.

If you copy and paste the following query into your favorite instance of SQL Server, it will execute for one minute and it will complete by printing the words "Naughty, naughty".

SET NOCOUNT ON;
-- http://www.patrickkeisler.com/2012/11/how-to-use-xpdirtree-to-list-all-files.html
DECLARE
    -- Don't do stupid things like adding spaces into folder names
    @sourceFolder varchar(260) = 'C:\ssisdata\Input'
    -- Have to use SQL matching rules, not DOS/SSIS
,   @fileMask sysname = 'SourceData%.txt'
    -- how long to wait between polling
,   @SleepInSeconds int = 5
    -- Don't exceed 24 hours aka 86400 seconds
,   @MaxTimerDurationInSeconds int = (3600 * 0) + (60 * 1) + 0
    -- parameter for xp_dirtree 0 => top folder only; 1 => subfolders
,   @depth int = 1
    -- parameter for xp_dirtree 0 => directory only; 1 => directory and files
,   @collectFile int = 1
,   @RC bigint = 0;

-- Create a table variable to capture the results of our directory command
DECLARE
    @DirectoryTree table
(
    id int IDENTITY(1, 1)
,   subdirectory nvarchar(512)
,   depth int
,   isFile bit
);

-- Use our sleep in seconds time to generate a delay time string
DECLARE
    @delayTime char(10) = CONVERT(char(10), TIMEFROMPARTS(@SleepInSeconds/60 /60, @SleepInSeconds/60, @SleepInSeconds%60, 0, 0), 108)
,   @stopDateTime datetime2(0) = DATEADD(SECOND, @MaxTimerDurationInSeconds, CURRENT_TIMESTAMP);

-- Force creation of the folder
EXECUTE dbo.xp_create_subdir @sourceFolder;

-- Load the results of our directory
INSERT INTO
    @DirectoryTree
(
    subdirectory
,   depth
,   isFile
)
EXECUTE dbo.xp_dirtree
    @sourceFolder
,   @depth
,   @collectFile;

-- Prime the pump
SELECT
    @RC = COUNT_BIG(1)
FROM
    @DirectoryTree AS DT
WHERE
    DT.isFile = 1
    AND DT.subdirectory LIKE @fileMask;

WHILE @rc = 0 AND @stopDateTime > CURRENT_TIMESTAMP
BEGIN

    -- Load the results of our directory
    INSERT INTO
        @DirectoryTree
    (
        subdirectory
    ,   depth
    ,   isFile
    )
    EXECUTE dbo.xp_dirtree
        @sourceFolder
    ,   @depth
    ,   @collectFile;

    -- Test for file existence
    SELECT
        @RC = COUNT_BIG(1)
    FROM
        @DirectoryTree AS DT
    WHERE
        DT.isFile = 1
        AND DT.subdirectory LIKE @fileMask;

    IF @RC = 0
    BEGIN
        -- Put our process to sleep for a period of time
        WAITFOR DELAY @delayTime;
    END
END

-- at this point, we have either exited due to file found or time expired
IF @RC > 0
BEGIN
    -- Take action when file was found
    PRINT 'Go run SSIS or something';
END
ELSE
BEGIN
    -- Take action for file not delivered in expected timeframe
    PRINT 'Naughty, naughty';
END

If you rerun the above query, in a separate window, assuming you have xp_cmdshell enabled, firing the following query will create a file with the expected pattern. Instead, it'll print out "Go run SSIS or something"

DECLARE
    @sourceFolder varchar(260) = 'C:\ssisdata\Input'
,   @fileMask sysname = REPLACE('SourceData%.txt', '%', CONVERT(char(10), CURRENT_TIMESTAMP, 120))
DECLARE
    @command varchar(1000) = 'echo > ' + @sourceFolder + '\' + @fileMask;

-- If you get this error
--Msg 15281, Level 16, State 1, Procedure sys.xp_cmdshell, Line 1 [Batch Start Line 0]
--SQL Server blocked access to procedure 'sys.xp_cmdshell' of component 'xp_cmdshell' because this component is turned off as part of the security configuration for this server. A system administrator can enable the use of 'xp_cmdshell' by using sp_configure. For more information about enabling 'xp_cmdshell', search for 'xp_cmdshell' in SQL Server Books Online.
--
-- Run this
--EXECUTE sys.sp_configure'xp_cmdshell', 1;
--GO
--RECONFIGURE;
--GO
EXECUTE sys.xp_cmdshell @command;

Once you're satisfied with how that works, now what? I'd likely set up a step 2 which is the actual running of the SSIS package (instead of printing a message). What about the condition that a file wasn't found? I'd likely use throw/raiserrror or just old fashioned divide by zero to force the first job step to fail. And then specify a reasonable number of @retry_attempts and @retry_interval.