A blog about SQL Server, SSIS, C# and whatever else I happen to be dealing with in my professional life.

Find ramblings

Tuesday, September 5, 2017

Biml Query Table Builder

Biml Query Table Builder

We previously noted the awesomeness that is SQL Server 2012+'s ability to expose a query's metadata. Let's look how we can couple that information with creating Biml Table objects.

Prep work

Add a static Biml file to your project that defines and OLE DB Connection and then a database and schema. e.g.

<Biml xmlns="http://schemas.varigence.com/biml.xsd">
    <Connections>
        <OleDbConnection Name="msdb" ConnectionString="Provider=SQLNCLI11;Data Source=localhost\dev2016;Integrated Security=SSPI;Initial Catalog=msdb" />
    </Connections>
    <Databases>
        <Database Name="msdb" ConnectionName="msdb" />
    </Databases>
    <Schemas>
        <Schema Name="dbo" DatabaseName="msdb" />
    </Schemas>
</Biml>

Now that we have a database connection named msdb and a valid database and schema, save the file and let's get into the good stuff.

Given the reference query in previous post, Drive level latency information, we would need to declare the following Biml within our Tables collection.

        <Table Name="Query 28" SchemaName="msdb.dbo">
            <Columns>
                <Column Name="Drive" DataType="String" Length="4" Precision="0" Scale="0" IsNullable="true" />
                <Column Name="Volume Mount Point" DataType="String" Length="512" Precision="0" Scale="0" IsNullable="true" />
                <Column Name="Read Latency" DataType="Int64" Length="8" Precision="19" Scale="0" IsNullable="true" />
                <Column Name="Write Latency" DataType="Int64" Length="8" Precision="19" Scale="0" IsNullable="true" />
                <Column Name="Overall Latency" DataType="Int64" Length="8" Precision="19" Scale="0" IsNullable="true" />
                <Column Name="Avg Bytes/Read" DataType="Int64" Length="8" Precision="19" Scale="0" IsNullable="true" />
                <Column Name="Avg Bytes/Write" DataType="Int64" Length="8" Precision="19" Scale="0" IsNullable="true" />
                <Column Name="Avg Bytes/Transfer" DataType="Int64" Length="8" Precision="19" Scale="0" IsNullable="true" />
            </Columns>
        </Table>

That could be accomplished purely within the declarative nature of Biml wherein we do lots of text nuggets <#= "foo" #> but that's going to be ugly to maintain as there's a lot of conditional logic to muck with. Instead, I'm going to create a C# method that returns the the Biml table object (AstTableNode). To do that, we will need to create a Biml Class nugget <#+ #>. I ended up creating two methods: GetAstTableNodeFromQuery and then a helper method to translate the SQL Server data types into something Biml understood.

<#+
    /// <summary>
    /// Build out a Biml table based on the supplied query and connection. 
    /// This assumes a valid SQL Server 2012+ OLEDB Connection is provided but the approach
    /// can be adapted based on providers and information schemas.
    /// We further assume that column names in the query are unique.
    /// </summary>
    /// <param name="connection">An OleDbConnection</param>
    /// <param name="query">A SQL query</param>
    /// <param name="schemaName">The schema our table should be created in</param>
    /// <param name="queryName">A name for our query</param>
    /// <returns>Best approximation of a SQL Server data type</returns>
    public AstTableNode GetAstTableNodeFromQuery(AstOleDbConnectionNode connection, string query, string schemaName, string queryName)
    {
        string template = @"SELECT
    DEDFRS.name
,   DEDFRS.is_nullable
,   DEDFRS.system_type_name
,   DEDFRS.max_length
,   DEDFRS.precision
,   DEDFRS.scale
FROM
    sys.dm_exec_describe_first_result_set(N'{0}', NULL, NULL) AS DEDFRS ORDER BY DEDFRS.column_ordinal;";
        AstTableNode atn = null;

        atn = new AstTableNode(null);
        atn.Name = queryName;
        atn.Schema = this.RootNode.Schemas[schemaName];
        string queryActual = string.Format(template, query.Replace("'", "''"));
        
        string colName = string.Empty;
        string typeText = string.Empty;
        System.Data.DbType dbt = DbType.UInt16;
        int length = 0;
        int precision = 0;
        int scale = 0;

        try
        {
            System.Data.DataTable dt = null;
            dt = ExternalDataAccess.GetDataTable(connection, queryActual);
            foreach (System.Data.DataRow row in dt.Rows)
            {
                try
                {
                    AstTableColumnBaseNode col = new AstTableColumnNode(atn);
                    // This can be empty -- see DBCC TRACESTATUS (-1)
                    if(row[0] == DBNull.Value)
                    {
                        atn.Annotations.Add(new AstAnnotationNode(atn){Tag = "Invalid", Text = "No Metadata generated"});
                        break;
                    }
                    else
                    {
                        colName = row[0].ToString();
                    }
                    
                    typeText = row[2].ToString();
                    dbt = TranslateSqlServerTypes(row[2].ToString());
                    length = int.Parse(row[3].ToString());
                    precision = int.Parse(row[4].ToString());
                    scale = int.Parse(row[5].ToString());
                    
                    col.Name = colName;
                    col.IsNullable = (bool)row[1];
                    col.DataType = dbt;
                    col.Length = length;
                    col.Precision = precision;
                    col.Scale = scale;
                    
                    atn.Columns.Add(col);
                }
                catch (Exception ex)
                {
                    // Something went awry with making a column for our table
                    AstTableColumnBaseNode col = new AstTableColumnNode(atn);
                    col.Name = "FailureColumn";
                    col.Annotations.Add(new AstAnnotationNode(col){Tag = "colName", Text = colName});
                    col.Annotations.Add(new AstAnnotationNode(col){Tag = "typeText", Text = typeText});
                    col.Annotations.Add(new AstAnnotationNode(col){Tag = "dbtype", Text = dbt.ToString()});
                    col.Annotations.Add(new AstAnnotationNode(col){Tag = "Error", Text = ex.Message});
                    col.Annotations.Add(new AstAnnotationNode(col){Tag = "Stack", Text = ex.StackTrace});
                    atn.Columns.Add(col);
                }
            }
        }
        catch (Exception ex)
        {
            // Table level failures
            AstTableColumnBaseNode col = new AstTableColumnNode(atn);
            col.Name = "Failure";
            col.Annotations.Add(new AstAnnotationNode(col){Tag = "Error", Text = ex.ToString()});
            col.Annotations.Add(new AstAnnotationNode(col){Tag = "SourceQuery", Text = query});
            col.Annotations.Add(new AstAnnotationNode(col){Tag = "QueryActual", Text = queryActual});
            atn.Columns.Add(col);
        }
        return atn;
    }
    
    /// <summary>
    /// A rudimentary method to convert SQL Server data types to Biml types. Doesn't cover
    /// UDDT, sql_variant(well)
    /// </summary>
    /// <param name="typeName">Data type with optional length/scale/precision</param>
    /// <returns>Best approximation of a SQL Server data type</returns>
    public DbType TranslateSqlServerTypes(string typeName)
    {
        // typeName might contain length - strip it
        string fixedName = typeName;
        if(typeName.Contains("("))
        {
            fixedName = typeName.Substring(0, typeName.IndexOf("("));
        }
        // Approximate translation of https://msdn.microsoft.com/en-us/library/System.Data.DbType.aspx
        // https://docs.microsoft.com/en-us/dotnet/framework/data/adonet/sql-server-data-type-mappings
        Dictionary<string, DbType> translate = new Dictionary<string, DbType> {
            
            {"bigint", DbType.Int64 }
        ,   {"binary", DbType.Binary }
        ,   {"bit", DbType.Boolean }
        ,   {"char", DbType.AnsiStringFixedLength }
        ,   {"date", DbType.Date }
        ,   {"datetime", DbType.DateTime }
        ,   {"datetime2", DbType.DateTime2 }
        ,   {"datetimeoffset", DbType.DateTimeOffset }
        ,   {"decimal", DbType.Decimal }
        ,   {"float", DbType.Double }
        //,   {"geography", 
        //,   {"geometry", 
        //,   {"hierarchyid", 
        ,   {"image", DbType.Binary }
        ,   {"int", DbType.Int32 }
        ,   {"money", DbType.Decimal }
        ,   {"nchar", DbType.StringFixedLength }
        ,   {"ntext", DbType.String }
        ,   {"numeric", DbType.Decimal }
        ,   {"nvarchar", DbType.String }
        ,   {"real", DbType.Single }
        ,   {"smalldatetime", DbType.DateTime }
        ,   {"smallint", DbType.Int16 }
        ,   {"smallmoney", DbType.Decimal }
        ,   {"sql_variant", DbType.Object }
        ,   {"sysname", DbType.String }
        ,   {"text", DbType.String }
        ,   {"time", DbType.Time }
        ,   {"timestamp", DbType.Binary }
        ,   {"tinyint", DbType.Byte }
        ,   {"uniqueidentifier", DbType.Guid }
        ,   {"varbinary", DbType.Binary }
        ,   {"varchar", DbType.AnsiString }
        ,   {"xml", DbType.Xml }
        };
        
        try
        {
            return translate[fixedName];
        }
        catch
        {
            return System.Data.DbType.UInt64;
        }
    }
<#+

Good grief, that's a lot of code, how do I use it? The basic usage would be something like

    <Tables>
        <#=  GetAstTableNodeFromQuery(this.RootNode.OleDbConnections["msdb"], "SELECT 100 AS demo", "dbo", "DemoQuery").GetBiml() #>
    </Tables>

The call to GetAstTableNodeFromQuery return an AstTableNode which is great, but what we really want is the Biml behind it so we chain a call to .GetBiml() onto the end.

What would make that better though is to make it a little more dynamic. Let's improve the code to create tables based on a pairs of names and queries. I'm going to use a Dictionary called namedQueries to hold the names and queries and then enumerate through them, calling our GetAstTableNodeFromQuery for each entry.

<#
    Dictionary<string, string> namedQueries = new Dictionary<string,string>{{"Query 28", @"-- Drive level latency information (Query 28) (Drive Level Latency)
-- Based on code from Jimmy May
SELECT tab.[Drive], tab.volume_mount_point AS [Volume Mount Point], 
    CASE 
        WHEN num_of_reads = 0 THEN 0 
        ELSE (io_stall_read_ms/num_of_reads) 
    END AS [Read Latency],
    CASE 
        WHEN num_of_writes = 0 THEN 0 
        ELSE (io_stall_write_ms/num_of_writes) 
    END AS [Write Latency],
    CASE 
        WHEN (num_of_reads = 0 AND num_of_writes = 0) THEN 0 
        ELSE (io_stall/(num_of_reads + num_of_writes)) 
    END AS [Overall Latency],
    CASE 
        WHEN num_of_reads = 0 THEN 0 
        ELSE (num_of_bytes_read/num_of_reads) 
    END AS [Avg Bytes/Read],
    CASE 
        WHEN num_of_writes = 0 THEN 0 
        ELSE (num_of_bytes_written/num_of_writes) 
    END AS [Avg Bytes/Write],
    CASE 
        WHEN (num_of_reads = 0 AND num_of_writes = 0) THEN 0 
        ELSE ((num_of_bytes_read + num_of_bytes_written)/(num_of_reads + num_of_writes)) 
    END AS [Avg Bytes/Transfer]
FROM (SELECT LEFT(UPPER(mf.physical_name), 2) AS Drive, SUM(num_of_reads) AS num_of_reads,
             SUM(io_stall_read_ms) AS io_stall_read_ms, SUM(num_of_writes) AS num_of_writes,
             SUM(io_stall_write_ms) AS io_stall_write_ms, SUM(num_of_bytes_read) AS num_of_bytes_read,
             SUM(num_of_bytes_written) AS num_of_bytes_written, SUM(io_stall) AS io_stall, vs.volume_mount_point 
      FROM sys.dm_io_virtual_file_stats(NULL, NULL) AS vfs
      INNER JOIN sys.master_files AS mf WITH (NOLOCK)
      ON vfs.database_id = mf.database_id AND vfs.file_id = mf.file_id
      CROSS APPLY sys.dm_os_volume_stats(mf.database_id, mf.[file_id]) AS vs 
      GROUP BY LEFT(UPPER(mf.physical_name), 2), vs.volume_mount_point) AS tab
ORDER BY [Overall Latency] OPTION (RECOMPILE);"}};
#>
    <Tables>
        <# foreach(var kvp in namedQueries){ #>
        <#=  GetAstTableNodeFromQuery(this.RootNode.OleDbConnections["msdb"], kvp.Value, "dbo", kvp.Key).GetBiml() #>
        <# } #>
    </Tables>

How can we improve this? Let's get rid of the hard coded query names and actual queries. Tune in to the next installment to see how we'll make that work.

Full code is over on github

No comments: