A blog about SQL Server, SSIS, C# and whatever else I happen to be dealing with in my professional life.

Find ramblings

Monday, June 7, 2010

T-SQL Tuesday #007: Summertime in the SQL TVPs

I have finally made time to participate in a T-SQL Tuesday and thanks to Jorge for hosting this month's edition. I have installed R2 but I have not yet used any new hotness. I have, however, been experiencing SQL 2008 love the past few weeks and this post will cover User Defined Table Types and in particular, how and why to use them as Table-Valued Parameters.

History

Back when I was loading data into the database, we used singletons and we liked it! Ok, maybe we didn't like it but we'd do it. Quite often you'd queue up some insert statements or maybe you'd get all fancy and format a file and use bcp to stream some data into the database. Maybe you were good and used a staging table, maybe you did it against the production table. Maybe you were an idiot and dropped the #Consumer table except you forgot the # and whoosh, away went the most important table (that had no foreign keys so SQL Server happily dropped it). Perhaps you know there is a limit on the number of parameters a stored procedure can accept because you wished it could have 2101. If you were lucky enough to be in a 2005 world, oh the abuses you could perform with XML parameters! No longer were you encumbered by anything, you could mark your data up with XML and stream all the values into the database at once. Oh the fun we had parsing that out and tracking down errors. All of these methods were so maintainable, so elegant, type safe and offered such high performance.

Ok, so I lied, a lot. But we performed lots of perversions and MacGyverisms in the name of getting data into the database and/or processing it as a set---we are doing set-based processing, right? These methods and plenty of other ones, they'd work, mostly but they were usually littered with comments like HACK, DO NOT TOUCH, etc.

Shiny

SQL Server 2008 introduced user-defined table types which allowed developers to pass in a table as a parameter to a method. How cool is that? It is pretty cool actually, no more munging with a # or ## table. It's pretty straight forward, using the same CREATE syntax you would use with a table, you will create a table type. With your fancy new table in hand, you can create a stored procedure that takes a your type as a parameter.

Introductory example

IF EXISTS
(
    SELECT
        *
    FROM
        sys.types T
    WHERE
        T.name = 'CONTRIVED_EXAMPLE'
)
BEGIN
    PRINT 'Dropping type dbo.CONTRIVED_EXAMPLE'
    DROP TYPE dbo.CONTRIVED_EXAMPLE
END
PRINT 'Creating type dbo.CONTRIVED_EXAMPLE'
GO
-----------------------------------------------------------------------------
-- Type: dbo.CONTRIVED_EXAMPLE
-- Author: Bill Fellows
-- Date: 2010-06-07
--
-- This is a user-defined table type demonstrating the syntax
-- for a table valued parameter
--
-- See also:
--
-- Modified:
--
-----------------------------------------------------------------------------
CREATE TYPE
    dbo.CONTRIVED_EXAMPLE AS TABLE
(
     twitter_handle varchar(15) NOT NULL
,    have_met bit NOT NULL
)
GO
That doesn't look much different than a create table statement, it has a schema, a name and the column definitions.

IF EXISTS
(
    SELECT
        *
    FROM
        sys.tables T
    WHERE
        T.name = 'CONTRIVED_EXAMPLE_TABLE'
)
BEGIN
    PRINT 'Dropping type dbo.CONTRIVED_EXAMPLE_TABLE'
    DROP TABLE dbo.CONTRIVED_EXAMPLE_TABLE
END
PRINT 'Creating table dbo.CONTRIVED_EXAMPLE_TABLE'
GO
-----------------------------------------------------------------------------
-- Type: dbo.CONTRIVED_EXAMPLE_TABLE
-- Author: Bill Fellows
-- Date: 2010-06-07
--
-- This is a boring table demonstrating the syntax
-- for a table valued parameter
--
-- See also:
--
-- Modified:
--
-----------------------------------------------------------------------------
CREATE TABLE
    dbo.CONTRIVED_EXAMPLE_TABLE 
(
    row_id int identity(1,10) NOT NULL PRIMARY KEY
,    twitter_handle varchar(15) NOT NULL
,    have_met bit NOT NULL
)
GO
What is different? There's the obvious, one is stored in sys.tables, the other is in sys.types. The row in sys.types is marginally interesting as it shows a value of 1 (true) for is_table_type as well as is_user_defined

At this point, we've done nothing with the user-defined table type, so let's actually use it. Here is a stored procedure with a single parameter.


IF EXISTS
(
    SELECT
        SO.*
    FROM
        dbo.sysobjects SO
    WHERE
        SO.id = OBJECT_ID('dbo.EchoBack')
        AND OBJECTPROPERTY(SO.id, 'IsProcedure') = 1
)
BEGIN
    PRINT 'Dropping stored procedure dbo.EchoBack'
    DROP PROCEDURE dbo.EchoBack
END
GO
PRINT 'Creating stored procedure dbo.EchoBack'
GO
-----------------------------------------------------------------------------
-- Function: dbo.EchoBack
-- Author: Bill Fellows
-- Date: 2010-06-07
--
-- This procedure handles merging
--
-- Recordsets:
-- The contents of our table valued parameter will be echoed back to the 
-- caller.
--
-- Side-effects:
-- None
--
-- See also:
-- your mom
--
-- Modified:
--
-----------------------------------------------------------------------------
CREATE PROCEDURE dbo.EchoBack
(
    @tvp dbo.CONTRIVED_EXAMPLE READONLY
)
AS
BEGIN
    SET NOCOUNT ON
    
    SELECT
        T.*
    FROM
        @tvp T

END
GO
PRINT 'Granting rights for dbo.EchoBack'
GRANT EXECUTE ON dbo.EchoBack TO public
GO
Nothing fancy at all, except for that optional READONLY clause, you could confuse the parameter with any old UDT. Let's set about using our new EchoBack procedure
DECLARE @p1 dbo.CONTRIVED_EXAMPLE 
INSERT INTO @p1 values('billinkc', 1)
INSERT INTO @p1 values('peschkaj', 1)
INSERT INTO @p1 values('mikeSQL', 1)
INSERT INTO @p1 values('SQLChicken', 0)

SELECT * FROM @p1

EXECUTE dbo.EchoBack @p1

Wasn't that thrilling? I have heart palpitations here and am just dizzy with the exertion.

Pros, Cons

Pros before !pros

Why would you want to use table valued parameters? The number one reason in my mind is set based operations. Rather than doing some hokum on parameters 10, 12 and 2100, just treat your inbound data in a set-based fashion. It's going to be less painful for you today and 8 months down the road when you have to crack this proc back open and remember what you were doing.

I'm a fan of that whole "ACID thing" and passing in all the data that comprises a a set sure seems like a good way to get atomicity, especially compared to working from a singleton methodology and some signal to indicate done.

Who likes weakly typed data? Who misses VB4? No one, we are in a database world, we like digital answers. Yes, No, level, not-level. These are the answers we crave. Table Valued parameters are strongly typed because they are a table.

Never underestimate the value of simplicity. The syntax for making a UDTT is as familiar as a table DDL statement. Using a TVP, is just like using a normal table with the exception that it's read-only. See cons below. Where the simplicity really shines is not in the TSQL realm---it's in the .NET framework. A dataset can now be passed to a stored procedure as a table valued parameter. Simply mark up the parameter of your SqlCommand object with ".SqlDbType = SqlDbType.Structured" and Bam! the heavy lifting is done!

Cons

"SQL Server does not maintain statistics on columns of table-valued parameters" I'm no internals genius, but I think I've heard enough to know that no statistics or out of date statistics can mean bad optimizer choices. The MSDN TVP article referenced below has some metrics on when they are good vs bulk insert operations. For the small datasets I've been working with, it's been good to me but as always test it both ways.

Once a user-defined table type is used as a TVP, it's schema-bound and you can't change the definition of it. It makes sense as you can't have compiled objects expecting one signature and getting another. In fact, that'd suck a great deal. The trouble I ran into with regard to UDTT is our deploy methodology. Every database object we create, we wrap with drop/create/permission assignment statements. It might be overkill to do this but it's one way to make sure you have a repeatable process. With TVPs, we can't drop the type until all the procs with their fingers on the declared type have been dropped. And you can see from above, we typically do the drop and create all within a single file, one file per object. The quick fix we're using is to combine the proc and type creation into a single file with the order of operations as drop proc, drop type, create type, create proc. This will fail if multiple procs used the same UDTT and at that point, we'd have to rethink our deployment strategy.

A local user-defined table type is a normal table but once passed into a proc, it becomes read only. You can omit the READONLY parameter attribute but SQL Server knows it should be RO and won't allow you to update the parameter within the confines of the proc.

Invocation

TSQL example

First of all, let's spin up a fancier proc than the echo proc above.
IF EXISTS
(
    SELECT
        SO.*
    FROM
        dbo.sysobjects SO
    WHERE
        SO.id = OBJECT_ID('dbo.TwitterAdd')
        AND OBJECTPROPERTY(SO.id, 'IsProcedure') = 1
)
BEGIN
    PRINT 'Dropping stored procedure dbo.TwitterAdd'
    DROP PROCEDURE dbo.TwitterAdd
END
GO
PRINT 'Creating stored procedure dbo.TwitterAdd'
GO
-----------------------------------------------------------------------------
-- Function: dbo.TwitterAdd
-- Author: Bill Fellows
-- Date: 2010-06-07
--
-- This procedure handles merging
--
-- Recordsets:
-- Business rule violations (optional)
--
-- Side-effects:
-- 0-N rows written to twitter table
--
-- See also:
-- CONTRIVED_EXAMPLE (your mom)
--
-- Modified:
--
-----------------------------------------------------------------------------
CREATE PROCEDURE dbo.TwitterAdd
(
    @tvp dbo.CONTRIVED_EXAMPLE READONLY
)
AS
BEGIN
    SET NOCOUNT ON
    
    MERGE
        dbo.CONTRIVED_EXAMPLE_TABLE DESTINATION
    USING
        (SELECT twitter_handle, have_met FROM @tvp T) AS SOURCE (twitter_handle, have_met)
        ON (SOURCE.twitter_handle = DESTINATION.twitter_handle)
        -- Never forget I met someone
    WHEN MATCHED AND DESTINATION.have_met = 0
        THEN UPDATE SET DESTINATION.have_met = SOURCE.have_met
    WHEN NOT MATCHED
        THEN INSERT(twitter_handle, have_met) VALUES (SOURCE.twitter_handle, SOURCE.have_met)
        -- merge statements must be terminated by a semi-colon
        -- and don't you forget it!
        ;

END
GO
PRINT 'Granting rights for dbo.TwitterAdd'
GRANT EXECUTE ON dbo.TwitterAdd TO public
GO
This is a two-for-one SQL 2008 shiny-new-hotness post: it has TVP and the Merge statement but I'm not explaining merge today. Invocation of the proc would still function the same
declare @p1 dbo.CONTRIVED_EXAMPLE
insert into @p1 values(N'billinkc',1)
insert into @p1 values(N'MidnightDBA',0)

exec TwitterAdd @tvp=@p1

C#

/// <summary>
/// This method demonstrates the ease of passing a dataset in
/// as a TVP
/// </summary>
public static void TSQLTuesday007()
{
    string connectionString = @"Data Source=localhost;Initial Catalog=master;Integrated Security=True";
    System.Data.SqlClient.SqlConnection conn = new System.Data.SqlClient.SqlConnection(connectionString);
    System.Data.DataTable dataTable = null;

    // note that my data table does not have to be the same
    // as my UDTT, nor do my columns have to have the same name
    // Types-yes. Ordinal, probably so
    dataTable = new System.Data.DataTable("Sample");
    dataTable.Columns.Add("tweep", System.Type.GetType("System.String"));
    dataTable.Columns.Add("have_met", System.Type.GetType("System.Boolean"));

    // add rows to my data table but really, this could be any source
    dataTable.Rows.Add(new object[] { "billinkc", true });
    dataTable.Rows.Add(new object[] { "BrentO", true });
    dataTable.Rows.Add(new object[] { "buckwoody", false });

    // Hooray for #sqlsat35 and meeting Jen & Sean
    dataTable.Rows.Add(new object[] { "MidnightDBA", true });

    System.Data.SqlClient.SqlConnection connection = null;
    System.Data.DataSet results = null;
    System.Data.SqlClient.SqlCommand command = null;
    System.Data.SqlClient.SqlDataReader dataReader = null;
    connection = new System.Data.SqlClient.SqlConnection(connectionString);
    try
    {
        connection.Open();
        command = new System.Data.SqlClient.SqlCommand("TwitterAdd");
        command.CommandType = System.Data.CommandType.StoredProcedure;
        command.Connection = connection;

        // Assigning a table valued parameter looks much like any other parameter
        System.Data.SqlClient.SqlParameter tvp = command.Parameters.AddWithValue("@tvp", dataTable);

        // this is the only special sauce (not required but helpful)
        tvp.SqlDbType = System.Data.SqlDbType.Structured;
        tvp.TypeName = "dbo.CONTRIVED_EXAMPLE";

        dataReader = command.ExecuteReader();
        if (dataReader.HasRows)
        {
            results = new System.Data.DataSet();
            results.Tables.Add();
            results.Tables[0].Load(dataReader);
            PPrint(results);
        }
    }
    catch (Exception ex)
    {
        Console.WriteLine(ex);
    }
}
I actually cheated on the TSQL example, that's simply the capture from SQL Profiler

Conclusion

Table Valued Parameters, like any other feature in the database can save you time and trouble if used wisely.

References

p.s.

The careful observer might notice that my screenshot of EchoBack does not match the TVP definition. I used an identity column for the first time and the TSQL went well but when I went to wire up the C#, I kept getting "INSERT into an identity column not allowed on table variables" I ran across Bob Beauchemin's post on the subject but it seemed like a kludge to my tired eyes.

No comments: