A blog about SQL Server, SSIS, C# and whatever else I happen to be dealing with in my professional life.

Find ramblings

Wednesday, June 3, 2009

CLR Aggregate

In September, I'll be giving a presentation on the CLR in SQL Server 2005+ to our local .NET user group. Of the five elements you can create as CLR database objects, the one that's been killing me is the custom aggregate function. I have live code of UDFs and stored procs. Triggers, I'm fine cribbing from my book's example. User Defined Types, meh not exciting but I could come up with something that isn't just a 2000 era alias. But aggregates, what cases have people come up with to write their own? My book, The Rational Guide to SQL Server CLR Integration, has a Count that includes a tally for nulls. It works but really, the same thing could be accomplished with a coalesce call to the data and the native count aggregate. I'm sure there are some advanced statistical functions that could be written but unfortunately statistics was never my strong suit (I do love saying Poisson distribution, even if I don't remember what it was).

As I was drifting off to sleep this evening, inspiration struck. I've seen plenty of questions in TSQL forums asking how they can turn a rows of data into a delimited string. While it's usually a precursor to them doing bad things like storing multiple values in a single column, it seemed to be a common enough scenario that a custom aggregate could be of use.

It's late so I'm assuming you are familiar enough with programming to create a database project, compile, sign and deploy it. Also, I need to find a graceful way of embedding code. Pre tags work, but it strips out the C# doc tags. Just found http://www.manoli.net/csharpformat/format.aspx to do that. And it works great on their site but gets roosterblocked on blogger.




// <copyright file="CsvAggregate.cs" company="billfellows.net">
// Copyright (c) This makes SourceAnalysis happy. All rights reserved.
// </copyright>
// <author>Bill Fellows</author>
// <email>bill.fellows ZHAT gmail.com</email>
// <date>2009-06-02</date>
// <summary>Custom CLR aggregate that builds a multi-valued column.</summary>

using System;
using System.Data;
using System.Data.SqlClient;
using System.Data.SqlTypes;
using Microsoft.SqlServer.Server;

/// <summary>
/// A CLR aggregate that will build a comma separated value string
/// for the given inputs. XML Path might be a more elegant solution
/// </summary>
[Serializable]
[Microsoft.SqlServer.Server.SqlUserDefinedAggregate(Format.UserDefined, MaxByteSize = 8000)]
public struct CsvAggregate : IBinarySerialize
{
/// <summary>
/// "CLRPresentation.CsvAggregate" is marked for native serialization,
/// but field "input" of type "CLRPresentation.CsvAggregate" is
/// not valid for native serialization.
/// </summary>
private System.Text.StringBuilder input;

/// <summary>
/// Called for each grouping set
/// </summary>
public void Init()
{
this.input = new System.Text.StringBuilder();
}

/// <summary>
/// Each row that comes into the aggregate
/// </summary>
/// <param name="value">the value to be aggregated</param>
public void Accumulate(SqlString value)
{
if (!value.IsNull)
{
this.input.Append(string.Format("{0},", value));
}
}

/// <summary>
/// Join the results of parallelized stuff back together
/// </summary>
/// <param name="group">Fold that group into ours</param>
public void Merge(CsvAggregate group)
{
// What, if any, is the implication of a Merge being called?
// Would there be an expectation of an ordering in this aggregate?
this.input.Append(group.input);
}

/// <summary>
/// Aggregation complete, send the results back
/// </summary>
/// <returns>A CSV of stuff piped into aggregate</returns>
public SqlString Terminate()
{
return new SqlString(this.input.ToString().TrimEnd(','));
}

/// <summary>
/// Part and parcel of implementing IBinarySerialize.
/// Handles writing the internal structure to disk
/// </summary>
/// <param name="reader">A reader object to deserialize our udt</param>
public void Read(System.IO.BinaryReader reader)
{
this.input = new System.Text.StringBuilder(reader.ReadString());
}

/// <summary>
/// Handles writing the internal structure of our stringbuilder
/// to disk. We can cheat because a string will work the same
/// for serialization purposes.
/// </summary>
/// <param name="writer">A writer to serialize our udt</param>
public void Write(System.IO.BinaryWriter writer)
{
writer.Write(this.input.ToString());
}
}

Enjoy

No comments: