A blog about SQL Server, SSIS, C# and whatever else I happen to be dealing with in my professional life.

Find ramblings

Tuesday, May 20, 2008

Improved Merge Join

There is a limitation to Merge Join that I've run into with a project. Currently, we are bumping a list of ids from a text file against Active Directory to tie a user to their email address. It's a bit ugly but it works.

Now we are getting requests to understand why people aren't showing up in the output. The whole "id in AD" is a manual process and so it's going to have issues, that's a given. The requests we are getting in boil down to lots of research for me and I don't like that kind of work, unless it's something interesting. A broken process does not intrigue me especially when I can't fix it. I'm investigating a variety of avenues but the basic problem boils down to "what rule has caused a candidate to be excluded from the stream?" Stated eve better would be "what candidates have been excluded from the stream by what rule?"

What would help me out greatly, I think, would be an error stream or some facility like on Conditional Split for the Merge Join task. We've got 2 streams coming in but we could have 3 outputs: what matched, unmatched stream 1 data, unmatched stream 2 data.

To Do
Determine how much effort it would be to extend (ha!) the base Merge Join task to provide additional output for tracking/logging purposes. Additional solutions may involve combing import log to derive information, reworking to use a lookup component or something I haven't yet thought of.

No comments: