There is a limitation to Merge Join that I've run into with a project. Currently, we are bumping a list of ids from a text file against Active Directory to tie a user to their email address. It's a bit ugly but it works.
Now we are getting requests to understand why people aren't showing up in the output. The whole "id in AD" is a manual process and so it's going to have issues, that's a given. The requests we are getting in boil down to lots of research for me and I don't like that kind of work, unless it's something interesting. A broken process does not intrigue me especially when I can't fix it. I'm investigating a variety of avenues but the basic problem boils down to "what rule has caused a candidate to be excluded from the stream?" Stated eve better would be "what candidates have been excluded from the stream by what rule?"
What would help me out greatly, I think, would be an error stream or some facility like on Conditional Split for the Merge Join task. We've got 2 streams coming in but we could have 3 outputs: what matched, unmatched stream 1 data, unmatched stream 2 data.
To Do
Determine how much effort it would be to extend (ha!) the base Merge Join task to provide additional output for tracking/logging purposes. Additional solutions may involve combing import log to derive information, reworking to use a lookup component or something I haven't yet thought of.
A blog about SQL Server, SSIS, C# and whatever else I happen to be dealing with in my professional life.
Tuesday, May 20, 2008
Subscribe to:
Posts (Atom)
Labels
#TSQL2sDay
(3)
.NET
(1)
ADO.NET provider
(1)
asp.net
(1)
benchmark
(1)
Bingo
(2)
Bot detector
(1)
build events
(1)
C#
(10)
CTE
(6)
cv
(1)
datawarehouse modeling
(1)
deadlock
(1)
Denali
(3)
dtutil
(2)
Engine of the Devil
(3)
Excel
(1)
EXECUTE AS
(1)
Execute SQL Task
(1)
EzAPI
(7)
F#
(3)
facebook
(1)
html
(1)
identity theft
(1)
itms
(1)
linked servers
(1)
Macbook Pro
(1)
Macros
(2)
meme monday
(4)
Merge Join
(1)
MS SQL Server
(34)
MySQL
(2)
n00b
(1)
Parameters
(1)
parsing
(2)
permissions
(2)
powershell
(5)
presentation
(1)
Profiler
(1)
Project Euler
(2)
python
(1)
Ranking
(1)
Resume
(1)
RSClientPrint
(2)
schema
(1)
shameless self promotion
(1)
SQL Lock In
(1)
SQL PASS
(1)
SQL Saturday
(3)
SQL Saturday 53
(8)
SQL Saturday 91
(1)
SQL Server 2005
(22)
SQL Server 2008
(13)
SSAS
(1)
SSIS
(31)
SSISUploader
(1)
SSISUploader SSIS
(3)
SSMS
(1)
SSRS
(2)
standards
(1)
stupid
(1)
Summit 2009
(2)
Tofslie
(1)
troubleshooting
(1)
TSQL
(26)
Twitter
(2)
UAC
(1)
Visual Studio
(3)
Visual Studio 2010
(1)
Windows Server 2008 R2
(1)
XML
(1)
yahoo
(1)