A blog about SQL Server, SSIS, C# and whatever else I happen to be dealing with in my professional life.

Find ramblings

Friday, April 30, 2010

Submitting conference abstracts

After listening to the webcast for Submitting a winning abstract at PASS, I was inspired to try and actually get myself onto the speaking circuit. While I've picked up plenty of technical knowledge these past 10 years, I know my presentation skills have gone to shite. Preparing to deliver this should be a step in the right direction. I submitted an abstract for CLR in SQL Server: A love story in 5 parts to the Kansas City Developers Conference. We'll see whether I'm picked. I had been asked to do this presentation last September for the Kansas City Dot Net User Group but was bumped so I've had it at 90% complete for some time now. With some hard work and the confidence of having spoken locally and/or regionally, perhaps in a few years I'll be ready to try speaking at SQL PASS. Until then, "small moves, Sparks. Small moves"

Tuesday, April 27, 2010

Proofreader needed!

This post is rated U, for Ultra-pedantic.  Something I noticed today is that you can tell at a glance whether you are running the SQL Server Configuration Management mmc snap-in for 2005 vs 2008.  In the two screen shots, 2005 has a title bar of "SQL Server Configuration Manager" while 2008 proudly proclaims itself "Sql Server Configuration Manager"

Run them for yourself if you wish to verify my Earth-shattering discovery
C:\WINDOWS\system32\mmc.exe /32 "C:\WINDOWS\system32\SQLServerManager.msc"
C:\WINDOWS\system32\mmc.exe /32 C:\WINDOWS\system32\SQLServerManager10.msc



Monday, April 26, 2010

SSIS Derived Column Transformation Editor, a turn for the worse in 2008

This year my company is finally moving forward with SQL Server 2008 and my current project is the first to use the 2008 SSIS engine.  In general, I haven't had that much of a geekgasm over it.  I made my peace with VB in the scripting components a long time ago so C# wasn't that big of an upsell.  This post however, is about a deficit in the SSIS Derived Column Transformation Editor in the 2008 world.  I might have also stumbled onto a bug which is exciting.  

Given the following expression, 
0 == 0 ? (DT_STR,3,1252)"pre" : (DT_STR,3,1252)"pst"
one would expect that to have a data type of non-unicode string (DT_STR) due to the explicit cast.  However, that is not the case, plug that expression into either 2005 or 2008 and it'll set the data type as Unicode string (DT_WSTR).  In 2005, you still had the option of "fixing" data types but that feature has been removed in 2008.  See attached screen shots.
The solution with much thanks to @VidasM is to cast the results of the ternary operator to string. I had tried that but operator precedence threw me off.
(DT_STR,3,1252)(0 == 0 ? "pre" : "pst"))

Sunday, April 25, 2010

Server 2008 R2, flying blind

I'm only flying blind due to a precedence set back when I was a wee laddie. You see, video and computer games, while they came with a manual, those were only to be read between the time you purchased the item and the eternity of a drive it took for a parent to get you home. In March, I pulled the trigger and bought an MSDN license. Since then, I haven't made the time to actually read about all the things I have in my possession, I've just been an installing fool. For $5000 bucks, I am trying my damnedest to get my money's worth. I'm typing this on a refurb'ed HP Pavilion I snagged from Microcenter. It seemed a shame to throw away a perfectly good Win 7 install but I needed to taste Windows Server 2008 R2. It's been interesting so far. If I've been gluttonous in installing software, this machine is paying the penalty for it. To Microsoft's credit, they have been warning me all along that I'm doing stupid things. I converted this machine to a Primary Domain Controller (PDC) so I could play with some AD things. And of course, I need SQL Server on here. Oh, but I shouldn't install that on the same box as the PDC. As much as I'd love to blog about all the things I've done to this poor machine, I wanted to capture here two links that were important.

The first is that Team Foundation Server does not run on a 64bit OS. 64bit is the way of the future, as I understand it, for MS products so flying blind, I selected the 64bit Enterprise Edition of Windows Server 2008 R2. That's a bit of a misstatement, I selected Enterprise ed, 64bit was the only option for W2k8. Doh! [Edit 2010-05-06]To clarify, TFS 2008 does not run on a 64bit OS. TFS 2010 does[/edit]

The other thing I wanted to point out, is that among the many ways you can configure the OS to provide some desktop functionality by adding the "Desktop Experience" feature. For some asinine reason, my install also required the ink and stylus to make the DE experience go. Time to reboot and then I can jam to soma.fm while I work with MVC2

Thursday, April 22, 2010

Python text parser

@Neil_Hambly is having some file import issues "BCP file Quoted identifier, comma seperated. some data fields have quotes. so client having import issues that issue is that these TEXT files are not going to work form them due to the data already having quotes etc.. 2/2 any suggestions" It sounded like some python scripting might be helpful for standardizing/resetting his input files. Python was my preferred tool for parsing through files as one point in my life and I thought I'd dust the cobwebs off that part of my life and attempt a solution.

Much thanks to Doug Hellmann and his Python Module of the Week for the syntax refresher.

Given these two files f1.txt
header1|"header 2"|"header | 3"|header 4
delimited file|1234|Pop goes the weasel|bob's your uncle
Looks at these c,,,,|5321|abc|bob's your uncle
and f2.txt
h1,h2,h3
a,"b,c",d
",1",2,"3"
Z,Y,"X,"

I created a file called parseIt.py
mport csv
import glob

# helpful refreshers from http://www.doughellmann.com/PyMOTW/csv/index.html
csv.register_dialect('pipes', delimiter='|')
for fname in glob.glob(r'*.txt'):
    fin = open(fname, 'rt')
    print "processing ", fname
    if fname == 'f1.txt' :
        reader = csv.reader(fin, dialect='pipes')
    else:
        reader = csv.reader(fin)

    # Perform whatever action with the parsed data, either rewrite into something simpler or
    # make direct database calls
    # This code will simply rewrite everything into a consistent output format

    fout = open(fname + ".out.csv", 'wt')
    # writer = csv.writer(fout, quoting=csv.QUOTE_ALL)
    # writer = csv.writer(fout, quoting=csv.QUOTE_MINIMAL)
    writer = csv.writer(fout, quoting=csv.QUOTE_NONNUMERIC)
    # writer = csv.writer(fout, quoting=csv.QUOTE_NONE)
    for line in reader:
        print(line)
        writer.writerow(line)
    fin.close()
    fout.close()

So what's that do for us? Depending on which writer/quoting style you select above, it could give you any of the following. QUOTE_ALL, QUOTE_MINIMAL and QUOTE_NONNUMERIC presented below for f1.txt and f2.txt
"header1","header 2","header | 3","header 4"
"delimited file","1234","Pop goes the weasel","bob's your uncle"
"Looks at these c,,,,","5321","abc","bob's your uncle"

header1,header 2,header | 3,header 4
delimited file,1234,Pop goes the weasel,bob's your uncle
"Looks at these c,,,,",5321,abc,bob's your uncle

"header1","header 2","header | 3","header 4"
"delimited file","1234","Pop goes the weasel","bob's your uncle"
"Looks at these c,,,,","5321","abc","bob's your uncle"

"h1","h2","h3"
"a","b,c","d"
",1","2","3"
"Z","Y","X,"

h1,h2,h3
a,"b,c",d
",1",2,3
Z,Y,"X,"

"h1","h2","h3"
"a","b,c","d"
",1","2","3"
"Z","Y","X,"

Unfortunately, I'm going to be late for work or I'd love to show off some of the nifty things you can do with the data or the hacks you can employ to standardize it if it's even less formatted than I'm imagining.

Wednesday, April 21, 2010

Top N producers by X

A quick hit that sounded interesting. @grrl_geek was looking for "I have table of biz divisions. Each div has separate customers. Need to find each division's top 10 customers. I have table of biz divisions. Each div has separate customers. Need to find each division's top 10 customers."

As I'm racing to get this in ahead of others, I'll dispense with the usual cruft. Be sure to know what Ranking function you need to use. See references below.


SET NOCOUNT ON
DECLARE @STUFF TABLE
(
    division int NOT NULL
,   person varchar(50) NOT NULL
,   sales float NOT NULL
)

DECLARE @topN int
SELECT @topN = 3

INSERT INTO
    @STUFF
SELECT 10 AS division, 'Tom 10' AS person, 100.01
UNION ALL SELECT 10 AS division, 'Jerry 10' AS person, 30.30
UNION ALL SELECT 10 AS division, 'Paco 10' AS person, 400.01
UNION ALL SELECT 10 AS division, 'Alfred 10' AS person, 87.01
UNION ALL SELECT 10 AS division, 'John 10' AS person, 5.01
UNION ALL SELECT 10 AS division, 'Paul 10' AS person, 700.11
UNION ALL SELECT 10 AS division, 'George 10' AS person, 700.89
UNION ALL SELECT 10 AS division, 'Ringo 10' AS person, 1.87
UNION ALL SELECT 10 AS division, 'Pierre 10' AS person, 100.01
UNION ALL SELECT 10 AS division, 'Mike 10' AS person, 100.01
UNION ALL SELECT 10 AS division, 'Trevor 10' AS person, 99.99
UNION ALL SELECT 10 AS division, 'Herbie 10' AS person, 18.81
UNION ALL SELECT 10 AS division, 'Tim 10' AS person, 21.01

UNION ALL SELECT 20 AS division, 'Jerry 20' AS person, 30.30
UNION ALL SELECT 20 AS division, 'Paco 20' AS person, 400.01
UNION ALL SELECT 20 AS division, 'Alfred 20' AS person, 87.01
UNION ALL SELECT 20 AS division, 'John 20' AS person, 5.01
UNION ALL SELECT 20 AS division, 'Paul 20' AS person, 700.11
UNION ALL SELECT 20 AS division, 'George 20' AS person, 700.89
UNION ALL SELECT 20 AS division, 'Ringo 20' AS person, 1.87
UNION ALL SELECT 20 AS division, 'Pierre 20' AS person, 200.01
UNION ALL SELECT 20 AS division, 'Mike 20' AS person, 200.01
UNION ALL SELECT 20 AS division, 'Trevor 20' AS person, 99.99
UNION ALL SELECT 20 AS division, 'Herbie 20' AS person, 18.81
UNION ALL SELECT 20 AS division, 'Tim 20' AS person, 21.01

UNION ALL SELECT 30 AS division, 'Jerry 30' AS person, 30.30
UNION ALL SELECT 30 AS division, 'Paco 30' AS person, 400.01
UNION ALL SELECT 30 AS division, 'Alfred 30' AS person, 87.01
UNION ALL SELECT 30 AS division, 'John 30' AS person, 5.01
UNION ALL SELECT 30 AS division, 'Paul 30' AS person, 700.11
UNION ALL SELECT 30 AS division, 'George 30' AS person, 700.89
UNION ALL SELECT 30 AS division, 'Ringo 30' AS person, 1.87
UNION ALL SELECT 30 AS division, 'Pierre 30' AS person, 300.01
UNION ALL SELECT 30 AS division, 'Mike 30' AS person, 300.01
UNION ALL SELECT 30 AS division, 'Trevor 30' AS person, 99.99
UNION ALL SELECT 30 AS division, 'Herbie 30' AS person, 18.81
UNION ALL SELECT 30 AS division, 'Tim 30' AS person, 21.01
;
WITH RANKED AS
(
    SELECT
        S.division
    ,   S.person
    ,   S.sales
    ,   RANK() OVER (PARTITION BY S.division ORDER BY S.sales DESC) AS sales_rank
    FROM
        @STUFF S
)
SELECT
    R.*
FROM
    RANKED R
WHERE
    R.sales_rank <= @topN

References
http://msdn.microsoft.com/en-us/library/ms189798.aspx